{"provider_name":"Hatena Blog","blog_url":"https://dk521123.hatenablog.com/","image_url":null,"categories":["Spark / PySpark"],"url":"https://dk521123.hatenablog.com/entry/2021/05/25/111051","height":"190","type":"rich","provider_url":"https://hatena.blog","html":"<iframe src=\"https://hatenablog-parts.com/embed?url=https%3A%2F%2Fdk521123.hatenablog.com%2Fentry%2F2021%2F05%2F25%2F111051\" title=\"\u3010\u5206\u6563\u51e6\u7406\u3011PySpark \uff5e DataFrame / \u30c7\u30fc\u30bf\u96c6\u8a08\u7de8 \uff5e - \u30d7\u30ed\u30b0\u30e9\u30e0 \u306e\u8d85\u500b\u4eba\u7684\u306a\u30e1\u30e2\" class=\"embed-card embed-blogcard\" scrolling=\"no\" frameborder=\"0\" style=\"display: block; width: 100%; height: 190px; max-width: 500px; margin: 10px 0px;\"></iframe>","author_url":"https://blog.hatena.ne.jp/dk521123/","title":"\u3010\u5206\u6563\u51e6\u7406\u3011PySpark \uff5e DataFrame / \u30c7\u30fc\u30bf\u96c6\u8a08\u7de8 \uff5e","width":"100%","version":"1.0","description":"\u25a0 \u306f\u3058\u3081\u306b https://dk521123.hatenablog.com/entry/2020/01/04/150942 \u306e\u7d9a\u304d\u3002 \u4eca\u56de\u306f\u3001\u30c6\u30fc\u30d6\u30eb\u30c7\u30fc\u30bf\u306e\u96c6\u8a08\u306b\u95a2\u3057\u3066\u6271\u3046\u3002 \u76ee\u6b21 \u3010\uff10\u3011agg (\u96c6\u8a08) \u3010\uff11\u3011min/max (\u6700\u5c0f/\u6700\u5927) \u3010\uff12\u3011count (\u30ab\u30a6\u30f3\u30c8) \u3010\uff13\u3011countDistinct (\u91cd\u8907\u30ab\u30a6\u30f3\u30c8) \u4ed6\u306b\u3082\u3001sum (\u7dcf\u8a08), avg (\u5e73\u5747) \u306a\u3069\u304c\u3042\u308b \u3010\uff10\u3011agg (\u96c6\u8a08) * aggregate = \u96c6\u8a08 * min, max, sum\u306a\u3069\u306e\u5b9f\u884c\u7d50\u679c\u3092\u8fd4\u3059 API\u4ed5\u69d8 https://spark.apache.org/docs/latest/api/pyt\u2026","published":"2021-05-25 11:10:51","author_name":"dk521123","blog_title":"\u30d7\u30ed\u30b0\u30e9\u30e0 \u306e\u8d85\u500b\u4eba\u7684\u306a\u30e1\u30e2"}