{"provider_url":"https://hatena.blog","blog_title":"ablog","published":"2020-11-24 12:48:37","url":"https://yohei-a.hatenablog.jp/entry/20201124/1606189717","provider_name":"Hatena Blog","image_url":null,"height":"190","version":"1.0","title":"Glue Spark \u30b8\u30e7\u30d6\u3067 dynamic_frame \u304b\u3089 Parquet \u3092\u8aad\u3082\u3046\u3068\u3059\u308b\u3068 \"Unsupported encoding: DELTA_BINARY_PACKED\" \u3068\u6012\u3089\u308c\u308b","blog_url":"https://yohei-a.hatenablog.jp/","description":"\u4e8b\u8c61 Glue Spark \u30b8\u30e7\u30d6\u3067 dynamic_frame \u304b\u3089 Parquet \u3092\u8aad\u3082\u3046\u3068\u3059\u308b\u3068 \"Unsupported encoding: DELTA_BINARY_PACKED\" \u3068\u6012\u3089\u308c\u308b\u3002 \u89e3\u6c7a\u7b56 \u4ee5\u4e0b\u3092\u8a2d\u5b9a\u3057\u3066\u3084\u308b\u3002 spark.conf.set(\"spark.sql.parquet.enableVectorizedReader\", \"false\") \u53c2\u8003 In order to generate the DELTA encoded parquet file in PySpark, we need to enable version 2 of the Parquet wri\u2026","html":"<iframe src=\"https://hatenablog-parts.com/embed?url=https%3A%2F%2Fyohei-a.hatenablog.jp%2Fentry%2F20201124%2F1606189717\" title=\"Glue Spark \u30b8\u30e7\u30d6\u3067 dynamic_frame \u304b\u3089 Parquet \u3092\u8aad\u3082\u3046\u3068\u3059\u308b\u3068 &quot;Unsupported encoding: DELTA_BINARY_PACKED&quot; \u3068\u6012\u3089\u308c\u308b - ablog\" class=\"embed-card embed-blogcard\" scrolling=\"no\" frameborder=\"0\" style=\"display: block; width: 100%; height: 190px; max-width: 500px; margin: 10px 0px;\"></iframe>","author_name":"yohei-a","categories":["AWS"],"width":"100%","type":"rich","author_url":"https://blog.hatena.ne.jp/yohei-a/"}