{"blog_url":"https://yohei-a.hatenablog.jp/","published":"2020-01-01 23:42:12","version":"1.0","title":"Spark on EMR \u304b\u3089 Glue Catalog \u3092\u53c2\u7167\u3059\u308b","image_url":null,"blog_title":"ablog","author_name":"yohei-a","height":"190","width":"100%","provider_url":"https://hatena.blog","html":"<iframe src=\"https://hatenablog-parts.com/embed?url=https%3A%2F%2Fyohei-a.hatenablog.jp%2Fentry%2F20200101%2F1577889732\" title=\"Spark on EMR \u304b\u3089 Glue Catalog \u3092\u53c2\u7167\u3059\u308b - ablog\" class=\"embed-card embed-blogcard\" scrolling=\"no\" frameborder=\"0\" style=\"display: block; width: 100%; height: 190px; max-width: 500px; margin: 10px 0px;\"></iframe>","provider_name":"Hatena Blog","type":"rich","author_url":"https://blog.hatena.ne.jp/yohei-a/","url":"https://yohei-a.hatenablog.jp/entry/20200101/1577889732","description":"Spark on EMR \u304b\u3089 Glue Catalog \u3092\u53c2\u7167\u3057\u3066\u307f\u305f\u30e1\u30e2\u3002 \u524d\u63d0 Glue \u30ab\u30bf\u30ed\u30b0\u306b\u3059\u3067\u306b\u30c7\u30fc\u30bf\u30d9\u30fc\u30b9\u3068\u30c6\u30fc\u30d6\u30eb\u306f\u5b58\u5728\u3059\u308b\u3082\u306e\u3068\u3059\u308b\u3002 \u8a2d\u5b9a EMR\u30af\u30e9\u30b9\u30bf\u30fc\u3092\u4f5c\u6210\u3059\u308b\u969b\u306b [\u30bd\u30d5\u30c8\u30a6\u30a7\u30a2\u8a2d\u5b9a]\u3067 Spark \u306b\u30c1\u30a7\u30c3\u30af\u3092\u5165\u308c\u3001 [AWS Glue Data Catalog \u306e\u8a2d\u5b9a (\u30aa\u30d7\u30b7\u30e7\u30f3)]-[Spark \u30c6\u30fc\u30d6\u30eb\u30e1\u30bf\u30c7\u30fc\u30bf\u306b\u4f7f\u7528]\u306b\u30c1\u30a7\u30c3\u30af\u3092\u5165\u308c\u308b\u3002 Glue \u30ab\u30bf\u30ed\u30b0\u3092\u53c2\u7167\u3057\u3066\u307f\u308b PySpark REPL \u3092\u8d77\u52d5\u3059\u308b\u3002 $ pyspark \u30c7\u30fc\u30bf\u30d9\u30fc\u30b9\u3092\u8868\u793a\u3059\u308b PySpark \u306e\u30b3\u30fc\u30c9\u3092\u5b9f\u884c\u3059\u308b\u3002 from pyspark.sql import Spark\u2026","categories":["AWS"]}