<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<oembed>
  <author_name>yohei-a</author_name>
  <author_url>https://blog.hatena.ne.jp/yohei-a/</author_url>
  <blog_title>ablog</blog_title>
  <blog_url>https://yohei-a.hatenablog.jp/</blog_url>
  <categories>
    <anon>Spark</anon>
  </categories>
  <description>PySpark では Java の正規表現を使う Regex in pyspark internally uses java regex.One of the common issue with regex is escaping backslash as it uses java regex and we will pass raw python string to spark.sql we can see it with a sample example \d represents digit in regex.Let us use spark regexp_extract to matc…</description>
  <height>190</height>
  <html>&lt;iframe src=&quot;https://hatenablog-parts.com/embed?url=https%3A%2F%2Fyohei-a.hatenablog.jp%2Fentry%2F20210612%2F1623470162&quot; title=&quot;PySpark は Java の正規表現記法を使う - ablog&quot; class=&quot;embed-card embed-blogcard&quot; scrolling=&quot;no&quot; frameborder=&quot;0&quot; style=&quot;display: block; width: 100%; height: 190px; max-width: 500px; margin: 10px 0px;&quot;&gt;&lt;/iframe&gt;</html>
  <image_url></image_url>
  <provider_name>Hatena Blog</provider_name>
  <provider_url>https://hatena.blog</provider_url>
  <published>2021-06-12 12:56:02</published>
  <title>PySpark は Java の正規表現記法を使う</title>
  <type>rich</type>
  <url>https://yohei-a.hatenablog.jp/entry/20210612/1623470162</url>
  <version>1.0</version>
  <width>100%</width>
</oembed>
