日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

使用 Azure Databricks 做ETL

發布時間:2023/12/14 编程问答 25 豆豆
生活随笔 收集整理的這篇文章主要介紹了 使用 Azure Databricks 做ETL 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

這是一個Demo,大家可以一起學習,一起交流,有問題的可以私信,或者留言

本文使用 Azure Databricks 執行 ETL(提取、轉換和加載數據)操作。?將數據從 Azure Data Lake Storage Gen2 提取到 Azure Databricks 中,在 Azure Databricks 中對數據運行轉換操作,然后將轉換的數據加載到 Azure Synapse Analytics 中。

本文的步驟使用 Azure Databricks 的 Azure Synapse 連接器將數據傳輸到 Azure Databricks。?而此連接器又使用 Azure Blob 存儲來臨時存儲在 Azure Databricks 群集和 Azure Synapse 之間傳輸的數據。

下圖演示了應用程序流:

?

創建 Azure Databricks 服務

在本部分中,你將使用 Azure 門戶創建 Azure Databricks 服務。

  • 在 Azure 菜單中,選擇“創建資源” 。

    然后,選擇“分析” > “Azure Databricks” 。

  • 在“Azure Databricks 服務” 下,提供以下值來創建 Databricks 服務:

    表 1properties說明
    工作區名稱為 Databricks 工作區提供一個名稱。
    訂閱從下拉列表中選擇自己的 Azure 訂閱。
    資源組指定是要創建新的資源組還是使用現有的資源組。?資源組是用于保存 Azure 解決方案相關資源的容器。?有關詳細信息,請參閱?Azure 資源組概述。
    位置選擇“China East 2 ”。?有關其他可用區域,請參閱各區域推出的 Azure 服務。
    定價層選擇“標準” 。
  • 創建帳戶需要幾分鐘時間。?若要監視操作狀態,請查看頂部的進度欄。

  • 選擇“固定到儀表板” ,然后選擇“創建” 。

  • 在 Azure Databricks 中創建 Spark 群集

  • 在 Azure 門戶中,轉到所創建的 Databricks 服務,然后選擇“啟動工作區”。

  • 系統隨后會將你重定向到 Azure Databricks 門戶。?在門戶中選擇“群集”。

  • 在“新建群集”頁中,提供用于創建群集的值。

  • 填寫以下字段的值,對于其他字段接受默認值:

    • 輸入群集的名稱。

    • 請務必選中“在不活動超過 __ 分鐘后終止” 復選框。?如果未使用群集,則請提供一個持續時間(以分鐘為單位),超過該時間后群集會被終止。

    • 選擇“創建群集”。?群集運行后,可將筆記本附加到該群集,并運行 Spark 作業。

  • 在 Azure Data Lake Storage Gen2 帳戶中創建文件系統

    在本部分中,你將在 Azure Databricks 工作區中創建一個 Notebook,然后運行代碼片段來配置存儲帳戶

  • 在?Azure 門戶中,轉到你創建的 Azure Databricks 服務,然后選擇“啟動工作區”。

  • 在左側選擇“工作區” 。?在?工作區?下拉列表中,選擇?創建?>?筆記本?。

  • 在“創建 Notebook”對話框中,輸入 Notebook 的名稱。?選擇“Scala”作為語言,然后選擇前面創建的 Spark 群集。

  • 選擇“創建” 。

  • 以下代碼塊設置 Spark 會話中訪問的任何 ADLS Gen 2 帳戶的默認服務主體憑據。?第二個代碼塊會將帳戶名稱追加到該設置,從而指定特定的 ADLS Gen 2 帳戶的憑據。?將任一代碼塊復制并粘貼到 Azure Databricks 筆記本的第一個單元格中。

    會話配置

    Scala復制

    val appID = "<appID>" val secret = "<secret>" val tenantID = "<tenant-id>"spark.conf.set("fs.azure.account.auth.type", "OAuth") spark.conf.set("fs.azure.account.oauth.provider.type", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider") spark.conf.set("fs.azure.account.oauth2.client.id", "<appID>") spark.conf.set("fs.azure.account.oauth2.client.secret", "<secret>") spark.conf.set("fs.azure.account.oauth2.client.endpoint", "https://login.microsoftonline.com/<tenant-id>/oauth2/token") spark.conf.set("fs.azure.createRemoteFileSystemDuringInitialization", "true")

    帳戶配置

    Scala復制

    val storageAccountName = "<storage-account-name>" val appID = "<app-id>" val secret = "<secret>" val fileSystemName = "<file-system-name>" val tenantID = "<tenant-id>"spark.conf.set("fs.azure.account.auth.type." + storageAccountName + ".dfs.core.chinacloudapi.cn", "OAuth") spark.conf.set("fs.azure.account.oauth.provider.type." + storageAccountName + ".dfs.core.chinacloudapi.cn", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider") spark.conf.set("fs.azure.account.oauth2.client.id." + storageAccountName + ".dfs.core.chinacloudapi.cn", "" + appID + "") spark.conf.set("fs.azure.account.oauth2.client.secret." + storageAccountName + ".dfs.core.chinacloudapi.cn", "" + secret + "") spark.conf.set("fs.azure.account.oauth2.client.endpoint." + storageAccountName + ".dfs.core.chinacloudapi.cn", "https://login.microsoftonline.com/" + tenantID + "/oauth2/token") spark.conf.set("fs.azure.createRemoteFileSystemDuringInitialization", "true") dbutils.fs.ls("abfss://" + fileSystemName + "@" + storageAccountName + ".dfs.core.chinacloudapi.cn/") spark.conf.set("fs.azure.createRemoteFileSystemDuringInitialization", "false")
  • 在此代碼塊中,請將?<app-id>、<secret>、<tenant-id>?和?<storage-account-name>?占位符值替換為在完成本教程的先決條件時收集的值。?將?<file-system-name>?占位符值替換為你想要為文件系統指定的任何名稱。

    • <app-id>?和?<secret>?來自在創建服務主體的過程中向 active directory 注冊的應用。

    • <tenant-id>?來自你的訂閱。

    • <storage-account-name>?是 Azure Data Lake Storage Gen2 存儲帳戶的名稱。

  • 按?SHIFT + ENTER?鍵,運行此塊中的代碼。

  • 注意:這些參數要看你使用的Azure是國際版還是中國版,本文使用的是中國版,如是國際版則需要將dfs.core.chinacloudapi.cn改成國際版的

    將示例數據引入 Azure Data Lake Storage Gen2 帳戶

    將以下代碼輸入到 Notebook 單元格中:

    復制

    %sh wget -P /tmp https://raw.githubusercontent.com/Azure/usql/master/Examples/Samples/Data/json/radiowebsite/small_radio_json.json

    下面是small_radio_json.json的內容,如果上面的執行不成功的話,可以復制,粘貼,上傳到到Azure Data Lake Storage Gen2 上

    {"ts":1409318650332,"userId":"309","sessionId":1879,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"free","itemInSession":2,"location":"Killeen-Temple, TX","lastName":"Montgomery","firstName":"Annalyse","registration":1384448062332,"gender":"F","artist":"El Arrebato","song":"Quiero Quererte Querer","length":234.57914} {"ts":1409318653332,"userId":"11","sessionId":10,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"paid","itemInSession":9,"location":"Anchorage, AK","lastName":"Thomas","firstName":"Dylann","registration":1400723739332,"gender":"M","artist":"Creedence Clearwater Revival","song":"Born To Move","length":340.87138} {"ts":1409318685332,"userId":"201","sessionId":2047,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"paid","itemInSession":11,"location":"New York-Newark-Jersey City, NY-NJ-PA","lastName":"Watts","firstName":"Liam","registration":1406279422332,"gender":"M","artist":"Gorillaz","song":"DARE","length":246.17751} {"ts":1409318686332,"userId":"779","sessionId":2136,"page":"Home","auth":"Logged In","method":"GET","status":200,"level":"free","itemInSession":0,"location":"Nashville-Davidson--Murfreesboro--Franklin, TN","lastName":"Townsend","firstName":"Tess","registration":1406970190332,"gender":"F"} {"ts":1409318697332,"userId":"401","sessionId":400,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"free","itemInSession":2,"location":"Atlanta-Sandy Springs-Roswell, GA","lastName":"Smith","firstName":"Margaux","registration":1406191211332,"gender":"F","artist":"Otis Redding","song":"Send Me Some Lovin'","length":135.57506} {"ts":1409318714332,"userId":"521","sessionId":520,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"paid","itemInSession":39,"location":"Chicago-Naperville-Elgin, IL-IN-WI","lastName":"Morse","firstName":"Alan","registration":1401760632332,"gender":"M","artist":"Slightly Stoopid","song":"Mellow Mood","length":198.53016} {"ts":1409318743332,"userId":"244","sessionId":2261,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"free","itemInSession":1,"location":"San Jose-Sunnyvale-Santa Clara, CA","lastName":"Shelton","firstName":"Gabriella","registration":1389460542332,"gender":"F","artist":"NOFX","song":"Linoleum","length":130.2722} {"ts":1409318804332,"userId":"969","sessionId":968,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"paid","itemInSession":0,"location":"Detroit-Warren-Dearborn, MI","lastName":"Williams","firstName":"Elijah","registration":1388691347332,"gender":"M","artist":"Nirvana","song":"The Man Who Sold The World","length":260.98893} {"ts":1409318832332,"userId":"401","sessionId":400,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"free","itemInSession":3,"location":"Atlanta-Sandy Springs-Roswell, GA","lastName":"Smith","firstName":"Margaux","registration":1406191211332,"gender":"F","artist":"Aventura","song":"La Nina","length":293.56363} {"ts":1409318891332,"userId":"779","sessionId":2136,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"free","itemInSession":1,"location":"Nashville-Davidson--Murfreesboro--Franklin, TN","lastName":"Townsend","firstName":"Tess","registration":1406970190332,"gender":"F","artist":"Harmonia","song":"Sehr kosmisch","length":655.77751} {"ts":1409318912332,"userId":"521","sessionId":520,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"paid","itemInSession":40,"location":"Chicago-Naperville-Elgin, IL-IN-WI","lastName":"Morse","firstName":"Alan","registration":1401760632332,"gender":"M","artist":"Spragga Benz","song":"Backshot","length":122.53995} {"ts":1409318931332,"userId":"201","sessionId":2047,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"paid","itemInSession":12,"location":"New York-Newark-Jersey City, NY-NJ-PA","lastName":"Watts","firstName":"Liam","registration":1406279422332,"gender":"M","artist":"Bananarama","song":"Love In The First Degree","length":208.92689} {"ts":1409318931332,"userId":"201","sessionId":2047,"page":"Home","auth":"Logged In","method":"GET","status":200,"level":"paid","itemInSession":13,"location":"New York-Newark-Jersey City, NY-NJ-PA","lastName":"Watts","firstName":"Liam","registration":1406279422332,"gender":"M"} {"ts":1409318993332,"userId":"11","sessionId":10,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"paid","itemInSession":10,"location":"Anchorage, AK","lastName":"Thomas","firstName":"Dylann","registration":1400723739332,"gender":"M","artist":"Alliance Ethnik","song":"Repr???sente","length":252.21179} {"ts":1409319034332,"userId":"521","sessionId":520,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"paid","itemInSession":41,"location":"Chicago-Naperville-Elgin, IL-IN-WI","lastName":"Morse","firstName":"Alan","registration":1401760632332,"gender":"M","artist":"Sense Field","song":"Am I A Fool","length":181.86404} {"ts":1409319064332,"userId":"969","sessionId":968,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"paid","itemInSession":1,"location":"Detroit-Warren-Dearborn, MI","lastName":"Williams","firstName":"Elijah","registration":1388691347332,"gender":"M","artist":"Binary Star","song":"Solar Powered","length":268.93016} {"ts":1409319125332,"userId":"401","sessionId":400,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"free","itemInSession":4,"location":"Atlanta-Sandy Springs-Roswell, GA","lastName":"Smith","firstName":"Margaux","registration":1406191211332,"gender":"F","artist":"Sarah Borges and the Broken Singles","song":"Do It For Free","length":158.95465} {"ts":1409319215332,"userId":"521","sessionId":520,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"paid","itemInSession":42,"location":"Chicago-Naperville-Elgin, IL-IN-WI","lastName":"Morse","firstName":"Alan","registration":1401760632332,"gender":"M","artist":"Incubus","song":"Drive","length":232.46322} {"ts":1409319245332,"userId":"11","sessionId":10,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"paid","itemInSession":11,"location":"Anchorage, AK","lastName":"Thomas","firstName":"Dylann","registration":1400723739332,"gender":"M","artist":"Ella Fitzgerald","song":"On Green Dolphin Street (Medley) (1999 Digital Remaster)","length":427.15383} {"ts":1409319283332,"userId":"401","sessionId":400,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"free","itemInSession":5,"location":"Atlanta-Sandy Springs-Roswell, GA","lastName":"Smith","firstName":"Margaux","registration":1406191211332,"gender":"F","artist":"10cc","song":"Silly Love","length":241.34485} {"ts":1409319293332,"userId":"906","sessionId":1909,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"free","itemInSession":0,"location":"Toledo, OH","lastName":"Oconnell","firstName":"Aurora","registration":1406406461332,"gender":"F","artist":"Eric Johnson","song":"Trail Of Tears (Album Version)","length":361.37751} {"ts":1409319332332,"userId":"969","sessionId":968,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"paid","itemInSession":2,"location":"Detroit-Warren-Dearborn, MI","lastName":"Williams","firstName":"Elijah","registration":1388691347332,"gender":"M","artist":"Phoenix","song":"Holdin' On Together","length":207.15057} {"ts":1409319365332,"userId":"750","sessionId":749,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"free","itemInSession":0,"location":"Grants Pass, OR","lastName":"Coleman","firstName":"Alex","registration":1404326435332,"gender":"M","artist":"Ween","song":"The Stallion","length":276.13995} {"ts":1409319447332,"userId":"521","sessionId":520,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"paid","itemInSession":43,"location":"Chicago-Naperville-Elgin, IL-IN-WI","lastName":"Morse","firstName":"Alan","registration":1401760632332,"gender":"M","artist":"dEUS","song":"Secret Hell","length":299.83302} {"ts":1409319539332,"userId":"969","sessionId":968,"page":"NextSong","auth":"Logged In","method":"PUT","status":200,"level":"paid","itemInSession":3,"location":"Detroit-Warren-Dearborn, MI","lastName":"Williams","firstName":"Elijah","registration":1388691347332,"gender":"M","artist":"Holly Cole","song":"Cry (If You Want To)","length":158.98077}

    ?

    從 Azure Data Lake Storage Gen2 帳戶中提取數據

  • 現在可以將示例 json 文件加載為 Azure Databricks 中的數據幀。?將以下代碼粘貼到新單元格中。?將括號中顯示的占位符替換為你的值。

    Scala復制

    val df = spark.read.json("abfss://" + fileSystemName + "@" + storageAccountName + ".dfs.core.chinacloudapi.cn/small_radio_json.json")
  • 按?SHIFT + ENTER?鍵,運行此塊中的代碼。

  • 運行以下代碼來查看數據幀的內容:

    Scala復制

    df.show()

    會顯示類似于以下代碼片段的輸出:

    輸出復制

    +---------------------+---------+---------+------+-------------+----------+---------+-------+--------------------+------+--------+-------------+---------+--------------------+------+-------------+------+ | artist| auth|firstName|gender|itemInSession| lastName| length| level| location|method| page| registration|sessionId| song|status| ts|userId| +---------------------+---------+---------+------+-------------+----------+---------+-------+--------------------+------+--------+-------------+---------+--------------------+------+-------------+------+ | El Arrebato |Logged In| Annalyse| F| 2|Montgomery|234.57914| free | Killeen-Temple, TX| PUT|NextSong|1384448062332| 1879|Quiero Quererte Q...| 200|1409318650332| 309| | Creedence Clearwa...|Logged In| Dylann| M| 9| Thomas|340.87138| paid | Anchorage, AK| PUT|NextSong|1400723739332| 10| Born To Move| 200|1409318653332| 11| | Gorillaz |Logged In| Liam| M| 11| Watts|246.17751| paid |New York-Newark-J...| PUT|NextSong|1406279422332| 2047| DARE| 200|1409318685332| 201| ... ...

    現在,你已將數據從 Azure Data Lake Storage Gen2 提取到 Azure Databricks 中。

  • 在 Azure Databricks 中轉換數據

    原始示例數據?small_radio_json.json?文件捕獲某個電臺的聽眾,有多個不同的列。?在此部分,請對該數據進行轉換,僅檢索數據集中的特定列。

  • 首先,僅從已創建的 dataframe 檢索?firstName?、?lastName?、?gender?、?location?和?level?列。

    val specificColumnsDf = df.select("firstname", "lastname", "gender", "location", "level") specificColumnsDf.show()

    接收的輸出如以下代碼片段所示:

    輸出復制

    +---------+----------+------+--------------------+-----+ |firstname| lastname|gender| location|level| +---------+----------+------+--------------------+-----+ | Annalyse|Montgomery| F| Killeen-Temple, TX| free| | Dylann| Thomas| M| Anchorage, AK| paid| | Liam| Watts| M|New York-Newark-J...| paid| | Tess| Townsend| F|Nashville-Davidso...| free| | Margaux| Smith| F|Atlanta-Sandy Spr...| free| | Alan| Morse| M|Chicago-Napervill...| paid| |Gabriella| Shelton| F|San Jose-Sunnyval...| free| | Elijah| Williams| M|Detroit-Warren-De...| paid| | Margaux| Smith| F|Atlanta-Sandy Spr...| free| | Tess| Townsend| F|Nashville-Davidso...| free| | Alan| Morse| M|Chicago-Napervill...| paid| | Liam| Watts| M|New York-Newark-J...| paid| | Liam| Watts| M|New York-Newark-J...| paid| | Dylann| Thomas| M| Anchorage, AK| paid| | Alan| Morse| M|Chicago-Napervill...| paid| | Elijah| Williams| M|Detroit-Warren-De...| paid| | Margaux| Smith| F|Atlanta-Sandy Spr...| free| | Alan| Morse| M|Chicago-Napervill...| paid| | Dylann| Thomas| M| Anchorage, AK| paid| | Margaux| Smith| F|Atlanta-Sandy Spr...| free| +---------+----------+------+--------------------+-----+
  • 可以進一步轉換該數據,將?level?列重命名為?subscription_type?。

    Scala復制

    val renamedColumnsDF = specificColumnsDf.withColumnRenamed("level", "subscription_type") renamedColumnsDF.show()

    接收的輸出如以下代碼片段所示。

    +---------+----------+------+--------------------+-----------------+ |firstname| lastname|gender| location|subscription_type| +---------+----------+------+--------------------+-----------------+ | Annalyse|Montgomery| F| Killeen-Temple, TX| free| | Dylann| Thomas| M| Anchorage, AK| paid| | Liam| Watts| M|New York-Newark-J...| paid| | Tess| Townsend| F|Nashville-Davidso...| free| | Margaux| Smith| F|Atlanta-Sandy Spr...| free| | Alan| Morse| M|Chicago-Napervill...| paid| |Gabriella| Shelton| F|San Jose-Sunnyval...| free| | Elijah| Williams| M|Detroit-Warren-De...| paid| | Margaux| Smith| F|Atlanta-Sandy Spr...| free| | Tess| Townsend| F|Nashville-Davidso...| free| | Alan| Morse| M|Chicago-Napervill...| paid| | Liam| Watts| M|New York-Newark-J...| paid| | Liam| Watts| M|New York-Newark-J...| paid| | Dylann| Thomas| M| Anchorage, AK| paid| | Alan| Morse| M|Chicago-Napervill...| paid| | Elijah| Williams| M|Detroit-Warren-De...| paid| | Margaux| Smith| F|Atlanta-Sandy Spr...| free| | Alan| Morse| M|Chicago-Napervill...| paid| | Dylann| Thomas| M| Anchorage, AK| paid| | Margaux| Smith| F|Atlanta-Sandy Spr...| free| +---------+----------+------+--------------------+-----------------+
  • 將數據加載到 Azure Synapse 中

    在本部分,請將轉換的數據上傳到 Azure Synapse 中。?使用適用于 Azure Databricks 的 Azure Synapse 連接器直接上傳數據幀,在 Azure Synapse 池中作為表來存儲。

    如前所述,Azure Synapse 連接器使用 Azure Blob 存儲作為臨時存儲,以便將數據從 Azure Databricks 上傳到 Azure Synapse。?因此,一開始請提供連接到存儲帳戶所需的配置。?必須已經按照本文先決條件部分的要求創建了帳戶。

  • 提供從 Azure Databricks 訪問 Azure 存儲帳戶所需的配置。

    val blobStorage = "<blob-storage-account-name>.blob.core.chinacloudapi.cn" val blobContainer = "<blob-container-name>" val blobAccessKey = "<access-key>"
  • 指定一個在 Azure Databricks 和 Azure Synapse 之間移動數據時需要使用的臨時文件夾。

    val tempDir = "wasbs://" + blobContainer + "@" + blobStorage +"/tempDirs"
  • 運行以下代碼片段,以便在配置中存儲 Azure Blob 存儲訪問密鑰。?此操作可確保不需將訪問密鑰以純文本形式存儲在筆記本中。

    val acntInfo = "fs.azure.account.key."+ blobStorage sc.hadoopConfiguration.set(acntInfo, blobAccessKey)
  • 提供連接到 Azure Synapse 實例所需的值。?先決條件是必須已創建 Azure Synapse Analytics 服務。?為 dwServer 使用完全限定的服務器名稱 。?例如,<servername>.database.chinacloudapi.cn?。

    //Azure Synapse related settings val dwDatabase = "<database-name>" val dwServer = "<database-server-name>" val dwUser = "<user-name>" val dwPass = "<password>" val dwJdbcPort = "1433" val dwJdbcExtraOptions = "encrypt=true;trustServerCertificate=true;hostNameInCertificate=*.database.chinacloudapi.cn;loginTimeout=30;" val sqlDwUrl = "jdbc:sqlserver://" + dwServer + ":" + dwJdbcPort + ";database=" + dwDatabase + ";user=" + dwUser+";password=" + dwPass + ";$dwJdbcExtraOptions" val sqlDwUrlSmall = "jdbc:sqlserver://" + dwServer + ":" + dwJdbcPort + ";database=" + dwDatabase + ";user=" + dwUser+";password=" + dwPass
  • 運行以下代碼片段來加載轉換的數據幀 renamedColumnsDF ,在 Azure Synapse 中將其存儲為表。?此代碼片段在 SQL 數據庫中創建名為?SampleTable?的表。

    spark.conf.set("spark.sql.parquet.writeLegacyFormat","true")renamedColumnsDF.write.format("com.databricks.spark.sqldw").option("url", sqlDwUrlSmall).option("dbtable", "SampleTable") .option( "forward_spark_azure_storage_credentials","True").option("tempdir", tempDir).mode("overwrite").save()

    ?

  • 連接到 SQL 數據庫,驗證是否看到名為?SampleTable?的數據庫。

  • 運行一個 select 查詢,驗證表的內容。?該表的數據應該與?renamedColumnsDF?dataframe 相同。

  • 總結

    以上是生活随笔為你收集整理的使用 Azure Databricks 做ETL的全部內容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。