In this 1-hour long project-based course, you will learn how to implement Polybase in Azure Synapse SQL Pool.
In this project, we are going to see how to implement Polybase in Azure Synapse SQL Pool. Polybase in simple words is, a feature provided by Azure SQL Pool through which you can access the data stored in Azure Data Lake Storage/Blob/HDFS using a SQL interface to access the files stored in above mentioned storage systems. Basically, you can execute SQL queries on the files containing the data. To implement Polybase the source that we are considering is a text file stored in Azure Data Lake Storage - Gen2. Pre requisites: 1. Azure subscription account 2. Basic understanding of Azure SQL Pool and Synapse Analytics 3. Basic understanding of T-SQL queries Here is a brief description of the tasks we are going to perform in this project: Task1: Create Azure Data Lake Storage - Gn2 In this task we are going to create the ADLS account which is going to have the source file (Customer.txt) which we would be eventually reading via SQL queries. Task2: Create Source File and upload it on ADLS container In this task, we are going to create a sample comma delimited text file and also see how to upload it on the container created in the ADLS account. Task1 & Task2 is to prepare our source. Task3: Create Azure SQL Pool In this task, we are going to create Azure SQL Pool and Azure Synapse Workspace. Polybase is a feature supported by Azure SQL Pool hence we need to create this service along with Synapse Workspace account. Task4: Configure Polybase So far in all above tasks we have created all the resources needed to configure and implement Polybase. Hence, in this task we are going to see how to configure Polybase. Task5: Polybase in action In this task, we are going to see polybase in action. We are going to see how to execute SQL queries on the Customer.txt file stored in ADLS account and retrieve the data.