Introduction

In this tutorial, we will walk through the steps required to write a PySpark DataFrame to a Delta table in Microsoft Fabric. We'll start by creating an example DataFrame, then demonstrate how to write it to a Delta table.

Goal

An example PySpark DataFrame should be created and written to a Delta table in a lakehouse.

Prequestisies

Before you begin, ensure you have the following prerequisites:

☑️ Notebook created

We have already created the notebook "dlnerds_notebook". If you want to know how to create a notebook check out the following post:

How to create a Notebook in Microsoft Fabric: A Step-by-Step Guide
Introduction Microsoft Fabric is a powerful All-in-One Data Platform (SaaS) in the Azure Cloud that combines various Azure components to cover the fields of Data Integration, Data Engineering, Data Science and Business Intelligence. One key component of the Microsoft Fabric architecture is the Notebook. In this tutorial, we will explain

☑️ Lakehouse created

We have already created the lakehouse "dlnerds_lakehouse". If you want to know how to create a lakehouse check out the following post:

How to create a Lakehouse in Microsoft Fabric: A Step-by-Step Guide
Introduction Microsoft Fabric is a powerful All-in-One Data Platform (SaaS) in the Azure Cloud that combines various Azure components to cover the fields of Data Integration, Data Engineering, Data Science and Business Intelligence. One key component of the Microsoft Fabric architecture is the Lakehouse. In this tutorial, we will explain

☑️ Lakehouse and Notebook connected

We have already established a connection between the notebook "dlnerds_notebook" and the lakehouse "dlnerds_lakehouse". If you want to know how to add a lakehouse to a notebook check out the following post:

How to connect a Lakehouse and a Notebook in Microsoft Fabric
Introduction In Microsoft Fabric, notebooks can interact very closely with lakehouses. In this tutorial, we will explain step-by-step how to connect a lakehouse and a notebook in Microsoft Fabric. Goal A lakehouse and a notebook should be connected. Prequestisies ☑️ Notebook created We have already created the notebook “dlnerds_notebook”. If

Step 1: Open Notebook

First, open the created notebook.

Step 2: Create PySpark DataFrame

Next, create a PySpark DataFrame based on a list containing some example data. To do this, use the createDataFrame() method of PySpark.

data = [
    ("Python", "Django", 20000),
    ("Python", "FastAPI", 9000),
    ("Java", "Spring", 7000),
    ("JavaScript", "ReactJS", 5000)
]
column_names = ["language", "framework", "users"]

df = spark.createDataFrame(data, column_names)
display(df)

Step 3: Write PySpark DataFrame to Delta Table

Let's write the PySpark DataFrame to a Delta table in the lakehouse "dlnerds_lakehouse". The Delta table should have the name "framework".

You can view this post with the tier: Academy Membership

Join academy now to read the post and get access to the full library of premium posts for academy members only.

Join Academy Already have an account? Sign In