Data Engineering - Deep Learning Nerds | The ultimate Learning Platform for AI and Data Science (Page 2)

58 posts

Academy Membership dbt dbt Analytics Engineering Certification

Set up a new dbt Project from Scratch: A Beginner’s Guide

Introduction Want to start with dbt core but don’t know where to begin? Don’t worry! In this tutorial, we'll walk through setting up a new dbt project from scratch - we cover the entire process from creating a virtual environment to initializing your project and verifying...

by Data Engineer

Academy Membership dbt dbt Analytics Engineering Certification

Connect dbt to DuckDB: A Step-by-Step Guide

Introduction Modern data work doesn’t always require complex infrastructure. If you're looking for a fast, local-first analytics stack, dbt + DuckDB is a powerful combination. In this tutorial, we'll walk through connecting dbt to DuckDB — perfect for analysts, developers, and data enthusiasts who want simplicity without...

by Data Engineer

Academy Membership dbt dbt Analytics Engineering Certification

Defining configurations in dbt_project.yml in dbt

Introduction In dbt, the dbt_project.yml file is the backbone of your project configuration. It controls how dbt behaves, where it looks for different resource types, and how your models are materialized and organized. In this tutorial, we'll walk through a sample dbt_project.yml file and...

by Data Engineer

Academy Membership PySpark Python

PySpark - Get statistical Properties of a DataFrame

Introduction When working with PySpark DataFrames, understanding the statistical properties of your data is crucial for data exploration and preprocessing. PySpark provides the describe() and summary() functions to generate useful summary statistics. In this tutorial, we’ll explore how to use both functions to get insights into our dataset. 📥 Import...

by Data Engineer

Academy Membership dlt Data Engineering

Install and use dlt (data load tool)

Introduction dlt (data load tool) is a powerful Python package that simplifies data ingestion and helps you build efficient data pipelines. In the Extract, Load, Transform (ELT) process, dlt is particularly suited for the Extract (E) and Load (L) stages. In this tutorial, we'll guide you through the...

by Data Engineer

Academy Membership dbt dbt Analytics Engineering Certification

Understanding the dbt build command: How it works and when to use it

Introduction In dbt, one of the most essential commands is dbt build. In this tutorial, we’ll dive into the dbt build command, exploring its syntax, functionality, and practical usage. Since this topic is relevant for the dbt Analytics Engineering Certification Exam, this guide will be a valuable resource on...

by Data Engineer

Academy Membership Microsoft Fabric DP-600

Create a Table in a Warehouse in Microsoft Fabric

Introduction Microsoft Fabric provides a powerful platform for managing and analyzing data efficiently. One essential capability within a Fabric warehouse is the ability to create tables, which form the foundation for structuring and storing data. Understanding how to define and create tables is crucial for managing datasets, running analytical queries,...

by Data Engineer

Academy Membership Microsoft Fabric DP-600

Create a Schema in a Warehouse in Microsoft Fabric

Introduction Microsoft Fabric provides a robust platform for data management and analytics. One fundamental aspect of managing a Fabric warehouse is the ability to create schemas. Schemas help organize tables and other database objects within a warehouse, improving structure, security, and manageability. For those preparing for the DP-600 certification exam,...

by DevOps Engineer

Academy Membership Microsoft Fabric DP-600

Create a Stored Procedure in a Warehouse in Microsoft Fabric

Introduction Microsoft Fabric provides a robust environment for managing and transforming data within a data warehouse. One of its powerful features is the ability to create stored procedures, which allow for encapsulating SQL logic that can be reused across multiple operations. Stored procedures simplify complex queries, automate repetitive tasks, and...

by Data Engineer