Common Table Expressions (CTEs):
A CTE, or Common Table Expression, is a temporary result set in SQL that can be referenced within a SELECT
, INSERT
, UPDATE
, or DELETE
statement. It's introduced with the WITH
keyword.
CTEs are used for breaking down complex SQL queries into simpler parts, making the query more readable and maintainable.
A CTE is similar to a derived table or a subquery but provides better readability and can be referenced multiple times in the main query.
The scope of a CTE is limited to the query in which it's defined. It's not stored as an object in the database and doesn't persist beyond the execution of the query.
Example:
WITH cte_name AS (
SELECT column1, column2
FROM table_name
WHERE condition
)
SELECT column1, column2
FROM cte_name;
dbt init
command. This sets up the directory structure and configuration files required for dbt.profiles.yml
file needs to be set up with connection details to your database. This file tells dbt how to connect to your data warehouse or database.models
directory of your dbt project. This directory is created when you initialize a dbt project.ref()
function, those referenced models should already exist in your dbt project.dbt deps
command before running your models.tests
directory and use the dbt test
command to run them.Once these prerequisites are met, you can start writing your dbt models, run them using the dbt run
command, and see the transformed data in your database.
In dbt, Jinja is a templating engine that allows for dynamic SQL generation, while macros are reusable pieces of SQL code that can be invoked in multiple places. Both Jinja and macros are essential components of dbt's functionality, enabling more modular and dynamic SQL code.
Example: Let's say you often need to format dates in a specific way in multiple models. Instead of writing the formatting SQL in each model, you can define a macro:
-- macros/date_formatting.sql
{% macro format_date(column_name) %}
TO_CHAR({{ column_name }}, 'YYYY-MM-DD')
{% endmacro %}
Then, in your models, you can use this macro:
SELECT
{{ format_date('created_at') }} as formatted_date,
...
FROM some_table
film_ratings.sql
:Create a new file in the macros/
directory of your dbt project, named film_ratings_macro.sql
: