Skip to content

Templating

Bruin supports Jinja as its templating language for SQL assets. This allows you to write dynamic SQL queries that can be parameterized with variables. This is useful when you want to write a query that is parameterized by a date, a user ID, or any other variable.

The following is an example SQL asset that uses Jinja templating for different start_date and end_date parameters:

sql
SELECT * FROM my_table WHERE dt BETWEEN '{{ start_date }}' AND '{{ end_date }}'

Since start_date and end_date parameters are automatically passed to your assets by Bruin, this allows the same SQL asset definition to be used both as your regular execution, e.g. daily or hourly, as well as backfilling a longer period of time.

You can do more complex stuff such as looping over a list of values, or using conditional logic. Here's an example of a SQL asset that loops over a list of days and dynamically generates column names.

Example

pipeline.yaml

yaml
name: sql-pipeline
variables:
  days:
    type: array
    default: [1, 3, 7, 15, 30, 90]

asset.sql

sql
SELECT
    conversion_date,
    cohort_id,
    {% for day_n in var.days %}
    SUM(IFF(days_since_install < {{ day_n }}, revenue, 0)) 
    AS revenue_{{ day_n }}_days
    {% if not loop.last %},{% endif %}
    {% endfor %}
FROM user_cohorts
GROUP BY 1,2

This will render into the following SQL query:

sql
SELECT
    conversion_date,
    cohort_id,
    SUM(IFF(days_since_install < 1, revenue, 0)) AS revenue_1_days,
    SUM(IFF(days_since_install < 3, revenue, 0)) AS revenue_3_days,
    SUM(IFF(days_since_install < 7, revenue, 0)) AS revenue_7_days,
    SUM(IFF(days_since_install < 15, revenue, 0)) AS revenue_15_days,
    SUM(IFF(days_since_install < 30, revenue, 0)) AS revenue_30_days,
    SUM(IFF(days_since_install < 90, revenue, 0)) AS revenue_90_days
FROM user_cohorts
GROUP BY 1,2

You can read more about Jinja here.

Adding variables

You can add variables in your pipeline.yml file. We support all YAML data types to give you the maximum flexibility in your variable configuration. variables are declared as JSON Schema object. Here's a comprehensive example:

yaml
name: var-pipeline
variables:
  users:
    type: array
    items:
      type: string
    default: ["jhon", "nick"]
  env:
    type: string
    default: dev
  tags:
    type: object
    properties:
      team:
        type: string
      tenant:
        type: string
    default:
      team: data
      tenant: acme

All user-defined variables are accessible via the var namespace. For example, if you define a variable called src, it will be available as {{ var.src }} in your assets.

Additionally all top level variables must define a default value. This will be used to render your assets in absence of values supplied on the commandline.

Overriding Variables

You can override variables defined in your pipeline.yml file during a bruin run execution by using the --var flag. This allows you to dynamically adjust variable values without modifying the pipeline configuration file.

Bruin supports two different syntax for overriding variables:

sh
# key value form; value must be valid JSON
$ bruin run --var key=value

# json form
$ bruin run --var '{"key": "value"}'

The --var flag can be specified multiple times. However, beware that if you specify the same key multiple times, the latter one will overwrite the previous one.

Builtin variables

Bruin injects various variables by default:

VariableDescriptionExample
start_dateThe start date in YYYY-MM-DD format"2023-12-01"
start_datetimeThe start date and time in YYYY-MM-DDThh:mm:ss format"2023-12-01T15:30:00"
start_timestampThe start timestamp in RFC3339 format"2023-12-01T15:30:00.000000+07:00"
end_dateThe end date in YYYY-MM-DD format"2023-12-02"
end_datetimeThe end date and time in YYYY-MM-DDThh:mm:ss format"2023-12-02T15:30:00"
end_timestampThe end timestamp in RFC3339 format"2023-12-02T15:30:00.000000Z07:00"
pipelineThe name of the currently executing pipelinemy_pipeline
run_idThe unique identifier for the current pipeline runrun_1234567890

You can use these variables in your SQL queries by referencing them with the syntax:

sql
SELECT * FROM my_table 
WHERE dt BETWEEN '{{ start_date }}' AND '{{ end_date }}'

NOTE

Date-related variables are passed in as strings, and they will be driven by the flags given to the bruin run command, read more on that here.

You can modify these variables with the use of filters.