Skip to content

run Command

This command is used to execute a Bruin pipeline or a specific asset within a pipeline.

  • You can run the pipeline from the current directory or a specific path to the pipeline/task definition.
  • If you don't specify a path, Bruin will run the pipeline from the current directory.
  • If you specify a path, Bruin will run the pipeline/asset from the directory of the file.
    • Bruin will try to infer if the given path is a pipeline or an asset and will run accordingly.
  • You can give specific start and end dates to run the pipeline/asset for a specific range.
  • You can limit the types of tasks to run by using the --only flag.
    • E.g. only run the quality checks: bruin run --only checks
bash
bruin run [FLAGS] [optional path to the pipeline/asset]
Bruin - init ## Flags
FlagTypeDefaultDescription
--downstreamboolfalseRun all downstream tasks as well.
--start-datestrBeginning of yesterdayThe start date of the range the pipeline will run for. Format: YYYY-MM-DD, YYYY-MM-DD HH:MM:SS, or YYYY-MM-DD HH:MM:SS.ffffff
--end-datestrEnd of yesterdayThe end date of the range the pipeline will run for. Format: YYYY-MM-DD, YYYY-MM-DD HH:MM:SS, or YYYY-MM-DD HH:MM:SS.ffffff
--environmentstr-The environment to use.
--forceboolfalseDo not ask for confirmation in a production environment.
--full-refreshboolfalseTruncate the table before running.
--no-log-fileboolfalseDo not create a log file for this run.
--only[]strmain, checksLimit the types of tasks to run. Options: main, checks, push-metadata.
--push-metadataboolfalsePush metadata to the destination database if supported (currently BigQuery).
--tagstr-Pick assets with the given tag.
--workersint16Number of workers to run tasks in parallel.
--continueboolfalseContinue from the last failed asset.

Continue from the last failed asset

If you want to continue from the last failed task, you can use the --continue flag. This will run the pipeline/asset from the last failed task. Bruin will automatically retrive all the flags used in the last run.

bash
bruin run --continue

NOTE

This will only work if the pipeline structure is not changed. If the pipeline structure has changed in any way, including asset dependencies, you will need to run the pipeline/asset from the beginning. This is to ensure that the pipeline/asset is run in the correct order.

Focused Runs: Filtering by Tags and Task Types

As detailed in the flag section above, the --tag, --downstream, and --only flags provide powerful ways to filter and control which tasks in your pipeline are executed. These flags can also be combined to fine-tune pipeline runs, allowing you to execute specific subsets of tasks based on tags, include their downstream dependencies, and restrict execution to certain task types.

Let’s explore how combining these flags enables highly targeted pipeline execution scenarios:

Combining Tags and Downstream

Using --tag with --downstream filters the assets by tag and includes all downstream dependencies for those tagged assets. For example:

bash
bruin run --tag important_tag --downstream

This will run all tasks of the assets tagged with important_tag and include all downstream tasks that depend on them.

Combining Tags and Task Types

Using --tag with --only restricts the tasks to specific types for the assets filtered by the given tag. For example:

bash
bruin run --tag quality_tag --only checks

This runs only the checks tasks for the assets tagged with quality_tag while excluding other task types.

Combining Tags, Downstream, and Task Types

Adding --only to the mix with --tag and --downstream allows you to specify the types of tasks to run for the tagged assets and their downstream dependencies. For example:

bash
bruin run --tag critical_tag --downstream --only main

This command runs only the main tasks for the assets tagged with critical_tag and their downstream dependencies.

Examples

Run the pipeline from the current directory:

bash
bruin run

Run the pipeline from a file:

bash
bruin run ./pipelines/project1/pipeline.yml

Run a specific asset:

bash
bruin run ./pipelines/project1/assets/my_asset.sql

Run the pipeline with a specific environment:

bash
bruin run --environment dev

Run the pipeline with a specific start and end date:

bash
bruin run --start-date 2024-01-01 --end-date 2024-01-31

Run the assets in the pipeline that contain a specific tag:

bash
bruin run --tag my_tag

Run only the quality checks:

bash
bruin run --only checks

Run only the main tasks and not the quality checks:

bash
bruin run --only main

Metadata Push

Metadata push is a feature that allows you to push metadata to the destination database/data catalog if supported. Currently, we support BigQuery as the catalog.

There are two ways to push metadata:

  1. You can set the --push-metadata flag to true when running the pipeline/asset.
  2. You can fill out the metadata_push dictionary in the pipeline/asset definition.
yaml
# pipeline.yml
name: bruin-init
schedule: daily
start_date: "2024-09-01"

default_connections:
   google_cloud_platform: "my-gcp-connection"

metadata_push:
  bigquery: true

When pushing the metadata, Bruin will detect the right connection to use, same way as it happens with running the asset.