Skip to content

Policies

Bruin supports policies to verify that data transformation jobs follow best practices and organisation wide conventions. In addition to built-in lint rules, Bruin also allows users to define custom lint rules using a policy.yml file.

This document explains how to define, configure, and use custom linting policies.

NOTE

For the purpose of this document, a resource means either an asset or a pipeline.

Quick Start

  1. Create a policy.yml file in your project root.
  2. Define custom rules under custom_rules (optional if only using built-in rules).
  3. Group rules into rulesets, specifying which resource they should apply to using selectors.

Example:

yaml
rulesets:
  - name: ruleset-1
    selector:
      - path: .*/foo/.*
    rules:
      - asset-has-owner
      - asset-name-is-lowercase
      - asset-has-description

🚀 That's it! Bruin will now lint your assets according to these policies.

To verify that your assets satisfy your policies, you can run:

sh
$ bruin validate /path/to/pipelines

TIP

bruin run normally runs lint before pipeline execution. So you can rest assured that any non-compliant resources will get stopped in it's tracks.

Rulesets

A ruleset groups one or more rules together and specifies which resources they apply to, based on selectors.

Each ruleset must include:

  • name: A unique name for the ruleset.
  • selector (optional): One or more predicates to select the applicable resources.
  • rules: List of rule names (built-in or custom) to apply.

If a selector is not specified, the ruleset applies to all resources.

NOTE

Names be must alphanumeric or use dashes (-). This applies to both rulesets and rules.

Selector Predicates

Selectors determine which resources a ruleset should apply to. Supported predicates are:

PredicateTargetDescription
pathasset, pipelinepath of the asset/pipeline
pipelineasset, pipelinename of the pipeline
assetassetname of the asset
tagassetasset tags

Each predicate is a regex string.

INFO

If multiple selectors are specified within a ruleset, all selectors must match for the ruleset to apply

If no selectors are defined for a ruleset, the ruleset applies to all resources. Some selectors only work with certain rule targets. For instance tag selector only works for rules that target assets. Pipeline level rules will just ignore this selector.

TIP

If your ruleset only contains asset selectors, but uses pipeline rules, then those pipeline rules will apply to all pipelines. Make sure to define a pipeline or path selector if you don't intend for that to happen.

Example

yaml
rulesets:
  - name: production
    selector:
      - path: .*/prod/.*
      - tag: critical
    rules:
      - asset-has-owner
      - asset-name-is-lowercase

In this example:

  • production applies only to resources that match both:
    • path regex .*/prod/.*
    • and have a tag matching critical.

Custom Rules

Custom lint rules are defined inside the custom_rules section of policy.yml.

Each rule must include:

  • name: A unique name for the rule.
  • description: A human-readable description of the rule.
  • criteria: An expr boolean expression. If the expression evalutes to true then the resource passes validation.

Example

yaml
custom_rules:
  - name: asset-has-owner
    description: every asset should have an owner
    criteria: asset.Owner != ""

Targets

Custom rules can have an optional target attribute that defines what resource the rule acts on. Valid values are:

  • asset (default)
  • pipeline

Example

yaml
custom_rules:

  - name: pipline-must-have-prefix-acme
    description: Pipeline names must start with the prefix 'acme'
    criteria: pipeline.Name startsWith 'acme'
    target: pipeline

  - name: asset-name-must-be-layer-dot-schema-dot-table
    description: Asset names must be of the form {layer}.{schema}.{table}
    criteria: len(split(asset.Name, '.')) == 3
    target: asset # optional

ruleset:
  - name: std
    rules:
      - pipeline-must-have-prefix-acme
      - asset-name-must-be-layer-dot-schema-dot-table

Variables

criteria has the following variables available for use in your expressions:

NameTarget
assetasset
pipelineasset, pipeline

WARNING

The variables exposed here are direct Go structs, therefore it is recommended to check the latest version of these given structs.

In the future we will create dedicated schemas for custom rules with standards around them.

Built-in Rules

Bruin provides a set of built-in lint rules that are ready to use without requiring a definition.

RuleTargetDescription
asset-name-is-lowercaseassetAsset names must be in lowercase.
asset-name-is-schema-dot-tableassetAsset names must follow the format schema.table.
asset-has-descriptionassetAssets must have a description.
asset-has-ownerassetAssets must have an owner assigned.
asset-has-columnsassetAssets must define their columns.
asset-has-primary-keyassetAssets must define at least one column as a primary key.
asset-has-checksassetAsset must have at least one check (column or custom_checks).
asset-has-tagsassetAsset must have at least one tag.
column-has-descriptionassetAll columns declared by Asset must have description.
column-name-is-snake-caseassetColumn names must be in snake_case.
column-name-is-camel-caseassetColumn names must be in camelCase.
column-type-is-valid-for-platformassetEnsure that column types declared by asset are valid types in the relevant platform (BigQuery and Snowflake only).
description-must-not-be-placeholderassetasset and column descriptions must not contain placeholder strings
asset-has-no-cross-pipeline-dependenciesassetAssets must not depend on assets in other pipelines.
pipeline-has-notificationspipelinePipeline must declare at least one notification channel
pipeline-has-retriespipelinePipeline must have retries > 0
pipeline-has-start-datepipelinePipeline must have a `start_date`
pipeline-has-metadata-pushpipelinePipeline must push it's metadata

You can directly reference these rules in rulesets[*].rules.

Full Example

yaml
custom_rules:
  - name: asset-has-owner
    description: every asset should have an owner
    criteria: asset.Owner != ""

rulesets:
  - name: production
    selector:
      - path: .*/production/.*
      - tag: critical
    rules:
      - asset-has-owner
      - asset-name-is-lowercase
      - asset-has-description
  - name: staging
    selector:
      - asset: stage.*
      - pipeline: staging
    rules:
      - asset-name-is-lowercase

Further Reading