Skip to content

Ingestr Assets

Ingestr is a CLI tool that allows you to easily move data between platforms. Bruin supports ingestr natively as an asset type.

Using Ingestr, you can move data from:

  • your production databases like:
    • MSSQL
    • MySQL
    • Oracle
  • your daily tools like:
    • Notion
    • Google Sheets
    • Airtable
  • from other platforms such as:
    • Hubspot
    • Salesforce
    • Google Analytics
    • Facebook Ads
    • Google Ads

to your data warehouses:

  • Google BigQuery
  • Snowflake
  • AWS Redshift
  • Azure Synapse
  • Postgres

INFO

You can read more about the capabilities of ingestr in its documentation.

Asset Structure

yaml
name: string
type: ingestr
connection: string # optional, by default uses the default connection for destination platform in pipeline.yml
parameters:
  source: string # optional, used when inferring the source from connection is not enough, e.g. GCP connection + GSheets source
  source_connection: string
  source_table: string
  destination: bigquery | snowflake | redshift | synapse
  
  # optional
  incremental_strategy: replace | append | merge | delete+insert
  incremental_key: string
  sql_backend: pyarrow | sqlalchemy
  loader_file_format: jsonl | csv | parquet

Examples

The examples below show how to use the ingestr asset type in your pipeline. Feel free to change them as you wish according to your needs.

Copy a table from MySQL to BigQuery

yaml
name: raw.transactions
type: ingestr
parameters:
  source_connection: mysql_prod
  source_table: public.transactions
  destination: bigquery

Copy a table from Microsoft SQL Server to Snowflake incrementally

This example shows how to use updated_at column to incrementally load the data from Microsoft SQL Server to Snowflake.

yaml
name: raw.transactions
type: ingestr
parameters:
  source_connection: mysql_prod
  source_table: dbo.transactions
  destination: snowflake
  incremental_strategy: append
  incremental_key: updated_at

Copy data from Google Sheets to Snowflake

This example shows how to copy data from Google Sheets into your Snowflake database

yaml
name: raw.manual_orders
type: ingestr
parameters:
  source: gsheets
  source_connection: gcp-default
  source_table: <mysheetid>.<sheetname>
  destination: snowflake