Kafka
Kafka is a distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.
ingestr supports Kafka as a source for ingestr assets, allowing you to ingest data from kafka into your data warehouse.
In order to have set up Kafka connection, you need to add a configuration item to connections
in the .bruin.yml
file complying with the following schema. For more information on how to get these credentials check the Kafka section in Ingestr documentation
Follow the steps below to correctly set up Kafka as a data source and run ingestion:
Step 1: Create an Asset File for Data Ingestion
To ingest data from Kafka, you need to create an asset configuration file. This file defines the data flow from the source to the destination. (For e.g., ingestr.kafka.asset.yml) and add the following content:
File: ingestr.kafka.asset.yml
name: public.kafka
type: ingestr
connection: postgres
parameters:
source_connection: my_kafka
source_table: 'kafka.my_topic'
destination: postgres
name: The name of the asset.
type: Specifies the type of the asset. It will be always ingestr type for Kafka.
connection: This is the destination connection.
parameters:
- source_connection: The name of the Kafka connection defined in .bruin.yml.
- source_table: The name of the data table in kafka you want to ingest.
Step 2: Add a Connection to .bruin.yml that stores connections and secrets to be used in pipelines. You need to add a configuration item to connections
in the .bruin.yml
file complying with the following schema.
File: .bruin.yml
connections:
kafka:
- name: "my_kafka"
bootstrap_servers: "localhost:9093"
group_id: "test123"
Step 3: Run Asset to Ingest Data
bruin run ingestr.kafka.asset.yml
It will ingest kafka data to postgres.