Create a Sync from ClickHouse to Batch Profile Events

Before you start

To create a ClickHouse → Batch Profile Events sync, you'll need:

  • Access to the Batch dashboard

  • A ClickHouse table or view containing one row per event

  • Your ClickHouse credentials (Host, Port, Database, Username, Password)

  • A table (or view) that follows the Cloud Sync events input format (see below)


1) Prepare your ClickHouse table or view

Cloud Sync expects your ClickHouse source (table or view) to include:

  1. A profile identifier (to know which profile the event belongs to)

  2. An event name (the type of event being recorded)

  3. A cursor field (to know which rows are new since the last run)

  4. Any number of event attribute columns (sent to Batch as event properties)


1.1 One row per event

Your source must contain one row per event. Unlike profile attribute syncs, multiple rows can share the same custom_id — each row creates a separate event on the corresponding profile.


1.2 Required columns

Your table (or view) must include the following columns:

Column
Required
Description

custom_id

The identifier of the profile the event belongs to

event_name

The name of the event (e.g. add_to_cart, purchase)

last_updated_at

Cursor used for incremental sync

circle-exclamation

1.3 Optional columns

Column
Description

event_time

The timestamp of the event, in RFC 3339 UTC format (e.g. 2026-04-17T04:25:00Z). If omitted, Batch uses the time of reception.


1.4 Event attribute naming rules (ClickHouse-compatible)

ClickHouse column names do not support characters like $, (, or ). Cloud Sync relies on prefixes in column names to represent typed event attributes.

Supported prefixes

Prefix
Meaning
Example

date__

Date attribute

date__purchased_at

url__

URL attribute

url__item_url

Any column without a prefix is sent to Batch as a standard string, number, or boolean attribute.


1.5 Handling arrays

Array event attributes must be passed as a String column containing a stringified JSON array. Native ClickHouse Array() types are not supported.

Example:


1.6 Handling objects

Object event attributes must also be stringified as a JSON object before being passed to Cloud Sync. The connector parses the string and sends it in the correct format to the Profile API.

Example:


1.7 Example schema (e-commerce)

How this maps in Batch:

  • custom_id identifies the profile the event is attached to

  • event_name sets the event type

  • event_time sets the event timestamp (falls back to reception time if omitted)

  • item, quantity, price become standard event attributes

  • url__item_url is interpreted as a URL attribute

  • date__purchased_at is interpreted as a date attribute

  • product_details is interpreted as an object attribute


1.8 Using a View

If your raw event table doesn't match the expected naming or format, create a ClickHouse View that converts your schema into the correct conventions.

This approach lets you:

  • rename fields with the correct prefixes (date__, url__)

  • set a reliable last_updated_at based on insertion time

  • stringify array and object columns correctly

  • format event_time to RFC 3339 UTC


1.9 Handling nulls

If an optional column value is NULL, that attribute is simply omitted from the event. It does not affect other attributes on the same event or on the profile.


2) Create the Sync in the Batch dashboard

Cloud Sync is configured from the dashboard via a dedicated Sync module.

  1. Open the Batch dashboard

  2. Go to Data → Cloud Sync

  3. Click Create Sync

  4. Select ClickHouse as the source


2.1 Configure your ClickHouse connection

Enter the following fields:

Field
Description

Host

Your ClickHouse server hostname (e.g. your-host.example.com)

Port

ClickHouse HTTP(S) port — defaults to 8123

Database

The database containing your source table/view (default: default)

Username

Your ClickHouse username

Password

Your ClickHouse password

SSL

Toggle on to use a secure connection (recommended for production)

Table

The table or view name

Batch validates the connection before continuing.


2.2 Select the destination

In the Destination dropdown, select Batch > Profile events.


2.3 Configure event mapping

Cloud Sync applies a simple mapping model:

  • custom_id → identifies which Batch profile to attach the event to

  • event_name → sets the event type

  • last_updated_at → used only for incremental sync logic

  • all other columns → mapped to event attributes


3) How incremental sync works

Cloud Sync uses incremental processing, which means it does not re-import your full dataset at every run. Instead, it fetches only the rows that are new since the last successful sync.


3.1 The last_updated_at cursor

Batch stores the last successful cursor value internally.

At each run, Batch fetches only rows where:

  • last_updated_at is greater than the last stored cursor

This makes sync runs faster, more scalable, and more cost-efficient.

circle-exclamation

3.2 Inserts and deletes

Incremental syncs capture:

  • ✅ new event rows (inserts)

They do not capture:

  • ❌ deletions — events sent to Batch are immutable; they cannot be removed via Cloud Sync


3.3 Best practices for reliable incremental syncs

To avoid missing events:

  • Set last_updated_at at insertion time and never modify it afterwards

  • Use a View if you need to rename columns or apply type conversions

  • Use ReplacingMergeTree and order by last_updated_at for large tables


4) Test and enable your Sync

Before enabling the schedule:

  1. Run a test sync

  2. Verify:

    • Events are created on the correct profiles

    • event_time is set correctly (or defaults to reception time as expected)

    • date__ and url__ fields are interpreted correctly

    • Array and object attributes are correctly stringified

Once enabled, Batch automatically handles batching and retries.

Last updated