# Create a Sync from Snowflake to Batch Profile Attributes

## Create a Sync from Snowflake to Batch Profile Attributes

### Before you start

To create a Snowflake → Batch sync, you'll need:

* Access to the **Batch dashboard**
* A **Snowflake table or view** containing one row per profile
* Your **Snowflake credentials** (Username & Password, or Key Pair)
* Your **Host, Warehouse, Role, Database, and Table/View** details
* A table (or view) that follows the **Cloud Sync input format** (see below)

***

### 1) Prepare your Snowflake table or view

Cloud Sync expects your Snowflake source (table or view) to include:

1. A **profile identifier** (to know which profile to update)
2. A **cursor field** (to know what changed since the last run)
3. Any number of **attribute columns** (sent to Batch as profile attributes)

***

#### 1.1 One row per profile

Your source must contain **one row per profile**.\
Each row is interpreted as an update to a single Batch profile.

***

#### 1.2 Required columns

Your table (or view) must include:

| Column            | Required | Description                      |
| ----------------- | :------: | -------------------------------- |
| `custom_id`       |     ✅    | The profile identifier in Batch  |
| `last_updated_at` |     ✅    | Cursor used for incremental sync |

{% hint style="warning" %}
`last_updated_at` must be updated **every time any synced attribute changes**, otherwise updates may not be picked up by the next run.
{% endhint %}

***

#### 1.3 Attribute naming rules (Snowflake-compatible)

Cloud Sync reads Snowflake columns and converts them into Batch profile attributes.

Snowflake column names do not support characters like `$`, `(`, or `)`. Cloud Sync relies on **prefixes** in column names to represent typed or native fields.

**Supported prefixes**

| Prefix    | Meaning                                   | Example                |
| --------- | ----------------------------------------- | ---------------------- |
| `date__`  | Date attribute                            | `date__birthday`       |
| `url__`   | URL attribute                             | `url__avatar`          |
| `batch__` | Native profile fields (instead of `$...`) | `batch__email_address` |

***

#### 1.4 Handling arrays

Snowflake does not natively support array column types in this context. **Array attributes must be stringified as a JSON array** before being passed to Cloud Sync.

```sql
'["value1","value2","value3"]'
```

Example:

```sql
SELECT
  user_id                          AS custom_id,
  updated_at                       AS last_updated_at,
  '["yoga","wellness"]'            AS interests  -- stringified array
FROM raw.users;
```

***

#### 1.5 Example schema (e-commerce)

```sql
CREATE OR REPLACE TABLE customer_data.batch_profiles (
    custom_id        STRING,
    last_updated_at  TIMESTAMP_TZ,

    -- Native profile fields
    batch__email_address  STRING,

    -- Standard attributes
    plan             STRING,
    country          STRING,
    lifetime_value   FLOAT,
    is_vip           BOOLEAN,

    -- Array attribute (stringified)
    interests        STRING,   -- e.g. '["yoga","wellness"]'

    -- Typed attributes
    url__avatar          STRING,
    date__birthday       STRING,
    date__last_purchase  STRING
);
```

**How this maps in Batch:**

* `custom_id` identifies the profile
* `batch__email_address` updates the profile's native email field
* `plan`, `country`, `lifetime_value`, `is_vip` become standard attributes
* `interests` is interpreted as an array attribute (must be a stringified JSON array)
* `url__avatar` is interpreted as a URL attribute
* `date__birthday` and `date__last_purchase` are interpreted as date attributes

***

#### 1.6 Using a View

If your raw table doesn't match the expected naming or format, create a **Snowflake View** that converts your schema into the correct conventions.

```sql
CREATE OR REPLACE VIEW customer_data.batch_profiles_view AS
SELECT
  CAST(u.user_id AS STRING)                        AS custom_id,
  GREATEST(u.updated_at, o.last_order_updated_at)  AS last_updated_at,

  u.email                                          AS batch__email_address,
  u.country,
  u.plan,

  u.avatar_url                                     AS url__avatar,
  TO_VARCHAR(u.birth_date, 'YYYY-MM-DD')           AS date__birthday,

  o.last_order_date                                AS date__last_purchase,
  o.total_spent                                    AS lifetime_value,
  u.is_vip,

  -- Stringify arrays before passing to Cloud Sync
  ARRAY_TO_STRING(PARSE_JSON(u.interests), ',')    AS interests
FROM raw.users u
LEFT JOIN raw.user_orders_summary o
  ON u.user_id = o.user_id;
```

This approach lets you:

* rename fields with the correct prefixes (`batch__`, `date__`, `url__`)
* compute a reliable `last_updated_at`
* stringify array columns correctly
* ensure you always expose **one row per profile**

***

#### 1.7 Handling nulls

If a column value is `NULL`, Batch interprets it as **attribute removal** for that profile.

If you don't want an attribute removed:

* ensure your view returns a non-null value, or
* exclude the column from the sync entirely.

***

#### 1.8 Attributes limits and constraints

All attributes sent through Cloud Sync must respect the same limits and constraints as the Batch Profile API. See Profile API documentation, [the attributes object](https://doc.batch.com/getting-started/features/customer-engagement-platform).

***

### 2) Create the Sync in the Batch dashboard

Cloud Sync is configured from the dashboard via a dedicated **Sync module**.

1. Open the **Batch dashboard**
2. Go to **Data → Cloud Sync**
3. Click **Create Sync**
4. Select **Snowflake** as the source

***

#### 2.1 Configure your Snowflake connection

Select an **Authentication** method, then fill in the corresponding fields:

{% tabs %}
{% tab title="Username & Password" %}

| Field        | Description                                                        |
| ------------ | ------------------------------------------------------------------ |
| Username     | Your Snowflake username                                            |
| Password     | Your Snowflake password                                            |
| Host         | Your Snowflake account URL (e.g. `account.snowflakecomputing.com`) |
| Role         | The Snowflake role with read access on the source table/view       |
| Warehouse    | The Snowflake virtual warehouse to use for queries                 |
| Database     | The database containing your source table/view                     |
| Table        | The table or view name                                             |
| {% endtab %} |                                                                    |

{% tab title="Key Pair" %}

| Field         | Description                                                        |
| ------------- | ------------------------------------------------------------------ |
| Username      | Your Snowflake username                                            |
| Private Key   | Your RSA private key (PEM format)                                  |
| Host          | Your Snowflake account URL (e.g. `account.snowflakecomputing.com`) |
| Role          | The Snowflake role with read access on the source table/view       |
| Warehouse     | The Snowflake virtual warehouse to use for queries                 |
| Database      | The database containing your source table/view                     |
| Table         | The table or view name                                             |
| {% endtab %}  |                                                                    |
| {% endtabs %} |                                                                    |

Batch validates the connection before continuing.

***

#### 2.2 Configure profile mapping

Cloud Sync applies a simple mapping model:

* `custom_id` → identifies which Batch profile to update
* all other columns → mapped to profile attributes
* `last_updated_at` → used only for incremental sync logic

***

### 3) How incremental sync works

Cloud Sync uses **incremental processing**, which means it does not re-import your full dataset at every run. Instead, it fetches only the rows that changed since the last successful sync.

***

#### 3.1 The `last_updated_at` cursor

Batch stores the last successful cursor value internally.

At each run, Batch fetches only rows where:

* `last_updated_at` is greater than the last stored cursor

This makes sync runs faster, more scalable, and more cost-efficient.

***

#### 3.2 Inserts, updates, and deletes

Incremental syncs naturally capture:

* ✅ inserts
* ✅ updates

They do **not** automatically capture:

* ❌ deletes

If you need deletions reflected in Batch, rely on a different pipeline or implement soft deletes by setting all attributes to null in the Snowflake view when a profile is deleted.

***

#### 3.3 Best practices for reliable incremental syncs

To avoid missing changes:

* Ensure `last_updated_at` updates **every time a synced column changes**
* Avoid timestamps that only reflect partial updates
* Use a View if you need computed fields or type conversions
* Cluster on `last_updated_at` for large tables

***

### 4) Test and enable your Sync

Before enabling the schedule:

1. Run a **test sync**
2. Verify:
   * Profiles are created or updated correctly
   * `batch__`, `date__`, and `url__` fields are interpreted correctly
   * Array attributes are correctly stringified (e.g. `["yoga","wellness"]`)
   * Null values behave as expected (null → attribute removal)

Once enabled, Batch automatically handles batching and retries.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://doc.batch.com/getting-started/features/customer-engagement-platform/profiles/cloud-sync/create-a-sync-from-snowflake-to-batch-profile-attributes.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
