Integrate product data

Ask about this Page
Copy for LLM
View as Markdown

Design and implement a robust product data integration from PIM and ERP systems into Composable Commerce.

Learn more about integrations in our self-paced Integration patterns module.

Product data integration is a foundational component of any Composable Commerce solution. Whether your products originate in a PIM, an ERP, or both, a well-planned integration ensures your catalog is accurate, consistent, and up to date. This page covers the planning, data mapping, and implementation of product data integrations using the Import API and the HTTP API.

Plan your product data integration

Before writing any integration code, answer these key questions. The answers directly shape your architecture.

  • What is the source of truth for product data? Define which system owns the authoritative version of each attribute. A PIM typically owns enriched content (descriptions, images), while an ERP may own SKUs, pricing, and inventory levels.
  • How many data sources are there? Determine whether product data flows from a single system or multiple systems (for example, a PIM for content plus a DAM for images). Multiple sources add complexity around sequencing and conflict resolution.
  • Can product data be modified in Composable Commerce? If product managers update attributes directly in the Merchant Center, you need to define which attributes are editable in Composable Commerce and which are read-only from the external system.
  • What is the direction of data flow? Evaluate whether the integration is one-way (external system to Composable Commerce) or bi-directional. For bi-directional flows, define how data consistency is maintained and which attributes sync back to the source.

Single source of product data

The simplest scenario is a single source such as a PIM where no product data is modified in Composable Commerce. The integration pulls data from the PIM and stores it in Composable Commerce. This is a one-way flow with a clear source of truth.

Multiple sources of product data

When multiple systems contribute product data, define the following:

  1. The source of truth for each attribute: determine which attributes come from each system (for example, descriptions from the PIM, inventory from the ERP). To prevent unintentional edits, place externally managed attributes in a restricted AttributeGroup, making them read-only in the Merchant Center.
  2. The process for creating Products: ensure the initial version of a Product is sent to Composable Commerce once, when it is first created. Restrict subsequent updates to only the attributes that can change from each source.
  3. The number of integrations required: choose between a single integration that consolidates all sources or multiple integrations. In the latter, a primary process creates the Product and secondary processes add data afterward.

For bi-directional integrations, also define:

  1. The process for synchronizing attributes managed by Composable Commerce: identify the attributes for which Composable Commerce is the source of truth, and ensure only those attributes sync back to the source system.

Choose your integration approach

The method you choose depends on the frequency of updates, the volume of data, and the capabilities of your data sources.

Full versus incremental uploads

Full uploads send all product data from the source to Composable Commerce. Full uploads take longer and leave the catalog in a partially updated state during the process. Use full uploads for initial catalog loads or periodic complete refreshes.

Incremental uploads focus on products that have been created or modified since the last upload. Incremental uploads are faster, but not all data sources support change tracking natively. If your source doesn't support incremental updates, use a middleware with a database to track changes.

Incremental uploads are better suited for time-critical updates. If a product description contains an error, you can fix it in the PIM and run the integration immediately. You don't need to wait for the next full upload cycle.

Event-driven versus bulk approach

Event-driven integrations handle data changes as they occur, providing near real-time updates. Event-driven integrations require frequent requests and are the right choice when the real-time accuracy of data is critical for operations.

Bulk integrations run at scheduled intervals (for example, nightly) and are more efficient for large data volumes where immediate updates are not essential. Bulk integrations work well for catalog refreshes and periodic synchronizations.

Data mapping

Data mapping aligns data from your source systems with the Composable Commerce data model. Follow these principles:

  • Only transfer essential product data: not all attributes from your PIM are necessary in Composable Commerce. Map and transfer only the data directly relevant to commerce operations (search, display, pricing, fulfillment).
  • Ensure structure compatibility: the data structure in Composable Commerce is tailored for commerce utility, while PIM data is optimized for product enrichment. The Composable Commerce and PIM data structures differ, and you may need to transform data when mapping between the two. See the Product catalog overview for details on how Composable Commerce structures product data.
  • Avoid a one-to-one relationship between Product Types: linking Product Types in Composable Commerce directly to categories or types in the source system creates a maintenance burden. You need to monitor and apply structural changes from the source. A more versatile Product Type setup with flexible attributes simplifies this process.
  • Prioritize search-critical attributes: establish a clear mapping strategy for attributes essential to search and filtering. Supplementary information can be consolidated as a JSON text attribute.

Manage price and inventory separately

Price and inventory updates are typically more frequent and time-critical than updates to core product information such as descriptions and images. Implement price and inventory as separate, event-based integrations. Even if pricing and inventory data originate from the same system as product data, separating the processes ensures fast updates. This avoids blocking by slower catalog synchronizations.

Solution A: bulk import with the Import API

The Import API is purpose-built for asynchronous bulk data ingestion with automatic dependency resolution. For an overview of how it works, supported resources, and limitations, see the Import API overview.

Automatic dependency resolution makes the Import API the right choice for:

  • Initial catalog loads from a PIM or ERP.
  • Bulk migrations from another commerce platform.
  • Periodic full catalog refreshes.
Before submitting any import data, create an Import Container to logically group your import operations.

Import inventory data from the ERP

Inventory data often originates in an ERP system. A common pattern is to export the ERP data as a CSV file, parse it, and submit it to the Import API.

Consider the following example CSV from the ERP:

sku,quantityOnStock,restockableInDays,expectedDelivery
product-sku-1,100,5,2026-05-01T10:00:00.000Z
product-sku-2,250,3,2026-04-15T10:00:00.000Z
product-sku-3,0,10,2026-06-01T10:00:00.000Z

Parse the CSV and submit the inventory data as import operations:

import { parse } from "csv-parse/sync";
import * as fs from "fs";

// Parse the ERP CSV export
const csvContent = fs.readFileSync("erp-inventory-export.csv", "utf-8");
const records = parse(csvContent, { columns: true, skip_empty_lines: true });

// Map CSV rows to Import API InventoryImport resources
const inventoryImports = records.map((row) => ({
  key: `inventory-${row.sku}`,
  sku: row.sku,
  quantityOnStock: parseInt(row.quantityOnStock, 10),
  restockableInDays: parseInt(row.restockableInDays, 10),
  expectedDelivery: row.expectedDelivery,
}));

// Submit the inventory import operations
const inventoryImportRequest = await apiRoot
  .inventories()
  .importContainers()
  .withImportContainerKeyValue({ importContainerKey: "pim-product-import" })
  .post({
    body: {
      type: "inventory",
      resources: inventoryImports,
    },
  })
  .execute();

Import product data from the PIM

Product data typically comes from a PIM system, also often as a CSV export. Products reference other resources such as Product Types and Categories by key, and the Import API resolves these dependencies automatically.

Consider the following example CSV from the PIM:

key,name.en,slug.en,description.en,productType,categories,sku,priceKey,price
blue-widget,"Blue Widget","blue-widget","A high-quality blue widget.","standard-product","widgets;accessories","blue-widget-sku","blue-widget-price","EUR 1999"
red-gadget,"Red Gadget","red-gadget","A versatile red gadget.","standard-product","gadgets","red-gadget-sku","red-gadget-price","EUR 2499"

Parse the CSV and submit the product data as import operations:

import { parse } from "csv-parse/sync";
import * as fs from "fs";

// Parse the PIM CSV export
const csvContent = fs.readFileSync("pim-product-export.csv", "utf-8");
const records = parse(csvContent, { columns: true, skip_empty_lines: true });

// Map CSV rows to Import API ProductDraftImport resources
const productDraftImports = records.map((row) => {
  const [currencyCode, centAmountStr] = row.price.split(" ");
  return {
    key: row.key,
    name: { en: row["name.en"] },
    slug: { en: row["slug.en"] },
    description: { en: row["description.en"] },
    productType: {
      key: row.productType,
      typeId: "product-type",
    },
    categories: row.categories.split(";").map((catKey) => ({
      key: catKey.trim(),
      typeId: "category",
    })),
    masterVariant: {
      key: row.sku,
      sku: row.sku,
      prices: [
        {
          key: row.priceKey,
          value: {
            type: "centPrecision",
            currencyCode: currencyCode,
            centAmount: parseInt(centAmountStr, 10),
          },
        },
      ],
    },
  };
});

// Submit the product import operations
const productDraftImportRequest = await apiRoot
  .productDrafts()
  .importContainers()
  .withImportContainerKeyValue({ importContainerKey: "pim-product-import" })
  .post({
    body: {
      type: "product-draft",
      resources: productDraftImports,
    },
  })
  .execute();
Note that the productType and categories fields use key references. The Import API resolves these dependencies asynchronously. If the referenced Product Type or Category doesn't exist yet, the operation waits up to 48 hours for the dependency to become available.
After submitting import operations, query the Import Container for a summary of operation statuses. Poll the summary periodically until all operations reach a terminal state (imported, rejected, or validationFailed). For operations that fail validation, inspect the individual ImportOperation for error details.
For guidance on organizing Import Containers, handling retries, rate limits, and anti-patterns to avoid, see the Import API best practices.

Solution B: incremental sync with the HTTP API

The HTTP API is the right choice for real-time, incremental synchronization where you need immediate feedback on each operation. The core pattern is an upsert: fetch a resource by key, then create it if it doesn't exist or update it if it does.

This approach is ideal for:

  • Real-time synchronization from event-driven sources.
  • Incremental updates where only a few resources change at a time.
  • Scenarios where you need synchronous confirmation that each operation succeeded.

Upsert inventory entries

The upsert pattern for inventory uses the key to check existence. If the resource doesn't exist (HTTP 404), create it. If it exists, update it with the latest data.
async function upsertInventoryEntry(
  apiRoot,
  inventoryKey: string,
  sku: string,
  quantityOnStock: number
) {
  try {
    // Attempt to fetch the existing inventory entry by key
    const existing = await apiRoot
      .inventories()
      .withKey({ key: inventoryKey })
      .get()
      .execute();

    // Resource exists: update it
    const updated = await apiRoot
      .inventories()
      .withKey({ key: inventoryKey })
      .post({
        body: {
          version: existing.body.version,
          actions: [
            {
              action: "changeQuantity",
              quantity: quantityOnStock,
            },
          ],
        },
      })
      .execute();

    return updated.body;
  } catch (error) {
    if (error.statusCode === 404) {
      // Resource doesn't exist: create it
      const created = await apiRoot
        .inventories()
        .post({
          body: {
            key: inventoryKey,
            sku: sku,
            quantityOnStock: quantityOnStock,
          },
        })
        .execute();

      return created.body;
    }
    throw error;
  }
}

Upsert products

The same pattern applies to Products. Fetch by key, then create or update depending on whether the Product already exists.

async function upsertProduct(
  apiRoot,
  productKey: string,
  productData: {
    name: { en: string };
    slug: { en: string };
    productType: { key: string; typeId: "product-type" };
    description?: { en: string };
  }
) {
  try {
    // Attempt to fetch the existing Product by key
    const existing = await apiRoot
      .products()
      .withKey({ key: productKey })
      .get()
      .execute();

    // Resource exists: update it
    const updated = await apiRoot
      .products()
      .withKey({ key: productKey })
      .post({
        body: {
          version: existing.body.version,
          actions: [
            {
              action: "changeName",
              name: productData.name,
            },
            {
              action: "changeSlug",
              slug: productData.slug,
            },
          ],
        },
      })
      .execute();

    return updated.body;
  } catch (error) {
    if (error.statusCode === 404) {
      // Resource doesn't exist: create it
      const created = await apiRoot
        .products()
        .post({
          body: {
            key: productKey,
            name: productData.name,
            slug: productData.slug,
            productType: productData.productType,
            description: productData.description,
          },
        })
        .execute();

      return created.body;
    }
    throw error;
  }
}

Build a resilient HTTP API client

When synchronizing product data at scale using the HTTP API, build resilience into your client:

  • Ramp up traffic gradually: don't send thousands of requests immediately. Start with a low request rate and increase it progressively. This prevents 502 Bad Gateway and 503 Service Unavailable errors caused by sudden traffic spikes.
  • Implement retry with exponential backoff: when you receive a 502, 503, or 429 Too Many Requests response, retry the request after a delay. Double the delay after each consecutive failure (for example, 1 second, 2 seconds, 4 seconds) up to a maximum threshold.
  • Manage concurrency: limit the number of parallel requests your client sends. A reasonable starting point is 10 to 20 concurrent requests. Monitor error rates and adjust as needed. Excessive concurrency leads to throttling and degraded performance.

Choose the right API

The Import API and the HTTP API serve different use cases. Use the Import API for bulk imports, migrations, and full catalog loads where asynchronous processing and automatic dependency resolution are beneficial. Use the HTTP API for real-time updates and incremental syncs where you need immediate feedback per request. For a detailed comparison of the two APIs, see the Import API overview.

Key differences for integration planning:

  • Idempotency: The Import API has built-in idempotency via resource keys. With the HTTP API, you must implement the upsert pattern (as shown in the incremental sync examples on this page).
  • Best for: Use the Import API for high-volume batch operations. Use the HTTP API for low-to-medium volume per request.

In practice, many projects use both APIs: the Import API for the initial catalog load and periodic bulk refreshes, and the HTTP API for real-time incremental updates triggered by events in the PIM or ERP.