Integrate with external Product and inventory data

Learn to import and synchronize data from external systems into your Composable Commerce Project.

After completing this page, you should be able to:

  • Understand the key considerations and best practices for using both the Import API and the HTTP API.
  • Analyze different strategies for importing and synchronizing data into commercetools.
  • Choose the optimal integration tool based on specific Project requirements and data characteristics.

In many enterprise architectures, Product data originates from multiple external systems of record. Zen Electron's setup is a classic example: they need to source core Product data from a Product Information Management (PIM) system and Product images from a Digital Asset Management (DAM).

There are two primary ways to send data to the Project: the synchronous HTTP API and the asynchronous Import API.

To showcase the power of the Import API's dependency resolution, we will perform a two-step import:

  1. First, we will import the images from the DAM. The Import API will stage these images, referencing Product Variants that do not yet exist.
  2. Second, we will import the Product data from the PIM. The API will then automatically associate the pre-loaded images with the correct Product Variants as they are created.

A key best practice for any data import is to define unique keys for all resources. Keys make your import operations idempotent (safe to re-run) and provide a stable way to reference resources across different import jobs. Notice how each CSV file below includes a unique key for the primary resource it defines.

Let's assume we have received the data formatted into the following CSV files:

images\_from\_dam.csv:
product_variant_sku,image_url,image_key
WM-001,https://dam.example.com/images/mouse_front.jpg,mouse-front
WM-001,https://dam.example.com/images/mouse_side.jpg,mouse-side
BK-001,https://dam.example.com/images/keyboard_main.jpg,keyboard-main
UCH-001,https://dam.example.com/images/hub_all_ports.jpg,hub-all-ports
products\_from\_pim.csv:
key,name_en_US,slug_en_US,product_type_key,variant_sku,variant_key,price_key,price_cent_amount,price_currency,tax_category_key
product-001,Wireless Mouse,wireless-mouse,electronics,WM-001,wm-001,wm-001-us-price,2999,USD,standard-tax
product-002,Bluetooth Keyboard,bluetooth-keyboard,electronics,BK-001,bk-001,bk-001-us-price,1232,USD,standard-tax
product-003,USB-C Hub,usb-c-hub,electronics,UCH-001,uch-001,uch-001-us-price,3999,USD,standard-tax

Solution A: The Import API

This approach leverages the Import API, which is specifically designed for bulk, asynchronous data ingestion. A service will extract data from the PIM and DAM, transform it into the required Import API format, and upload it in bulk.
The Import API processes data asynchronously, making it an ideal choice for initial catalog imports and scheduled batch updates. The solution below follows the general Import API workflow.

Step 1: Import Product Images from the DAM

We begin by importing the images. This might seem counterintuitive since the Products don't exist yet, but it's a perfect demonstration of the Import API's ability to handle dependencies. We are telling Composable Commerce, "Here are some images; please attach them to these Product Variant SKUs as soon as they become available."

a. Create an Image Import Container: This acts as a staging area for our image data.

apiRootImport
  .importContainers()
  .post({
    body: {
      key: 'zen-image-import-container',
      resourceType: 'product', // Specify the type of resource for this container
    },
  })
  .execute();
b. Submit ImageImportRequests: We read the images\_from\_dam.csv file, transform each row into an ImageImportRequest, and submit them. Each request includes the image\_key from our CSV and references the Product Variant by its SKU.
async function importImages(importContainerKey: string) {
  const CHUNK_SIZE = 20;
  const rawImages = await csvtojson().fromFile("./images_from_dam.csv");

  // Map CSV data to Image resources, including the key
  const allResources: ImageImportRequest[] = rawImages.map((image) => ({
    key: image.image_key, // Use the key from the DAM
    productVariant: {
      typeId: "product-variant",
      sku: image.product_variant_sku,
    },
    images: [{
      url: image.image_url,
      // You can also specify dimensions
      // dimensions: { w: 1920, h: 1080 }
    }],
  }));

  // Split resources into chunks and submit each to the Import API
  for (let i = 0; i < allResources.length; i += CHUNK_SIZE) {
    const chunk = allResources.slice(i, i + CHUNK_SIZE);
    const request: ImageImportRequest = {
      type: "image",
      resources: chunk,
    };

    try {
      const response = await apiRootImport
        .images()
        .importContainers()
        .withImportContainerKeyValue({ importContainerKey })
        .post({ body: request })
        .execute();
      console.log(` Image chunk imported successfully.`, response.body);
    } catch (error) {
      console.error(` Error importing image chunk:`, error);
    }
  }
}

After submission, the import operations for these images will have a state of unresolved because their target Product Variants do not exist yet.

Step 2: Import Product Data from the PIM

Now, we import the core Product data. As the Import API processes these products, it will find the pending image import operations that reference the same SKUs and automatically resolve the dependencies, linking the images to the newly created Product Variants. We can reuse the Import Container because the datatype is still ProductDraftImport.
async function importProductDrafts(importContainerKey: string) {
  const CHUNK_SIZE = 20;
  const rawProducts = await csvtojson().fromFile("./products_from_pim.csv");

  // Map CSV data to ProductDraftImport resources, including all keys
  const allResources: ProductDraftImport[] = rawProducts.map((product) => ({
    key: product.key, // Product key
    name: { "en-US": product.name_en_US },
    productType: { typeId: "product-type", key: product.product_type_key },
    slug: { "en-US": product.slug_en_US },
    taxCategory: { typeId: "tax-category", key: product.tax_category_key },
    masterVariant: {
      sku: product.variant_sku,
      key: product.variant_key, // Variant key
      prices: [{
        key: product.price_key, // Price key
        value: {
          type: "centPrecision",
          currencyCode: product.price_currency,
          centAmount: parseInt(product.price_cent_amount, 10),
        },
      }],
    },
    // Set publish to true to make the product available immediately
    publish: true,
  }));

  // Split resources into chunks and submit each to the Import API
  for (let i = 0; i < allResources.length; i += CHUNK_SIZE) {
    const chunk = allResources.slice(i, i + CHUNK_SIZE);
    const request: ProductDraftImportRequest = {
      type: "product-draft",
      resources: chunk,
    };

    try {
      const response = await apiRootImport
        .productDrafts()
        .importContainers()
        .withImportContainerKeyValue({ importContainerKey })
        .post({ body: request })
        .execute();
      console.log(` Product chunk imported successfully.`, response.body);
    } catch (error) {
      console.error(` Error importing product chunk:`, error);
    }
  }
}

Step 3: Check the status

After submitting both sets of data, you can poll the status of each container to monitor progress. The import job will wait for a maximum of 48 hours for dependencies to be provided before timing out.

a. Check the Status: You can check the status of all operations in a container to verify completion. You would do this for both zen-image-import-container and zen-product-import-container.
async function checkImportSummary(importContainerKey: string) {
  const response = await apiRootImport
    .importContainers()
    .withImportContainerKeyValue({ importContainerKey })
    .importSummaries()
    .get()
    .execute();
  console.log(`Summary for ${importContainerKey}:`, response.body);
}

b. Once the Product imports are complete, the previously unresolved image import operations will transition to imported, and the images will appear on their respective Product Variants in the Merchant Center.

Key considerations for the Import API

  • Idempotency and keys: Providing a unique key for every resource (Products, Product Variants, Prices, Images, etc.) is a crucial best practice. This key, which you define in your source system (PIM/DAM), allows the Import API to either create a new resource or update an existing one with the same key. This ensures your import scripts are idempotent, allowing them to run repeatedly without producing duplicate data.
  • Asynchronous operation: The Import API is asynchronous. You submit a request and then poll for status updates.
  • Data orchestration: The Import API excels at handling complex data relationships. In our scenario, we imported images first, referencing Product Variant SKUs that didn't exist yet. The API held these operations in an unresolved state until the corresponding Products were imported, at which point it automatically linked them. This dependency resolution works as long as all related data is sent within 48 hours.

The beauty of the Import API is that it handles both create and update scenarios with the same request structure and intelligently manages dependencies, simplifying the integration of data from multiple systems like a PIM and a DAM.

Solution B: The HTTP API

The HTTP API is synchronous and provides immediate feedback on whether an operation succeeded or failed. While this approach offers more control, it also requires managing backoff responses, resource dependencies, and error handling within your client.

The workflow for an "upsert" (update or insert) operation is as follows:

  1. Attempt to fetch each Product by its key.
  2. If it's not found (404 error), send a POST request to create it.
  3. If it is found, get its current version and send a POST request with the appropriate updateActions to modify it.
async function upsertProductsFromCSV() {
  const rows = await csvtojson().fromFile("./products_from_pim.csv");

  for (const row of rows) {
    try {
      // 1. Try to fetch the product
      const existing = await apiRoot.products().withKey({ key: row.key }).get().execute();
      const currentVersion = existing.body.version;

      // 3. If found, update it
      console.log(`Product ${row.key} exists, updating...`);
      await apiRoot.products().withKey({ key: row.key }).post({
        body: {
          version: currentVersion,
          actions: [
            { action: "changeName", name: { "en-US": row.name_en_US } },
            { action: "changeSlug", slug: { "en-US": row.slug_en_US } },
            // Add other update actions as needed...
          ],
        },
      }).execute();

    } catch (err: any) {
      if (err.statusCode === 404) {
        // 2. If not found, create it
        console.log(`Product ${row.key} not found, creating...`);
        await apiRoot.products().post({
          body: {
            key: row.key,
            productType: { typeId: "product-type", key: row.product_type_key },
            name: { "en-US": row.name_en_US },
            slug: { "en-US": row.slug_en_US },
            // Add other fields for creation...
          },
        }).execute();
      } else {
        console.error(` Failed to process product ${row.key}`, err.message);
      }
    }
  }
}

A key advantage of this approach is that results are nearly instantaneous, provided all dependencies (like Product Types) are in place. However, this also means you are responsible for building a resilient client.

Building a resilient HTTP API Client

When using the HTTP API for bulk operations, it is crucial to implement client-side logic that can:

  • Ramp up request load gradually: Avoid sudden large bursts of requests. Start with a low number of concurrent requests and only increase the load when no back-pressure error codes (like 502 or 503) are returned.
  • Retry gracefully: Implement a backoff strategy for transient errors like 502 Bad Gateway or 503 Service Unavailable.
  • Manage concurrency: Implement a request manager or middleware to throttle outgoing requests, using a dynamic concurrency strategy with robust error handling.
A well-designed request manager acts as a smart traffic controller for your API calls. It should manage a dynamic concurrency limit, handle errors with retries, and provide observability through logging. This ensures your integration is robust and respects the API's overcapacity guidelines.

The big picture: Choosing the right API

Both the Import API and the HTTP API are powerful tools for getting data into your Project. The best choice depends on your specific use case.

The Import API is designed for asynchronous bulk inserts and updates. It is ideal for:

  • Initial data migrations.
  • Scheduled batch syncs from multiple systems (PIM, ERP, DAM).
  • Scenarios where developer experience and simplified dependency management are prioritized over real-time feedback.

The HTTP/GraphQL API is designed for real-time, transactional operations. It is the better choice for:

  • Smaller, incremental updates where immediate feedback is required.
  • Use cases where precise control over individual operations and error handling is necessary.
  • Powering user-facing applications that require synchronous data manipulation.
  • Speed is a critical requirement.

Test your knowledge