Integrate external search

Ask about this Page
Copy for LLM
View as Markdown
Learn more about integrations in our self-paced Integration patterns module.

In the digital commerce world, search is the primary means of product discovery, allowing your customers to easily find what they're looking for on the website. Poorly organized search results can create usability problems, frustrate customers, and most importantly, result in subpar customer experiences and lost sales. Hence, it is critical to implement an optimized search experience that can contribute to increased conversion rates.

Many of our customers choose to use a specialized commerce search. This guide outlines how to integrate commercetools Frontend and the commercetools commerce APIs with external search to support an end-to-end search functionality.

Functional scope

Search engine integrations are generally composed of two main areas:

Backend: product sync

As a prerequisite to supporting search functionality on commerce websites, product catalog data must be synchronized and indexed by the search engine. Product sync can be divided into full and partial product sync. A full product sync synchronizes the entire product catalog data to the search engine, while a partial product sync updates delta changes to the search engine.

Frontend: product discovery

Product discovery is the frontend implementation where the customer interacts with the search on the commerce website. These interactions include text search (including auto-complete), product listing pages (PLP), and search pages with filtering and faceting.

Solution

The diagram shows how both product sync and product discovery scenarios can be integrated.

Search integration with commercetools and Frontend.

Product data sync

After setting up the product data, we recommend using Subscriptions to synchronize product data to an external search engine.
With Subscriptions, you can implement an asynchronous solution that captures any updates on product data and pushes data towards the third-party system using a product sync connector hosted with Connect or any public cloud messaging service (for example, AWS EventBridge). For product sync, you can use the following Subscription Messages:

Key questions for your integration

Before designing your product data synchronization, consider the following business requirements that affect how you query and structure Product data for your search engine.

  • Stores: do you operate multiple Stores with different assortments? If so, you need to synchronize Products per Store and use Product Selections to manage Store-specific assortments.
  • Product Selections: are you using Product Selections to define different assortments? The Products available in each selection affect which Products you index for each Store or channel.
  • Product Tailoring: do you tailor Product names, descriptions, or other fields per Store? If so, use Product Tailoring to retrieve Store-specific data alongside the base Product data.
  • Localization: which locales does your storefront support? Ensure you include all required localized fields (names, descriptions, slugs) in your search engine index for each locale.
  • Pricing: how do you structure Prices? Consider whether you need to index Embedded Prices per currency, country, Customer Group, or channel. If you use Standalone Prices, account for those in your data model as well.
  • Inventory: do you need to reflect inventory availability in search results? If so, query Inventory Entries and include stock data per channel or supply channel in your index.

These questions help you determine the correct query parameters, data model, and Subscription Messages for your search engine index.

Full synchronization: the initial data load

Before receiving delta updates, you must perform a full synchronization to load the entire product catalog into the search engine. The following best practices apply:

  • Use Product Projections to query the current (published) projection, which represents the data visible to customers.
  • Use cursor-based pagination with the sort and withTotal=false parameters for efficient querying of large catalogs.
  • If you operate multiple Stores, query Products per Store using the storeProjection parameter.
  • Include Price Selection parameters to retrieve the correct Prices for each context (currency, country, Customer Group, channel).

The full synchronization follows these steps:

  1. Prepare your search engine index: create or clear the target index in your search engine, defining the schema and field mappings for Product data.
  2. Query Product Projections: retrieve the published Product Projections from Composable Commerce using pagination. Use the following query parameters for efficient pagination:
    Query Product Projections (first page)http
    GET /product-projections?staged=false&withTotal=false&sort=id asc&limit=100
    
    For subsequent pages, use the where parameter with the last Product ID from the previous page:
    Query Product Projections (next page)http
    GET /product-projections?staged=false&withTotal=false&sort=id asc&limit=100&where=id > "last-product-id"
    
  3. Transform the data: map each Product Projection to the document schema required by your search engine. Include all relevant fields such as names, descriptions, categories, Prices, Attributes, and availability.
  4. Index the data: send the transformed documents to your search engine in batches. Most search engines support bulk indexing operations for efficient loading.
  5. Verify the index: confirm the total document count in your search engine matches the expected number of Products. Run test queries to validate the data quality.

For catalogs with more than 100,000 Products, run the full sync in parallel by partitioning on the Product ID.

Delta updates

After the initial data load, you need to keep the search engine index in sync with ongoing changes. Two approaches support delta updates: scheduled polling and event-driven updates with Subscriptions.

Scheduled polling

With scheduled polling, your application periodically queries Composable Commerce for Products that changed since the last synchronization. You use the lastModifiedAt field to identify updated Products.

Scheduled polling works as follows:

  1. Store the timestamp of your last successful sync.

  2. At each polling interval, query Product Projections with a where predicate filtering on lastModifiedAt:
    Query Product Projections modified since last synchttp
    GET /product-projections?staged=false&where=lastModifiedAt > "2025-01-01T00:00:00.000Z"&sort=lastModifiedAt asc&withTotal=false&limit=100
    
  3. Process and index the returned Products.

  4. Update the stored timestamp to the lastModifiedAt value of the last processed Product.

Scheduled polling is simpler to implement but has trade-offs compared to the event-driven approach:

ConsiderationScheduled pollingEvent-driven (Subscriptions)
LatencyDepends on the polling interval (minutes to hours).Near real-time (seconds).
API usageEach poll queries the API, regardless of whether changes occurred.Messages are delivered only when changes occur.
ComplexityLower initial complexity, but requires state management for timestamps.Requires setting up a messaging service and message handlers.
ReliabilityStateful; if the stored timestamp is lost, a full re-sync may be needed.Message queues provide built-in retry and delivery guarantees.
ScalabilityHigher API load for frequently changing catalogs.Scales well with high-frequency changes.

Event-driven updates with Subscriptions

The recommended approach for delta updates is to use Subscriptions to receive messages when Products are published or unpublished. Event-driven Subscriptions provide near real-time updates with lower API overhead.

The key Messages for search engine synchronization are:

  • ProductPublished: sent when a Product is published or updated. Use the productProjection field in the message payload to directly access the published Product data without making an additional API call.
  • ProductUnpublished: sent when a Product is unpublished. Remove the Product from your search engine index.

To set up event-driven updates:

  1. Create a Subscription in Composable Commerce that targets your preferred messaging service. Supported destinations include Google Cloud Pub/Sub, AWS SQS, AWS SNS, AWS EventBridge, Azure Service Bus, and Confluent Cloud.
  2. Configure the Subscription to listen for ProductPublished and ProductUnpublished Messages.
  3. Implement a message consumer that processes incoming Messages and updates your search engine index.

For example, to create a Subscription with Google Cloud Pub/Sub as the destination:

{
  "key": "product-sync-subscription",
  "destination": {
    "type": "GoogleCloudPubSub",
    "projectId": "your-gcp-project-id",
    "topic": "product-updates"
  },
  "messages": [
    { "resourceTypeId": "product", "types": ["ProductPublished", "ProductUnpublished"] }
  ]
}

Using Subscriptions with a cloud messaging service provides built-in message durability and retry mechanisms, making this approach more reliable than polling for production workloads.

Product sync with Connect

The open-source Algolia integration of the Store Launchpad for B2C Retail is designed for use with the B2C sample data, and it synchronizes product data with Algolia search. You can create a private Connector using this repository by following the Connect getting started guide.

Product Discovery with Frontend

To support product discovery functionality in the frontend with commercetools Frontend, there are two main concepts to understand.

Frontend components

A Frontend component is a central part of the Frontend development that consists of two parts:
  • Javascript entry point, which is a React component that receives some special props.
  • Frontend component schema, which is a JSON file that defines the data a Frontend component requires and how it must be configured.
For an external search functionality in the frontend, the Store Launchpad for B2C Retail comes with an out-of-the-box data sync integration with Algolia, which you can use as a reference to customize or enhance search based on your needs.

Algolia search components must be developed with React instant search hooks. The Store Launchpad for B2C Retail has two out-of-the-box Algolia search components:

  • Real-time interactive search: supports text search on the website.
  • Interactive product list for category and product search pages: supports filtering in PLP.
For more information, see Frontend-Algolia integration.

Extensions

Frontend extensions are Javascript or TypeScript functions that run inside the API hub. API hub acts as an orchestration layer to integrate with the external search engine, commercetools, or any other third party.
For more information, see Extensions.

Conclusion

In this guide, we've highlighted how the HTTP API, Connect, and Frontend can support you with an end-to-end search functionality by integrating with an external search, and how you can take advantage of the out-of-the-box data sync integration with Algolia provided by the Store Launchpad for B2C Retail.