In the digital commerce world, search is the primary means of product discovery, allowing your customers to easily find what they're looking for on the website. Poorly organized search results can create usability problems, frustrate customers, and most importantly, result in subpar customer experiences and lost sales. Hence, it is critical to implement an optimized search experience that can contribute to increased conversion rates.
Many of our customers choose to use a specialized commerce search. This guide outlines how to integrate commercetools Frontend and the commercetools commerce APIs with external search to support an end-to-end search functionality.
Functional scope
Search engine integrations are generally composed of two main areas:
Backend: product sync
As a prerequisite to supporting search functionality on commerce websites, product catalog data must be synchronized and indexed by the search engine. Product sync can be divided into full and partial product sync. A full product sync synchronizes the entire product catalog data to the search engine, while a partial product sync updates delta changes to the search engine.
Frontend: product discovery
Product discovery is the frontend implementation where the customer interacts with the search on the commerce website. These interactions include text search (including auto-complete), product listing pages (PLP), and search pages with filtering and faceting.
Solution
The diagram shows how both product sync and product discovery scenarios can be integrated.
Product data sync
- Product Messages
- Store Messages
- Product Selection Messages (required only if Product Selections are used to manage different Stores assortments)
Key questions for your integration
Before designing your product data synchronization, consider the following business requirements that affect how you query and structure Product data for your search engine.
- Stores: do you operate multiple Stores with different assortments? If so, you need to synchronize Products per Store and use Product Selections to manage Store-specific assortments.
- Product Selections: are you using Product Selections to define different assortments? The Products available in each selection affect which Products you index for each Store or channel.
- Product Tailoring: do you tailor Product names, descriptions, or other fields per Store? If so, use Product Tailoring to retrieve Store-specific data alongside the base Product data.
- Localization: which locales does your storefront support? Ensure you include all required localized fields (names, descriptions, slugs) in your search engine index for each locale.
- Pricing: how do you structure Prices? Consider whether you need to index Embedded Prices per currency, country, Customer Group, or channel. If you use Standalone Prices, account for those in your data model as well.
- Inventory: do you need to reflect inventory availability in search results? If so, query Inventory Entries and include stock data per channel or supply channel in your index.
These questions help you determine the correct query parameters, data model, and Subscription Messages for your search engine index.
Full synchronization: the initial data load
Before receiving delta updates, you must perform a full synchronization to load the entire product catalog into the search engine. The following best practices apply:
- Use Product Projections to query the
current(published) projection, which represents the data visible to customers. - Use cursor-based pagination with the
sortandwithTotal=falseparameters for efficient querying of large catalogs. - If you operate multiple Stores, query Products per Store using the
storeProjectionparameter. - Include Price Selection parameters to retrieve the correct Prices for each context (currency, country, Customer Group, channel).
The full synchronization follows these steps:
-
Prepare your search engine index: create or clear the target index in your search engine, defining the schema and field mappings for Product data.
-
Query Product Projections: retrieve the published Product Projections from Composable Commerce using pagination. Use the following query parameters for efficient pagination:
GET /product-projections?staged=false&withTotal=false&sort=id asc&limit=100For subsequent pages, use thewhereparameter with the last Product ID from the previous page:GET /product-projections?staged=false&withTotal=false&sort=id asc&limit=100&where=id > "last-product-id" -
Transform the data: map each Product Projection to the document schema required by your search engine. Include all relevant fields such as names, descriptions, categories, Prices, Attributes, and availability.
-
Index the data: send the transformed documents to your search engine in batches. Most search engines support bulk indexing operations for efficient loading.
-
Verify the index: confirm the total document count in your search engine matches the expected number of Products. Run test queries to validate the data quality.
For catalogs with more than 100,000 Products, run the full sync in parallel by partitioning on the Product ID.
Delta updates
After the initial data load, you need to keep the search engine index in sync with ongoing changes. Two approaches support delta updates: scheduled polling and event-driven updates with Subscriptions.
Scheduled polling
lastModifiedAt field to identify updated Products.Scheduled polling works as follows:
-
Store the timestamp of your last successful sync.
-
At each polling interval, query Product Projections with a
wherepredicate filtering onlastModifiedAt:GET /product-projections?staged=false&where=lastModifiedAt > "2025-01-01T00:00:00.000Z"&sort=lastModifiedAt asc&withTotal=false&limit=100 -
Process and index the returned Products.
-
Update the stored timestamp to the
lastModifiedAtvalue of the last processed Product.
Scheduled polling is simpler to implement but has trade-offs compared to the event-driven approach:
| Consideration | Scheduled polling | Event-driven (Subscriptions) |
|---|---|---|
| Latency | Depends on the polling interval (minutes to hours). | Near real-time (seconds). |
| API usage | Each poll queries the API, regardless of whether changes occurred. | Messages are delivered only when changes occur. |
| Complexity | Lower initial complexity, but requires state management for timestamps. | Requires setting up a messaging service and message handlers. |
| Reliability | Stateful; if the stored timestamp is lost, a full re-sync may be needed. | Message queues provide built-in retry and delivery guarantees. |
| Scalability | Higher API load for frequently changing catalogs. | Scales well with high-frequency changes. |
Event-driven updates with Subscriptions
The key Messages for search engine synchronization are:
- ProductPublished: sent when a Product is published or updated. Use the
productProjectionfield in the message payload to directly access the published Product data without making an additional API call. - ProductUnpublished: sent when a Product is unpublished. Remove the Product from your search engine index.
To set up event-driven updates:
- Create a Subscription in Composable Commerce that targets your preferred messaging service. Supported destinations include Google Cloud Pub/Sub, AWS SQS, AWS SNS, AWS EventBridge, Azure Service Bus, and Confluent Cloud.
- Configure the Subscription to listen for
ProductPublishedandProductUnpublishedMessages. - Implement a message consumer that processes incoming Messages and updates your search engine index.
For example, to create a Subscription with Google Cloud Pub/Sub as the destination:
{
"key": "product-sync-subscription",
"destination": {
"type": "GoogleCloudPubSub",
"projectId": "your-gcp-project-id",
"topic": "product-updates"
},
"messages": [
{ "resourceTypeId": "product", "types": ["ProductPublished", "ProductUnpublished"] }
]
}
Using Subscriptions with a cloud messaging service provides built-in message durability and retry mechanisms, making this approach more reliable than polling for production workloads.
Product sync with Connect
Product Discovery with Frontend
To support product discovery functionality in the frontend with commercetools Frontend, there are two main concepts to understand.
Frontend components
- Javascript entry point, which is a React component that receives some special props.
- Frontend component schema, which is a JSON file that defines the data a Frontend component requires and how it must be configured.
Algolia search components must be developed with React instant search hooks. The Store Launchpad for B2C Retail has two out-of-the-box Algolia search components:
- Real-time interactive search: supports text search on the website.
- Interactive product list for category and product search pages: supports filtering in PLP.
Extensions
Conclusion
In this guide, we've highlighted how the HTTP API, Connect, and Frontend can support you with an end-to-end search functionality by integrating with an external search, and how you can take advantage of the out-of-the-box data sync integration with Algolia provided by the Store Launchpad for B2C Retail.