To best utilize the Import API, we recommend some best practices while implementing the Import API.
Using Import Containers effectively
Organizing Import Containers
It is entirely up to you how to organize your Import Containers.
A general recommendation is to use more Import Containers that contain less data (over more data in fewer containers). But, it may differ based on your use case and organization/monitoring needs:
- When importing full data sets, creating a new dedicated Import Container for the task can help in distinguishing imports performed on different occasions.
- When performing routine data imports, reusing a dedicated Import Container may be better.
- When importing data from multiple sources, using an Import Container for each source can help in organizing and monitoring progress.
In these three use cases, Import Containers would be organized by resource type, reusing a container for recurring import activities, or by data source.
|Use Case||Possible Import Container organization|
|Import Product and Category||Create separate containers for Product and Category|
|Import Price changes daily at 5 PM||Create a reusable container. If there are more than 200 000 imports per day, this may be broken down by some other business logic or temporary container for excess counts.|
|Import Product changes from multiple sources||Create one container per source for Product imports.|
|Import Operation total count||Import Containers (Import Operation count)|
|Day 1||100 000||container-a (100 000)|
|Day 2||500 000||container-a (100 000), container-b (200 000), container-c (200 000)|
|Day 3||400 000||container-a (0), container-b (200 000), container-c (200 000)|
|Day 4||200 000||container-a (200 000), container-b (0), container-c (0)|
On Day 3, container-a will be empty as all the Import Operations have reached 48 hours. container-a is now ready to be reused. Similarly container-b and container-c can be reused from Day 4.
You can have any number of Import Containers.
Import Operations and Import Requests per Import Container have no limits, but we recommend breaking down containers per 200 000 Import Operations, especially if you would be querying these containers.
If you have an event based architecture and do not plan to actively monitor these containers, you could have more Import Operations per container (such as up to 500 000 or higher).
Cleaning up data from Import Containers and removing unused Import Containers
You can delete Import Containers at your own convenience. This will immediately delete all Import Operations in the container. However, data that has been imported to your Project will not be affected.
How to send Import Requests to Import Containers?
The batch size is limited to 20 per Import Request. If you have a huge number of resources to import, we recommend using thread optimization to send your data as fast as possible to an Import Container.
When to use product draft vs product/variant/prices individually?
How to effectively monitor the import progress?
Two of our monitoring endpoints, Import Summary and Query Import Operations, are container-based. Call the Import Summary endpoint for a quick summary and (later) the Query Import Operations to fetch details.
- Import Summary can be used to get an aggregated progress summary, which gives you the information if you have any errors,
- Query Import Operations can be used with filters like states to query specific situations. For example, query to fetch all the errors to fix those, or to fetch the unresolved to resolve those.
- You can use debug mode to fetch the unresolved references if there is something in the
unresolvedstates. This way, you know which references to resolve.
How to best utilize the 48 hours lifetime of import operations?
Import Operations are kept for 48 hours to allow you to send other referenced data (unresolved references) during this time period.
For example, one of your teams is responsible for Product import but the business validation usually delays the Product import for 1-2 days, and another team that imports Prices is very fast in importing the data, the Import API keeps the Price data for up of 48 hours and waits for the Product to be imported.
What is a very large size import? Limitations of Import API?
What is the recommendation on retries?
You only need to retry if your Import Operation has the
rejected status. In other cases, the Import API will handle the retry internally and you need not retry.
What not to do
- Do not send duplicate import requests concurrently. Since the Import API imports data asynchronously, the order is not guaranteed. It may also lead to a concurrent modification error.
- In case of errors, do not query Import Operations or the Import Summary endpoint repeatedly without fixing the problems as it may slow down the import process. If required, debug to find out more details and retry after fixing the problems.