To best utilize the Import API, here are some best practices that we recommend while implementing the Import API for best results.
It is entirely up to you how you want to organize your Import Containers, we recommend breaking down your Import Containers by data source, resource type or as a reusable container for recurring import activities.
|Use Case||Possible Import Container breakdown|
|Import Product and Category||Create separate containers for Product and Category.|
|Import Price changes daily at 5 PM||Create one container that could be reused, in case there is more than 200 000 imports per day, may be breakdown by some other business logic or temporary container for excess counts.|
|Import Product changes from multiple sources||create container per source for products imports.|
We recommend having up to 200 000 Import Requests at a time in a single Import Container. This way it will not be costly to perform monitoring activities on the container level. Since Import Operations are automatically deleted 48 hours after their creation time, you can reuse Import Containers by utilizing this limit over time.
|Import Operation total count||Import Containers (Import Operation count)|
|Day 1||100 000||container-a (100 000)|
|Day 2||500 000||container-a (100 000), container-b (200 000), container-c (200 000)|
|Day 3||400 000||container-a (0), container-b (200 000), container-c (200 000)|
|Day 4||200 000||container-a (200 000), container-b (0), container-c (0)|
On Day 3, the container-a will be empty as all the Import Operations reached to 48 hours timeline and get expired, so you can reuse it. Similarly container-b and c get cleaned up on Day 4.
Although there is no limit on the number of Import Containers as of now, you may expect to have some expiration or deletion policy on them in the future. Although there is no limit on the number of Import Operations or the number of Import Requests per container, we recommend breaking down containers per 200 000 Import Operations especially if you would be querying these containers. In case you have an event based architecture and you do not plan to actively monitor these containers, you could have more Import Operations per container, even up to 500 000 or more.
You can delete Import Containers at your own convenience. As of now there is no limit on the number of Import Containers. We recommend using more containers with less data over more data in less containers. You may expect to have some expiration or deletion policy on containers in the future.
You do not need to clean up an Import Container as Import Operations are automatically deleted 48 hours after they are created. If you need, you can delete the Import Container. This will immediately delete all Import Operations in the container. However, already imported data will not be deleted from the platform.
As the batch size is limited to 20 per Import Request, if you have a huge number of resources to import, we recommend doing thread optimization to send your data as fast as possible to an Import Container. Please note that as soon as the first Import Request is received by the Import Container, it already starts to import asynchronously.
- Two of our monitoring endpoints, Import Summary and Query Import Operations, are container based. You may call the Import Summary endpoint for a quick summary and later fetch details using the Query Import Operations endpoint.
- The Import Summary endpoint should be used to get an aggregated summary of the progress, which gives you the information if you have any errors,
- Query Import Operations should be used with filters like states to query specific situations. For example, query to fetch all the errors to fix those, or to fetch the unresolved to resolve those.
- You can use debug mode to fetch the unresolved references if there is something in the
unresolvedstates. This way, you know which references to resolve.
The purpose to keep the Import Operations for 48 hours is to allow you to send other referenced data (unresolved refernences) during this time period,
Suppose one of your teams is responsible for product import but the business validation usually delays the product import for 1 -2 days, and there is another team that imports Prices and is very fast in importing the data. The Import API will keep the price data for maximum of 48 hours and wait for the product to be imported.
- The Import API doe not have a data import limit for your project. However it is advised to not import all of your data through one Import Container. Please see Import container optimization best practices above.
- Please refer to platform limits to review the limits on inidividual resources and make sure your import requests are aligned to your project limits.
You need to retry only if your Import Operation has the rejected status. In any other cases, you do not need to retry. The Import API will handle the retry internally.
- You should not send duplicate import requests concurrently. Since the Import API imports data asynchronously, as of now the order is not guaranteed. It may also lead to a concurrent modification error.
- You should not query Import Operations or the Import Summary endpoint repeatedly in case of errors without fixing the problems. It may slow down the import process. You can use debug mode if required to find out more details and retry after fixing the problems.