On 4.x
Last updated
Was this helpful?
Last updated
Was this helpful?
Nosto module (4.0.0 < 5.0.0) for Magento 2 introduced two new indexers to the existing Magento indexers. These indexers are used to drastically improve the performance of synchronising the product catalog with Nosto as well as generating the product tagging for product detail pages.
If you are using version >= 5.0.0 of Nosto module, please refer to this article.
The indexers can be found by navigating to the Index Management view in the Magento settings.
The two indexers are named Nosto Product Data Invalidator
and Nosto Product Data Index
.
The main focus of the first indexer is to listen for product changes from Magento. In case a product is updated, the indexer will sign the product as dirty in the cached product table.
The second indexer will listen for changes inside the product cache table itself. When a table entry is set to dirty, the DataIndexer will check and compare that the product has changed.
The product will be rebuilt, serialised and stored in the product_data field. Then the product is set as out of sync, which means that the data should be sent to Nosto through our APIs.
To further optimise the process, the module makes use of message queues.
All the products that have been rebuilt will be divided into batches and passed as messages to queue processor which will take care of sending the data and set the product as in_sync after it's a success.
You can run a full reindex of the product catalog by using Magento's built-in CLI indexer. After version 4.0.0, Nosto extension makes use of two indexers.
To reindex all products, re-run both indexers:
Invalidate indexer
Product data indexer
Starting with version 2.2.6 Magento supports parallel reindexing. Nosto's indexers support parallelization and both the Nosto indexers can be executed in parallel mode. The indexers are scoped based on stores. This means that if a merchant has n-stores, there will be n-processes running in parallel, each indexing a specific store (also called a "Dimension").
There are a few steps to be taken before enabling parallelisation:
Check the dimension mode for the indexer
Set the indexer mode for both to store
Make sure that the number of threads declared in the env variable MAGE_INDEXER_THREADS_COUNT
is equal to the max number of stores.
For testing purposes, it can be declared in the CLI, like:
We recommend the following best practices for Nosto indexers.
We strongly advise that both indexer modes are set to Update by Schedule
for better performance. This will also make the product updates to Nosto more reliable. For example the scheduled catalog price rules would not be updated in real-time to Nosto unless the indexer mode is set to Update by Schedule
If you have multiple store views, we recommend that you enable multi-dimensional indexing for both indexers.
If you are having issues with indexing you want to first enable Magento's debug logging https://devdocs.magento.com/guides/v2.3/config-guide/cli/logging.html. This will enable more verbose logging for the indexing. You will find indexing related logs from debug log (debug.log
by default). All log entries are prefixed with "nosto".
If you are frequently updating massive amount of products (for example via API or import) there's a chance that the indexer cannot process the previous update before the next update batch is executed. In these cases we recommend parallelising the indexer as a first step.
We also recommend figuring out the source of frequent product updates and do optimisations for the mview subscriptions / triggers. For example if you are using 3rd party module / integration that updates all product images frequently but those images are not used for recommendations you might want to remove gallery related subscriptions. Modifying the mview.xml
file can be done for example using Magento's patches.
innodb_buffer_pool_size
You will most likely see this warning in your Magento logs if you've installed MySQL using the defaults. To get rid of this warning we recommend increasing innodb_buffer_pool_size
on you MySQL server configuration. You can find more info about indexer optimization from the official Magento documentation.
If the product data is not synchronized to Nosto you must verify the message queue consumers nosto_product_sync.update
and nosto_product_sync.delete
are running. Magento cron should take care of running (and restarting if needed) the consumers automatically but you can verify this by checking the process list (ps -ax
for example) on your server.
You can also see the amount of products that are out of sync ("Products Out Of Sync") or needs to be rebuilt ("Products Marked As Dirty") in Magento's Nosto settings (Stores > Configuration > Services > Nosto).
If you have the indexers running on mode "Update by save" the bulk operations are not automatically reflected to Nosto. This is due to how Magento processes bulk updates internally.
It is highly recommended to run all indexers in mode "Update by schedule".
If the indexing process is too slow even after parallelising the indexing process you can set the product cache building process to be ran in a cron job. This way the sometimes heavy operation of building product cache would not affect other Magento indexers. Please keep in mind that if you set the product building to be done in a cron job you cannot utilise the parallelisation and the product synchronisation to Nosto will be slower due to this.
You can switch the product cache to to a cron from Nosto module's settings (Stores > Services > Nosto).
This happens by design. When Nosto settings that affect Nosto product data are changed and indexers are defined to be run in mode "Update By Schedule" Nosto will automatically initialise a full reindex to keep the product data up to date.
We highly recommend updating your module version to > 5.0.0.