On 4.x

Nosto module (4.0.0 < 5.0.0) for Magento 2 introduced two new indexers to the existing Magento indexers. These indexers are used to drastically improve the performance of synchronising the product catalog with Nosto as well as generating the product tagging for product detail pages.

If you are using version >= 5.0.0 of Nosto module, please refer to this article.

The indexers can be found by navigating to the Index Management view in the Magento settings.

The two indexers are named Nosto Product Data Invalidator and Nosto Product Data Index.

The main focus of the first indexer is to listen for product changes from Magento. In case a product is updated, the indexer will sign the product as dirty in the cached product table.

The second indexer will listen for changes inside the product cache table itself. When a table entry is set to dirty, the DataIndexer will check and compare that the product has changed.

The product will be rebuilt, serialised and stored in the product_data field. Then the product is set as out of sync, which means that the data should be sent to Nosto through our APIs.

To further optimise the process, the module makes use of message queues.

All the products that have been rebuilt will be divided into batches and passed as messages to queue processor which will take care of sending the data and set the product as in_sync after it's a success.

Manually Running the Indexer

You can run a full reindex of the product catalog by using Magento's built-in CLI indexer. After version 4.0.0, Nosto extension makes use of two indexers.

To reindex all products, re-run both indexers:

  • Invalidate indexer

    bin/magento indexer:reset nosto_index_product_invalidate
    bin/magento indexer:reindex nosto_index_product_invalidate
  • Product data indexer

    bin/magento indexer:reset nosto_index_product_data
    bin/magento indexer:reindex nosto_index_product_data

Indexer Parallelisation

Starting with version 2.2.6 Magento supports parallel reindexing. Nosto's indexers support parallelization and both the Nosto indexers can be executed in parallel mode. The indexers are scoped based on stores. This means that if a merchant has n-stores, there will be n-processes running in parallel, each indexing a specific store (also called a "Dimension").

There are a few steps to be taken before enabling parallelisation:

  1. Check the dimension mode for the indexer

bin/magento indexer:show-dimensions-mode
Product Price:                                     none
Nosto Product Index Data :                         none
Nosto Product Data Invalidator:                    none
  1. Set the indexer mode for both to store

bin/magento indexer:set-dimensions-mode nosto_index_product_data store
Dimensions mode for indexer "Nosto Product Index Data " was changed from 'none' to 'store'
bin/magento indexer:set-dimensions-mode nosto_index_product_invalidate store
Dimensions mode for indexer "Nosto Product Data Invalidator" was changed from 'none' to 'store'
  1. Make sure that the number of threads declared in the env variable MAGE_INDEXER_THREADS_COUNT is equal to the max number of stores.

For testing purposes, it can be declared in the CLI, like:

MAGE_INDEXER_THREADS_COUNT=3 php -f bin/magento indexer:reindex nosto_index_product_data

Best Practices

We recommend the following best practices for Nosto indexers.

  • We strongly advise that both indexer modes are set to Update by Schedule for better performance. This will also make the product updates to Nosto more reliable. For example the scheduled catalog price rules would not be updated in real-time to Nosto unless the indexer mode is set to Update by Schedule

  • If you have multiple store views, we recommend that you enable multi-dimensional indexing for both indexers.

Troubleshoot

If you are having issues with indexing you want to first enable Magento's debug logging https://devdocs.magento.com/guides/v2.3/config-guide/cli/logging.html. This will enable more verbose logging for the indexing. You will find indexing related logs from debug log (debug.log by default). All log entries are prefixed with "nosto".‌

Indexer is not keeping up with product updates‌

If you are frequently updating massive amount of products (for example via API or import) there's a chance that the indexer cannot process the previous update before the next update batch is executed. In these cases we recommend parallelising the indexer as a first step.‌

We also recommend figuring out the source of frequent product updates and do optimisations for the mview subscriptions / triggers. For example if you are using 3rd party module / integration that updates all product images frequently but those images are not used for recommendations you might want to remove gallery related subscriptions. Modifying the mview.xml file can be done for example using Magento's patches.‌

Warning about innodb_buffer_pool_size

You will most likely see this warning in your Magento logs if you've installed MySQL using the defaults. To get rid of this warning we recommend increasing innodb_buffer_pool_size on you MySQL server configuration. You can find more info about indexer optimization from the official Magento documentation.‌

Products not synchronized to Nosto

If the product data is not synchronized to Nosto you must verify the message queue consumers nosto_product_sync.update and nosto_product_sync.delete are running. Magento cron should take care of running (and restarting if needed) the consumers automatically but you can verify this by checking the process list (ps -ax for example) on your server.‌

You can also see the amount of products that are out of sync ("Products Out Of Sync") or needs to be rebuilt ("Products Marked As Dirty") in Magento's Nosto settings (Stores > Configuration > Services > Nosto).‌

Bulk attribute updates not synchronized to Nosto

If you have the indexers running on mode "Update by save" the bulk operations are not automatically reflected to Nosto. This is due to how Magento processes bulk updates internally.‌

It is highly recommended to run all indexers in mode "Update by schedule".‌

Nosto indexer is blocking other Magento indexers ‌

If the indexing process is too slow even after parallelising the indexing process you can set the product cache building process to be ran in a cron job. This way the sometimes heavy operation of building product cache would not affect other Magento indexers. Please keep in mind that if you set the product building to be done in a cron job you cannot utilise the parallelisation and the product synchronisation to Nosto will be slower due to this.‌

You can switch the product cache to to a cron from Nosto module's settings (Stores > Services > Nosto).‌

Nosto indexer runs after Nosto module settings are changed ‌

This happens by design. When Nosto settings that affect Nosto product data are changed and indexers are defined to be run in mode "Update By Schedule" Nosto will automatically initialise a full reindex to keep the product data up to date.

We highly recommend updating your module version to > 5.0.0.

Last updated