Let’s get you started!
Ready to know more about how Froomle can boost your business in as little as 40 days? Our team of experts is here to help!
Recommendation systems have become omnipresent in our daily lives. Where the technology used to be limited to tech giants like Facebook, Netflix and Spotify, we now find them used in more and more websites.
Retail companies want to show relevant products when you are browsing their store page, podcast apps suggest new shows to listen to and news websites find content related to your topics of interest.
The research community is very active and proposes many new algorithms, methods and metrics.
At Froomle, the research team realised that the choice of algorithm is not as important as making sure that the model is up to date and that it is trained on the right amount of data.
Better controlling these factors has a larger impact on customers’ KPIs, rather than trying out many different algorithms.
This blog post will discuss why these two factors are so important, and how the Froomle solution controls them.
A trained recommendation model freezes user interests and item relationships. The world, on the other hand, keeps changing. Items that were related a while ago might no longer be related, old items lose relevance and user interests change. Eventually, as time passes, the model's frozen reality is so different from the environment's reality, that it hurts the quality of the recommendations.
This is especially true for news use cases. In particular to the quick rotation of ‘relevant’ items, most items quickly lose relevance after a short period in the spotlight.
In order to avoid the difference between model reality and environment reality, we need to keep the models up to date. However, every model update costs money, and so in order to balance performance with model costs, companies need to find a way to schedule the right amount of model updates at the right moments to maximize performance.
Typically the starting point is a certain budget available for updates, which gets translated, based on past experience, into an average number of updates per day (or month).
A basic scheduling solution is to update the models on a fixed cadence, for example updating every 2 hours. However, the research shows that this is not the optimal usage of the available model updates. Activity fluctuates, for example, at night there usually is less traffic on a website. Any update scheduled during a period of "low traffic" is far less useful than an update during peak hours. During peak hours large amounts of information are collected, which must be captured by updating the model.
Fortunately, it is pretty easy to achieve this behaviour. Rather than specifying a fixed schedule based on time, scheduling can be based on the number of events that occurred since the last update. Once enough events have been collected, the model is retrained. The threshold defining "enough" can be computed based on the number of allowed updates, and the average number of events collected every day.
This is already a step towards better scheduling, but this method is still lacking, especially in settings where models do not show regular performance degradation, such as webshops and streaming services. In these settings, models do not grow stale at all over long periods of time, so they do not need to be updated frequently. They also do not grow stale at a regular pace, as the news models did. Rather there are a few moments where the environment’s reality shifts drastically, rather than gradually.
In these settings, the staleness of a model is not influenced by the number of new events collected, but rather by the amount of new information those events carry. Events that confirm knowledge already present in the model are not as useful as those that the model had not expected.
For example, if the model thinks a user is interested in Squid Game, and the user watches an episode from that series, that contains very little additional information to the model.
But if the model has no idea that the user likes The Office, and they watch the first episode of that series, that has a lot more value to a future model, because the system can learn something new about the user’s preferences.
This realization led to exploring information-based schedules.
Froomle has implemented the second method already after promising offline results and is working on also implementing the IPR scheduler.
Applying the smarter schedulers in production has led to an increase in CTR of 2% (relative) on news and a 20% (relative) increase in CTR on retail.
The second aspect to optimize is the amount of historic data to use when training models.
Through experimentation, Froomle noticed, that even if there are months or years of data available, training simple models with just the last few hours of data, is a very effective way to give better recommendations.
The changing environment again is at the basis of this solution. Old interactions contain different information from what is relevant now. So, using these interactions in models that need to perform now can contaminate the recommendations.
Some models are able to use more data effectively, by accounting for the order of events, or how old events are, but simpler models are easily drowned by older events, giving poor recommendations when they receive too much training data.
One could conclude that these simpler models are therefore not a good fit for those use cases, however, researchers found that by using only recent interactions as training data for these algorithms, they can outperform the more complicated algorithms.
From this experimentation, the Froomle standard procedure for optimizing AB-tests now also includes finding the right window of data to use when training recommendation models. For example, for news use cases the best results were achieved with training windows between 12 hours and 36 hours for personalisation models, and 1 hour for popularity.
In a series of AB tests on Mediahuis brands, Froomle found an uplift of 8% for the popularity model, training it on 1 hour of data, rather than 3 hours of data.
An interesting side-effect of this data reduction approach is that training the model takes less time and resources than it would without paying attention to training data.
Thus more models can be trained on the same budget, and so the production model is more up-to-date as well.
For more information, the research team at Froomle has published a paper on the methodology and experiments in the Proceedings of the Perspectives on the Evaluation of Recommender Systems Workshop in 2022 .
In order to get high-quality recommendations, companies need to look beyond the choice of algorithm, and consider when to train their models, and on which data to train them.
By getting these choices right, companies are able to make more effective recommendations. Ignoring these choices can have dramatic effects, as the model might be hopelessly out of data, or trained on data that is not representative of the environment in which it needs to make predictions.
 Robin Verachtert and Lien Michiels and Bart Goethals. “Are We Forgetting Something? Correctly Evaluate a Recommender System With an Optimal Training Window”.
In Proceedings of the Perspectives on the Evaluation of Recommender Systems Workshop 2022. URL: https://ceur-ws.org/Vol-3228/paper1.pdf
 Jon Atle Gulla, Lemei Zhang, Peng Liu, Özlem Özgöbek, and Xiaomeng Su. “The Adressa Dataset for News Recommendation”. In Proceedings of the International Conference on Web Intelligence, WI ’17, page 1042–1048, New York, NY, USA, 2017. Association for Computing Machinery. URL https://doi.org/10.1145/3106426.3109436.
 Gabriel de Souza Pereira Moreira, Felipe Ferreira, and Adilson Marques da Cunha. “News Session-based Recommendations Using Deep Neural Networks”. In Proceedings of the 3rd Workshop on Deep Learning for Recommender Systems, DLRS 2018, page 15–23, New York, NY, USA, 2018. ACM.URL: https://doi.org/10.1145/3270323.3270328.
 Michael Kechinov. Cosmeticsshop E-commerce Dataset.
URL: https://www.kaggle.com/mkechinov/ecommerce-events-history-in-cosmetics-shop. Accessed: 2022-07-26
Please accept marketing cookies to view this form.