Ido Guy about two papers that were accepted to the SIGIR 2020 conference

Our research teams never rest.
They keep on publishing articles and give presentations in leading conferences all over the world.
Today, we bring forth Ido Guy, in order to TL;DR two papers that were accepted to the SIGIR 2020 conference.

Interested? You can read the full papers at:

A. As part of a cooperation with Professor Oren Kurland from the Technion, we examined the subject of reformulation queries in eCommerce search engines.

When users want to buy something, they perform a search. The search engine is supposed to give them the best results it can. We know that during an average search session, users edit and change their keywords several times. For starters, we tried to understand how or what affects those changes. The name of the game is intent – what is the purpose of the user’s search.

We examined and published a collection of findings that show, among other things, that search queries usually get more verbose during the session. Meaning, the users add rather than remove queries along the session. We also observed that unlike regular search engines, changing the query on an eCommerce search engine leads to an almost-completely different set of top ten and even top fifty search results. These insights helped us to examine if we can predict the query reformulation: can we tell in advance what will be the changes the user makes? If we can predict that, we might be able to give him personalized results. Of course, this is dependent on many variables. For example, is the query easy or hard in regards to the results’ retrieval scores (which can be an indicator as to how well the search engine “understood” what the user is looking for).

The bottom line is that we can get very good results.As a world leader in eCommerce, eBay keeps trying to improve its search capabilities. To us, a successful reformulation session is one that ends with a purchase, after the user has found the product s/he was looking for, quickly and efficiently.


B. The second article was written in cooperation with Professor Bracha Shapira and Professor Lior Rokach from Ben-Gurion University of the Negev.

We examined bundles on the websites – items that have more than one product from the catalog. It’s a parcel of items that are sold together and a very common method, in which the seller sells a few items in one deal and the buyer enjoys a discount. Bundles play a central role in the eCommerce arena. The challenge begins when items aren’t labeled properly – when sellers mark items as bundles, even though they aren’t bundles, and vice versa. 

Our mission was to understand where are the bundles present, and where are they not. As part of the research, we released a labeled dataset, which includes items marked as bundles or not. Since most of our data isn’t labeled and has “noisy labels”, the existing methods of supervised Machine Learning didn’t work very well for us. We’ve developed a method that uses both labeled and unlabeled data, in order to learn (we have an abundance of unlabeled data, obviously). Our semi-supervised system starts with labeled content, learns a model, implements it on unlabeled data and then looks at the most “confident” results and makes them a part of the labeled data in the next learning iteration. The process then repeats itself and develops a new model, based on previous results, and runs it again and again. This way, we’ve developed a large collection of models, in which the final model is a smart combination of selected models that were created during the iterations (Ensemble).

Eventually, we’ve managed to produce impressive results in identifying bundles in two common categories: “Video games and consoles” and “Cameras and images”. The next phase will be to “disassemble” the bundle into its separate products and perform further actions on the basis of this information.