Search Solutions 2020: Abstracts

Paul Cleverley, Robert Gordon University: The Impact of COVID-19 on Enterprise Search
Agnes Molar, Search Explained, Budapest: Impact of information quality on successful enterprise search implementation - a case study
Jeremy Pickens, OpenText, USA: Challenges and Opportunities in Ediscovery and Information Governance: Not Everything is Big Data
Michael Bendersky, Google Research: TF-Ranking: Learning-to-Rank in Tensorflow
Marianne Sweeny, Daedalus Information Systems, USA: IR Intelligence: Introduction to Neural IR & Learning To Rank
Charlie Hull, OpenSource Connections: Building the best ecommerce search with open source software
Tony Russell-Rose, UX Labs & Goldsmiths University: Searching fast & slow
Paul Levay, NICE: Systematic searching in the age of uncertainty: identifying the evidence for NICE guidelines
David Maxwell, TUDelft: Searching, Stopping and User Modelling
Elaine Toms, University of Sheffield: Conceptualising Search as a Set of Cognitive Prostheses
Ricardo Baeza-Yates, Northeastern University at Silicon Valley: Search Biases

Paul Cleverley, Robert Gordon University: The Impact of COVID-19 on Enterprise Search

The COVID-19 pandemic has created unprecedented challenges for organisations, yet no study has examined the impact on information search behaviour. A case study in a knowledge-intensive multinational organisation was undertaken, including over 2.5 Million search queries made during the pandemic. A surge of unique users and explicit COVID-19 search queries in early March 2020 may equate to ‘peak uncertainty and activity’ demonstrating the importance of an effective corporate search engine in times of crisis. Search volumes dropped 24% after lockdowns, an ‘L shaped’ recovery may be a surrogate for business activity. Explicit COVID-19 search queries transitioned from awareness related, to impact, strategy, policy, response, and ways of working that may be significant for future search design. Low clickthrough rates imply some information needs were not met and searches on mental health increased during lockdowns. Search logs may not be fully exploited by organisations to monitor business needs and health risks.

Agnes Molar, Search Explained, Budapest: Impact of information quality on successful enterprise search implementation - a case study

Working with enterprise search for over a decade, the most important lesson I’ve learned is that we should never underestimate the importance of content quality when planning and implementing enterprise search. Almost all of my clients have been having issues with their information architecture as well as metadata quality. In this case study session I’m going through the whole process of understanding, optimizing and enhancing content and metadata quality, in order to have better search. In this session, you’ll get an A to Z overview, as well as practical action items to take in your organization.

Jeremy Pickens, OpenText, USA: Challenges and Opportunities in Ediscovery and Information Governance: Not Everything is Big Data

Information Retrieval and Machine Learning have found wide applicability in today's large scale consumer driven web and commercial spaces. The collection of massive amounts of objective and subjective (e.g. behavioral) data has been a boon to these areas. However, there are a number of problem domains which just as assuredly require proper solutions, but for which such volumes of data are not -- nor will ever be -- available. Two such domains include eDiscovery (eDisclosure) and Information Governance. In this talk, I will describe these domains, discuss some of their challenges, and outline steps that have been taken toward solutions.

Michael Bendersky, Google Research: TF-Ranking: Learning-to-Rank in Tensorflow

In this talk, I will introduce TF-Ranking, a popular open-source library for building learning-to-rank (LTR) models in Tensorflow. I will first provide an overview of standard pointwise, pairwise and listwise approaches to LTR, and how these approaches are implemented in TF-Ranking. I will then discuss some recent advances in neural ranking that are available in TF-Ranking: neural GAMs for interpretable LTR, listwise BERT models, and Document Interaction Networks. Finally, I will demonstrate the state-of-the-art performance of TF-Ranking on a variety of both public and large-scale proprietary datasets.

Marianne Sweeny, Daedalus Information Systems, USA: IR Intelligence: Introduction to Neural IR & Learning To Rank

In the beginning, there were discrete silos and life was much easier for information retrieval systems and developers. The elegant efficiency of TF*idF created a search landscape that worked best for those trained professionals who designed to its simplicity. Documents ruled. Librarians were the findability knights templar Information retrieval.

Then, Google’s game changing introduction of human input into relevance ranking and the collection of massive amounts of user search data introduced a new lR landscape with user behavior as a ranking factor. Today, query terms do not have to appear in document text as search engines use artificial intelligence to “understand” the query. AI mediated IR is an experience that is calculated rather than informed by human understanding. It is determined by machine intelligence rather than guided by empathetic, collaborative design thinking.

However, does AI-driven IR serve human needs as well as fulfill its instructions?
In this session, we will explore the intersections between AI and IR as represented by Learning to Rank (LTR) “machine experience” governed by algorithms interpreting human behavior and structure, text. When we understand and embrace this new machine search experience, we will be better able to deliver an optimal human search experience.

Charlie Hull, OpenSource Connections: Building the best ecommerce search with open source software

Effective search is a vital component of ecommerce websites that are experiencing huge rises in usage as the world shifts to online shopping, accelerated by the current pandemic. However, many commercial and open source ecommerce platforms offer a poor search experience for the customer and fail to provide the right tools or support for marketers and others attempting to improve search quality and thus online sales.

Charlie will first show some simple tools to assess common failings of ecommerce search and their possible causes. He will then introduce Chorus, a reference implementation combining various open source projects that allows ecommerce providers to build their own high-quality search engine that can easily be measured, rated, tuned and tested. He’ll then demonstrate some simple search relevance measurement and tuning operations using an example web shop.

Tony Russell-Rose, UX Labs & Goldsmiths University: Searching fast & slow

Knowledge workers such as healthcare information professionals, legal researchers, and librarians need to create and execute search strategies that are comprehensive, transparent, and reproducible. The traditional solution is to use command-line query builders offered by proprietary database vendors. However, these are based on a procedural paradigm that dates from the days when users could access databases only via text-based terminals and command-line syntax. In this talk, we explore and review alternative approaches based on a declarative paradigm in which users express concepts as objects on a visual canvas and manipulate them to articulate relationships. This offers a more intuitive user experience (UX) that eliminates many sources of error, makes the query semantics more transparent, and offers new ways to collaborate, share, and optimize search strategies and best practices.

Paul Levay, NICE: Systematic searching in the age of uncertainty: identifying the evidence for NICE guidelines

This presentation will reflect on searching to support the National Institute for Health and Care Excellence (NICE) produce evidence-based guidance on promoting healthy living and preventing ill health. It will discuss the challenges of the last 10 years and consider the developments currently underway to transform guidance production. The presentation will consider how the role of the expert searcher has developed, as the remit of NICE has expanded from pharmaceutical products, to public health, social care and COVID-19. How does NICE identify the best available evidence within the time and resources available?

David Maxwell, TUDelft: Searching, Stopping and User Modelling

I will be presenting the work I did during my PhD studies at Glasgow pertaining to user modelling, and understanding their stopping behaviours (with an emphasis on Foraging Theory) — and how these findings could be useful for us as a community moving forward. I will conclude by outlining my current line of research on 'search as learning'.

Elaine Toms, University of Sheffield: Conceptualising Search as a Set of Cognitive Prostheses

The objective of search has been about identifying the relevant item. But in a complex work environment, search needs to be tailored to the intent of the immediate action, and that intent will vary significantly depending on the nature of the task that initiated the search in the first place. This work proposes decomposing the typical query-response framework into a series of cognitive prostheses – mini tools based on IR processes that augment human intellect, and aid the task completion.

Ricardo Baeza-Yates, Northeastern University at Silicon Valley: Search Biases

In this presentation we survey all biases that affect search systems. They include biases on the data, the algorithms as well as the user interaction, in particular the ones related to relevance feedback loops (e.g., ranking and personalization) that are tainted by the cognitive biases of users. We also address the biases that are product of the evaluation measures and methods used. This presentation is partially based on Bias on the Web, Communications of ACM, June 2018.