2nd Workshop on Augmented Intelligence for Technology-Assisted Reviews Systems:
Evaluation Metrics and Protocols for eDiscovery and Systematic Review Systems.


2nd April 2023

To be held as part of the 45th European Conference on Information Retrieval (ECIR 2023) 2-6 April 2023 | Dublin, Ireland


Augmented Intelligence (AL) is “a subsection of AI machine learning developed to enhance human intelligence rather than operate independently of or outright replace it. It is designed to do so by improving human decision-making and, by extension, actions taken in response to improved decisions.” In this sense, users are supported, not replaced, in the decision-making process by the filtering capabilities of the Augmented Intelligence solutions, but the final decision will always be taken by the users who are still accountable for their actions.
In this field, Technology-assisted review systems (TARS) use a kind of human-in-the-loop approach where classification and/or ranking algorithms are continuously trained according to the relevance feedback from expert reviewers, until a substantial number of the relevant documents are identified. This approach has been shown to be more effective and more efficient than traditional e-discovery and systematic review practices, which typically consists of a mix of keyword search and manual review of the search results.
Given these premises, ALTARS will focus on High-recall Information Retrieval (IR) systems which tackle challenging tasks that require the finding of (nearly) all the relevant documents in a collection. Electronic discovery (eDiscovery) and systematic review systems are probably the most important examples of such systems where the search for relevant information with limited resources, such as time and money, is necessary.


In this workshop, we aim to fathom the effectiveness of these systems which is a research challenge itself. In fact, despite the number of evaluation measures at our disposal to assess the effectiveness of a "traditional" retrieval approach, there are additional dimensions of evaluation for TAR systems.
For example, it is true that an effective high-recall system should be able to find the majority of relevant documents using the least number of assessments. However, this type of evaluation discards the resources used to achieve this goal, such as the total time spent on those assessments, or the amount of money spent for the experts judging the documents.
The topics include, but are not restricted to:

  • Novel evaluation approaches and measures for Systematic reviews;
  • Reproducibility of experiments with test collections;
  • Design and evaluation of interactive high-recall retrieval systems;
  • Study of evaluation measures;
  • User studies in high-recall retrieval systems;
  • Novel evaluation protocols for continuous Active Learning;
  • Evaluation of sampling bias.


Research papers, describing original ideas on the listed topics and on other fundamental aspects of Technology-Assisted Reviews methodologies and technologies, are solicited. Moreover, short papers on early research results, new results on previously published works, and extended abstract on previously published works are also welcome.

  • Research papers presenting original works should be in the 9 - 10 pages range,
  • short papers should be in the 6 - 7 pages range
  • posters should be 3 - 4 pages long
  • extended abstracts should be 2 pages long
For all the submission types the references are included in the page limit. Papers must be in the CEUR-ART single column style.


The accepted papers will be published in the ALTARS 2023 Proceedings. The Proceedings will be published by CEUR-WS, which is gold open access and indexed by SCOPUS and DBLP.


Authors must submit their papers via Easychair:

Important Dates

Abstract Submission Deadline:

January 23 2023

Extended Submission Deadline:

January 30 2023

February 6 2023

Acceptance Notification:

February 20 2023

February 28 2023


2 April 2023


Workshop Chair

Giorgio Maria Di Nunzio, University of Padua (Italy)
Evangelos Kanoulas, University of Amsterdam (The Netherlands)
Prasenjit Majumder, DAIICT, Gandhinagar and TCG CREST, Kolkata (India)

Program Committee

Amanda Jones , Lighthouse (USA)
Dave Lewis , Redgrave Data (USA)
Parth Mehta , Parmonic (USA)
Doug Oard , University of Maryland (USA)
Fabrizio Sebastiani , CNR-ISTI (Italy)
Rene Spijker , Cochrane (The Netherlands)
Mark Stevenson , University of Sheffield (UK)
Eugene Yang , Johns Hopkins University (USA)


Local (Dublin) Times


Invited talk

Building Adhoc IR Evaluation Collections with Active Learning

Eugene Yang and Dawn J. Lawrie


An important quality of reusable evaluation collections for ad hoc retrieval is fairness to future systems that are not part of the collection creation process, which usually translates into judging most of the relevant documents for each topic. A typical approach to determining which documents to judge for a topic in the collection is pooling, which pools the top-ranked documents from a set of retrieval systems. However, the reusability of the collection depends on the diversity of the pooled systems so that most of the relevant documents will be judged; otherwise, it is possible to have systematic biases in the unjudged documents, which are generally considered not relevant.
In this talk, we argue that the problem of creating a reusable evaluation collection is indeed a high recall retrieval problem; thus, technology assisted review (TAR), a common high recall retrieval framework, can be utilized. We will discuss two cross language information retrieval evaluation collections, HC4 and HC3, and how TAR was used to create them. We will also present a pooling experiment on HC3 to compare the effectiveness of creating evaluation collections using TAR.


Coffee Break


Measuring Impact of False Negatives at Citation Screening Step on the Outcome of Systematic Review

Wojciech Kusa, Guido Zuccon, Petr Knoth and Allan Hanbury


Entity Enhanced Attention Graph-Based Passages Retrieval

Lucas Albarede, Lorraine Goeuriot, Philippe Mulhem, Claude Le Pape-Gardeux, Sylvain Marié and Trinidad Chardin-Segui


Automatic Citation Screening Using Pattern-Exploiting Training and Paraphrasing

Alice Romagnoli, Wojciech Kusa and Gabriella Pasi


Defining Effectiveness in One-Phase Technology-Assisted Review

David Lewis, Lenora Gray, Aravind Kuchibhatla and Mark Noel



Lunch Break


Classification Protocols with Minimal Disclosure

Jinshuo Dong, Aravindan Vijayaraghavan and Jason Hartline


Challenges and Opportunities in Extreme Systematic Reviews for Environmental Policymaking

Jean-Jacques Dubois and Yue Wang


Invited Talk

Improving Query Formulation for Systematic Review Literature Search
Guico Zuccon


Systematic reviews are comprehensive literature reviews that answer highly focused research questions. In evidence-based medicine, they are often considered the highest form of evidence. To create a high-quality systematic review, researchers construct complex Boolean queries to retrieve studies relevant to the review topic. However, constructing effective queries is time-consuming, and poor queries can lead to biased or invalid reviews or increased review costs due to irrelevant studies.
In this talk, I will discuss computational methods for creating high-quality queries for systematic reviews. These methods range from mimicking the human query formulation process to following objective formulation criteria, learning to rank query variations, incorporating controlled medical terminologies, and exploiting recent advances in generative language models (such as ChatGPT). I will also discuss the promises and limitations of these methods and outline open challenges and chart future research directions.

Coffee Break

Keynote (together with Legal IR workshop)

The Limitations and Misuse of Information Retrieval in Legal Cases

Maura Grossman