High precision but variable recall – comparing the performance of five deduplication tools

Main Article Content

Heidrun Janka
Maria-Inti Metzendorf

Abstract

Deduplication methods for multiple database searches conducted for evidence syntheses differ in terms of time invested, accuracy, and comprehensiveness of identified duplicates. Deduplication tools can significantly contribute to a more efficient conduct of the search task in evidence syntheses. Widely-used tools for deduplication include reference management software (e.g. EndNote), built-in deduplication features in systematic review software (e.g. Covidence, Rayyan), and automated deduplication tools (e.g. Deduklick, SRA Deduplicator). Newer tools leverage machine learning algorithms crafted by information specialists, that encompass natural language normalization and rule-based approaches. We investigated five frequently used automated and semi-automated deduplication tools regarding their performance, core features and time efficiency in comparison to manual deduplication in EndNote using six datasets.

Article Details

How to Cite
1.
High precision but variable recall – comparing the performance of five deduplication tools. J Eur Assoc Health Info Libr [Internet]. 2024 Mar. 17 [cited 2024 Nov. 21];20(1):12-7. Available from: https://ojs.eahil.eu/JEAHIL/article/view/607
Section
Feature Articles

How to Cite

1.
High precision but variable recall – comparing the performance of five deduplication tools. J Eur Assoc Health Info Libr [Internet]. 2024 Mar. 17 [cited 2024 Nov. 21];20(1):12-7. Available from: https://ojs.eahil.eu/JEAHIL/article/view/607