High precision but variable recall – comparing the performance of five deduplication tools

  • Heidrun Janka Institute of General Practice, Heinrich-Heine-University Düsseldorf
  • Maria-Inti Metzendorf

Abstract


Deduplication methods for multiple database searches conducted for evidence syntheses differ in terms of time invested, accuracy, and comprehensiveness of identified duplicates. Deduplication tools can significantly contribute to a more efficient conduct of the search task in evidence syntheses. Widely-used tools for deduplication include reference management software (e.g. EndNote), built-in deduplication features in systematic review software (e.g. Covidence, Rayyan), and automated deduplication tools (e.g. Deduklick, SRA Deduplicator). Newer tools leverage machine learning algorithms crafted by information specialists, that encompass natural language normalization and rule-based approaches. We investigated five frequently used automated and semi-automated deduplication tools regarding their performance, core features and time efficiency in comparison to manual deduplication in EndNote using six datasets.
Published
2024-03-17
How to Cite
1.
Janka H, Metzendorf M-I. High precision but variable recall – comparing the performance of five deduplication tools. JEAHIL [Internet]. 17Mar.2024 [cited 13Apr.2024];20(1):12-7. Available from: http://ojs.eahil.eu/ojs/index.php/JEAHIL/article/view/607
Section
Feature Articles