QUALITY MATTERS: BIOCURATION EXPERTS ON THE IMPACT OF DUPLICATION AND OTHER DATA QUALITY ISSUES IN BIOLOGICAL DATABASES
Qingyu Chen, Ramona Britto, Ivan Erill, Constance J. Jeffery, Arthur Liberzon, Michele Magrane, Jun-ichi Onami, Marc Robinson-Rechavi, Jana Sponarova, Justin Zobel, Karin Verspoor
Genomics Proteomics Bioinformatics. 2020 Jul 8; S1672-0229(20)30063-2.
Biological databases represent an extraordinary collective volume of work and serve as an invaluable resource for many researchers. Given the scale of these databases, some of the records are inevitably redundant, inconsistent, inaccurate, incomplete, or outdated. To characterize and explore the impact and solutions to duplications and other quality issues in biological databases, the authors of this article consulted more than 20 domain experts via a questionnaire-based survey. Furthermore, they examined the curation process of a biological database in detail by means of a descriptive report and an interview. The authors show that biocuration is vital to remove duplicates and to handle other quality issues in data resources. These results highlight the need for a broader community effort to provide adequate support for facilitating comprehensive curation, and to ensure an added benefit to the end users of biological databases.