Learning from crowds in digital pathology using scalable variational Gaussian processes

Indexed in

License and use

Citations

Cited 13 times in Web of Science logo

Cited 11 times in Europe PMC logo

Cited 1 times in Google Scholar logo

Altmetrics

Grant support

This work was supported by the Agencia Estatal de Investigacion of the Spanish Ministerio de Ciencia e Innovacion under contract PID2019-105142RB-C22/AEI/10.13039/501100011033, and the United States National Institutes of Health National Cancer Institute Grants U01CA220401 and U24CA19436201. P.M. contribution was mostly before joining Microsoft Research, when he was supported by La Caixa Banking Foundation (ID 100010434, Barcelona, Spain) through La Caixa Fellowship for Doctoral Studies LCF/BQ/ES17/11600011.

Analysis of institutional authors

López-Pérez, MiguelAuthor

February 4, 2025

Publications

Article

Sí

Learning from crowds in digital pathology using scalable variational Gaussian processes

Publicated to:Scientific Reports. 11 (1): 11612- - 2021-06-02 11(1), DOI: 10.1038/s41598-021-90821-3

Authors: Lopez-Perez, Miguel; Amgad, Mohamed; Morales-Alvarez, Pablo; Ruiz, Pablo; Cooper, Lee A D; Molina, Rafael; Katsaggelos, Aggelos K

Affiliations

Microsoft Res, Cambridge CB1 2FB, England - Author

Northwestern Univ, Ctr Computat Imaging & Signal Analyt, Chicago, IL 60611 USA - Author

Northwestern Univ, Dept Pathol, Chicago, IL 60611 USA - Author

Nothwestern Univ, Dept Elect & Comp Engn, Evanston, IL 60208 USA - Author

OriGenAI, Brooklyn, NY 11201 USA - Author

Univ Granada, Dept Comp Sci & Artificial Intelligence, Granada 18071, Spain - Author

Abstract

The volume of labeled data is often the primary determinant of success in developing machine learning algorithms. This has increased interest in methods for leveraging crowds to scale data labeling efforts, and methods to learn from noisy crowd-sourced labels. The need to scale labeling is acute but particularly challenging in medical applications like pathology, due to the expertise required to generate quality labels and the limited availability of qualified experts. In this paper we investigate the application of Scalable Variational Gaussian Processes for Crowdsourcing (SVGPCR) in digital pathology. We compare SVGPCR with other crowdsourcing methods using a large multi-rater dataset where pathologists, pathology residents, and medical students annotated tissue regions breast cancer. Our study shows that SVGPCR is competitive with equivalent methods trained using gold-standard pathologist generated labels, and that SVGPCR meets or exceeds the performance of other crowdsourcing methods based on deep learning. We also show how SVGPCR can effectively learn the class-conditional reliabilities of individual annotators and demonstrate that Gaussian-process classifiers have comparable performance to similar deep learning methods. These results suggest that SVGPCR can meaningfully engage non-experts in pathology labeling tasks, and that the class-conditional reliabilities estimated by SVGPCR may assist in matching annotators to tasks where they perform well.

Keywords

Breast neoplasmsCrowdsourcingDeep learningFemaleHistocytochemistryHumansNormal distributionSoftware

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal Scientific Reports due to its progression and the good impact it has achieved in recent years, according to the agency Scopus (SJR), it has become a reference in its field. In the year of publication of the work, 2021, it was in position , thus managing to position itself as a Q1 (Primer Cuartil), in the category Multidisciplinary. Notably, the journal is positioned above the 90th percentile.

From a relative perspective, and based on the normalized impact indicator calculated from the Field Citation Ratio (FCR) of the Dimensions source, it yields a value of: 9.87, which indicates that, compared to works in the same discipline and in the same year of publication, it ranks as a work cited above average. (source consulted: Dimensions Jul 2025)

Specifically, and according to different indexing agencies, this work has accumulated citations as of 2025-07-30, the following number of citations:

WoS: 13
Europe PMC: 11
Google Scholar: 1
Open Alex: 27

Impact and social visibility

Leadership analysis of institutional authors

This work has been carried out with international collaboration, specifically with researchers from: Granada; United Kingdom; United States of America.

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (López Pérez, Miguel) .

Indexed in

License and use

Citations

Altmetrics

Grant support

Analysis of institutional authors

Share

Learning from crowds in digital pathology using scalable variational Gaussian processes

Affiliations

Abstract

Keywords

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

Impact and social visibility

Leadership analysis of institutional authors