October 30, 2024
Publications
>
Proceedings Paper
No

Textual-Content-Based Classification of Bundles of Untranscribed Manuscript Images

Publicated to: Proceedings - International Conference on Pattern Recognition. 3162-3169 - 2021-01-01 (), DOI: 10.1109/ICPR48806.2021.9412688

Authors:

Prieto, JR; Bosch, V; Vidal, E; Alonso, C; Orcero, MC; Marquez, L
[+]

Affiliations

Inst Andaluz Patrimonio Hist, Ctr Arqueol Sunacuat - Author
Univ Politecn Valencia, PRHLT Res Ctr - Author

Abstract

Content-based classification of manuscripts is an important task that is generally performed in archives and libraries by experts with a wealth of knowledge on the manuscript's contents. Unfortunately, many manuscript collections are so vast that it is not feasible to rely solely on experts to perform this task. Current approaches for textual-content-based manuscript classification generally require the handwritten images to be first transcribed into text - but achieving sufficiently accurate transcripts are generally unfeasible for large sets of historical manuscripts. We propose a new approach to perform automatically this classification task which does not rely on any explicit i mage transcripts. It is based on probabilistic indexing, a relatively novel technology which allows to effectively represent the intrinsic word-level uncertainty generally exhibited by handwritten text images. We assess the performance of this approach on a large collection of complex manuscripts from the Spanish Archivo General de Indias, with promising results. To the best of our knowledge, this is the first published work proposing, developing and assessing a successful approach for content-based classification of untranscribed manuscript images.
[+]

Keywords

Classification tasksContent-based classificationConvolutional neural-networksExtractionHandwritten imagesHandwritten textsImage classificationNew approachesText processingTextual contentWord level

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal Proceedings - International Conference on Pattern Recognition due to its progression and the good impact it has achieved in recent years, according to the agency Scopus (SJR), it has become a reference in its field. In the year of publication of the work, 2021, it was in position , thus managing to position itself as a Q2 (Segundo Cuartil), in the category Computer Vision and Pattern Recognition.

Independientemente del impacto esperado determinado por el canal de difusión, es importante destacar el impacto real observado de la propia aportación.

Según las diferentes agencias de indexación, el número de citas acumuladas por esta publicación hasta la fecha 2026-04-02:

  • WoS: 5
  • Scopus: 9
[+]

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2026-04-02:

  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 4 (PlumX).
[+]

Leadership analysis of institutional authors

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (Prieto, JR) .

the author responsible for correspondence tasks has been Prieto, JR.

[+]