Utilize este identificador para referenciar este registo: https://hdl.handle.net/1822/76687

TítuloNER in archival finding aids: extended
Autor(es)Cunha, Luís Filipe da Costa
Ramalho, José Carlos
Palavras-chavenamed entity recognition
archival search aids
machine learning
deep learning
maximum entropy
Data17-Jan-2022
EditoraMultidisciplinary Digital Publishing Institute
RevistaMachine Learning and Knowledge Extraction (MAKE)
CitaçãoCunha, L.F.d.C.; Ramalho, J.C. NER in Archival Finding Aids: Extended. Mach. Learn. Knowl. Extr. 2022, 4, 42-65. https://doi.org/10.3390/make4010003
Resumo(s)The amount of information preserved in Portuguese archives has increased over the years. These documents represent a national heritage of high importance, as they portray the country’s history. Currently, most Portuguese archives have made their finding aids available to the public in digital format, however, these data do not have any annotation, so it is not always easy to analyze their content. In this work, Named Entity Recognition solutions were created that allow the identification and classification of several named entities from the archival finding aids. These named entities translate into crucial information about their context and, with high confidence results, they can be used for several purposes, for example, the creation of smart browsing tools by using entity linking and record linking techniques. In order to achieve high result scores, we annotated several corpora to train our own Machine Learning algorithms in this context domain. We also used different architectures, such as CNNs, LSTMs, and Maximum Entropy models. Finally, all the created datasets and ML models were made available to the public with a developed web platform, NER@DI.
TipoArtigo
URIhttps://hdl.handle.net/1822/76687
DOI10.3390/make4010003
e-ISSN2504-4990
Versão da editorahttps://www.mdpi.com/2504-4990/4/1/3
Arbitragem científicayes
AcessoAcesso aberto
Aparece nas coleções:CCTC - Artigos em revistas internacionais

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
make-04-00003.pdf1,69 MBAdobe PDFVer/Abrir

Este trabalho está licenciado sob uma Licença Creative Commons Creative Commons

Partilhe no FacebookPartilhe no TwitterPartilhe no DeliciousPartilhe no LinkedInPartilhe no DiggAdicionar ao Google BookmarksPartilhe no MySpacePartilhe no Orkut
Exporte no formato BibTex mendeley Exporte no formato Endnote Adicione ao seu ORCID