Utilize este identificador para referenciar este registo:
https://hdl.handle.net/1822/70215
Título: | Towards an automated classification of spreadsheets |
Autor(es): | Mendes, Jorge Cunha Do, Kha N. Saraiva, João |
Palavras-chave: | Spreadsheets Data mining Classification |
Data: | Jan-2016 |
Editora: | Springer |
Revista: | Lecture Notes in Computer Science |
Citação: | Mendes J., Do K.N., Saraiva J. (2016) Towards an Automated Classification of Spreadsheets. In: Milazzo P., Varró D., Wimmer M. (eds) Software Technologies: Applications and Foundations. STAF 2016. Lecture Notes in Computer Science, vol 9946. Springer, Cham. https://doi.org/10.1007/978-3-319-50230-4_26 |
Resumo(s): | Many spreadsheets in the wild do not have documentation nor categorization associated with them. This makes difficult to apply spreadsheet research that targets specific spreadsheet domains such as financial or database.We introduce with this paper a methodology to automatically classify spreadsheets into different domains. We exploit existing data mining classification algorithms using spreadsheet-specific features. The algorithms were trained and validated with cross-validation using the EUSES corpus, with an up to 89% accuracy. The best algorithm was applied to the larger Enron corpus in order to get some insight from it and to demonstrate the usefulness of this work. |
Tipo: | Artigo em ata de conferência |
URI: | https://hdl.handle.net/1822/70215 |
ISBN: | 978-3-319-50229-8 |
e-ISBN: | 978-3-319-50230-4 |
DOI: | 10.1007/978-3-319-50230-4_26 |
ISSN: | 0302-9743 |
Versão da editora: | https://link.springer.com/chapter/10.1007/978-3-319-50230-4_26 |
Arbitragem científica: | yes |
Acesso: | Acesso aberto |
Aparece nas coleções: |