Improving NLP techniques by integrating linguistic input to detect Hate Speech in CMC Corpora

doi:10.1007/978-3-031-38248-2_3

Utilize este identificador para referenciar este registo: https://hdl.handle.net/1822/91036

Título:	Improving NLP techniques by integrating linguistic input to detect Hate Speech in CMC Corpora
Autor(es):	Dias, Idalete Pereira, Ana Filipa Vilela
Palavras-chave:	Hate speech Computer mediated communication Natural language processing Pragmatic-discursive features
Data:	Dez-2023
Editora:	Palgrave Macmillan
Citação:	Dias, I., Pereira, F. (2023). Improving NLP Techniques by Integrating Linguistic Input to Detect Hate Speech in CMC Corpora. In: Ermida, I. (eds) Hate Speech in Social Media. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-031-38248-2_3
Resumo(s):	Hate speech detection research relies heavily on automatic detection models that make use of machine learning (ML), opinion mining, sentiment analysis and polarity detection. The highly informal and speech-like nature of Computer Mediated Communication (CMC) poses many challenges for electronic processing and automatic detection methods. In this study, we describe details of the natural language processing (NLP) techniques applied to obtain a lemmatised and part-of-speech-tagged Portuguese-English CMC corpus. Considering that automatic analysis and annotation tools are optimised for standard written production, we will address the limitations of these tools due to CMC-specific phenomena and how their performance can be improved by integrating linguistic input. We propose a mixed methods approach in which linguistic knowledge, including lexical, syntactic and pragmatic input, is used in conjunction with NLP techniques to trace and analyse fixed expressions in order to detect potential hate speech in user-generated content. Our focus will be on analysing the behaviour of opinion markers that exhibit a certain degree of fixedness as potential pointers to prejudiced hateful content in Netlang’s English Subcorpus as a contribution to the optimisation of hate speech detection NLP and ML models.
Tipo:	Capítulo de livro
URI:	https://hdl.handle.net/1822/91036
ISBN:	978-3-031-38247-5
e-ISBN:	978-3-031-38248-2
DOI:	10.1007/978-3-031-38248-2_3
Versão da editora:	https://link.springer.com/chapter/10.1007/978-3-031-38248-2_3
Acesso:	Acesso restrito autor
Aparece nas coleções:	CEHUM - Livros e Capítulos de Livros