Improving NLP techniques by integrating linguistic input to detect Hate Speech in CMC Corpora

doi:10.1007/978-3-031-38248-2_3

Utilize este identificador para referenciar este registo: https://hdl.handle.net/1822/91036

Registo completo

Campo DC	Valor	Idioma
dc.contributor.author	Dias, Idalete	por
dc.contributor.author	Pereira, Ana Filipa Vilela	por
dc.date.accessioned	2024-04-17T07:23:21Z	-
dc.date.issued	2023-12	-
dc.identifier.citation	Dias, I., Pereira, F. (2023). Improving NLP Techniques by Integrating Linguistic Input to Detect Hate Speech in CMC Corpora. In: Ermida, I. (eds) Hate Speech in Social Media. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-031-38248-2_3	por
dc.identifier.isbn	978-3-031-38247-5	por
dc.identifier.uri	https://hdl.handle.net/1822/91036	-
dc.description.abstract	Hate speech detection research relies heavily on automatic detection models that make use of machine learning (ML), opinion mining, sentiment analysis and polarity detection. The highly informal and speech-like nature of Computer Mediated Communication (CMC) poses many challenges for electronic processing and automatic detection methods. In this study, we describe details of the natural language processing (NLP) techniques applied to obtain a lemmatised and part-of-speech-tagged Portuguese-English CMC corpus. Considering that automatic analysis and annotation tools are optimised for standard written production, we will address the limitations of these tools due to CMC-specific phenomena and how their performance can be improved by integrating linguistic input. We propose a mixed methods approach in which linguistic knowledge, including lexical, syntactic and pragmatic input, is used in conjunction with NLP techniques to trace and analyse fixed expressions in order to detect potential hate speech in user-generated content. Our focus will be on analysing the behaviour of opinion markers that exhibit a certain degree of fixedness as potential pointers to prejudiced hateful content in Netlang’s English Subcorpus as a contribution to the optimisation of hate speech detection NLP and ML models.	por
dc.language.iso	eng	por
dc.publisher	Palgrave Macmillan	por
dc.relation	info:eu-repo/grantAgreement/FCT/3599-PPCDT/PTDC%2FLLT-LIN%2F29304%2F2017/PT	por
dc.rights	closedAccess	por
dc.subject	Hate speech	por
dc.subject	Computer mediated communication	por
dc.subject	Natural language processing	por
dc.subject	Pragmatic-discursive features	por
dc.title	Improving NLP techniques by integrating linguistic input to detect Hate Speech in CMC Corpora	por
dc.type	bookPart	por
dc.relation.publisherversion	https://link.springer.com/chapter/10.1007/978-3-031-38248-2_3	por
oaire.citationStartPage	79	por
oaire.citationEndPage	105	por
dc.identifier.doi	10.1007/978-3-031-38248-2_3	por
dc.date.embargo	10000-01-01	-
dc.identifier.eisbn	978-3-031-38248-2	por
dc.subject.fos	Humanidades::Outras Humanidades	por
sdum.bookTitle	Hate Speech in Social Media. Linguistic Approaches	por
oaire.version	VoR	por
Aparece nas coleções:	CEHUM - Livros e Capítulos de Livros