Utilize este identificador para referenciar este registo: https://hdl.handle.net/1822/74307

TítuloSkeleton driven action recognition using an image-based spatial-temporal representation and convolution neural network
Autor(es)Silva, Vinícius
Soares, Filomena
Leão, Celina Pinto
Esteves, João Sena
Vercelli, Gianni
Palavras-chaveHuman action recognition
Human computer interaction
Autism spectrum disorder
Convolutional neural network
Data25-Jun-2021
EditoraMultidisciplinary Digital Publishing Institute (MDPI)
RevistaSensors
CitaçãoSilva, V.; Soares, F.; Leão, C.P.; Esteves, J.S.; Vercelli, G. Skeleton Driven Action Recognition Using an Image-Based Spatial-Temporal Representation and Convolution Neural Network. Sensors 2021, 21, 4342. https://doi.org/10.3390/s21134342
Resumo(s)Individuals with Autism Spectrum Disorder (ASD) typically present difficulties in engaging and interacting with their peers. Thus, researchers have been developing different technological solutions as support tools for children with ASD. Social robots, one example of these technological solutions, are often unaware of their game partners, preventing the automatic adaptation of their behavior to the user. Information that can be used to enrich this interaction and, consequently, adapt the system behavior is the recognition of different actions of the user by using RGB cameras or/and depth sensors. The present work proposes a method to automatically detect in real-time typical and stereotypical actions of children with ASD by using the Intel RealSense and the Nuitrack SDK to detect and extract the user joint coordinates. The pipeline starts by mapping the temporal and spatial joints dynamics onto a color image-based representation. Usually, the position of the joints in the final image is clustered into groups. In order to verify if the sequence of the joints in the final image representation can influence the model’s performance, two main experiments were conducted where in the first, the order of the grouped joints in the sequence was changed, and in the second, the joints were randomly ordered. In each experiment, statistical methods were used in the analysis. Based on the experiments conducted, it was found statistically significant differences concerning the joints sequence in the image, indicating that the order of the joints might impact the model’s performance. The final model, a Convolutional Neural Network (CNN), trained on the different actions (typical and stereotypical), was used to classify the different patterns of behavior, achieving a mean accuracy of 92.4% ± 0.0% on the test data. The entire pipeline ran on average at 31 FPS.
TipoArtigo
URIhttps://hdl.handle.net/1822/74307
DOI10.3390/s21134342
ISSN1424-8220
e-ISSN1424-8220
Versão da editorahttps://www.mdpi.com/1424-8220/21/13/4342
Arbitragem científicayes
AcessoAcesso aberto
Aparece nas coleções:BUM - MDPI

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
sensors-21-04342.pdf7,53 MBAdobe PDFVer/Abrir

Este trabalho está licenciado sob uma Licença Creative Commons Creative Commons

Partilhe no FacebookPartilhe no TwitterPartilhe no DeliciousPartilhe no LinkedInPartilhe no DiggAdicionar ao Google BookmarksPartilhe no MySpacePartilhe no Orkut
Exporte no formato BibTex mendeley Exporte no formato Endnote Adicione ao seu ORCID