Detection of errors in the reading-aloud utilizing pattern recognition in speech for the early detection of dyslexia
DOI:
https://doi.org/10.24054/rcta.v1i39.1370Keywords:
Dyslexia, Speech Recognition, Pattern Recognition, Neural NetworksAbstract
This article presents the results of a comparison between three pattern recognition techniques applied to the classification of isolated words that simulates common errors that occur in the reading-aloud of children that manifest early stages of dyslexia. A database of recordings was made from five different speakers and sampled at 16 kHz, and later extended utilizing artificial data augmentation. The recordings were preprocessed to eliminate noise and the MFCC were later extracted. We compare the performance of the K-nearest neighbors (KNN), Multi-Layer Perceptron (MLP) and Convolutional Neural Networks (CNN), obtaining results in the order of 99 points for the recognition of phonetically distinctive words, and results in the order of 70 points for phonetically similar words.
References
speech and signal processing (ICASSP), pp. 4277-4280
Aggarwal, C. C. (2018). Neural networks and deep learning. Springer.
Alvarado, R., Caicedo, M. y Gelvez, M. (2019). Los DBA: “Herramienta Pedagógica para la intervención de las dificultades de aprendizaje en la lectura”. Universidad de Pamplona.
Bastos, V. (1983). La Dislexia y su Tratamiento.
Camacho C, C. (2016). Desarrollo de un Sistema de reconocimiento de habla natural basado en redes naturales profundas. Universidad Autónoma de Madrid.
Dougherty, G. (2013). Pattern Recognition and Classification. Springer-Verlag New York.
Fernández, F., Llopis, A. y DeRiego, C. (2006). La dislexia: origen, diagnóstico y recuperación. (16ª edición). Madrid: Morata.
Garcia, A. P., Suarez, O., & Castellanos, W. (2016). ERAAE virtual library. Paper presented at the CHILECON 2015 - 2015 IEEE Chilean Conference on Electrical, Electronics Engineering, Information and Communication Technologies, Proceedings of IEEE Chilecon 2015, 911-916. doi:10.1109/Chilecon.2015.7404681
Gelvez, L. y Maldonado, J. (2012) Aplicación de Redes Neuronales Morfológicas al reconocimiento de vocablos simples. Revista Colombiana de Tecnologías de Avanzada, Vol. 19(1), pp. 13-20.
Hidalgo N, A. y Quilis M, M. (2012). La voz del lenguaje: fonética y fonología del español. Tirant Humanidades
Huang, J. T., Li, J., y Gong, Y. (2015). An analysis of convolutional neural networks for speech recognition. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4989-4993.
Katagiri, S. (2003). Speech Pattern Recognition using Neural Networks. En Chow, W. y Juang, B. H. (Eds.), Pattern Recognition in Speech and Language Processing CRC Press LCC. pp. 115-147.
Kubanek M, Bobulski J, y Kulawik J. (2019) A Method of Speech Coding for Speech Recognition Using a Convolutional Neural Network. Symmetry. Vol. 11(9), pp. 1185.
McFee, B., Raffel, C., Liang, D., P.W. Ellis, D., McVicar, M., Battenberg, E. y Nieto, Oriol. (2015). librosa: Audio and Music Signal Analysis in Python. Proceedings of the 14th Python in Science Conference (SCIPY 2015).
Quintana, C. (2021). Detección de dificultades en la lectura inicial en niños de primer y segundo grado de Básica Primaria utilizando reconocimiento de patrones en el habla para la detección temprana de la dislexia. Universidad de Pamplona
Rabiner, L. y Juang, B. H. (1996). Fundamentals of Speech Recognition. Prentice-Hall International Inc.
Sen, S., Dutta, A. y Dey, Nilanjam. (2019). Audio Processing and Speech Recognition. Springer.
Suarez, O. J., Díaz, N. H., & Garcia, A. P. (2020). A real-time pattern recognition module via matlab-arduino interface. Paper presented at the Proceedings of the LACCEI International Multi-Conference for Engineering, Education and Technology, doi:10.18687/LACCEI2020.1.1.646
The International Dyslexia Association (IDA). Dyslexia Basics. https://dyslexiaida.org/ dyslexia-basics/. (10 de noviembre de 2021).
Brownlee, J. (26 de agosto de 2020). Train-Test Split for Evaluating Machine Learning Algorithms. Machine Learning Mastery. https://machinelearningmastery.com/train- test-split-for-evaluating-machine-learning- algorithms/. (10 de noviembre de 2021).
Smith, J. (2011). Spectral Audio Signal Processing. WK3 Publishing. http://ccrma.stanford.edu/~jos/sasp/. (10 de noviembre de 2021).
Downloads
Published
Versions
How to Cite
Issue
Section
License
Copyright (c) 2022 REVISTA COLOMBIANA DE TECNOLOGIAS DE AVANZADA (RCTA)
![Creative Commons License](http://i.creativecommons.org/l/by-nc/4.0/88x31.png)
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors publishing in this journal agree to the following conditions:
- Authors retain the moral rights of authorship and grant the journal the right of first publication of the work. This work is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0 DEED), allowing third parties to use the work provided appropriate credit is given to the authors and the first publication in this journal. Commercial use of the work is not permitted, and derivative works cannot be created.
- Authors may enter into separate, additional contractual agreements for the non-exclusive distribution of the journal's published version of the article (for example, including it in an institutional repository or publishing it in a book), provided that the work is clearly indicated as having first been published in this journal.
- Authors are permitted and encouraged to post their work online (e.g., on institutional or personal websites) before and during the review and publication process, as it can lead to productive exchanges and greater and faster dissemination of the published work.