Voice Driven UML Modeling System for Visually Impaired Students in Software Engineering Education

Carlos Henriquez Miranda; Malak Andres Sanchez Cataño; German Sanchez-Torres

doi:10.24054/rcta.v2i46.4126

Authors

Carlos Universidad del Magdalena https://orcid.org/0000-0001-8252-1413
Malak Andres Sanchez Cataño Universidad del Magdalena https://orcid.org/0009-0007-8398-5392
German Sanchez-Torres Universidad del Magdalena https://orcid.org/0000-0002-9069-0732

DOI:

https://doi.org/10.24054/rcta.v2i46.4126

Keywords:

accessibility, speech synthesis, UML modeling

Abstract

Visual impairment limits students' access to object-oriented modeling environments that rely on graphical interfaces. Advances in Text-to-Speech (TTS) and Speech-to-Text (STT) technologies make it possible to consider mechanisms to compensate for this barrier, yet their adoption in educational platforms lacks evidence-based guidelines. This work aims to design, implement, and validate a voice-controlled UML modeling system that enables visually impaired students to create, edit, and query class and use-case diagrams. A four-phase methodology was followed: (i) requirements elicitation; (ii) modular architectural design with grammatical validation and the use of large language models (LLMs); (iii) implementation in Python 3.11, FastAPI, and cloud-based STT/TTS services; and (iv) technical evaluation using recognition accuracy, latency, and task-time metrics. The prototype executed 20 critical commands with a recognition accuracy of 97% and maintained full syntactic coherence in the UML models generated. The average times to complete creation, editing, and navigation tasks were 4.23 s, 6.78 s, and 5.24 s, respectively. The average TTS latency (1468 ms) exceeded the 500 ms target, identifying the NLP module as the main bottleneck. The system demonstrates technical feasibility and adheres to the defined accessibility guidelines (WCAG 2.2). Future improvements will focus on reducing TTS latency, expanding the command repertoire, and conducting large-scale usability evaluations (SUS).

Downloads

Download data is not yet available.

References

A. D. P. D. Santos, A. H. G. Suzuki, F. O. Medola, y A. Vaezipour, «A systematic review of wearable devices for orientation and mobility of adults with visual impairment and blindness», IEEE Access, vol. 9, pp. 162306-162324, 2021, doi: 10.1109/ACCESS.2021.3132887.

World Wide Web Consortium, «Web Content Accessibility Guidelines (WCAG) 2.2». diciembre de 2024.

C. Rottondi, M. Sacchetto, L. Severi, y A. Bianco, «Toward an inclusive framework for remote musical education and practices», IEEE Access, vol. 12, pp. 173836-173849, 2024, doi: 10.1109/ACCESS.2024.3501414.

D. Chen et al., «Visually impaired people learning virtual textures through multimodal feedback combining vibrotactile and voice», IEEE Trans. Neural Syst. Rehabil. Eng., vol. 33, pp. 453-465, 2025, doi: 10.1109/TNSRE.2025.3528048.

S. Raffoul y L. Jaber, «Text-to-Speech Software and Reading Comprehension: The Impact for Students with Learning Disabilities», Canadian Journal of Learning and Technology, vol. 49, n.o 2, Art. n.o 2, nov. 2023, doi: 10.21432/cjlt28296.

D. Chen, J. Liu, L. Tian, X. Hu, y A. Song, «Research on the method of displaying the contour features of image to the visually impaired on the touch screen», IEEE Trans. Neural Syst. Rehabil. Eng., vol. 29, pp. 2260-2270, 2021, doi: 10.1109/TNSRE.2021.3123394.

D. Chen et al., «Development and evaluation of refreshable braille display and active touch-reading system for digital reading of the visually impaired», IEEE Trans. Neural Syst. Rehabil. Eng., vol. 32, pp. 934-945, 2024, doi: 10.1109/TNSRE.2024.3363495.

H. -S. Moon y J. Seo, «Sample-efficient training of robotic guide using human path prediction network», IEEE Access, vol. 10, pp. 104996-105007, 2022, doi: 10.1109/ACCESS.2022.3210932.

M. Manzoni, S. Mascetti, D. Ahmetovic, R. Crabb, y J. M. Coughlan, «MapIO: a gestural and conversational interface for tactile maps», IEEE Access, vol. 13, pp. 84038-84056, 2025, doi: 10.1109/ACCESS.2025.3566286.

S. L. Gay, E. Pissaloux, K. Romeo, y N. -T. Truong, «F2T: a novel force-feedback haptic architecture delivering 2D data to visually impaired people», IEEE Access, vol. 9, pp. 94901-94911, 2021, doi: 10.1109/ACCESS.2021.3091441.

Z. Yuan et al., «Network-aware 5G edge computing for object detection: Augmenting wearables to “see” more, farther and faster», IEEE Access, vol. 10, pp. 29612-29632, 2022, doi: 10.1109/ACCESS.2022.3157876.

M. Zeinullin y M. Hersh, «Tactile audio responsive intelligent system», IEEE Access, vol. 10, pp. 122074-122091, 2022, doi: 10.1109/ACCESS.2022.3223099.

F. Leo, E. Cocchi, y L. Brayda, «The effect of programmable tactile displays on spatial learning skills in children and adolescents of different visual disability», IEEE Trans. Neural Syst. Rehabil. Eng., vol. 25, n.o 7, pp. 861-872, jul. 2017, doi: 10.1109/TNSRE.2016.2619742.

P. Mejía, L. C. Martini, F. Grijalva, J. C. Larco, y J. C. Rodríguez, «A survey on mathematical software tools for visually impaired persons: a practical perspective», IEEE Access, vol. 9, pp. 66929-66947, 2021, doi: 10.1109/ACCESS.2021.3076306.

S. Gatto, O. Gaggi, L. Grosset, y L. G. N. Fovino, «Accessible mathematics: Representation of functions through sound and touch», IEEE Access, vol. 12, pp. 121552-121569, 2024, doi: 10.1109/ACCESS.2024.3448509.

R. Jafri, S. M. M. Althbiti, N. A. A. Alattas, A. A. A. Albraiki, y S. H. A. Almuhawwis, «Tac-trace: a tangible user interface-based solution for teaching shape concepts to visually impaired children», IEEE Access, vol. 10, pp. 131153-131165, 2022, doi: 10.1109/ACCESS.2022.3228455.

T. Tanabe, K. Nunokawa, K. Doi, y S. Ino, «Training system for white cane technique using illusory pulling cues induced by asymmetric vibrations», IEEE Trans. Neural Syst. Rehabil. Eng., vol. 30, pp. 305-313, 2022, doi: 10.1109/TNSRE.2022.3148770.

G. V. Helden, V. Van Der Werf, G. N. Saunders-Smits, y M. M. Specht, «The use of digital peer assessment in higher education—an umbrella review of literature», IEEE Access, vol. 11, pp. 22948-22960, 2023, doi: 10.1109/ACCESS.2023.3252914.

K. Villalba et al., «Eyeland: a visually-impaired accessible english learning application using a design-based research framework», IEEE Access, vol. 12, pp. 142275-142290, 2024, doi: 10.1109/ACCESS.2024.3444741.

M. N. Islam et al., «A multilingual handwriting learning system for visually impaired people», IEEE Access, vol. 12, pp. 10521-10534, 2024, doi: 10.1109/ACCESS.2024.3353781.

P. F. Balestrucci, E. Di Nuovo, M. Sanguinetti, L. Anselma, C. Bernareggi, y A. Mazzei, «An educational dialogue system for visually impaired people», IEEE Access, vol. 12, pp. 150502-150519, 2024, doi: 10.1109/ACCESS.2024.3479883.

T. C. Lethbridge, S. E. Sim, y J. Singer, «Studying Software Engineers: Data Collection Techniques for Software Field Studies», Empir Software Eng, vol. 10, n.o 3, pp. 311-341, jul. 2005, doi: 10.1007/s10664-005-1290-x.

O. Cico, L. Jaccheri, A. Nguyen-Duc, y H. Zhang, «Exploring the intersection between software industry and Software Engineering education - A systematic mapping of Software Engineering Trends», Journal of Systems and Software, vol. 172, p. 110736, feb. 2021, doi: 10.1016/j.jss.2020.110736.

Object Management Group, «OMG Unified Modeling Language (OMG UML), Version 2.5.1», Object Management Group (OMG), Specification formal/17-12-05, dic. 2017. [En línea]. Disponible en: https://www.omg.org/spec/UML/2.5.1

M. Fowler, UML Distilled: A Brief Guide to the Standard Object Modeling Language, 3.a ed. USA: Addison-Wesley Longman Publishing Co., Inc., 2003.

T.-S. Nguyen, S. Stueker, y A. Waibel, «Super-Human Performance in Online Low-latency Recognition of Conversational Speech», 26 de julio de 2021, arXiv: arXiv:2010.03449. doi: 10.48550/arXiv.2010.03449.

«Software Architecture in Practice, 3rd Edition». Accedido: 27 de julio de 2025. [En línea]. Disponible en: https://www.sei.cmu.edu/library/software-architecture-in-practice-third-edition/

S. Geng, M. Josifoski, M. Peyrard, y R. West, «Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning», 18 de enero de 2024, arXiv: arXiv:2305.13971. doi: 10.48550/arXiv.2305.13971.

J. Nielsen, Usability Engineering. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1994.