Acceso a Internet por Comandos de Voz: Aplicación de Navegación para Facebook, Gmail y Chrome
DOI:
https://doi.org/10.24054/rcta.v1i45.2963Palabras clave:
Interacción Humano-Computador, Tecnologías de la Información y la Comunicación, Navegación en Internet, Reconocimiento del HablaResumen
Se presenta un sistema para navegar por Internet mediante comandos de voz. La herramienta implementada permitió el control verbal de las aplicaciones Google Chrome, Gmail y Facebook. Las pruebas se realizaron con un grupo de 33 personas con diferentes experiencias de navegación por Internet compuesto por adultos jóvenes, adultos mayores y personas con discapacidad motriz. Cada una de las aplicaciones se probó por separado mediante diálogos guiados con comandos de voz y dictados. En las pruebas del sistema de reconocimiento de voz se utilizaron 2871 comandos de voz y 594 dictados, observándose un mejor resultado de los comandos de voz en la aplicación Facebook y de los dictados en la aplicación Google Chrome. Se obtuvo una media general del 84,69% con una desviación estándar del 6,45% para el reconocimiento de comandos de voz, y del 74,63% con una desviación estándar del 2,75% para el reconocimiento de dictados.
Descargas
Citas
N. S. Sreekanth et al., “Multimodal interface for Effective Man Machine Interaction,” Media Convergence Handbook, Media Business and Innovation, vol. 2, pp. 249–264, 2016. doi: 10.1007/978-3-642-54487-3.
B. Basharirad and M. Moradhaseli, “Speech Emotion Recognition Methods: A Literature Review,” 2nd Int. Conf. Appl. Sci. Technol. 2017, 2017, doi: 10.1063/1.5005438.
H. Ibrahim and A. Varol, “A Study on Automatic Speech Recognition Systems,” 8th Int. Symp. Digit. Forensics Secur. ISDFS 2020, p. 1–5, https://doi.org/10.1109/ISDFS49300.2020.91162, 2020.
S. Raju, V. Jagtap, P. Kulkarni, M. Ravikanth, and M. Rafeeq, “Speech Recognition to Build Context: A Survey,” Int. Conf. Comput. Sci. Eng. Appl., p. 1–7, https://doi.org/10.1109/iccsea49143.2020.9132, 2020.
L. Oksana, T. Ihor, and L. Pavlo, “Navigation Assistive Application for the Visually Impaired People,” Proc. - 2020 IEEE 11th Int. Conf. Dependable Syst. Serv. Technol. DESSERT 2020, p. 320–325, https://doi.org/10.1109/DESSERT50317.2020, 2020.
L. Clark et al., “The State of Speech in HCI: Trends, Themes and Challenges,” Interact. Comput., vol. 31, no. 4, p. 349–371, https://doi.org/10.1093/iwc/iwz016, 2019.
Apple, “Siri,” [Online - Accesed May 4, 2023]. [Online]. Available: http://www.apple.com/siri/
Microsoft, “Cortana,” [Online - Accesed May 4, 2023]. [Online]. Available: https://www.microsoft.com/en-us/cortana
Google, “Google Assistant,” [Online - Accesed May 4, 2023]. [Online]. Available: https://assistant.google.com/intl/es_es/
Amazon, “Alexa,” [Online - Accesed May 4, 2023]. [Online]. Available: https://developer.amazon.com/alexa
Samsung, “Bixby,” [Online - Accesed May 4, 2023]. [Online]. Available: https://www.samsung.com/us/explore/bixby/
Cerence, “Cerence Drive,” [Online - Accesed May 4, 2023]. [Online]. Available: https://www.cerence.com/cerence-products/beyond-voice
Amazon, “Astro,” [Online - Accesed May 4, 2023]. [Online]. Available: https://www.amazon.com/-/es/Presentamos-Amazon-Astro/dp/B078NSDFSB
R. Sarikaya, “The Technology Behind Personal Digital Assistants,” IEEE Signal Process. Mag., vol. 34, p. 67–81, https://doi.org/10.1109/MSP.2016.2617341, 2017.
V. Kepuska and G. Bohouta, “Next-generation of virtual personal assistants (Microsoft Cortana, Apple Siri, Amazon Alexa and Google Home),” 2018 IEEE 8th Annu. Comput. Commun. Work. Conf. CCWC 2018, vol. 2018–Janua, no. c, pp. 99–103, 2018, doi: 10.1109/CCWC.2018.8301638.
E. Marvin, “Digital Assistant for the Visually Impaired,” 2020 Int. Conf. Artif. Intell. Inf. Commun. ICAIIC 2020, p. 723–728, https://doi.org/10.1109/ICAIIC48513.2020, 2020.
M. B. Chandu and K. Ganapathy, “Voice Controlled Human Assistence Robot,” 2020 6th Int. Conf. Adv. Comput. Commun. Syst. ICACCS 2020, p. 971–973, https://doi.org/10.1109/ICACCS48705.2020, 2020.
S. Faroom, M. Nauman, S. Yousaf, and S. Umer, “Literature Review on Home Automation System,” Int. Conf. Comput. Math. Eng. Technol., 2018, doi: 10.17148/IJARCCE.2017.63173.
P. Suesaowaluk, “Home Automation System Based Mobile Application,” 2nd World Symp. Artif. Intell., p. 97–102, https://doi.org/10.1109/wsai49636.2020.914, 2020.
N. H. Abdallah, E. Affes, Y. Bouslimani, M. Ghribi, A. Kaddouri, and M. Ghariani, “Smart Assistant Robot for Smart Home Management,” 1st Int. Conf. Commun. Control Syst. Signal Process., p. 317–321, https://doi.org/10.1109/ccssp49278.2020.9, 2020.
P. J. Rani, J. Bakthakumar, B. P. Kumaar, U. P. Kumaar, and S. Kumar, “Voice controlled home automation system using natural language processing (NLP) and internet of things (IoT),” ICONSTEM 2017 - Proc. 3rd IEEE Int. Conf. Sci. Technol. Eng. Manag., vol. 2018–Janua, pp. 368–373, 2018, doi: 10.1109/ICONSTEM.2017.8261311.
P. Dabre, R. Gonsalves, R. Chandvaniya, and A. V. Nimkar, “A Framework for System Interfacing of Voice User Interface for Personal Computers,” 3rd Int. Conf. Commun. Syst. Comput. IT Appl., p. 1–6, https://doi.org/10.1109/cscita47329.2020.9137, 2020.
V. Chayapathy, G. S. Anitha, and B. Sharath, “IOT based home automation by using personal assistant,” Proc. 2017 Int. Conf. Smart Technol. Smart Nation, SmartTechCon 2017, pp. 385–389, 2018, doi: 10.1109/SmartTechCon.2017.8358401.
L. P. De Oliveira, M. A. Wehrmeister, and A. S. De Oliveira, “Systematic Literature Review on Automotive Diagnostics,” Brazilian Symp. Comput. Syst. Eng. SBESC, vol. 2017–Novem, pp. 1–8, 2017, doi: 10.1109/SBESC.2017.7.
H. Zhang and C. Ye, “Human-Robot Interaction for Assisted Wayfinding of a Robotic Navigation Aid for the Blind,” 12th Int. Conf. Hum. Syst. Interact. (HSI), Richmond, VA, USA, p. 137–142, https://doi.org/10.1109/HSI47298.2019.894, 2019.
G. Lugano, “Virtual assistants and self-driving cars,” Proc. 2017 15th Int. Conf. ITS Telecommun. ITST 2017, pp. 1–5, 2017, doi: 10.1109/ITST.2017.7972192.
M. Kim, E. Seong, Y. Jwa, J. Lee, and S. Kim, “A Cascaded Multimodal Natural User Interface to Reduce Driver Distraction,” IEEE Access, vol. 8, p. 112969–112984, https://doi.org/10.1109/ACCESS.2020, 2020.
S. Estes, J. Helleberg, K. Long, M. Pollack, and M. Quezada, “Guidelines for speech interactions between pilot & cognitive assistant,” ICNS 2018 - Integr. Commun. Navig. Surveill. Conf., pp. 1–23, 2018, doi: 10.1109/ICNSURV.2018.8384965.
S. Nur, A. Mohamad, and K. Isa, “Assistive Robot for Speech Semantic Recognition System,” 2018 7th Int. Conf. Comput. Commun. Eng., p. 50–55, ISBN: 9781538669921, 2018.
M. A. Hossain, M. F. K. Khondakar, M. H. Sarowar, and M. J. U. Qureshi, “Design and Implementation of an Autonomous Wheelchair,” 2019 4th Int. Conf. Electr. Inf. Commun. Technol. EICT 2019, p. 1–5, https://doi.org/10.1109/EICT48899.2019.906885, 2019.
N. H. Khan, A. H. Arovi, H. Mahmud, K. Hasan, and H. A. Rubaiyeat, “Speech based text correction tool for the visually impaired,” pp. 150–155, ISBN: 978-1-4673-9930–2, 2015.
P. Bose, A. Malpthak, U. Bansal, and A. Harsola, “Digital assistant for the blind,” 2017 2nd Int. Conf. Converg. Technol. I2CT 2017, vol. 2017–Janua, no. 2015, pp. 1250–1253, 2017, doi: 10.1109/I2CT.2017.8226327.
K. Cofre, E. Molina, and G. Guerrero, “Voice controlled interface oriented memory loss assistance system for older adults,” Iber. Conf. Inf. Syst. Technol. Cist., p. 24–27, https://doi.org/10.23919/CISTI49556.2020.91, 2020.
J.-H. Mosquera-DeLaCruz, S.-E. Nope-Rodríguez, A.-D. Restrepo-Girón, and H. Loaiza-Correa, “Internet Access by Voice Commands (Source Code),” [Online - Accesed Jun 17, 2023]. Accessed: Jun. 17, 2023. [Online]. Available: https://github.com/nandostiwar/Internet_Access_by_Voice_Commands.git
Logitech, “LogitechG430,” [Online - Accesed May 4, 2023]. [Online]. Available: https://www.logitechg.com/es-roam/products.html?searchclick=gaming
Microsoft, “Bing Speech API,” [Online - Accesed May 4, 2023]. [Online]. Available: https://azure.microsoft.com/es-es/services/cognitive-services/speech-to-text/#overview
M. Assefi, M. Wittie, and A. Knight, “Impact of Network Performance on Cloud Speech Recognition,” 2015, doi: 10.1109/ICCCN.2015.7288417.
A. Hannun et al., “Deep Speech: Scaling up end-to-end speech recognition,” Arxiv, pp. 1–12, 2014, doi: arXiv:1412.5567v2.
C. AutoIt, “AutoIt Library,” [Online - Accesed May 4, 2023], 2023. [Online]. Available: https://www.autoitscript.com/site/autoit/
Microsoft, “System Speech Synthesis,” [Online - Accesed May 4, 2023], 2023. [Online]. Available: https://msdn.microsoft.com/en-us/library/system.speech.synthesis.aspx
M. Canelli, D. Grasso, and M. King, “Methods and Metrics for the Evaluation of Dictation Systems: a Case Study,” Proc. Second Int. Conf. Lang. Resour. Eval., p. 1–7, SematicScholar Corpus ID: 15527622, 2000.
Descargas
Publicado
Cómo citar
Número
Sección
Licencia
Derechos de autor 2025 REVISTA COLOMBIANA DE TECNOLOGIAS DE AVANZADA (RCTA)
Esta obra está bajo una licencia internacional Creative Commons Atribución-NoComercial 4.0.