Using Confidence Scores to Improve Eyes-free Detection of Speech Recognition Errors

Home > Publications > Using Confidence Scores to Improve Eyes-free Detection of Speech Recognition Errors

Using Confidence Scores to Improve Eyes-free Detection of Speech Recognition Errors

Sadia Nowrin, Keith Vertanen

PETRA '25: Proceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments, 2025 (to appear).

Conversational systems rely heavily on speech recognition to interpret and respond to user commands and queries. Despite progress on speech recognition accuracy, errors may still sometimes occur and can significantly affect the end-user utility of such systems. While visual feedback can help detect errors, it may not always be practical, especially for people who are blind or low-vision. In this study, we investigate ways to improve error detection by manipulating the audio output of the transcribed text based on the recognizer's confidence level in its result. Our findings show that selectively slowing down the audio when the recognizer exhibited uncertainty led to a 12% relative increase in participants' ability to detect errors compared to uniformly slowing the audio. It also reduced the time it took participants to listen to the recognition result and decide if there was an error by 11%.

Reference: