Towards Fluid Speech-based Text Interaction
CHI '17: Extended Abstracts of the ACM International Conference on Human Factors in Computing Systems (workshop), to appear.
Relying on just speech input to create, edit, and revise text can be challenging. While dictating the bulk of your text using speech recognition can be quick, subsequent editing steps are often best done using other input methods such as a keyboard or mouse. This position paper describes our efforts to make editing more fluid when speech is the primary or only input modality. We describe our approach to automatically inferring the location of a spoken correction or revision within the original speech recognition result. We describe our probabilistic merge model that combines information from the original recognition and the correction recognition to improve accuracy on the final correction. Lastly we describe how allowing users to provide spelling information can substantially improve accuracy.