A study on speech recognition control for a surgical robot

Kateryna Zinchenko, Chien Yu Wu, Kai-Tai Song*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

57 Scopus citations


Speech recognition is common in electronic appliances and personal services, but its use for industrial and medical purposes is rare because of the presence of motion ambiguity. For minimally invasive surgical robotic assistants, this ambiguity arises because the robotic motion is not calibrated to the camera images. This paper presents a design for a speech recognition interface for an HIWIN robotic endoscope holder. A new intentional speech control is proposed to control movement over long distances. To decrease ambiguity, a method is proposed for voice-to-motion calibration that compares the degree of change in the endoscope image for a voice command. A speech recognition algorithm is implemented on Ubuntu OS, using CMU Sphinx. The control signal is sent to the robot controller using serial-port communication through a RS232 cable. The experimental results show that the proposed intentional speech control strategy has a navigation precision of up to 3.1° of angular displacement for the endoscope. The overall system processing time, including robotic motion, is 3.22 s for ∼1.8-s speech duration. The reference image navigation range is from 2.5 mm for ∼0.5-s speech duration up to 6 mm for ∼1.8-s speech duration, using a setup with camera tip that is located at a distance of 5 cm from the remote center of motion point.

Original languageEnglish
Article number7737032
Pages (from-to)607-615
Number of pages9
JournalIEEE Transactions on Industrial Informatics
Issue number2
StatePublished - 1 Apr 2017


  • Automated system
  • human-robot interface
  • motion control
  • robotic surgery
  • speech recognition control


Dive into the research topics of 'A study on speech recognition control for a surgical robot'. Together they form a unique fingerprint.

Cite this