A proposed system for machine translation by crowdsourcing

Kantaro Mera, Taichi Sugimura, Takahiro Koita


To enjoy video content, the hearing impaired and viewers of different languages need video captioning. Existing automatic captioning systems have already been employed on commercial platforms such as YouTube. These systems generate captions from the audio data of the videos. The voice data handled by existing systems is a synthesis of multiple sounds, making it difficult to caption. In addition, the insufficient accuracy of machine-translated captions cannot correctly express the meaning of the original conversation. In this paper, we propose a new captioning system using crowdsourcing to improve the accuracy of foreign language captions in conversations with multiple people. The proposed system includes a step to create machine translation adaptive sentences to increase accuracy. We compared and evaluated the accuracy of captions created by the proposed system and YouTube’s existing system. In this paper, we demonstrate that our proposed system has higher accuracy than the existing system.


crowdsourcing; machine translation; automatic captioning system; multilingual communication

Full Text:



  • There are currently no refbacks.