CMKL

We are developing a large-scale Thai automated speech recognition model on using combinations of Thai-language speech datasets. The experiment took a similar approach to SpeechStew model (https://arxiv.org/abs/2104.02133), but our model is specifically tuned for Thai language. Our combined dataset has more than 3000 hours of speech content (over one billion input frames) ranging from read speech, lectures, and conversational speech. The model is based on Nvidia’s conformer-CTC model with roughly 30 million parameters. Our preliminary result shows that the model (which is still actively training on the cluster) can outperform models trained on any single domain. It also outperforms a domain specific model on a held-out domain.

Powered by Apex

A Large-Scale Thai Automated Speech Recognition Model

MORE PROJECTS

Powered by Apex

Porjai: Thai Automatic Speech Recognition (ASR) for E-Commerce

CMKL University

Preventing "Lai-Tai" Among Thais: Discovering the Genetic Causes and Treatments of Lai-Tai

Faculty of Medicine, Chulalongkorn University

Real-Time Image Dianosis Platform for Cervical Cancer Screening via Pap Smear Tests

Department of Biomedical Engineering, School of Engineering, KMITL

COVID Variant Identifier Service on CiRA

Faculty of Information Technology, KMITL

Address