My name is Bogdan Petcu and I am working within this year’s RSoC program at CMU Sphinx.
CMU Sphinx is a toolkit used for building applications that use speech recognition. Our aim for this summer is to implement a module in Sphinx4 that adapts the data that is used for decoding so that recognition process has better results.
For decoding with Sphinx4 you could use a general acoustic model (e.g for English language), but if you want to decode audio files that contains speech of non-native speakers, or if the recording environment has background noise you would want to adapt this general acoustic model for those particular speakers so that the decoding process is more precise. Currently adapting an acoustic model requires using the sphinxtrain tool provided by CMU Sphinx. Using sphinxtrain requires building manually some files ( recordings, their transcription etc.).
Our aim is to make Sphinx4 adapt the acoustic model by itself using information from the first decoding and after that redecode with the adapted model improving the recognition process.
Until now I implemented a component that collects adaptation data and another component that based on the collected adaptation data builds a specific file containing the transformation that will be applied to the acoustic model in order to adapt it.
Github repository: https://github.com/bogdanpetcu/sphinx4