Experiencing Text to Speech

xujiajin

管理员
Staff member
http://www.iflytek.com/speech%20shows.asp
大家可以切身体验一下语音合成(speech synthesis)技术。
你可以在input box中输入任何汉字,当然也可以中英文混杂,然后合成,系统就可以将所写的文字读出来。这种技术叫TTS(Text to Speech)


跟合成相对的技术叫语音识别(speech recognition或者叫voice recognition)。
 
所言极是,迅飞做的主要是中文。请的那位中文播音员英文发音的确有问题。我拼命帮他纠正,感觉都不能令人满意。
 
Speech Recognition Technology Used in CALL Applications in China
2005070109311391.jpg


Technology in use

We have made a general evaluation of the recognition engines used in the learning software, for example, TeLL me More Pro, LiveABC Interactive (developed by the same company of CNN Interactive), and many others.

Most of the CALL programs are PC-based and some include web-based recognition.
Record/playback/compare method is in the main. During the playback and comparison, pass-fail mode is used to control the learning pace, but the problem is that the technology is unreliable per se. In some cases, individual word waveform evaluation score and score for the general performance are displayed. Others apply the “pass--try again--listen to the original sound--skip” to guide the learning path.
2005070109333880.jpg


Task design in CALL applications with speech recognition technology
Due to the apparent limitations of the speech processing technology, current voice-interactive CALL applications only allows for the training of lower-level repetitive skills in language learning. But this does not keep us from making up some original tasks or activities with speech recognition technology. Apart from record, playback and comparison, we can also write some good voice-based exercises like (1) oral cloze, in which learners spell out the correct answers from the given options or distracters; (2) situated multiple choice, in which the learners answer the questions orally, that is to choose the right answer from among three or four options. Usually the first question in such an activity is an open one to select a situation for the episode to develop. Different choices anticipate different set of questions. So learners are positively motivated in doing such an exercise and their interests in learning are raised; and (3) role play, which is a popular activity employed by almost all the CALL applications we have reviewed thus far. In a voice-based role play, the learners first select a role of the dialog. Then computer screens out the recording of the role in the dialog the learner chooses and plays the recording of the other role. The learner reads or recites the sentences that the role s/he plays. After either of the roles reads the last sentence, the computer mixes the learner’s and computer’s “talk” and plays it back. The learner can sit back to listen to the dialog s/he has done with the computer. S/he may also choose to listen to the original sound.

[本贴已被 作者 于 2005年07月01日 09时36分11秒 编辑过]
 
讯飞的大的语音合成系统都是采用拼接的技术,所以自然度很好,现在的qq汽车上也用了讯飞的语音合成技术,不过汽车上用的是共振峰合成,语言自然度就差了很多,机器味很重。
在At&T的网站上,Bell Lab的网站上都有他们做的英语语音合成,也比较有意思。
 
回复:Experiencing Text to Speech

I used this passage as the testing sample:

迅飞做的主要是中文。请的那位中文播音员英文发音的确有问题。我拼命帮他纠正,感觉都不能令人满意。

Except some accent on function words (e.g. 那位) the
output is impressive. I think it sounds much better than
the ATT system that I tested a while back.
 
Back
顶部