改進(jìn)型智能機(jī)器人的語音識別方法----外文翻譯.doc
約20頁DOC格式手機(jī)打開展開
改進(jìn)型智能機(jī)器人的語音識別方法----外文翻譯,improved speech recognition methodfor intelligent robot2、overview of speech recognitionspeech recognition has received more and more attention recently due to t...
![](http://img.queshao.com/images/pcgzh.gif)
![](http://preview.queshao.com/tobuy/256469.gif)
內(nèi)容介紹
此文檔由會員 wanli1988go 發(fā)布
Improved speech recognition method
for intelligent robot
2、Overview of speech recognition
Speech recognition has received more and more attention recently due to the important theoretical meaning and practical value [5 ]. Up to now, most speech recognition is based on conventional linear system theory, such as Hidden Markov Model (HMM) and Dynamic Time Warping(DTW) . With the deep study of speech recognition, it is found that speech signal is a complex nonlinear process. If the study of speech recognition wants to break through, nonlinear
-system theory method must be introduced to it. Recently, with the developmentof nonlinea-system theories such as artificial neural networks(ANN) , chaos and fractal, it is possible to apply these theories to speech recognition. Therefore, the study of this paper is based on ANN and chaos and fractal theories are introduced to process speech recognition.
Speech recognition is divided into two ways that are speaker dependent and speaker independent. Speaker dependent refers to the pronunciation model trained by a single person, the identification rate of the training person?sorders is high, while others’orders is in low identification rate or can’t be recognized. Speaker independent refers to the pronunciation model
改進(jìn)型智能機(jī)器人的語音識別方法
2、語音識別概述
最近,由于其重大的理論意義和實(shí)用價值,語音識別已經(jīng)受到越來越多的關(guān)注。到現(xiàn)在為止,多數(shù)的語音識別是基于傳統(tǒng)的線性系統(tǒng)理論,例如隱馬爾可夫模型和動態(tài)時間規(guī)整技術(shù)。隨著語音識別的深度研究,研究者發(fā)現(xiàn),語音信號是一個復(fù)雜的非線性過程,如果語音識別研究想要獲得突破,那么就必須引進(jìn)非線性系統(tǒng)理論方法。最近,隨著非線性系統(tǒng)理論的發(fā)展,如人工神經(jīng)網(wǎng)絡(luò),混沌與分形,可能應(yīng)用這些理論到語音識別中。因此,本文的研究是在神經(jīng)網(wǎng)絡(luò)和混沌與分形理論的基礎(chǔ)上介紹了語音識別的過程。
語音識別可以劃分為獨(dú)立發(fā)聲式和非獨(dú)立發(fā)聲式兩種。非獨(dú)立發(fā)聲式是指發(fā)音模式是由單個人來進(jìn)行訓(xùn)練,其對訓(xùn)練人命令的識別速度很快,但它對與其他人的指令識別速度很慢,或者不能識別。獨(dú)立發(fā)聲式是指其發(fā)音模式是由不同年齡,不同性別,不同地域的人來進(jìn)行訓(xùn)練,它能識別一個群體的指令。一般地,由于用戶不需要操作訓(xùn)練,獨(dú)立發(fā)聲式系統(tǒng)得到了更廣泛的應(yīng)用。 所以,在獨(dú)立發(fā)聲式系統(tǒng)中,從語音信號中提取語音特征是語音識別系統(tǒng)的一個基本問題。
語音識別包括訓(xùn)練和識別,我們可以把它看做一種模式化的識別任務(wù)。通常地,語音信號可以看作為一段通過隱馬爾可夫模型來表征的時間序列。通過這些特征提取,語音信號被轉(zhuǎn)化為特征向量并把它作為一種意見,在訓(xùn)練程序中,這些意見將反饋到HMM的模型參數(shù)估計中。這些參數(shù)包括意見和他們響應(yīng)狀態(tài)所對應(yīng)的概率密度函數(shù),狀態(tài)間的轉(zhuǎn)移概率,等等。經(jīng)過參數(shù)估計以后,這個已訓(xùn)練模式就可以應(yīng)
for intelligent robot
2、Overview of speech recognition
Speech recognition has received more and more attention recently due to the important theoretical meaning and practical value [5 ]. Up to now, most speech recognition is based on conventional linear system theory, such as Hidden Markov Model (HMM) and Dynamic Time Warping(DTW) . With the deep study of speech recognition, it is found that speech signal is a complex nonlinear process. If the study of speech recognition wants to break through, nonlinear
-system theory method must be introduced to it. Recently, with the developmentof nonlinea-system theories such as artificial neural networks(ANN) , chaos and fractal, it is possible to apply these theories to speech recognition. Therefore, the study of this paper is based on ANN and chaos and fractal theories are introduced to process speech recognition.
Speech recognition is divided into two ways that are speaker dependent and speaker independent. Speaker dependent refers to the pronunciation model trained by a single person, the identification rate of the training person?sorders is high, while others’orders is in low identification rate or can’t be recognized. Speaker independent refers to the pronunciation model
改進(jìn)型智能機(jī)器人的語音識別方法
2、語音識別概述
最近,由于其重大的理論意義和實(shí)用價值,語音識別已經(jīng)受到越來越多的關(guān)注。到現(xiàn)在為止,多數(shù)的語音識別是基于傳統(tǒng)的線性系統(tǒng)理論,例如隱馬爾可夫模型和動態(tài)時間規(guī)整技術(shù)。隨著語音識別的深度研究,研究者發(fā)現(xiàn),語音信號是一個復(fù)雜的非線性過程,如果語音識別研究想要獲得突破,那么就必須引進(jìn)非線性系統(tǒng)理論方法。最近,隨著非線性系統(tǒng)理論的發(fā)展,如人工神經(jīng)網(wǎng)絡(luò),混沌與分形,可能應(yīng)用這些理論到語音識別中。因此,本文的研究是在神經(jīng)網(wǎng)絡(luò)和混沌與分形理論的基礎(chǔ)上介紹了語音識別的過程。
語音識別可以劃分為獨(dú)立發(fā)聲式和非獨(dú)立發(fā)聲式兩種。非獨(dú)立發(fā)聲式是指發(fā)音模式是由單個人來進(jìn)行訓(xùn)練,其對訓(xùn)練人命令的識別速度很快,但它對與其他人的指令識別速度很慢,或者不能識別。獨(dú)立發(fā)聲式是指其發(fā)音模式是由不同年齡,不同性別,不同地域的人來進(jìn)行訓(xùn)練,它能識別一個群體的指令。一般地,由于用戶不需要操作訓(xùn)練,獨(dú)立發(fā)聲式系統(tǒng)得到了更廣泛的應(yīng)用。 所以,在獨(dú)立發(fā)聲式系統(tǒng)中,從語音信號中提取語音特征是語音識別系統(tǒng)的一個基本問題。
語音識別包括訓(xùn)練和識別,我們可以把它看做一種模式化的識別任務(wù)。通常地,語音信號可以看作為一段通過隱馬爾可夫模型來表征的時間序列。通過這些特征提取,語音信號被轉(zhuǎn)化為特征向量并把它作為一種意見,在訓(xùn)練程序中,這些意見將反饋到HMM的模型參數(shù)估計中。這些參數(shù)包括意見和他們響應(yīng)狀態(tài)所對應(yīng)的概率密度函數(shù),狀態(tài)間的轉(zhuǎn)移概率,等等。經(jīng)過參數(shù)估計以后,這個已訓(xùn)練模式就可以應(yīng)