referinte

Prelucrarea Semnalului Vocal si Recunoasterea Vorbirii

Consortiu de cercetare si activitate didactica:
* Laboratorul de Sisteme Inteligente - Universitatea Tehnica "Gh. Asachi" Iasi
* Catedra de Informatica Aplicata - Fac. de Informatica, Universitatea "Al. I. Cuza"
* Laboratorul de Procesarea Semnalelor si Sisteme Fuzzy si Neuro-Fuzzy -
- Institutul de Informatica Teoretica al Academiei Romane

Encyclopedia of Information Systems, Academic Press, 2002

International Journal of Speech Technology, la adresa http://www.kluweronline.com/issn/1381-2416. Numarul September 2002, Volume 5, Issue 3 este dedicat cercetarilor din Romania si este vizualizabil liber, la adresa http://ipsapp008.lwwonline.com/ips/frames/toc.asp?J=4778&I=9. Cateva dintre lucrarile din acest numar vor fi discutate si in acest curs sau la laboratoare.

G. Stolojanu, V. Podaru, F. Cetina, prelucrarea numerica a semnalului vocal. Ed. Militara, 1984

H.N. Teodorescu, D. Mlynek, A. Kandel, H.J. Zimmermann (Eds.): Intelligent Systems and Interfaces. 480pp., ISBN: 079237763XKluwer Academic Press, Boston. 2000, cap. 8, 9

H.-N. Teodorescu, L. Buchholtzer, C. Posa: Comunicarea orala om-masina, Ed. Tehnica, 1986 (introducere generala; partea tehnica depasita)

H.N. Teodorescu, A. Brezulianu: Procesarea imaginilor si a semnalului vocal. Iasi 1996, Universitatea Tehnica Iasi

H.N. Teodorescu, L.C. Jain (Eds): Intelligent Systems & Interfaces, CRC Press 2001, Cap. 1, 3

D. Tufis (Ed), Limbaj si tehnologie, Ed. Academiei Romane, 1996

D. Tufis, F. Filip (Eds), Limba Romana in Societatea Cunoasterii, Ed. Expert, Buc., 2003

http://mambo.ucsc.edu/psl/speech.html este o pagina de linkuri (unele deja depasite), majoritatea de interes pentru cercetarea actuala in lume.

3. X. Rodet, "Sound and Music from Chua's Circuit",Journal of Circuits, Systems and Computers , Special Issue on Chua's Circuit: a Paradigm for Chaos, Vol. 3, No. 1, pp. 49-61, March 1993.

4. X. Rodet, "Models of Musical Instruments from Chua's Circuit with Time Delay", in IEEE Trans. on Circ. and Syst., Special Issue on Chaos in nonlinear electronic circuits, Sept. 1993.

5. Shahrokh D. Yadegari, Self-similar Synthesis: On the Border Between Sound and Music. Master Thesis submitted to Media Arts and Sciences Section, (Media Lab) School of Architecture and Planning, at Massachusetts Institute of Technology on August 25, 1992

6. H.N. Teodorescu: Utilizarea tehnicilor nuantate (fuzzy) si de dinamica neliniara pentru sinteza adaptiva a vorbirii. In: Dan Tufis, Florin. Gh. Filip (Eds), Limba Româna în Societatea Informationala - Societatea Cunoasterii. Editura Expert, Academia Româna, 2002, pp. 277-290.

Research in Iasi, Romania, on speech, speech technology and related natural language-speech technology issues

In Iasi, there are 3 groups including people from the Technical University of Iasi, the Institute for Information Science of the Romanian Academy, and the University "Al. I. Cuza" Iasi. The three groups work tightly together. The main topics of research of these groups have been:

Contribution to the use of the nonlinear analysis in speech analysis. The nonlinear analysis has revealed a direct connection between the nonlinear processes as represented by the formants and the change in the nonlinear behavior of the pitch. Research has been performed in cooperation with the Department of Computer Science and Engineering, University of South Florida, where a PhD Thesis has continued the researchers based in Iasi. The results of these researches have been presented in [3,4,5,8,9].

Research performed in cooperation with physicians revealed that the larynx muscles are driven by bio-electric potentials that can be used to decode what vowels are uttered. As a direct consequence, voice prostheses driven by the larynx muscles have been proposed. [1, 13, 14]

Various pioneering researches with applications in medicine have been conducted. Results are presented in [11, 15, 16, 17].

Adaptive speech synthesis, including the use of fuzzy systems in performing and adapting the speech synthesis have been addressed by the groups in Iasi. The adaptation is aimed to quality improvement in speech synthesis under various ambient conditions (non-stationary noise.) Results are reported in [2,6,7,10].

References (only papers authored by the group led by H.N. Teodorescu)

Papers and chapters

[1] Teodorescu H.N., et al.: Neurological control of laryngeal prosthesis, in vol. Progress Report in Electronic in Medicine and Biology, 1986, IERE Press, London, p. 269-275
[2] Vasile Apopei, Doina Jitca, Florin Grigoras, H.N. Teodorescu: Naturalness in Speech Synthesis by Fuzzy Control of the Glottal Parameters. Proc. 6th International Conference on Soft Computing (IIZUKA2000), September 30-October 4, 2000
[3] W. Rodriguez, H.N. Teodorescu, F. Grigoras, A. Kandel, H. Bunke: A Fuzzy information space approach to speech signal nonlinear analysis. J. of Intelligent Systems (Wiley), Dec. 1999
[4] Fl. Grigoras, H.N. Teodorescu, V. Apopei: Nonlinear Analysis and Synthesis of Speech. Studies in Informatics and Control, vol. 7, no. 1, March 1998, pp. 57-72
[5] H.N. Teodorescu, Fl. Grigoras, V. Apopei: Nonlinear processes in speech production. Int. J. Chaos Theory and Applications, vol. 2, no. 2 (1997), pp. 35-52
[6] Teodorescu H.N.: Making speech synthesisers noise-adaptabile. Electronic Engineering (UK), July 1987, p. 23
[7] F. Grigoras, H. N. Teodorescu, V. Apopei: Fuzzy Control for Speech AI-based Synthesis. European Control Conference, Germany, September 1999
[8] F. Grigoras, H.N. Teodorescu, V. Apopei: Analysis of nonlinear and nonstationary processes in speech production, IEEE 1997 Workshop on Applications of Processing to Audio and Acoustics Mohonk Mountain House New Paltz, New York, October 19-22, 1997 (IEEE Catalog # 97TH8278)
[9] H.N. Teodorescu, Fl. Grigoras: Nonlinear Techniques in Speech Signal Analysis. Proc. International Conference on Intelligent Technologies in Human-Related Sciences, ITHURS'96. July 5-7, 1996, Leon, Spain. Vol. 2, pp. 293-298
[10] Teodorescu H.N., Chelaru M., Sofron E., Adascalitei A.: Adaptive speech synthesis. In vol. Digitale Sprach-verarbeitung - Prinzipien und Anwendungen. VDE Verlag, Berlin (W), 1988, pp. 183-188.
[11] Teodorescu H.N. et al - Fuzzy models in speech analysis and medical application, in Book of Summaries Int. Conf "Modelling and Simulation", Istanbul, Turkey, July 1988, vol. 1, p. 162 (Summary)
[12] F. Grigoras, H. N. Teodorescu, V. Apopei: Fuzzy Control for Speech AI-based Synthesis. European Control Conference, Germany, September 1999
[13] Teodorescu H.N., L. Buchholtzer, Chelaru M., Teodorescu L.: A laryngeal prosthesis based on perilaryngean reflexes, Proc. 9th Int. EMBS Conf. IEEE, Boston. Vol. 4, IEEE, 1987, pp. 2114-2115

[14] Teodorescu H.N., Teodorescu L., Buchholtzer L.: Patent (Rom.) no. 84397/03.28.1984: Command device for laryngeal prostheses (Dispozitiv de comanda pentru proteze vocale)
[15] Teodorescu H.N., Talmaciu M., Teodorescu L.: Patent (Rom.) no. 84641/04.10.1984: Hearing-disabled aid (Proteza pentru handicapati auditivi)
[16] Teodorescu H.N., Posa C., Teodorescu L.: Patent (Rom.) no. 84396/03.28.1984: Biofeedback method (Metoda de biofeedback)
[17] Teodorescu H.N., Talmaciu M., Teodorescu L.: Patent (Rom.) no. 84641/04.10.1984: Prosthesis for hearing disabled (Proteza pentru handicapati auditivi)

Two volumes (handbooks) and several book chapters have been authored in connection to the research performed.

Referinte bibliografice

[1]. Teodorescu H.N., Chelaru M., Sofron E., Adascalitei A.: Adaptive speech synthesis. In vol. Digitale Sprach-verarbeitung - Prinzipien und Anwendungen. VDE Verlag, Berlin (W), pp. 183-188, 1988
[2]. Teodorescu H.N.: Interrelationship, Communication, Semiotics, and Artificial Consciousness. In: Kitamura, T. (Ed.): What Should be Computed to Understand and Model Brain Functions? FLSI Book Series, vol. 3, World Scientific, 2000
[3]. Teodorescu H.N.: Computer semiotics: understanding meanings and parallel languages (Refereed invited paper), Proc. Int. Conf. IIZUKA'98, Japan, 1998
[4]. Teodorescu H.N.: Making speech synthesisers noise-adaptabile. Electronic Engineering (UK), July 1987, p. 23
[5]. Rodriguez, W., Teodorescu H.N., Grigoras Fl., Kandel A., Bunke H.: A Fuzzy information space approach to speech signal nonlinear analysis. J. of Intelligent Systems (Wiley), Dec. 1999
[6]. Grigoras Fl., Teodorescu H.N., Apopei V.: Nonlinear Analysis and Synthesis of Speech. Studies in Informatics and Control, vol. 7, no. 1, March 1998, pp. 57-72
[7]. Teodorescu H.N., Grigoras Fl., Apopei V.: Nonlinear processes in speech production. Int. J. Chaos Theory and Applications, vol. 2, no. 2 (1997), pp. 35-52
[8]. Teodorescu H.N., Grigoras Fl.: Nonlinear Techniques in Speech Signal Analysis. Proc. International Conference on Intelligent Technologies in Human-Related Sciences, ITHURS'96. July 5-7, Leon, Spain. Vol. 2, pp. 293-298, 1996
[9]. Grigoras Fl., Teodorescu H.N., Apopei V.: Analysis of nonlinear and nonstationary processes in speech production, IEEE 1997 Workshop on Applications of Processing to Audio and Acoustics. Mohonk Mountain House New Paltz, New York, October 19-22, 1997 (IEEE Catalog # 97TH8278)
[10]. Burlui V., Teodorescu H.N., Morarasu C.S.: La fonction phonatoire chez l'edente total. Analyse en frequence. Les Cahiers de Prothese (France), No. 88, Decembre 1994, pp. 63-68 1994
[11]. Teodorescu H.N. et al.: Fuzzy models in speech analysis and medical application, in Book of Summaries Int. Conf Modelling and Simulation, Istanbul, Turkey, July 1988, vol. 1, p. 162 (Summary)
[12]. Teodorescu H.N., L. Buchholtzer, Chelaru M., Teodorescu L.: A laryngeal prosthesis based on perilaryngean reflexes, Proc. 9th Int. EMBS Conf. IEEE, Boston. Vol. 4, IEEE, pp. 2114-2115, 1987
[13]. Anonymous Automotive Industry OEM/Supplier: Talking to computers vs. talking to humans 7/12/2000. http://www-nrd.nhtsa.dot.gov/departments/nrd-13/driver-distraction/Topics013040293.htm#A293
[14]. Anne-Marie Derouault, The Future of Speech Recognition. Evolving speech recognition technology is driving transparent computing, making it easier for people to interact with computers. http://www.advisor.com/Articles.nsf/ID/OA000107.DERO01
[15]. House D., Bell L., Gustafson K. & Johansson L. Child-directed speech synthesis: evaluation of prosodic variation for an educational computer program. Proc of Eurospeech'99, pp. 1843-1846, 1999
[16]. Heldner M., Strangert E. & Deschamps T.: Focus detection using overall intensity and high frequency emphasis. In: Andersson R, Abelin Å, Allwood J & Lindblad P, eds. Proc of Fonetik 99; pp. 73-76, 1999.
[17]. Heldner M., Strangert E. & Deschamps T.: A focus detector using overall intensity and high frequency emphasis. Proc of ICPhS-99, pp. 1491-1494, 1999.
[18]. Heldner M.: On the non-linear lengthening of focally accented Swedish words. In: W. van Dommelen & T Fretheim, eds. Nordic Prosody: Proc of the VIIIth Conference, Trondheim 2000 . Frankfurt am Main: Peter Lang. 2001
[19]. Karlsson I., Banziger T., Dankovicová J., Johnstone T., Lindberg J., Melin H., Nolan F. & Scherer K.: Within-speaker variability due to speaking manners. Mannell RH & Robert-Ribes J, eds. Proc of ICSLP98, 2379-2382. 1998
[20]. Karlsson I.: Within-speaker variability in the VeriVox database. In: Andersson R, Abelin Å, Allwood J & Lindblad P, eds. Proc. of Fonetik 99, pp. 93-96, 1999.
[21]. Karlsson I, Banziger T, Dankovicova J, Johnstone T, Lindberg J, Melin H, Nolan F, Scherer K (1998), Within speaker variation due to induced stress, Proc Fonetik-98, 150-153. www.ling.su.se/ fon/publications/fonetik98/
[22]. Gustafson-Capkova S & Megyesi B.: A Comparative Study of Pauses in Dialogues and Read Speech. Proc of Eurospeech 2001, pp. 931-935, 2001
[23]. Beskow J.: A tool for teaching and development of parametric speech synthesis. In: Branderud P & Traunmüller H (eds). Proc of Fonetik -98, pp. 162-165. 1-98, 1998
[24]. Rachel I. Mayberry, Elizabeth Lock, Hena Kazmi: Linguistic ability and early language exposure. NATURE, Vol. 417, 2 May 2002, p. 38, 2002
[25]. Mircrosoft Co.: Platform SDK: Agent. Characters. http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msagent/deschar_8nn6.asp
[26]. Mauricio Lumbreras, Gustavo Rossi: Metaphor for the Visually Impaired: Browsing Information in a 3D Auditory Environment. CHI'95 Proc., www.acm.org/sigchi/chi95/proceedings/shortppr/ml_bdy.htm
[27]. Christophe d'Alessandro & Jean-Sylvain Liénard: 5.2 Synthetic Speech Generation. In: Survey of the State of the Art in Human Language Technology. http://cslu.cse.ogi.edu/HLTsurvey/ch5node4.html#SECTION52
[28]. Teodorescu H.N.: Chaos in fuzzy systems and signals. Vol. Proceedings of the 2nd Int. Conf. on Fuzzy Logic and Neural Networks. Vol. 1., pp. 21-50 (Jono Printing Co., 1992, Iizuka, Japan)
[29]. Teodorescu H.N., Kandel A., Jain L. C. (Eds.), Fuzzy and Neuro-Fuzzy Systems in Medicine (International Series on Computational Intelligence). CRC Press, Boca Raton, USA, 1998.
[30]. Teodorescu H.N., Mlynek D., Kandel A. (Eds.): Intelligent Systems and Interfaces (The Kluwer International Series In Intelligent Systems). Kluwer Publ., Boston, 2000.
[31]. Yasuhisa Niimi, Masanori Kasamatu, Takuya Nishimoto and Masahiro Araki: Synthesis of Emotional Speech Using Prosodically Balanced VCV Segments. http://www.ssw4.org/papers/133.pdf.
[32]. Nick Campbell: Where is the information in sopeech? (and to what extent can it be modelled in synthesis?) www.slt.atr.co.jp/cocosda/jenolan/Proc/r82/r82.pdf.
[33]. Hakulinen J., Turunen, M.: Prosodic Features for Speech User Interfaces. www.cs.uta.fi/hci/spi/reports/Prosodic_Features_for_Speech_User _Interfaces.pdf.
[34]. Ansgar Rinscheid: Voice Conversion Based On Topological Feature Maps and Time-Variant Filtering. www.asel.udel.edu/icslp/cdrom/vol3/235 /a235.pdf.
[35]. Syrdal A., Stylianou Y., Garrison L.+, Conkie A. Schroeter J.: Td-Psola Vs. Harmonic Plus Noise model in Diphone Based Speech Synthesis. www.research.att.com/projects/tts/papers/1998_ICASSP/paperSYN.ps.

MENIU

- Introducere
Welcome message
- Programa analitica
- Materiale suplimentare pentru laborator, seminar si curs
- Note de curs, lucrari indicate pentru lectura
- Referinte
- Mini-proiecte
- Corpusuri (Baze de date - semnale vocale)
- Exemplu de proiect de materat (disertatie), in curs de realizare
- Alte teme de cercetare curente ale colectivului (aceste teme pot deveni subiecte ale unor teze de disertatie sau diploma)
- Varia - materiale pentru cursuri, laboratoare, mini-proiecte etc.