A Retrospective Assessment of Fuzzy Logic Applications in Voice Communications and Speech Analytics
Keywords:
fuzzy logic, fuzzy system, speech, communication, VAD, speech segmen- tation, speech coding, speech analyticsAbstract
Voice and speech communication is a major topic covering simultaneously ’communication’, ’control’ (because it often involves control in the coding algorithms), and ’computing’ - from speech analysis and recognition, to speech analytics and to speech coding over communication channels. While fuzzy logic was specifically con- ceived to deal with language and reasoning, it has yet a limited use in the referred field. We discuss some of the main current applications from the perspective of half a century since fuzzy logic inception.References
Amir N., Kerret O., Karlinski D.(2001); Classifying emotions in speech: a comparison of methods, 7th EUROSPEECH Proc., Aalborg, 127-130.
Austermann, A., Esau, N., Kleinjohann, L., Kleinjohann, B. (2005); Fuzzy emotion recognition in natural speech dialogue, Robot and Human Interactive Communication, ROMAN 2005, IEEE Int. Workshop on, 13-15 Aug. 2005, 317-322.
Ben Jebara S., Ben Amor T. (2004); On improving voice activity detection by fuzzy logic rules: case of coherence based features, Proc. Signal Processing Conference, 2004, 12th European, 725 - 728.
Ben Jebara S. (2002); Coherence-based voice activity detector, IEE Electronic Lett., 38(22):1393-1397. http://dx.doi.org/10.1049/el:20020914
Ben Jebara S. (2008); Voice Activity Detection Using Periodioc/Aperiodic Coherence Features, Signal Processing Conference, 2008, 16th European, Lausanne, Switzerland, 1-5.
Beritelli F., Casale S., Cavallaro A. (1999); A multi-channel speech/silence detector based on time delay estimation and fuzzy classification, Proc. IEEE Int. Conf. ASSP, Phoenix, AZ, 15-19 Mar 1999, Vol. 1: 93-96.
Beritelli F., Casale S., Cavallaro A. (1998); A robust voice activity detector for wireless communications using soft computing. IEEE J. Selected Areas Comm, 16(9): 1818-1829. 870 H.-N.L. Teodorescu
Beritelli F., Casale S., G. Ruggeri, S. Serrano (2002); Performance evaluation and comparison of G.729-AMR-fuzzy voice activity detectors, IEEE Signal Process Lett, 9(3): 85-88. http://dx.doi.org/10.1109/97.995824
Beritelli F., Casale S., Cavallaro A. (1998); Adaptive voice activity detection for wireless communications based on hybrid fuzzy learning, Global Telecommunications Conference, 1998. GLOBECOM 1998. The Bridge to Global Integration. IEEE, 3: 1729 - 1734.
Christer Carlsson (2013); On the Relevance of Fuzzy Sets in Analytics. In R. Seising, E. Trillas, C. Moraga, S. Termini (Eds.), On Fuzziness, Studies in Fuzziness and Soft Computing, 298: 83-89. http://dx.doi.org/10.1007/978-3-642-35641-4_13
Carvalho, J.P., Batista F., Coheur L. (2012); A Critical Survey on the use of Fuzzy Sets in Speech and Natural Language Process, Fuzzy Systems (FUZZ-IEEE), 2012 IEEE Interna- tional Conference on, 1-8.
Cavallaro A., Beritelli F., Casale S (1998), A Fuzzy Logic-Based Speech Detection Algorithm For Communications in Noisy Environments, Proc. 1998 IEEE Int. Conf. Acoustics, Speech and Signal Process, 1: 565-568. http://dx.doi.org/10.1109/icassp.1998.674493
Chen Y.-L., Weng C.-H. (2009); Mining fuzzy association rules from questionnaire data, Knowledge-Based Systems, 22: 46-56. http://dx.doi.org/10.1016/j.knosys.2008.06.003
Cheng RG, Chang C.J. (1996); Design of a fuzzy traffic controller for ATM networks, IEEE- ACM Trans. Networking, 4(3):460-469. http://dx.doi.org/10.1109/90.502244
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N.,Votsis, G., Kollias, S., Fellenz, W., Taylor, J.G. (2001); Emotion recognition in human-computer interaction, IEEE Signal Process Magazine, 18(1): 32-80. http://dx.doi.org/10.1109/79.911197
Dhavarudha E, Charoenlarpnopparut C, Runggeratigul S (2015); Traffic Control Based on Contention Resolution in Optical Burst, International Journal of Computers Communica- tions & Control, 10(1); 49-61. http://dx.doi.org/10.15837/ijccc.2015.1.461
El Ayadi M., Kamel M.S., Karray F. (2011); Survey on speech emotion recognition: Features, classification schemes, and databases, Pattern Recognition, 44(3): 572-587. http://dx.doi.org/10.1016/j.patcog.2010.09.020
Fenn J. (2006); Survey Shows Adoption and Value of Emerging Technologies. Gartner Research, 23 March 2006, Number G00138453.
Feraru, S.M., Teodorescu, H.N., Zbancioc, M.D. (2010); SRoL - Web-based Resources for Languages and Language Technology e-Learning, International Journal of Computers Com- munications & Control, 5(3): 301-313.
Gharavian D., Sheikhan M., Nazerieh A., Garoucy S. (2012); Speech emotion recognition using FCBF feature selection method and GA-optimized fuzzy ARTMAP neural network. Neural Computing and Applications, 21(8): 2115-2126. http://dx.doi.org/10.1007/s00521-011-0643-1
Grimm, M., Kroschel, K., Narayanan, S. (2007); Support Vector Regression for Automatic Recognition of Spontaneous Emotions in Speech, Proc. ICASSP 2007, Honolulu, HI, 4: 1085-1088.
Grimm, M., Kroschel, K., Mower, E., Narayanan, S. (2007); Primitives-based evaluation and estimation of emotions in speech, Speech Commun, 49(10-11): 787-800. http://dx.doi.org/10.1016/j.specom.2007.01.010
Grimm M., Kroschel K. (2007); Rule-Based Emotion Classification Using Acoustic Features, Speech Communication; 49(10): 787-800.
Hsieh C.T., Su M.C., Lai E., Hsu C.H. (1999); A Segmentation Method for Continuous Speech Utilizing Hybrid Neuro-Fuzzy Network. J. Information Sci. & Engineering, 15, 615- 628.
Juang C.-F., Cheng C.-N., Chen T.M. (2009); Speech detection in noisy environments by wavelet energy-based recurrent neural fuzzy network. Expert Systems with Applications, 36(1):321-332. http://dx.doi.org/10.1016/j.eswa.2007.10.028
Kamaruddin, N., Nanyang, Wahab, A (2010);, Driver behavior analysis through speech emotion understanding, IEEE Intell Vehicles Symp 2010, San Diego, CA, 238-243. DOI: 10.1109/IVS.2010.5548124 http://dx.doi.org/10.1109/IVS.2010.5548124
Kamaruddin N.,Wahab A., Quek C. (2012); Cultural dependency analysis for understanding speech emotion. Expert Systems with Applications, 39(5): 5115-5133. http://dx.doi.org/10.1016/j.eswa.2011.11.028
Kamaruddin N.,Wahab A. (2009); Features extraction for speech emotion. J. Computational Methods in Science and Engineering, 9(1- Suppl.): 11-12.
Kasabov, N., Iliev, G. (2000); Hybrid system for robust recognition of noisy speech based on evolving fuzzy neural networks and adaptive filtering, Proc. Int. Conf. IJCNN 2000, 24-27 Jul 2000, Como, Italy, 5: 91-96. DOI:10.1109/IJCNN.2000.861440 http://dx.doi.org/10.1109/IJCNN.2000.861440
Kaufmann M.A. (2008); Inductive Fuzzy Classification in Marketing Analytics (Fuzzy Man- agement Methods), Springer [Kindle Edition].
Kaufmann M.A., E. Portmann, M. Fathi (2013); A Concept of Semantics Extraction from Web Data by Induction of Fuzzy Ontologies, 2013 IEEE Int. Conf. Electro-Information Tech EIT, 1-6.
Kazemzadeh A., Lee S, and Narayanan S (2013); Fuzzy Logic Models for the Meaning of Emotion Words, IEEE Computational intelligence magazine, 8(2): 34-49. http://dx.doi.org/10.1109/MCI.2013.2247824
Lee C.M., Narayanan S.S. (2005); Toward detecting emotions in spoken dialogs, IEEE Trans Speech and Audio Process, 13(2): 293-303. http://dx.doi.org/10.1109/TSA.2004.838534
Lee CM, Narayanan S. (2003); Emotion recognition using a data-driven fuzzy inference system, Proc. EUROSPEECH, Geneva, 157-160.
Lin, C.T.,Wu, R.C.,Wu, G.D.(2002); Noisy Speech Segmentation-Enhancement with Multiband Analysis and Neural Fuzzy Networks, Int J Pattern Recognition and AI, 16(7): 927-955.
Ndousse, T.D. (1994); Fuzzy neural control of voice cells in ATM networks, IEEE J. on Selected Areas in Communications, 12(9): 1488 - 1494. http://dx.doi.org/10.1109/49.339916
Ndousse, T.D. (1998); Fuzzy expert systems in a TM networks, in Fusion of Neural Net- works, Fuzzy Systems and Genetic Algorithms: Industrial Applications, Lakhmi C. Jain, N.M. Martin (Eds.), CRC Press, Boca Raton, USA, 229-284.
Pavaloi, I., Rotaru F.(2011); A Study on Duration for Different Pronunciations in Emotional States, Proc. 3rd Int. Conf. EHB, Iasi, Romania.
T. Polzehl and F. Metze (2008); Using prosodic features to prioritize voice messages, Proc. Searching Spontaneous Conversational Speech Workshop SIGIR 2008, Singapore, July 2008, ACM.
Qin Y., Zhang X., Ying H. (2010); A HMM-based fuzzy affective model for emotional speech synthesis, 2nd Int. Conf. ICSPS, 3: 525-528. DOI: 10.1109/ICSPS.2010.5555658. http://dx.doi.org/10.1109/ICSPS.2010.5555658
Ramirez J. et al. (2004); Efficient voice activity detection algorithms using long-term speech information, Speech Commun, 42: 271-287. http://dx.doi.org/10.1016/j.specom.2003.10.002
Rodriguez W., Teodorescu HN, Grigoras F., Kandel, A., Bunke, H.(2002); A fuzzy information space approach to speech signal non-linear analysis, Int. J. Intelligent Systems, 15(4): 343-363. http://dx.doi.org/10.1002/(SICI)1098-111X(200004)15:43.0.CO;2-M
Sheikhan M, Garoucy S.(2010); Reducing the Codebook Search Time in G.728 Speech Coder Using Fuzzy ARTMAP Neural Networks, World Applied Sciences Journal, 8(10): 1260-1266.
Spanias A.S. (1994); Speech Coding: A Tutorial Review. Proc. of the IEEE, 82(10):1541 - 1582. http://dx.doi.org/10.1109/5.326413
Temko A., Macho D., Nadeu C.(2008); Fuzzy integral based information fusion for classification of highly confusable non-speech sounds. Pattern Recognition, 41(5):1814-1823. http://dx.doi.org/10.1016/j.patcog.2007.10.026
Tian Y., Wu J., Wang Z., Lu D. (2003); Fuzzy clustering and Bayesian information criterion based threshold estimation for robust voice activity detection. 2003 IEEE Int. Conf. ASSP - ICASSP'03, 1: 444-447.
Toledano D.T., RodrĂguez Crespo M. A. (1998); Escalada Sardina J. G. (1998); Trying to Mimic Human Segmentation of Speech using HMM and Fuzzy Logic Post-correction Rules, 3rd ESCA/COCOSDA Workshop (ETRW), Nov. 26-29, SSW3-1998, 207-212.
Zare, H., Adibnia,F., Derhami, V. (2013); A Rate based Congestion Control Mechanism Using Fuzzy Controller in MANETs, International Journal of Computers Communications & Control, 8(3): 486-491. http://dx.doi.org/10.15837/ijccc.2013.3.244
Yang M., Kiang M., Ku Y., Chiu C., Li Y. (2011); Social Media Analytics for Radical Opinion Mining in Hate Group Web Forums, J. Homeland Security and Emergency Management, 8(1): 1547-7355. http://dx.doi.org/10.2202/1547-7355.1801
Zadeh, L.A. (1975); Concept of a Linguistic Variable and Its Application to Approximate Reasoning. 1. Information Sciences, 8(3): 199-249. http://dx.doi.org/10.1016/0020-0255(75)90036-5
Zbancioc M., Feraru M. (2012); The Analysis of the FCM and WKNN Algorithms Performance for the Emotional Corpus SROL, Advances Electrical Comput Engng, 12(3): 33-38, DOI: 10.4316/AECE.2012.03005. http://dx.doi.org/10.4316/aece.2012.03005
Zhao H., Wang G, Xu C., Yu F. (2011); Voice activity detection method based on multivalued coarse-graining Lempel-Ziv complexity. Comput. Sci. Inf. Syst., 8(3): 869-888. http://dx.doi.org/10.2298/CSIS100906032Z
Published
Issue
Section
License
ONLINE OPEN ACCES: Acces to full text of each article and each issue are allowed for free in respect of Attribution-NonCommercial 4.0 International (CC BY-NC 4.0.
You are free to:
-Share: copy and redistribute the material in any medium or format;
-Adapt: remix, transform, and build upon the material.
The licensor cannot revoke these freedoms as long as you follow the license terms.
DISCLAIMER: The author(s) of each article appearing in International Journal of Computers Communications & Control is/are solely responsible for the content thereof; the publication of an article shall not constitute or be deemed to constitute any representation by the Editors or Agora University Press that the data presented therein are original, correct or sufficient to support the conclusions reached or that the experiment design or methodology is adequate.