Classificació de les fricatives de l’oromo amb aprenentatge automàtic

Main Article Content

Feda Negesse

Mitjançant aprenentatge automàtic, aquest estudi investiga com la durada del segment, els moments espectrals i els coeficients DCT permeten diferenciar les fricatives simples de les geminades en oromo. Divuit parlants nadius d’oromo occidental van produir un conjunt de fricatives en posició intervocàlica. D’aquests segments se’n va extreure la durada, els moments espectrals transformats en bark i els sis primers coeficients DCT. Els sons van ser classificats mitjançant màquines de vectors de suport, random forests i xarxes neuronals de perceptró multicapa. Els resultats revelen que la durada del segment és la característica més consistent per distingir entre simples i geminades, amb els coeficients DCT lleugerament superiors als moments espectrals. La màxima precisió de classificació s’aconsegueix combinant la durada i els moments espectrals, però les característiques no temporals donen lloc a més errors.

Paraules clau
oromo, fricatives, aprenentatge automàtic, coeficients DCT

Article Details

Com citar
Negesse, Feda. «Classificació de les fricatives de l’oromo amb aprenentatge automàtic». Estudios de fonética experimental, 2025, vol.VOL 34, p. 151-68, https://raco.cat/index.php/EFE/article/view/980000003466.
Referències

Abdelwhab, O. (2022). Feature selection based on term frequency for Arabic text classification using multilayer perceptron. In S. Sedkaoui, M. Khelfaoui, R. Benaichouba, & K. Mohammed Belkebir (Eds.), International Conference on Managing Business Through Web Analytics (pp. 101–109). Springer. https://doi.org/10.1007/978-3-031-06971-0_8

Abramson, A. S. (1999). Fundamental frequency as a cue to word-initial consonant length: Pattani Malay. In J. J. Ohala, Y. Hasegawa, M. Ohala, D. Granville, & A. C. Bailey (Eds.), Proceedings of the 14th International Congress of Phonetic Sciences (ICPhS-14) (pp. 591–594). International Phonetic Association.

Ali, A. M. A., Van der Spiegel, J. V., & Mueller, P. (2001). Acoustic-phonetic features for the automatic classification of fricatives. The Journal of the Acoustical Society of America, 109(5), 2217–2235. https://doi.org/10.1121/1.1357814

Al-Khairy, M. A. (2005). Acoustic characteristics of Arabic fricatives [Doctoral dissertation, University of Florida]. ProQuest Dissertations Publishing. (Publication No. 9932727)

Al-Tamimi, J. (2017). Revisiting acoustic correlates of pharyngealization in Jordanian and Moroccan Arabic: Implications for formal representations. Laboratory Phonology, 8(1), 28, 1–40. https://doi.org/10.5334/labphon.19

Al-Tamimi, J., & Khattab, G. (2015). Acoustic cue weighting in singleton vs. geminate contrast in Lebanese Arabic: The case of fricative consonants. The Journal of the Acoustical Society of America, 138(1), 344–360. https://doi.org/10.1121/1.4922514

Al-Tamimi, J., & Khattab, G. (2018). Acoustic correlates of the voicing contrast in Lebanese Arabic singleton and geminate stops. Journal of Phonetics, 71, 306–325. https://doi.org/10.1016/j.wocn.2018.09.010

Anna, E., & Di Benedetto, M. G. (1999). An acoustical and perceptual study of gemination in Italian stops. The Journal of the Acoustical Society of America, 106(4), 2051–2062. https://doi.org/10.1121/1.428056

Aoyama, K., & Reid, L. A. (2006). Cross-linguistic tendencies and durational contrasts in geminate consonants: An examination of Guinaang Bontok geminates. Journal of the International Phonetic Association, 36(2), 145–157. https://doi.org/10.1017/S0025100306002520

Baumann, S., & Winter, B. (2018). What makes a word prominent? Predicting untrained German listeners’ perceptual judgements. Journal of Phonetics, 70, 20–38. https://doi.org/10.1016/j.wocn.2018.05.004

Boersma, P., & Weenink, D. (1992–2009). Praat: Doing phonetics by computer (Version 6.2.21) [Computer software]. http://www.praat.org/

Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. https://doi.org/10.1023/A:1010933404324

Brunton, S. L., Noack, B. R., & Koumoutsakos, P. (2020). Machine learning for fluid mechanics. Annual Review of Fluid Mechanics, 52, 477–508. https://doi.org/10.1146/annurev-fluid-010719-060214

Ceriotti, M., Clementi, C., & von Lilienfeld, O. A. (2021). Introduction: Machine learning at the atomic scale. Chemical Reviews, 121(16), 9719–9721. https://doi.org/10.1021/acs.chemrev.1c00598

Dejene, G. (2019). An acoustic analysis of Oromo fricatives produced by typically developing child and adult speakers [Doctoral dissertation, Addis Ababa University]. AAU Institutional Repository.

DiCanio, C. T. (2012). The phonetics of fortis and lenis consonants in Itunyoso Trique. International Journal of American Linguistics, 78(2), 239–272. https://doi.org/10.1086/664481

Dwyer, D. B., Falkai, P., & Koutsouleris, N. (2017). Machine learning approaches for clinical psychology and psychiatry. Annual Review of Clinical Psychology, 14, 91–118. https://doi.org/10.1146/annurev-clinpsy-032816-045037

Elani, H. W., Batista, A. F. M., Thomson, W. M., Kawachi, I., & Filho, C. (2021). Predictors of tooth loss: A machine learning approach. PLoS ONE, 16(6), e0252873. https://doi.org/10.1371/journal.pone.0252873

Forrest, K., Weismer, G., Milenkovic, P., & Dougall, R. N. (1988). Statistical analysis of word-initial voiceless obstruents: Preliminary data. The Journal of the Acoustical Society of America, 84(1), 115–123. https://doi.org/10.1121/1.396977

Fox, R. A., & Nissen, S. L. (2005). Sex-related acoustic changes in voiceless English fricatives. Journal of Speech, Language, and Hearing Research, 48(4), 753–765. https://doi.org/10.1044/1092-4388(2005/052)

Guest, D., Cranmer, K., & White, D. (2018). Deep learning and its application to LHC physics. Annual Review of Nuclear and Particle Science, 68, 161–181. https://doi.org/10.1146/annurev-nucl-101917-021019

Guido, R., Groccia, M. C. & Conforti, D. (2023). A hyper-parameter tuning approach for cost-sensitive support vector machine classifiers. Soft Computing, 27, 12863–12881. https://doi.org/10.1007/s00500-022-06768-8

Ham, W. (2002). Phonetic and phonological aspects of geminate timing (1st ed.). Routledge. https://doi.org/10.4324/9781315023755

Harris, F. J. (1978). On the use of windows for harmonic analysis with the discrete Fourier transform. Proceedings of the IEEE, 66(1), 51–83. https://doi.org/10.1109/PROC.1978.10837

Hillenbrand, J., Getty, L. A., Clark, M. J., & Wheeler, K. (1995). Acoustic characteristics of American English vowels. The Journal of the Acoustical Society of America, 97(5, Pt. 1), 3099–3111. https://doi.org/10.1121/1.411872

Jannedy, S., & Weirich, M. (2017). Spectral moments vs. discrete cosine transformation coefficients: Evaluation of acoustic measures distinguishing two merging German fricatives. The Journal of the Acoustical Society of America, 142(1), 395–405. https://doi.org/10.1121/1.4991347

Jesus, L. M. T., & Jackson, P. J. B. (2008). Frication and voicing classification. In A. Teixeira, V. L. Strube Lima, L. Caldas Oliveira & P. Quaresma (Eds.), Computational Processing of the Portuguese Language (PROPOR 2008) (pp. 11–20). Springer. https://doi.org/10.1007/978-3-540-85980-2_2

Jesus, L. M. T., & Shadle, C. H. (2002). A parametric study of the spectral characteristics of European Portuguese fricatives. Journal of Phonetics, 30(3), 437–464. https://doi.org/10.1006/jpho.2002.0169

Jongman, A. (1989). Duration of fricative noise required for identification of English fricatives. The Journal of the Acoustical Society of America, 85, 1718–1725. https://doi.org/10.1121/1.397961

Jongman, A., Wayland, R., & Wong, S. (2000). Acoustic characteristics of English fricatives. The Journal of the Acoustical Society of America, 108(3), 1252–1263. https://doi.org/10.1121/1.1288413

Khattab, G. (2007). A phonetic study of gemination in Lebanese Arabic. In J. Trouvain & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences, 153–158. International Phonetic Association.

Kong, Y. Y., Mullangi, A., & Kokkinakis, K. (2014). Classification of fricative consonants for speech enhancement in hearing devices. PLoS ONE, 9(4). https://doi.org/10.1371/journal.pone.0095001

Lilley, J., Spinu, L., & Athanasopoulou, A. (2021). Exploring the front fricative contrast in Greek: A study of acoustic variability based on cepstral coefficients. Journal of the International Phonetic Association, 51(3), 313–334. https://doi.org/10.1017/S002510031900029X

Local, J., & Simpson, A. (1999). Phonetic implementation of geminates in Malayalam nouns. In J. J. Ohala, Y. Hasegawa, M. Ohala, D. Granville, & A. C. Bailey (Eds.), Proceedings of the 14th International Congress of Phonetic Sciences (ICPhS-14) (pp. 595–598). International Phonetic Association.

Maniwa, K., Jongman, A., & Wade, T. (2009). Acoustic characteristics of clearly spoken English fricatives. The Journal of the Acoustical Society of America, 125(6), 3962–3970. https://doi.org/10.1121/1.2990715

McMurray, B., & Jongman, A. (2011). What information is necessary for speech categorization? Harnessing variability in the speech signal by integrating cues computed relative to expectations. Psychological Review, 118(2), 219–246. https://doi.org/10.1037/a0022325

Mitterer, H. (2018). The singleton-geminate distinction can be rate dependent: Evidence from Maltese. Laboratory Phonology, 9(1), Article 6. https://doi.org/10.5334/labphon.66

Mokari, P. G., & Sardhaei, N. M. (2020). Predictive power of cepstral coefficients and spectral moments in the classification of Azerbaijani fricatives. The Journal of the Acoustical Society of America, 147(3), EL228–EL234. https://doi.org/10.1121/10.0000830

Morgenstern, J. D., Rosella, L. C., Costa, A. P., & Anderson, L. N. (2022). Development of machine learning prediction models to explore nutrients predictive of cardiovascular disease using Canadian linked population-based data. Applied Physiology, Nutrition, and Metabolism, 47, 529–546. https://doi.org/10.1139/apnm-2021-0502

Nirgianaki, E. (2014). Acoustic characteristics of Greek fricatives. The Journal of the Acoustical Society of America, 135(5), 2964–2976. https://doi.org/10.1121/1.4870487

Nissen, L. S., & Fox, R. A. (2005). Acoustic and spectral characteristics of young children’s fricative productions: A developmental perspective. The Journal of the Acoustical Society of America, 118(4), 2570–2578. https://doi.org/10.1121/1.2010407

Owens, J. (1985). A grammar of Harar Oromo. Helmut Buske Verlag.

Payne, E. (2005). Phonetic variation in Italian consonant gemination. Journal of the International Phonetic Association, 35(2), 153–159. https://doi.org/10.1017/S0025100305002240

Payne, E. (2006). Non-durational indices of gemination in Italian. Journal of the International Phonetic Association, 36(1), 83–95. https://doi.org/10.1017/S0025100306002398

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830. https://jmlr.org/papers/v12/pedregosa11a.html

Podder, P., Khan, T. Z., Khan, M. H., & Rahman, M. M. (2014). Comparative performance analysis of Hamming, Hanning and Blackman window. International Journal of Computer Applications, 96(18), 1–7. https://doi.org/10.5120/16891-6927

Raphael, W., Harrington, J., & Jänsch, K. (2017). EMU-SDMS: Advanced speech database management and analysis in R. Computer Speech & Language, 45, 392–410. https://doi.org/10.1016/j.csl.2017.01.002

Reidy, P. F. (2015). A comparison of spectral estimation methods for the analysis of sibilant fricatives. The Journal of the Acoustical Society of America, 137(4), EL248–EL254. https://doi.org/10.1121/1.4915064

Reidy, P. F. (2016). Spectral dynamics of sibilant fricatives are contrastive and language-specific. The Journal of the Acoustical Society of America, 140(4), 2518–2529. https://doi.org/10.1121/1.4964510

Seth, S., Singh, G., & Chahal, K. K. (2021). A novel time-efficient learning-based approach for smart intrusion detection system. Journal of Big Data, 8, 111. https://doi.org/10.1186/s40537-021-00498-8

Shadle, C. H., Chen, W., Koenig, L. L., & Preston, J. L. (2023). Refining and extending measures for fricative spectra, with special attention to the high-frequency range. The Journal of the Acoustical Society of America, 154(3), 1932–1944. https://doi.org/10.1121/10.0021075

Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding machine learning: From theory to algorithms. Cambridge University Press. https://doi.org/10.1017/CBO9781107298019

Sharma, S., Mittal, R., & Goyal, N. (2022). An assessment of machine learning and deep learning techniques with applications. ECS Transactions, 107(1), 8979. https://doi.org/10.1149/10701.8979ecst

Solé, M. J. (2007). Controlled and mechanical properties in speech: A review of the literature. In M. J. Solé, P. S. Beddor & M. Ohala (Eds.), Experimental Approaches to Phonology (pp. 302–321). Oxford University Press. https://doi.org/10.1093/oso/9780199296675.003.0018

Spinu, L., & Lilley, J. (2016). A comparison of cepstral coefficients and spectral moments in the classification of Romanian fricatives. Journal of Phonetics, 57, 40–58. https://doi.org/10.1016/j.wocn.2016.05.002

Spinu, L., Kochetov, A., & Lilley, J. (2018). Acoustic classification of Russian plain and palatalized sibilant fricatives: Spectral vs. cepstral measures. Speech Communication, 100, 41–45. https://doi.org/10.1016/j.specom.2018.04.010

Stevens, K. N. (2000). Acoustic phonetics. MIT Press. https://doi.org/10.7551/mitpress/1072.001.0001

Stroomer, H. (1987). A comparative study of three Southern Oromo dialects in Kenya: Phonology, morphology and vocabulary. Buske.

Stroomer, H. (1995). A grammar of Boraana Oromo (Kenya): Phonology, morphology, vocabularies. Rüdiger Köppe Verlag.

Tagliamonte, S. A., & Baayen, R. H. (2012). Models, forests, and trees of York English: Was/were variation as a case study for statistical practice. Language Variation and Change, 24(2), 135 178. https://doi.org/10.1017/S0954394512000129

Tran, K., Neiswanger, W., Yoon, J., Zhang, Q., Xing, E., & Ulissi, Z. W. (2020). Methods for comparing uncertainty quantifications for material property predictions. Machine Learning: Science and Technology, 1(2), 025006. https://doi.org/10.1088/2632-2153/ab7e1a

Ulrich, N., Allassonnière-Tang, M., Pellegrino, F., & Dediu, D. (2021). Identifying the Russian voiceless non-palatalized fricatives /f/, /s/, and /ʃ/ from acoustic cues using machine learning. The Journal of the Acoustical Society of America, 150(3), 1806–1820. https://doi.org/10.1121/10.0005950

Villarreal, D., Clark, L., Hay, J., & Watson, K. (2020). From categories to gradience: Auto-coding sociophonetic variation with random forests. Laboratory Phonology, 11(1), Article 5. https://doi.org/10.5334/labphon.216

Wikse Barrow, C., Włodarczak, M., Thörn, L., & Heldner, M. (2022). Static and dynamic spectral characteristics of Swedish voiceless fricatives. The Journal of the Acoustical Society of America, 152(5), 2588–2600. https://doi.org/10.1121/10.0014947