رتبه‌بندی واج‌های گفتار فارسی از نظر کارآیی در بازشناسی گوینده

شیخ‌زادگان, جواد

doi:10.22059/jolr.2016.59415

نوع مقاله : مقاله پژوهشی

نویسنده

جواد شیخ‌زادگان

دانشیار پژوهشکده پردازش هوشمند علائم

https://doi.org/10.22059/jolr.2016.59415

چکیده

در این مقاله، کارآیی واجهای گفتار فارسی از نظر بازشناسی گوینده مورد مطالعه و پژوهش قرار گرفته و با توجه به میزان کارآییها، رتبهبندی واجها صورت گرفته‌اند. جهت برآورد کارآیی واجها، از یک معیاری که بهصورت نسب « فاصلة بینگویندهای» واجها به « فاصلة در گوینده‌ای» تعریف شده است و ما آن را « نسبت تأثیرپذیری گوینده » نامیدهایم، استفاده شده است. آزمایشها و محاسبات لازم برای کلیه واجهای گفتار فارسی (باستثنای واج /À/) با استفاده از دادگان گفتار فارسی « فارسدات» انجام شده و رتبهبندیها براساس نتایج آزمایشها و محاسبات هم در مورد دستههای کلی واجی و هم برای تکتک واج‌ها صورت گرفتهاند. نتایج آزمایشها و محاسبات نشان دادهاند که در رتبهبندی دستههای کلی واجی، واکهها و نیمواکهها در رتبهی اول، خیشومیها، سایشیها و روانها در رتبه دوم و انسدادیها و انفجاریها در رتبه سوم از نظر کارآیی در بازشناسی گوینده قرار دارند. رتبهبندی تک تک واج‌ها نیز نشان میدهد که واج /∂/ در رتبه اول و واج /t/ در رتبهی آخر از نظر کارآیی در بازشناسی گوینده قرار می‌گیرند. نتایج این تحقیق در مقایسه با نتایج پژوهشهای انجام شده در مورد برخی از زبان‌های دیگر نظیر انگلیسی، آلمانی و دوچ از نظر رتبهبندی دستههای کلی واجی سازگاری بالایی دارد اما از نظر جزئیات رتبهبندیها، تفاوتهای قابل توجهی ملاحظه میشود.

کلیدواژه‌ها

عنوان مقاله [English]

Ranking of Persian Speech Phonemes from the Point of View of Efficiency in Speaker Recognition

نویسنده [English]

Javad Sheykhzadegan

Associate Professor of Research Center of Intelligent Signal Processing

چکیده [English]

In this paper, the efficiency of Persian speech phonemes from the point of view of efficiency in speaker recognition has been studied, and then with due attention to efficiencies, the ranking of phonemes has been done. For estimating the efficiencies of phonemes, we have introduced one criterion that has been defined in the form of phonemes “Inter speaker distance” to “Intra speaker distance” ratio, referred to as “Speaker Affectability Ratio: SAR”. The necessary experiments and computations have been done for all Persian speech phonemes (with the exception of /À/) using the Persian speech database “Farsdat” and then on the basis of the results of these experiments and computations, the ranking of single phonemes and phoneme groups has been done. The results have shown that in the phoneme groups ranking, vowels and semi-vowels are first, nasals, fricatives and liquids are second and, obstructions and plosives are third from the point of view of efficiency in speaker recognition. Likewise, the ranking of single phonemes shows that the phoneme /∂/ is first and the phoneme /t/ is last from the point of view of efficiency in speaker recognition. The results of this research in line with research on other spoken languages such as English, Germanic and Dutch show high agreement for phoneme groups ranking but noticeable differences in details of rankings are also noted.

کلیدواژه‌ها [English]

phonemes ranking
Persian speech
phonemes efficiency
speaker recognition and speaker affectability ratio

مراجع

بی‌جن‌خان، محمود و سید صالحی، سیدعلی (1376 الف). واج به‌عنوان یک عنصر زبانی، شناختی و پردازشی، اولین مجموعه مقالات پژوهشکده پردازش هوشمند علائم 1-6.

بیجن‌خان، محمود و غفوریان، محمدعلی (1376ب). آموزش و بازشناسی خودکار طبقات واجی در گفتار پیوسته فارسی با استفاده از منطق فارسی، اولین مجموعه مقالات پژوهشکده پردازش هوشمند علائم، 7-12.

بیجن‌خان، محمود و سیدصالحی، سیدعلی (1376ج). بررسی واج‌گونه‌های زبان فارسی و استخراج فرکانس سازه‌ها، گزارش پژوهشی، مرکز تحقیقات پردازش هوشمند علائم.

ثمره، یدالله (1368). آواشناسی زبان فارسی، مرکز نشر دانشگاهی، چاپ دوم.

سید صالحی، سیدعلی و همکاران (1376). بازشناخت مستقل از گوینده واج‌های گفتار پیوسته فارسی با استفاده از ویژگی‌های تولیدی، اولین مجموعه مقالات پژوهشکده پردازش هوشمند علائم، 13-18.

شیخ‌زادگان،جواد (1374 الف). بررسی درجۀ اهمیت واج‌های زبان فارسی گفتاری از نقطه نظر بازشناسی گوینده، مجموعه مقالات دهمین کنفرانس بین‌المللی مهندسی برق ایران، 180-187.

شیخ‌زادگان،جواد (1374ب). تعیین هویت گوینده بصورت مستقل از متن، رساله دکتری، دانشگاه تربیت مدرس، 27-35.

مدرسی قوامی،گلناز (1392). آواشناسی: بررسی علمی گفتار، انتشارات سمت، چاپ دوم.

مشکوه‌الدینی،مهدی (1388). ساخت آوایی زبان، انتشارات دانشگاه فردوسی مشهد، چاپ ششم.

ABE, M. & Sagayam, S. 1990. Statistical Study on voice Individual Conversion Across Different Languages, ICSLP.

Atal, B.S. 1972. Automatic speaker recognition based on pitch contours, Acoust, Soc, Amer, 52:1972-1687.

Atal, B.S. 1974. Effectiveness of linear predication characteristics of the speech wave for Automatic speaker Identification and verification, JASA, 55, 6: 1304- 1312.

Bijankhan, M. Sheikhzadegan, J. Roohani, M.R. Samareh, Y. Lucas, K.. & Tebyani, M. 1994. FARSDAT – The speech Database of Farsi spoken Language, Proceeding SST – 94, vol. 11, Des-.

Doddington, G.R. 1970. A computer Method of speaker verification, Ph.D. dissertation, department of Electrical Engineering, University of Wisconsin Madison.

Eatok, J.P. & Mason, J.S.D. 1992. Phoneme performance in speaker Recognition, ICSLP.

Furui, S. 1986. Research on individuality features in speech waves and automatic speaker recognition techiques, Speech communication, 5, 2: 183- 197.

Goldstein, U.G. 1976. Speaker identification feature based on formant tracks,JASA, vol. 59, no. 1: 176-182, January.

Heuvel, H.V.D. & Rietveld, T. 1992. Speaker Related Variability in cepstral Representation of Dutch Speech Segments, ICSLP.

Li, K.P. & Hughes, G.W. 1974. Talker Differences as they Appear in correlation Matrices of continuous speech spectra, JASA, vol.55, No. 4: 833- 837.

Li, K.P. & Wrench, Jr.E.H. 1983. An Approach To Text- Independent Speaker Recognition with short ulterances, proc. IEEE, Int. Conf. Acoust. Speech signal processing, Boston, MA, 1209: 555-558.

Lin, C.S. etal. 1990. Study of line spectrum pair frequencies for speaker Recognition, proc. ICASSP 90, vol.1: 277- 280.

Lummis, R.C. 1975. speaker verification by computer using speech Intensity for Temporal Registration, IEEE Trans. Audio Electroacoust vol.63, pp. 561- 580.

Markel, J.D. etal. 1977. Long Term Feature Averaying for speaker Recognition, IEEE Trans. ASSP, vol. PSSP- 25, No. 4: 330- 337.

Mastui, t. & Furui, S. 1992. Speaker Recognition Using Cancatenated phoneme Models, ICSLP.

Matsui, T. & Furui, S. 1990. Text Independent speaker Recognition using Vocal Tract and pitch Information, proc. ICSLP 90, vol. 1: 137- 140.

Nolan, F. 1983. The phonetic basis of speaker recognition, Cambrige University press.

Paliwal, K.K.. 1988. A study of line spectrum pair frequencies for speech Recognition, proc. ICASSP 88, vol. 1: 485- 488.

Paul, J. & Rabinowit, A. 1979. Development of analytical methods for a semi- automatic speaker Identification system, Automatic speech and Speaker Recognition, IEEE Press: 390.

Pruzcmsky, S. & Mathews, M.V. 1964. Talker Recognition Based on Analysis of variance, JASA, vol. 36, No. 11: 2041- 2047.

Rose, R.C. & Reynalds, D.A. 1990. Text – indepent speaker Identification using Automatic Acoustic segmentation, ICASSP.

Rose, R.C. & Reynolds, D.A. 1990. Text – Independent speaker Identification using Automatic Acoustic segmentation, proc. ICASSP 90, 551.

Sambur, M.R. 1976. Speaker Recognition using orthogonal linear predication, IEEE Trances. ASSP, vol. ASSP 24, No. 4: 283- 289.

Sambur, M.R. 1972. Selection of acoustic feature for speaker identification", IEEE Trans. ASSP – 23.

Schwortz, R. etal. 1982. The Application of Probability Density Estimation to Text – Independent speaker Identification, proc. ICASSP 82, vol. 2: 1649- 1652.

Shridhar, M. etal. 1981. Text- Independent speaker Recognition using orthogonal linear prediction, ICASSP – 81: 197- 204.

SU, I.S. & etel. 1974. Identification of speaker by use of nasal coariculation JASA, vol. 56, no. 6: 1876- 1882, December.

Tou, J.T. & Gonzalez, R.C. 1974. : Pattern Recognition Principles, Addison Wesley Pulishing Company.

Wolf, J.J. 1972. Efficient acoustic parameters for spesker recognition, JASA, vol. 51, no, 6, pp. 2044-2056, June.

Yegnanarayana, B. etal. 1994. A speaker verification system using prosodic feature, ICSLP 94, vol. 4, pp. 1867-1870.

پژوهشهای زبانی

رتبه‌بندی واج‌های گفتار فارسی از نظر کارآیی در بازشناسی گوینده

مراجع

مراجع

دوره 7، شماره 1
خرداد 1395
صفحه 77-96

رتبه‌بندی واج‌های گفتار فارسی از نظر کارآیی در بازشناسی گوینده

مراجع

مراجع

دوره 7، شماره 1خرداد 1395صفحه 77-96

دوره 7، شماره 1
خرداد 1395
صفحه 77-96