Document Type : Research Paper

Authors

1 Department of Linguistics, Faculty of foreign languages, University of Isfahan,Isfahan, Iran

2 Department of Linguistics, Faculty of foreign languages, University of Isfahan, Isfahan, Iran

10.22059/jolr.2025.389287.666909

Abstract

In the digital age, accurate speaker identification plays a crucial role in forensic and security investigations. However, the widespread use of internet-based communication platforms, such as WhatsApp, has introduced new challenges in this field. Factors such as variable microphone quality, background noise, network distortions, and audio compression can significantly affect a speaker’s acoustic features and reduce the accuracy of speaker identification systems. Despite these limitations, evaluating the performance of acoustic features under such conditions is essential for advancing forensic phonetics and improving its practical applications in real-world settings. This study examines the role of voiceless fricatives in capturing between-speaker variability in audio recordings obtained through WhatsApp. The novelty of this research lies in investigating the ability of Persian voiceless fricatives to distinguish speakers under non-ideal recording conditions. To achieve this goal, speech data from 100 male Persian speakers were collected, and Mel-frequency cepstral coefficients (MFCCs) were extracted from the voiceless fricative segments. These features were then used as input to a support vector machine (SVM) model for speaker classification. The results showed that when all voiceless fricatives were considered together, the model achieved an overall speaker identification accuracy of 69%. However, analyzing each fricative separately led to an increase in model accuracy. Among the individual fricatives, the /s/ fricative had the highest accuracy at 77%, followed by /ʃ/, /f/, and /x/ with accuracies of 75%, 74%, and 73%, respectively. These findings suggest that even in non-ideal recording conditions, such as WhatsApp recordings, voiceless fricatives can provide valuable information for speaker differentiation. However, this study only focuses on one type of non-ideal recording condition, and further research is needed to explore other potential sources of degradation. The results highlight the potential of voiceless fricatives in speaker identification applications, particularly in informal, uncontrolled, and real-world scenarios where high-quality recording equipment is unavailable.

Keywords

Main Subjects