You're using an outdated browser. Please upgrade to a modern browser for the best experience.
Authenticity at Risk: Key Factors in the Generation and Detection of Audio Deepfakes
Academic Video Service
  • View Times: 9
  • |
  • Release Date: 2025-02-26
  • audio deepfake
  • generation
  • detection
  • acoustic context
Video Introduction

This video is adapted from 10.3390/app15020558

Detecting audio deepfakes is essential for ensuring authenticity and security, particularly in critical areas like legal, security, and human rights contexts. Various factors, such as complex acoustic backgrounds, enhance the realism of deepfakes, yet their impact on the creation and detection processes has been insufficiently explored.

This discussion systematically analyzes how elements like the acoustic environment, user type, and signal-to-noise ratio influence the quality and detectability of deepfakes. Utilizing the WELIVE dataset, which features audio recordings of 14 female victims of gender-based violence in uncontrolled environments, the findings reveal that the complexity of the acoustic scene significantly affects both the generation and detection of deepfakes.

Interestingly, classifiers, especially the linear SVM, perform better in complex acoustic environments. This suggests that simpler acoustic settings may allow for the creation of more realistic deepfakes, making them harder for classifiers to detect. These insights highlight the necessity for developing adaptive models that can effectively manage diverse acoustic environments, ultimately improving detection reliability in dynamic, real-world situations.

Full Transcript
Academic Video Service