X3A: Efficient Multimodal Deepfake Detection with Score-Level Fusion

  • Chan Park
  • , Bohyun Moon
  • , Minsun Jeon
  • , Jee Weon Jung
  • , Simon S. Woo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Advances in deepfake generation have highlighted the necessity for sophisticated detection methods and realistic datasets to ensure models are effectively generalized. While traditional datasets focused on unimodal manipulations, the emergence of multimodal datasets, which include audio-visual forgeries, increased the complexity of deepfake detection. The recent release of the LAV-DF and AV-Deepfake1M datasets featured partial manipulations in multimodal contents and underscored the need for effective videolevel detection methods to identify these forgeries. In this work, we propose X3A, an efficient multimodal video deepfake detection model exploiting two powerful unimodal models with probabilistic score-level fusion. X3A leverages the advantage of using raw visual and audio inputs without relying on hand-crafted features. We conducted the extensive experiments on multiple different multimodal deepfake benchmark datasets and achieved superior performance on multimodal deepfake detection, successively detecting entirely and partially manipulated scenarios. Our X3A model demonstrates an accuracy of 0.9960 AUC of 0.9999 on the most challenging AV-Deepfake1M benchmark, surpassing all existing models.

Original languageEnglish
Title of host publication40th Annual ACM Symposium on Applied Computing, SAC 2025
PublisherAssociation for Computing Machinery
Pages767-774
Number of pages8
ISBN (Electronic)9798400706295
DOIs
StatePublished - 14 May 2025
Event40th Annual ACM Symposium on Applied Computing, SAC 2025 - Catania, Italy
Duration: 31 Mar 20254 Apr 2025

Publication series

NameProceedings of the ACM Symposium on Applied Computing

Conference

Conference40th Annual ACM Symposium on Applied Computing, SAC 2025
Country/TerritoryItaly
CityCatania
Period31/03/254/04/25

Keywords

  • deepfake detection
  • multimodal deepfake
  • score-level fusion

Fingerprint

Dive into the research topics of 'X3A: Efficient Multimodal Deepfake Detection with Score-Level Fusion'. Together they form a unique fingerprint.

Cite this