TY - GEN
T1 - DeepDistAL
T2 - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024
AU - Rana, Md Shohel
AU - Nur Nobi, Mohammad
AU - Sung, Andrew
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024/7
Y1 - 2024/7
N2 - In the rapidly evolving landscape of artificial intelligence (AI), particularly in the Deepfake domain, largescale datasets play a pivotal role in ensuring performance, including the model's accuracy, robustness, trustworthiness, etc. However, the increasing size and intricacy of the datasets impose a growing demand for computational resources and amplify the cost and duration of model building. To mitigate the challenge, dataset distillation provides a solution. For the Deepfake detection problem, noteworthy datasets such as VDFD, FaceForensics++, DFDC, and Celeb-DF underscore the indispensability of extensive data for ensuring model robustness. Nevertheless, the computational requirement associated with these datasets presents significant obstacles. This paper describes a data distillation method utilizing Active Learning to reduce dataset size while retaining essential data qualities. The proposed method facilitates efficient model training selecting representative samples by capturing the most salient features, thereby enabling effective performance in resource-constrained environments. The study encompasses developing a data distillation algorithm tailored for Deepfake detection, rigorous experimentation with a major Deepfake dataset to validate its efficacy, and a comprehensive comparison of the model performance trained on distilled versus original datasets. Through thorough analysis, we demonstrate the practicality and effectiveness of our proposed method in alleviating computational demands without compromising detection accuracy.
AB - In the rapidly evolving landscape of artificial intelligence (AI), particularly in the Deepfake domain, largescale datasets play a pivotal role in ensuring performance, including the model's accuracy, robustness, trustworthiness, etc. However, the increasing size and intricacy of the datasets impose a growing demand for computational resources and amplify the cost and duration of model building. To mitigate the challenge, dataset distillation provides a solution. For the Deepfake detection problem, noteworthy datasets such as VDFD, FaceForensics++, DFDC, and Celeb-DF underscore the indispensability of extensive data for ensuring model robustness. Nevertheless, the computational requirement associated with these datasets presents significant obstacles. This paper describes a data distillation method utilizing Active Learning to reduce dataset size while retaining essential data qualities. The proposed method facilitates efficient model training selecting representative samples by capturing the most salient features, thereby enabling effective performance in resource-constrained environments. The study encompasses developing a data distillation algorithm tailored for Deepfake detection, rigorous experimentation with a major Deepfake dataset to validate its efficacy, and a comprehensive comparison of the model performance trained on distilled versus original datasets. Through thorough analysis, we demonstrate the practicality and effectiveness of our proposed method in alleviating computational demands without compromising detection accuracy.
KW - Active Learning
KW - Dataset Distillation
KW - DeepDistAL
KW - Deepfake
KW - VDFD
UR - https://openaccess.thecvf.com/content/CVPR2024W/DDCV/papers/Rana_DEEPDISTAL_Deepfake_Dataset_Distillation_using_Active_Learning_CVPRW_2024_paper.pdf
U2 - 10.1109/CVPRW63382.2024.00768
DO - 10.1109/CVPRW63382.2024.00768
M3 - Conference article
T3 - IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
SP - 7723
EP - 7730
BT - Proceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024
PB - IEEE Computer Society
Y2 - 16 June 2024 through 22 June 2024
ER -