AdFlush: A Real-World Deployable Machine Learning Solution for Effective Advertisement and Web Tracker Prevention

Kiho Lee, Chaejin Lim, Beomjin Jin, Taeyoung Kim, Hyoungshick Kim

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Conventional ad blocking and tracking prevention tools often fall short in addressing web content manipulation. Machine learning approaches have been proposed to enhance detection accuracy, yet aspects of practical deployment have frequently been overlooked. This paper introduces AdFlush, a novel machine learning model for real-world browsers. To develop AdFlush, we evaluated the effectiveness of 883 features, ultimately selecting 27 key features for optimal performance. We tested AdFlush on a dataset of 10,000 real-world websites, achieving an F1 score of 0.98, thereby outperforming AdGraph (F1 score: 0.93), WebGraph (F1 score: 0.90), and WTAgraph (F1 score: 0.84). Additionally, AdFlush significantly reduces computational overhead, requiring 56% less CPU and 80% less memory than AdGraph. We also assessed AdFlush's robustness against adversarial manipulations, demonstrating superior resilience with F1 scores ranging from 0.89 to 0.98, surpassing the performance of AdGraph and WebGraph, which recorded F1 scores between 0.81 and 0.87. A six-month longitudinal study confirmed that AdFlush maintains a high F1 score above 0.97 without the need for retraining, underscoring its effectiveness.

Original languageEnglish
Title of host publicationWWW 2024 - Proceedings of the ACM Web Conference
PublisherAssociation for Computing Machinery, Inc
Pages1902-1913
Number of pages12
ISBN (Electronic)9798400701719
DOIs
StatePublished - 13 May 2024
Event33rd ACM Web Conference, WWW 2024 - Singapore, Singapore
Duration: 13 May 202417 May 2024

Publication series

NameWWW 2024 - Proceedings of the ACM Web Conference

Conference

Conference33rd ACM Web Conference, WWW 2024
Country/TerritorySingapore
CitySingapore
Period13/05/2417/05/24

Keywords

  • ad blocking
  • machine learning
  • web security
  • web tracking

Cite this