ConTheModel: Can We Modify Tweets to Confuse Classifier Models?

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

News on social media can significantly influence users, manipulating them for political or economic reasons. Adversarial manipulations in the text have proven to create vulnerabilities in classifiers, and the current research is towards finding classifier models that are not susceptible to such manipulations. In this paper, we present a novel technique called ConTheModel, which slightly modifies social media news to confuse machine learning (ML)-based classifiers under the black-box setting. ConTheModel replaces a word in the original tweet with its synonym or antonym to generate tweets that confuse classifiers. We evaluate our technique on three different scenarios of the dataset and perform a comparison between five well-known machine learning algorithms, which includes Support Vector Machine (SVM), Naive Bayes (NB), Random Forest (RF), eXtreme Gradient Boosting (XGBoost), and Multilayer Perceptron (MLP) to demonstrate the performance of classifiers on the modifications done by ConTheModel. Our results show that the classifiers are confused after modification with the utmost drop of 16.36%. We additionally conducted a human study with 25 participants to validate the effectiveness of ConTheModel and found that the majority of participants (65%) found it challenging to classify the tweets correctly. We hope our work will help in finding robust ML models against adversarial examples.

Original languageEnglish
Title of host publicationSilicon Valley Cybersecurity Conference - First Conference, SVCC 2020, Revised Selected Papers
EditorsYounghee Park, Divyesh Jadav, Thomas Austin
PublisherSpringer Science and Business Media Deutschland GmbH
Pages205-219
Number of pages15
ISBN (Print)9783030727246
DOIs
StatePublished - 2021
Externally publishedYes
Event1st Silicon Valley Cybersecurity Conference, SVCC 2020 - San Jose, United States
Duration: 17 Dec 202019 Dec 2020

Publication series

NameCommunications in Computer and Information Science
Volume1383 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference1st Silicon Valley Cybersecurity Conference, SVCC 2020
Country/TerritoryUnited States
CitySan Jose
Period17/12/2019/12/20

Keywords

  • Adversarial examples
  • Machine learning
  • Social media
  • Tweets

Fingerprint

Dive into the research topics of 'ConTheModel: Can We Modify Tweets to Confuse Classifier Models?'. Together they form a unique fingerprint.

Cite this