Agency's hate speech A.I. destroys harmful Twitter accounts

Possible's 'We Counter Hate' movement pledges a $1 donation for every retweet of hate.

More than half of the Twitter accounts spewing hate speech detected by one ad agency’s A.I. technology have been removed from the public domain.

Possible launched "We Counter Hate" earlier this year after its creators discovered that, in 2017, there were more tweets involving messages of hate than Game of Thrones, Major League Baseball, the Super Bowl and the Grammys.

The platform helps combat hate speech by turning retweets into donations that go toward organizations dedicated to fighting it, like Life After Hate. Possible teamed up with Spredfast to train A.I. technology to identify hatefluencers and use human moderation to respond to the hate that’s found.

It’s radically outperformed expectations of identifying hate speech (91 per cent success) relative to a human moderator, and Possible is continuing to improve the model.


Jason Carmel, global chief data officer at Possible, said: "In addition to reducing the spread of hate speech, we’re just starting to delve into the data on the make-up of hate speech on twitter.

"Some early analysis, for example, indicates that anti-Semitic hate speech accounts for nearly 40 per cent of volume of hate speech we analyzed, but that anti-LGBTQ hate speech is nearly twice as toxic. We’re looking at a number of different initiatives to track the volume and toxicity of posts against specific groups to learn more about how hate spreads and intensifies in social media."

When #WeCounterHate responds to a hate tweet, it reduces the spread of that hate between 50 to 60 per cent, claims the agency. A total of 57 per cent of the tweets that have been countered have been deleted, the accounts themselves have been suspended or made private.

Once hate speech on Twitter is identified, the tweets are tagged with the reply: "This hate tweet is now being countered. Think twice before retweeting. For every retweet, a donation will be committed to a non-profit fighting for equality, inclusion and diversity. #WeCounterHate https://www.wecounterhate.com/counteredhate."

This permanent marker lets those looking to spread hate know that retweeting will commit a donation to a nonprofit with a mission to rehabilitate individuals who have lived a life of hate.


The movement has gone viral, organically. Earlier this month, it caught the attention of Kenny Still, wide receiver of the Miami Dolphins, who tweeted in support.

The agency adapted Dr. Gregory Staunton’s "10 Stages of Genocide" to create a structure for hate speech identification. The report, originally presented in a briefing to the U.S. Department of State in 1996, helped Possible understand the process of classification and dehumanization.

The team changed it by condensing the stages and removing some that aren’t relevant to Twitter. They also added contemporary phenomena found in social media (like coded language). 

"We created this A.I. platform in hopes that innovation could actually yield results for good," said Shawn Herron, experience technology director at Possible.

"Knowing that we’ve eliminated over four million potential impressions of hate speech is exciting, yet what’s most thrilling is the deletion and suspension rates we’re witnessing. Challenging hate speech with love as a way to stop it’s spread, and seeing it actually work, reminds our team that taking risks is necessary for change to happen."

Tags