Moderating online hate
Although many countries have laws concerning hate speech, in many contexts enforcement of these laws is difficult, particularly in the online world. Although online platforms are moderating hate speech, there is not currently a reliable way to detect hate speech on different platforms besides highly labour intensive reviews of platform content. There are some efforts to automate the detection of hate speech, but they lack the ability to classify hate speech reliably. Moreover, not all platforms are moderating user-generated content. Twitter, for example, removed this moderation after Elon Musk acquired the platform. The removal of moderation has resulted in major negative consequences, namely an increase in hateful content.
The offline impact of online hate speech
Offline hate crimes are often preceded by online hate speech. Online extremist narratives have been linked to real-world hate crimes against both individuals and larger communities. For example, in 2019 an extreme-right terrorist in New Zealand attacked two mosques in Christchurch and shot a total of 51 people in 36 minutes.
What is unique about Tarrant’s attack is that he announced his attack 15 minutes beforehand on the platform 8chan, and live streamed his atrocity on Facebook. In doing so, he highlighted the weakness of such platforms to moderate content. Although the video was removed from Facebook within one hour, it was still widely shared. It was re-uploaded more than 2 million times on different social media platforms, and it remained easily accessible for another 24 hours after the attack.
Reducing online hate
Above-mentioned example shows the relation between online hate speech and offline hate crimes. What started as online hate speech resulted in the deadliest terrorist attack in New Zealand history. In order to combat this, more focus should be on the detection of hate speech, for the sake of the online (and potential offline) victims. We need a rapid and continuous approach to combat hate speech.
Monitoring online hate
Tilt’s hate speech detection model offers a valuable service to those trying to monitor and analyse online hate speech, toxicity and violence across various platforms (e.g. Twitter, Reddit, YouTube, Instagram). Our models are non-binary classification systems, ranking messages by level of severity. Content is categorised and ranked according to the aforementioned definition of hate speech. This classification enables users to quickly detect, rank and report hate speech.
Tilt helps organisations detect hate speech quickly and easily. With hate speech detection we help organisations save time conducting investigations and broaden the scope of their research. This allows for the swift detection of hate speech with minimum effort, to efficiently report and demand action from platforms to remove hateful content. Detecting hate speech thus contributes to the organisation’s sway and ultimately reduces online hate – as well as its offline impact.