Machine learning algorithms can help police sweep the dark web and crack down on cybercrime. But beyond evidence of illegal activity, it’s helping law enforcement find something far more valuable: motive.
The dark web (also known as the deep, invisible and hidden web) with its veil of anonymity has been associated with crimes such as drug dealing, child pornography and credit card fraud.
This dire reputation is difficult to confirm through research, as criminal sites can only be accessed legally by law enforcement officers conducting an investigation.
But a recent study led by Monash University PhD candidate Janis Dalins – who also happens to be an Australian Federal Police (AFP) officer – has been able to shed light on the actual activity going on in the dark web’s shadowy realms.
Dalins’ main academic supervisor, Dr Campbell Wilson, told create digital the PhD research was not originally focused on law enforcement, but Dalins’ policing background offered a unique opportunity to develop tools to improve community safety online.
“It became clear over time that we could bring Janis’ experience in law enforcement together with research, particularly in machine learning,” Wilson said.
Motivation is important
So far, policing of the dark web has been done the old fashioned way: agents access and investigate sites manually.
AI tools with machine learning capability have the potential to make the AFP’s job easier. Wilson explained the research team’s aim was to develop a classification system to train algorithms to target illegal material, rather than legitimate uses of anonymous browsing.
In particular, they hoped to draw out not just what kind of material the sites were hosting, but why. For example, whether they were set up for sales, advocacy, file sharing or discussion.
“It’s not just about what was out there, but how it is being used. Motivation is important to law enforcement,” Wilson said.
Sites on the dark web can only be accessed with specialised software that hides the user’s identity. For his research, Dalins used The Onion Router (Tor), which was developed in 2002 as a tool to protect online privacy.
Tor works by using a minimum of three layers of encryption to hide the user’s IP address. It relays web traffic via a series of three computers selected from a global volunteer network that remove one ‘onion’ layer each before sending data to the next point.
With permission from the federal justice minister, the researchers developed a randomised web crawler that used Tor to access more than 232,000 anonymous pages. Dalins manually classified more than 4000 of these pages according to content and use.
Not just dirty dealing
The press coverage of illicit uses of the dark web have earned it a bad name as a virtual den of thieves – and worse. But while the researchers found content associated with financial dirty dealing, narcotics and illegal pornography, they also uncovered material they described as “downright banal”.
Nearly 40 per cent of the pages analysed were legitimate. When added to content that was unclear in its intent, it’s likely that over half of the material analysed would not be of interest to law enforcement officials.
Of the illegal material, the most frequently encountered category was financial crime, at 16 per cent of the pages accessed. Of these sites, 95 per cent offered products and services for sale, including bitcoin laundering sites and exchanges. Stolen credit cards, bank account details and gift cards were also on offer.
Wilson was unable to comment on whether any investigations or arrests have arisen from the findings of the study, but confirmed the researchers have applied for funding for the next stage of research.
The follow-up study will focus on applying the classification system to train machine learning algorithms that could be used by police to weed out crime on the dark web and other online platforms in the future.
“Essentially what we are doing is laying the foundations for the effective use of machine learning in law enforcement,” Wilson said.