New academic research could pave the way towards technology that could help prevent intellectual property theft by tricking hackers into stealing fake documents.
Researchers at Dartmouth College's Department of Computer Science, working under V.S. Subrahmanian, Distinguished Professor in Cybersecurity, Technology, and Society and also the director of the school's Institute for Security, Technology, and Society, recently devised a new data protection system that mimics a canary trap.
The idea of a canary trap - long an instrument of espionage, the method has been used by intelligence agencies for years - is based on the concept of supplying different versions of a sensitive document to recipients in order to see which version gets leaked.
The system, WE-FORGE, builds on a different version of the system, FORGE, the Fake Online Repository Generation Engine, Dartmouth students created two years ago. The newer version uses artificial intelligence to create phony documents the academics claim can protect IP like drug design or military technology by slowing down and confusing an attacker when he or she is in a system.
The research was published in ACM (Association for Computing Machinery) Transactions on Management Information Systems, a scholarly quarterly journal, last month.
"Malicious actors are stealing intellectual property right now and getting away with it for free," says Subrahmanian this week. "This system raises the cost that thieves incur when stealing government or industry secrets."
As an example, the researchers consider a patent, a document that could include 1,000 different concepts with up to 20 possible replacements. Given the permutations, WE-FORGE could take that data and create millions of different versions, and following what the researchers call joint optimization, replace words and concepts simultaneously.
After computing similarities between concepts in a document, WE-FORGE analyzes how relevant each word is to the document, then sorts concepts into "bins" and computes the feasible candidate for each group, according to ACM’s summary of the research.
When an attacker is in a system, unless they've established persistence, they usually have limited time to achieve their objective. Dartmouth’s system, in theory, could only trip an attacker up further; if there are countless versions of a similar looking document, they’d have fewer time to determine which of the documents they’ve exfiltrated is legitimate.
To that end, Ppsychologically speaking, the aim of WE-FORGE is to deter attackers’ confidence, as well.
According to Subrahmanian, the AI system forces attackers to waste time and effort trying to identify if they're accessing and stealing the correct document.
"Even if they do, they may not have confidence that they got it right," the professor said.
The academics claim as part of their research they falsified a series of computer science and chemistry patents and asked a panel of knowledgeable subjects to decide which of the documents were real. The system managed to consistently generate highly believable fake documents that deceived.
While WE-FORGE can’t outright prevent data theft or detect it, it sounds like it could throw make it harder for an attacker moving laterally throughout a system, especially if they have limited time and they're unsure whether what they’ve stolen is authentic.