EVERY year a computer worm emerges to stalk the internet, each one seemingly bigger and badder than the last (see diagram). Although they seem to come from nowhere, every new bit of malware has a history. Sussing out the family resemblances could generate a faster response to future threats.
“Our vision is to have a database of the world’s malware, which people can use to share insights,” says Josh Saxe from Invincea labs in Fairfax, Virginia. His firm’s scheme is based on a novel method for classifying malware, the programs hackers use to steal passwords, send spam and carry out other nefarious activities.
Malware is produced at such an astonishing rate that security experts already have automated systems for classifying new strains. But many of their plans are based on analyses of malware code, which hackers can often disguise. The new approach focuses instead on the behaviour of the malware itself.
Saxe and colleagues tested their ideas on just over 100,000 malware samples collected between February 2011 and June 2012. The team ran each piece of malware and logged the communication between the software and the machine it was running on. This communication is made up of “calls”, such as requests to read the contents of a particular file. Individual strains often produce tens of thousands of such calls.
After watching the behaviour of many strains, Saxe and colleagues were able to break the communication data into blocks containing specific sequences of calls that occurred repeatedly across different samples. These blocks are a consequence of malware authors reusing code from older strains. The team used these blocks to classify the strains and to group them into what Saxe calls “malware families”. He presented his work last month at the Neural Information Processing Systems Conference in Lake Tahoe, Nevada.
Analysts will be able to use this catalogue of malware families to share information about new threats. Such a tool could be extremely useful, because much of the collaboration that takes place in the security community is ad hoc. Groups of researchers sometimes band together to tackle new threats, but detailed analyses of emerging strains tends to happen in parallel and independently.
The Invincea system offers an alternative. Analysts can attach notes to the blocks of sequence calls and, if a new strain produces the same block, those notes pop up. This should allow others to get to grips with the strain more rapidly.
Analysts should also be able to visualise groups of malware families, which could help determine the ancestors and authors of new strains, says Gunter Ollmann, chief technology officer at IOActive in Seattle. But he cautions that such sharing may be hampered by technical differences between analysts’ set-ups.
The Invincea project is one of several funded under the Cyber Genome Program, run by the US Defense Advanced Research Projects Agency. The results, many of which have not been made public, will be used to secure computer networks run by the Department of Defense.
“We’re trying to allow people to make more intelligent choices about what to analyse to avoid repetition,” says Saxe. “We’re trying to multiply the effectiveness of analysts.”
Source : Newscientist- http://goo.gl/rjk4q