Walowdac – Analysis of a Peer-to-Peer Botnet

Sunday, January 3. 2010
One of the most interesting botnets of 2009 was Waledac: the botnet implements a peer-to-peer-based communication channel and it can be seen as the successor of Storm Worm, since it implemented many similar ideas (e.g., a very similar language for spam templates was used). The researchers from Trend Micro had published an analysis of the botnet and we also examined the botnet. The result is a paper entitled "Walowdac - Analysis of a Peer-to-Peer Botnet": instead of passively observing the network, we implemented an active infiltration component. We emulate the protocol of a bot and are able to observe the inner communication aspects of the network. As a result, we obtain an in-depth overview of the botnet that enables us to study different aspects of the network, e.g., efficiency of the spam campaigns or number of active bots. As a small peak of the results, the following pictures shows the number of active bots in different countries on a specific day in August 2009. We can for example observe diurnal patterns and clearly see the effects of timezones on the size of the botnet:


Abstract:
A botnet is a network of compromised machines under the control of an attacker. Botnets are the driving force behind several misuses on the Internet, for example spam mails or automated identity theft. In this paper, we study the most prevalent peer-to-peer botnet in 2009: Waledac. We present our infiltration of the Waledac botnet, which can be seen as the successor of the Storm Worm botnet. To achieve this we implemented a clone of the Waledac bot named Walowdac. It implements the communication features of Waledac but does not cause any harm, i.e., no spam emails are sent and no other commands are executed. With the help of this tool we observed a minimum daily population of 55,000 Waledac bots and a total of roughly 390,000 infected machines throughout the world. Furthermore, we gathered internal information about the success rates of spam campaigns and newly introduced features like the theft of credentials from victim machines.

The paper was joint work with Ben Stock, Jan Göbel, Markus Engelberth, and Felix C. Freiling. The full paper is available at http://honeyblog.org/junkyard/paper/waledac-ec2nd09.pdf and it was published at EC2ND 2009.

Automatic Analysis of Malware Behavior using Machine Learning

Monday, December 28. 2009
CWSandbox
In the last couple of years, several honeypot solutions to automatically "collect" malware samples were developed. With these tools, it is possible to obtain copies of malware samples without any human interaction. As a result, we are able to collect quite a few malware samples per day, which then also need to be analyzed. Thus, several sandbox solutions were developed that automate the analysis step by performing dynamic, behavior-based analysis. The result of the dynamic analysis is typically a report that summarizes the observed behavior. The next logical step is to use that information to perform malware classification and malware clustering: at the end of that process, we can then obtain information about which samples perform basically the same kind of activity. We can then automatically find variants of well-known threats, identify new malware families, and reduce the manual effort needed to analyze the large number of incoming malware samples.

In the last couple of months, we worked on malware classification and malware clustering. The results are summarized in a technical report. In the article, we introduce a learning-based framework for automatic analysis of malware behavior. To apply this framework in practice, it suffices to collect a large number of malware samples and monitor their behavior using a sandbox environment. By embedding the observed behavior in a vector space, reflecting behavioral patterns in its dimensions, we are able to apply learning algorithms, such as clustering and classification, for analysis of malware behavior. Both techniques are important for an automated processing of malware samples and we show in several experiments that our techniques significantly improve previous work in this area. For example, the concept of prototypes allows for efficient clustering and classification, while also enabling a security researcher to focus manual analysis on prototypes instead of all malware samples. Moreover, we introduce a technique to perform behavior-based analysis in an incremental way that avoids run-time and memory overhead inherent to previous approaches.

Abstract
Malicious software — so called malware — poses a major threat to the security of computer systems. The amount and diversity of its variants render classic security defenses ineffective, such that millions of hosts in the Internet are infected with malware in form of computer viruses, Internet worms and Trojan horses. While obfuscation and polymorphism employed by malware largely impede detection at file level, the dynamic analysis of malware binaries during run-time provides an instrument for characterizing and defending against the threat of malicious software.
In this article, we propose a framework for automatic analysis of malware behavior using machine learning. The framework allows for automatically identifying novel classes of malware with similar behavior (clustering) and assigning unknown malware to these discovered classes (classification). Based on both, clustering and classification, we propose an incremental approach for behavior-based analysis, capable to process the behavior of thousands of malware binaries on a daily basis. The incremental analysis significantly reduces the run-time overhead of current analysis methods, while providing an accurate discovery and discrimination of novel malware variants.

The full technical report is available at http://honeyblog.org/junkyard/paper/malheur-TR-2009.pd. It was joint work with Konrad Rieck, Philipp Trinius, and Carsten Willems. And the word cloud was generated using http://www.wordle.net/.

AV Tracker

Thursday, October 22. 2009
CWSandbox
A couple of days ago, the website "AV Tracker" went online, which publishes information about various automated analysis systems. The idea is that the attacker uploads a binary to an analysis system, waits for the sample to be executed, and then the binary phones home some information to a server under the control of the attacker. The collected information is then published at "AV Tracker", exposing information about the analysis systems. Besides some well-known AV companies, also CWSandbox and Anubis were affected.

We analyzed the binary and found that it sends a simply HTTP request, in which all extracted information is encoded. An example for an analysis report generated by one of the samples is http://anubis.iseclab.org/?action=result&task_id=361b5a8ee7235954252b02d33b3a7d24. This can be defeated by blocking access to the reporting server or by regularly changing the IP address of the analysis systems, but at the end this will be some kind of arms race again.

Some other interesting information is also embedded in the binary. When extracting the strings from the sample, the following text becomes visible (some information is hidden by dots):
This is Peter Kl....... fuck ...... fuck the world fuck you all!
I was once working with ...... and was a white hat, now I am the worst mean motherfucker black hat and I am selling the source code of ...... .. :D
I am with the SinowalWhistler developers, funny days, aren't ;) and fuck ..... they don't have no idea :D bitches

A related article was also published today at http://www.viruslist.com/en/weblog under the title "A black hat loses control".

$645.00 ...

Thursday, September 10. 2009
... is the amount I am worth in the underground economy, at least according to Symantec's new website on which they advertise (in a somewhat entertaining way) Norton 2010 products. Here are the results when I take the risk assessment:
[...] In the underground economy, you're really worth about $645.00. And that's on a good day.
Your entire digital life could go on the auction block for as little as $10.96, whether you like it or not.

How they compute these numbers and on what methodology / measurements this is based remains completely unclear, after all it is just some kind of marketing. But the movies are funny, perhaps they can serve as some kind of security awareness campaign. Main drawback is that the website is almost completely built on top of Flash and JavaScript - how about not using all these techniques next time? In some recent measurements we found that the vast majority of web surfers still have an unpatched version of Flash installed, better teach them to regularly update their system next time...

Thread Graphs for Visualizing Malware Behavior

Tuesday, August 25. 2009
CWSandbox
The last blog post dealt with our recent research on visualizing malware behavior. Now a quick update on the thread graphs we generate for visualizing malware behavior: since tree maps display nothing about the sequence of operations, we use another presentation format to visualize the temporal behavior of the individual threads of a sample. A thread graph can be regarded as a behavioral fingerprint of the sample that represents the temporal order of executed system commands and the different threads spawned by a binary. The x-axis represents the time (sequence of performed actions), while the y-axis indicates the operation/section of the performed action. An analyst can then study this behavior graph to quickly learn more about the actions of each individual thread.

The following two pictures show examples of this kind of visualization:


On the left hand picture, we can see that one thread is responsible for the majority of operations for the sample. This thread performs many registry operations and initially performs many network- and system-related operations (operations 90-140). Additionally, two more threads are spawned, but they perform only a limited amount of operations during the analysis phase. The thread graph for the malware sample on the right side is completely different and an analyst can get a quick overview of what actions a given samples performs.