Observing Malware Outbreaks with Honeypots

Saturday, July 26. 2008
Low-interaction honeypots like Nepenthes or Amun are good at capturing autonomous spreading malware that propagates via exploiting vulnerabilities in network services: by emulating specific vulnerabilities, these honeypots trick malware into exploiting the honeypot and we can capture a copy of the malware.
These honeypots also allow us to observe outbreaks of new malware samples: since quite many people run Nepenthes or Amun nowadays and also send the samples to cwsandbox.org for automated malware analysis, we can correlate the submissions of many different sensors at a central location. For example, we received the malware sample with MD5 sum cb032b12af742555e60124f6d7d2d2ea from a total of 57 different sensor at the timestamps depicted below:

Timestamp Filename
2008-01-10 19:36:25 grospolinacb032b12af742555e60124f6d7d2d2eauLa1AA
2008-01-10 22:11:47 nepenthescb032b12af742555e60124f6d7d2d2easBj96A
2008-01-11 00:03:32 nepenthescb032b12af742555e60124f6d7d2d2easm4aaA
2008-01-11 00:18:58 nepenthescb032b12af742555e60124f6d7d2d2eaA
2008-01-11 00:22:22 nepenthescb032b12af742555e60124f6d7d2d2eayK4gcQ
2008-01-11 00:22:56 nepenthescb032b12af742555e60124f6d7d2d2eadOoZcA
2008-01-11 00:34:36 nepenthescb032b12af742555e60124f6d7d2d2eaf92wA
2008-01-11 00:44:56 nepenthescb032b12af742555e60124f6d7d2d2eaBmLfOg
2008-01-11 00:45:09 nepenthescb032b12af742555e60124f6d7d2d2eagv4WoQ
2008-01-11 00:53:59 nepenthescb032b12af742555e60124f6d7d2d2eaOewZcA
2008-01-11 01:11:01 nepenthescb032b12af742555e60124f6d7d2d2eaQANtUA
2008-01-11 01:56:59 nepenthescb032b12af742555e60124f6d7d2d2eaeEtIA
2008-01-11 04:48:11 nepenthescb032b12af742555e60124f6d7d2d2eaYO0fA
2008-01-11 05:32:44 nepenthescb032b12af742555e60124f6d7d2d2eadOoZcA
2008-01-11 06:35:31 nepenthescb032b12af742555e60124f6d7d2d2eaf0fA
2008-01-11 08:21:13 nepenthescb032b12af742555e60124f6d7d2d2eaze0fA
2008-01-11 08:49:09 nepenthescb032b12af742555e60124f6d7d2d2eaSu4fA
2008-01-11 09:25:49 nepenthescb032b12af742555e60124f6d7d2d2eaanj2kA
2008-01-11 09:41:40 nepenthescb032b12af742555e60124f6d7d2d2eaJ8ZcA
2008-01-11 12:00:10 cb032b12af742555e60124f6d7d2d2ea
2008-01-11 13:42:14 nepenthescb032b12af742555e60124f6d7d2d2ea1E4a6A
2008-01-11 14:15:43 nepenthescb032b12af742555e60124f6d7d2d2eaSHkgA
2008-01-11 14:37:06 grospolinacb032b12af742555e60124f6d7d2d2eamKgfA
2008-01-11 14:38:37 nepenthescb032b12af742555e60124f6d7d2d2eabGhXGQ
2008-01-11 18:30:29 nepenthescb032b12af742555e60124f6d7d2d2eaMPofKg
2008-01-11 18:39:25 nepenthescb032b12af742555e60124f6d7d2d2eaGSGoWQ
2008-01-11 20:33:26 nepenthescb032b12af742555e60124f6d7d2d2eab0fA
2008-01-12 04:19:46 nepenthescb032b12af742555e60124f6d7d2d2eauJQiA
2008-01-12 12:12:12 nepenthescb032b12af742555e60124f6d7d2d2eaGDoqMQ
2008-01-12 14:32:15 nepenthescb032b12af742555e60124f6d7d2d2eaSIUgA
2008-01-13 20:37:45 nepenthescb032b12af742555e60124f6d7d2d2eaYO0fA
2008-01-14 17:38:54 nepenthescb032b12af742555e60124f6d7d2d2eaQ8fA
2008-01-14 22:26:54 grospolinacb032b12af742555e60124f6d7d2d2ea2rqiGw
2008-01-15 06:27:12 nepenthescb032b12af742555e60124f6d7d2d2eaM0sA
2008-01-15 09:32:40 nepenthescb032b12af742555e60124f6d7d2d2eaM0sA
2008-01-18 10:20:58 nepenthescb032b12af742555e60124f6d7d2d2eaKEuA
2008-01-19 02:10:38 nepenthescb032b12af742555e60124f6d7d2d2eagfofkA
2008-01-20 05:37:39 nepenthescb032b12af742555e60124f6d7d2d2eaxeoZcA
2008-01-25 09:43:36 nepenthescb032b12af742555e60124f6d7d2d2eaLvAfA
2008-01-29 15:36:08 nepenthescb032b12af742555e60124f6d7d2d2eaBxofsA
2008-01-29 20:47:39 nepenthescb032b12af742555e60124f6d7d2d2eaJ00A
2008-02-01 18:48:12 nepenthescb032b12af742555e60124f6d7d2d2eaEcoA
2008-02-02 12:24:22 nepenthescb032b12af742555e60124f6d7d2d2eawcUgLg
2008-02-02 19:35:56 cb032b12af742555e60124f6d7d2d2ea
2008-02-07 13:59:24 cb032b12af742555e60124f6d7d2d2ea.dat
2008-02-08 15:48:30 nepenthescb032b12af742555e60124f6d7d2d2eaGfoWA
2008-02-14 14:14:03 cb032b12af742555e60124f6d7d2d2eacb032b12af742555...2ea
2008-02-21 14:20:01 nepenthescb032b12af742555e60124f6d7d2d2eaWN0fA
2008-02-28 16:56:53 nepenthescb032b12af742555e60124f6d7d2d2eaoexA
2008-03-03 15:15:39 nepenthescb032b12af742555e60124f6d7d2d2eaA
2008-03-11 02:56:00 nepenthescb032b12af742555e60124f6d7d2d2eaAfA
2008-03-14 11:11:51 nepenthescb032b12af742555e60124f6d7d2d2eaJgfA
2008-03-15 17:31:37 nepenthescb032b12af742555e60124f6d7d2d2eaGGYnA
2008-03-20 10:55:43 nepenthescb032b12af742555e60124f6d7d2d2eacb032b1...2ea
2008-03-20 17:05:07 nepenthescb032b12af742555e60124f6d7d2d2eaoflA
2008-03-31 12:12:02 nepenthescb032b12af742555e60124f6d7d2d2eaYO0fA
2008-04-07 07:06:12 nepenthescb032b12af742555e60124f6d7d2d2eaxMUg3A
2008-04-08 02:37:22 cb032b12af742555e60124f6d7d2d2ea

Each timestamp depicts the first point in time where the specific sensor captured a copy of the malware. As you can see, the malware outbreak happened presumably at January 10, 2008. From then on, honeypot sensors all around the world captured a copy of this specific bot. The CWSandbox report contains more detailed information about the botnet, e.g.:
  • The bot creates a file named C:\WINDOWS\system32\explorer.exe, which is a copy of itself

  • It creates a run key for the Windows registry such that the bot is started again after a reboot

  • The C&C server is located at the IP address 67.43.232.36 and listens on the TCP port 8080

  • C&C channel is #wawa and the command issued by the botmaster at the time of analysis is: ipscan s.s.s dcom2 -f -s

Fast-Flux Data

Wednesday, July 16. 2008
Back in February, we published a paper on fast-flux service networks at NDSS'08. The basic idea behind fast-flux networks is a fast change in the mapping between a domain name and the corresponding IP addresses. The attackers use this mechanism to build a proxy-network on top of compromised machines to maintain a robust hosting infrastructure for their services. For more information on this topic, see the paper by the Honeynet Project or our NDSS paper.

To foster research in this area, the data collected during our study is available for research purposes. Up to now, quite a few people mailed me and asked for the data. To make this process a bit more scalable and also minimize the amount of work needed at my side, we decided to simply publish all the data such that everyone can download the raw data and use it for whatever purpose. Today, I uploaded a tarball which contains a summary of the fast-flux data collected over a period of several weeks. The tarball contains a potpourri of different measurements and has a total size of 7.3 MB. It contains about 55K raw dig lookup files and has an unpacked size of about 220 MB. The archive contains the following data:
  • storm-qavoter.com.log: dig lookups for domain used by the Storm Worm botnet which uses fast-flux techniques

  • asprox-damnec-hydra.log: dig lookups for Asprox/Damnec botnet which also uses fast-flux techniques

  • lookups-ff: dig lookups for fast-flux domains, confirmed manually

  • lookups-spam: dig lookups for various domains found in spam e-mails

  • lookups-benign: dig lookups for (probable) benign domains, most of them collected via dmoz or Alexa

  • lookups-ndss: part of the domains used for the NDSS paper

  • lookups-ndss-ff: suspected fast-flux domains from NDSS paper

So if you are interested in this area and want to learn more about it, just download the archive (7.3 MB) and play with the files :)

Survival of the Fittest

Monday, July 14. 2008
The Internet Storm Center blogged about the Survival Time on the Internet today. The survival time is defined as:
The survivaltime is calculated as the average time between reports for an average target IP address. If you are assuming that most of these reports are generated by worms that attempt to propagate, an unpatched system would be infected by such a probe.
The average time between probes will vary widely from network to network. Some of our submitters subscribe to ISPs which block ports commonly used by worms. As a result, these submitters report a much longer 'survival time'. On the other hand, University Networks and users of high speed internet services are frequently targeted with additional scans from malware like bots. If you are connected to such a network, your 'survival time' will be much smaller.
The main issue here is of course that the time to download critical patches will exceed this survival time.

With the help of honeypots, we can measure the survival time. For example, we can use low-interaction honeypot such as nepenthes or amun that emulate common network-based vulnerabilities and deploy them at different locations. The average time it takes to download the first binary is an estimation of the survival time: The honeypots emulate known vulnerabilities and are thus exploited by different kinds of autonomous spreading malware - similar to an unpatched system. At our lab, we deployed ten honeypots in different network ranges and measured different things as I'll explain with the following graphs. These are all based on measurements between August 2007 and July 2008.


This plot shows the total number of attacks (blue) and of downloads (red) per sensor for the measurement period. We see that there are huge differences depending on the network location (e.g., whether or not the ISP filters specific ports). Furthermore, not all attacks are successful and we also observed quite a lot failed attacks.


This plot shows the percentage of attacks (red) and downloads (blue) per time of day. We can observe a clear diurnal pattern: lower attack volume during the night and higher attack volume during the day, following the typical behavior of humans.


This plot shows the attacks (blue) and the downloads (red) per weekday for all sensors during the measurement period. The values are given in percentage of the sum of all attacks/downloads over the chosen period of time. The attack traffic is slightly higher during the weekends.


Another interesting observation is whether or not the attacks originate from the same ASN as the honeypot as depicted in the above picture. The figure shows the percentage of attacks coming from the same ISP as the honeypot, e.g., for sensor 1, about 90% of the attacks originate from machines within the same autonomous system. The graph can be interpreted as many attacks being local - which makes sense since autonomous spreading malware often prefers to propagate locally. In some ASNs, however, it seems like most attacks originate from other ASNs.


Finally, this graph shows an estimation of the survival time: The graph shows the average amount of time for the honeypot to be attacked successfully. Red bars are honeypots with a static IP address, thus we have only one measurement point for these honeypots. For the blue bars, each honeypot had a dynamic IP address, e.g., a disconnect every 24 hours. The bar depicts the average time from obtaining a new DHCP lease to first download which can be interpreted as the time it would take for an unpatched system to be compromised. Compared to the survival time from the Internet Storm Center which is currently below five minutes, we measure a higher survival time. However, the time is still short and you need to patch a system before taking it online.

More information and many more graphs are available in the thesis from Laura Itzel (unfortunately in German only).

Update: I updated the description of the fourth figure to explain it a bit better for non-German speaking readers.

Sicherheit'08: "Monkey-Spider: Detecting Malicious Websites with Low-Interaction Honeyclients"

Sunday, July 6. 2008
Back in April, our paper on low-interaction, client-side honeypots entitled "Monkey-Spider: Detecting Malicious Websites with Low-Interaction Honeyclients" was published at Sicherheit'08, the main security conference for the German speaking community. The paper presents a client-side honeypot that can be used to detect malicious web sites. The basic idea is to use the crawler Heritrix to download content efficiently and then analyze the downloaded content with different means, e.g., AV scanners, CWSandbox, or other tools. To our surprise, the paper won the best paper award of the conference :-)

Abstract:
Client-side attacks are on the rise: malicious websites that exploit vulnerabilities in the visitor’s browser are posing a serious threat to client security, compromising innocent users who visit these sites without having a patched web browser. Currently, there is neither a freely available comprehensive database of threats on the Web nor sufficient freely available tools to build such a database. In this work, we introduce the Monkey-Spider project. Utilizing it as a client honeypot, we portray the challenge in such an approach and evaluate our system as a high-speed, Internet-scale analysis tool to build a database of threats found in the wild. Furthermore, we evaluate the system by analyzing different crawls performed during a period of three months and present the lessons learned.

The full paper is now also available for download and the software is published at SourceForge: http://monkeyspider.sourceforge.net/. The software is released under the terms of GPLv3 and the maintainer is Ali Ikinci (ali at ikinci dot info).

WEIS'08: "Studying Malicious Websites and the Underground Economy on the Chinese Web"

Friday, July 4. 2008
The 7th Workshop on the Economics of Information Security (WEIS'08) took place last week at Dartmouth College's Tuck School of Business. Several interesting papers like "Security Economics and European Policy", "Do Data Breach Disclosure Laws Reduce Identity Theft?", or "The Impact of Incentives on Notice and Take-down" were presented during the workshop. Our paper entitled "Studying Malicious Websites and the Underground Economy on the Chinese Web" deals with several aspects of the underground economy within China's part of the World Wide Web. Amongst other techniques, we use client-side honeypots to study malicious websites.

Abstract:
The World Wide Web gains more and more popularity within China with more than 1.31 million websites on the Chinese Web in June 2007. Driven by the economic profits, cyber criminals are on the rise and use the Web to exploit innocent users. In fact, a real underground black market with thousand of participants has developed which brings together malicious users who trade exploits, malware, virtual assets, stolen credentials, and more. In this paper, we provide a detailed overview of this underground black market and present a model to describe the market. We substantiate our model with the help of measurement results within the Chinese Web. First, we show that the amount of virtual assets traded on this underground market is huge. Second, our research proves that a significant amount of websites within China’s part of the Web contain some kind of malicious content: our measurements reveal that about 1.49% of the examined sites contain malicious content that tries to attack the visitor’s browser.

The paper is a collaboration with several researchers from China (Jianwei Zhuge, Chengyu Song, Jinpeng Guo, Xinhui Han, and Wei Zou) and a revised version of our technical report on the same topic. The full version of the paper is now available.


Continue reading "WEIS'08: "Studying Malicious Websites and the Underground Economy on the Chinese Web""