Technical Report: "Abusing Social Networks for Automated User Profiling"

Wednesday, March 17. 2010
research
We recently published a technical report on another project related to social networks. The paper is entitled "Abusing Social Networks for Automated User Profiling" and we focus on automatically collecting information about users based on the information available in different networks.

Imagine that you have a profile on Facebook, on LinkedIn, and on MySpace. Perhaps you do not want to directly link these profiles, for example since you want to have a more serious profile on LinkedIn, while having a more relaxed one on MySpace and Facebook. Thus you use different pseudonym/names on the different profiles and expect that the information can not be correlated. However, there is a problem with that assumption: during the registration on the different networks, you used the same e-mail address. And a social network typically enables a user to search for e-mail addresses in order to find friends (a convenient feature, after all you want to network with your friends). An attacker can thus go ahead and search on each network for a given e-mail address, scrape the profile related to that address, and then correlate the information found on different network. At the end, an attacker can thus enrich a given e-mail address with information collected on different social networks.

An attacker can not only search for one e-mail address at a time, but typically for hundreds or even thousands. And he can not only do this once, but thousands of times per day. For example, we were able to check about 10 million e-mail addresses on Facebook per day. A spammer could use this "feature" to verify e-mail addresses by using Facebook as an oracle to determine whether or not a given e-mail address is valid. Furthermore, the correlation aspect is of course also a privacy problem since an attacker can find "hidden" information and correlate information across different networks.

We have contacted different social networks. Facebook and XING have already addressed the problem - thanks a lot!

Abstract:
Recently, social networks such as Facebook have experienced a huge surge in popularity. The amount of personal information stored in these sites calls for appropriate security precautions to protect this data.
In this paper, we describe how we are able to take advantage of a common weakness, namely the fact that an attacker can query the social network for registered e-mail addresses on a large scale. Starting with a list of about 10.4 million email addresses, we were able to automatically identify more than 1.2 million user profiles associated with these addresses. By crawling these profiles, we collect publicly available personal information about each user, which we use for automated profiling (i.e., to enrich the information available from each user).
Finally, we propose a number of mitigation techniques to protect the user’s privacy. We have contacted the most popular providers, who acknowledged the threat and are currently implementing our countermeasures. Facebook and XING in particular have recently fixed the problem.

The technical report is available at http://www.iseclab.org/papers/socialabuse-TR.pdf and it was joint work with Marco Balduzzi, Christian Platzer, Engin Kirda, Davide Balzarotti, and Christopher Kruegel.

"Inspector Gadget: Automated Extraction of Proprietary Gadgets from Malware Binaries"

Friday, March 12. 2010
When analyzing malware samples, a human analyst is typically interested in understanding/recovering a specific algorithms of the given sample. In the case of Conficker, for example, she might be interested in extracting the domain generation algorithm such that she can understand what domains are currently and in the future used by the malware. Or for spam bots, she might be interested in how the malware downloads spam templates, decodes them, and then generates the actual spam messages. Or for bots, she might be interested in understanding how binary updates are downloaded, decoded, and then executed.

In each case, the binary itself encodes the algorithm, but it is cumbersome and hard work to understand all of this. Thus it would be useful to have a tool that enables a malware analyst to automatically extract from a given binary sample the relevant algorithm related to a specific task. In a paper that will be presented at the 31st IEEE Symposium on Security & Privacy we introduce Inspector Gadget, a tool that implements exactly this. A gadget encapsulates all code related to a specific task and can be executed in a stand-alone fashion. A gadget player can take a gadget and replay it, for example to determine which domains are currently used by Conficker, or download and decode an update for a bot binary. Furthermore, we introduce an approach to revert gadget based on a enhanced brute-force algorithm: this is useful to understand the effects of malware in detail and we can (in certain cases) also revert obfuscation algorithms, i.e., to understand what data has been exfiltrated by a given sample. The full paper has all the details and describes Inspector Gadget in more depth. And if you are interested in the topic, you should also read the paper by Caballero et al. on BCR (paper title is "Binary Code Extraction and Interface Identification for Security Applications").

Abstract:
Unfortunately, malicious software is still an unsolved problem and a major threat on the Internet. An important component in the fight against malicious software is the analysis of malware samples: Only if an analyst understands the behavior of a given sample, she can design appropriate countermeasures. Manual approaches are frequently used to analyze certain key algorithms, such as downloading of encoded updates, or generating new DNS domains for command and control purposes.
In this paper, we present a novel approach to automatically extract, from a given binary executable, the algorithm related to a certain activity of the sample. We isolate and extract these instructions and generate a so-called gadget, i.e., a stand-alone component that encapsulates a specific behavior. We make sure that a gadget can autonomously perform a specific task by including all relevant code and data into the gadget such that it can be executed in a self-contained fashion.
Gadgets are useful entities in analyzing malicious software: In particular, they are valuable for practitioners, as understanding a certain activity that is embedded in a binary sample (e.g., the update function) is still largely a manual and complex task. Our evaluation with several real-world samples demonstrates that our approach is versatile and useful in practice.

The full paper is available at http://www.iseclab.org/papers/ieee_sp10_inspector_gadget.pdf and will be presented in May at the 31st IEEE Symposium on Security & Privacy. The paper was joint work with Clemens Kolbitsch, Christopher Kruegel, and Engin Kirda - all members of the International Secure Systems Lab.