It seems that many organisations, including some of the largest ones, do not sufficiently utilize the open-source intelligence capabilities available online in order to gain further insight into their own cyber security threats. By adopting even basic techniques, organisations may be able to improve their detection time and responsiveness to at least some of their cyber threats. Long detection times and unaddressed data breaches, are still of big concern The well known 2013 Data Breach Investigations Report (DBIR) from Verizon, provides an in-depth analysis of a broad range of security breaches and sheds some light on the circumstances under which they were detected. According to the report, the span of time from the initial compromise to the moment when the victim organization discovered the incident was a matter of months or even more, for 66% percent of the incidents that were investigated. That’s a pretty long time for a compromised system and sensitive information to be at the disposal of bad guys while still going unnoticed! Along with these alarming figures, the report shows some suggestive indicators on how the breaches initially get discovered by the victim. Third parties discover data breaches much more frequently than the breached victims do (respectively 69% and 31%). Incidents categorized as reported by a third party include those learned from law enforcement agencies, clients, partners and other external parties. This looks very bad, doesn’t it? And these are just the incidents we know about, while some other cyber crimes may have been flying under the radar. A key question here is how could these detections have turned out better? SIEM requires great efforts and aggregates data often limited to one’s own environment Security Information and Event Management (SIEM) technology has been a hot investment for the last decade. There have been great efforts on gathering intelligence by mining data and correlating internal sources, including databases, middleware, infrastructure components, Intrusion Detection and Prevention Systems (IDPS), and many more. Most recent generations of SIEM technology are surfing on the buzzword of “big data”. They are expanding SIEM capabilities to even smarter mechanisms, being fuelled by massive data sets coming from a much wider variety of sources – literally expressed in ‘terabytes per week’. This cutting-edge technology is part of the new security arsenal designed to address traditional security solution limitations, like signature-based detection technologies. SIEM may valuably augment security breach detection capabilities while improving the overall security posture. This however comes at some cost, as getting a handle on SIEM usually requires significant resources, efforts and expertise. Most small and even many medium-sized businesses just can’t afford the luxury. On the other hand, SIEM is more generally limited to generating intelligence from internal sources within the perimeter network. Supporting collection mechanisms often do not integrate nicely with unstructured external data sources. Emerging cyber threat intelligence won’t uncover 100% of the APTs In the light of today’s fast-paced cyber threat landscape and the emergence of Advanced Persistent Threat (APT), security vendors and providers are bringing new threat intelligence solutions on the table. Rather than just providing CERT, SANS and vendors advisories on the latest ongoing threats, they tend to aggregate supplementary restricted data sources, such as honeypots, malware zoos, vendor’s managed devices and other endpoints, giving a better understanding of what’s going on, down the wire. By doing so in conjunction with contextual analysis, the vendor may provide its customer with an actionable threat intelligence feed related to corporate IP addresses, domain names, sensitive URLs, file content, etc. Some of these new services may probably pay off and help in uncovering quite a few APTs in some cases. They might be especially worthwhile when offered by big players in Telco managed services or enterprise security product arenas. The larger their infrastructures scale, the wider their field of view is likely to be. As usual, a multi-layered approach is better when it comes to security
As for any other information security practice, a good cyber threat intelligence strategy may follow a multi-layered approach. This is particularly true when considering APTs, where the detection risk is inherently noteworthy. On one hand, if an organization focuses all its efforts on SIEM or other assimilated techniques, its eye-range might be limited to the inner perimeter. On the other hand, an organization relying on an external vendor to carry out threat intelligence monitoring outside its perimeter would probably be limited to the vendor’s field of view. Embedding OSINT into a cyber threat intelligence strategy With this in mind, some forms of Open-Source Intelligence (OSINT) mechanisms could be embedded into the corporate threat intelligence strategy and connect the dots with the other layers. Although OSINT has been traditionally used by government and military agencies, some of the underlying techniques may be suitable in other businesses. This term is so called “open-source” at it refers to overt and publicly available sources on the Internet. OSINT techniques consist of conducting regular reviews and/or continuous monitoring over multiple sources, including search engines, social networks, blogs, comments, underground forums, blacklists/whitelists and so on. Likewise, similar techniques are commonly used by marketing departments for competitive intelligence and business intelligence purposes. They are, however, utilized as strategic decision support tools rather than cyber threat intelligence means. In case of the latter, the purpose is twofold. Various techniques to reveal weaknesses and uncover ongoing cyber threats The first objective is to uncover ongoing threats by looking up open sources for signs of suspicious activity associated with a predefined set of targets. The second objective is to understand the organization’s footprint and how it might be viewed by potential cyber-criminals in terms of interest, visible information, exhibited vulnerabilities and weaknesses. This could be achieved by using reconnaissance in a similar way as a typical pentester does. However, this reconnaissance process might be more repeatable and automated compared to the pentester’s. Techniques are numerous and may be more or less complex, depending on the needs and what can be afforded. They can be carried out using a variety of services and tools, including free online utilities. They can range from using specially crafted search engine alerts (i.e. Google dorks, web crawling for blacklists, etc.) to creating dummy user accounts in underground hacking forums. As a simple example, an organization named “MyOrg Ltd” may implement some basic search engine alerts crawling the web for patterns like “MyOrg has been hacked“, “MyOrg * defaced“, “MyOrg * SQL injection” or even “Fake MyOrg emails“. Such a surveillance mechanism could allow the organization to catch-up on an ongoing threat which is being discussed by some users, bloggers, online newspapers or hackers claiming to have broken into a system. Yes, still a lot of hackers show off their tour-de-force on Twitter and from time to time exchange lists of vulnerable URLs in public places. Another straightforward example would be to query various search engines with common patterns of vulnerabilities by using the so called “Google dorks” or similar requests. For instance, the following requests may at times reveal some juicy targets: “inurl:MyOrg.com ‘login: *’ ‘password= *’ filetype:xls” or “site:www.MyOrg.com inurl: administrator_login.asp“. To give you one last illustration, social networks and search engines could both be monitored for tracks of careless employees or contractors posting confidential information about their work activities. This could be done by setting-up alerts based on patterns related to trade secrets, current R&D projects or classification footer records like “Confidential MyOrg Ltd” or “MyOrg Ltd proprietary information“. A bit of a side note here: care should be taken not to violate privacy and to comply with regulations when monitoring activity on social networks. OSINT for whom, and for what purpose? All kind of business may see advantages in implementing OSINT mechanisms. While many large organisations focus their efforts on implementing a comprehensive (and expensive) SIEM system, some may balance their investments with other forms of cyber threat intelligence. As with other information security investments, OSINT initiatives should follow a risk-based and cost-effective approach. For instance, a major defence corporation may be interested in a more ambitious OSINT program, than a small toy manufacturing company. The first one might be a juicier target than the second one. It may implement complex and highly customised OSINT techniques focusing on several criteria, including, but not limited to, key stakeholders and executives details, sensitive projects, IP address ranges, domain names, URLs, etc. Small businesses, which can’t easily afford SIEM or costly cyber threat intelligence solutions, could find worth considering simple OSINT techniques similar to some of those outlined above. As with other cyber threat intelligence mechanisms, bear in mind that OSINT won’t uncover all of the ongoing threats. The simplest mechanisms might uncover just a small few of those. They may however come at minimal cost, so why not to go with them? Just think about it as a component of a multi-layered threat detection or cyber threat intelligence strategy. Conclusion It is clear today that traditional (and expensive) security protections like signature-based and “wall-and-fortress” approaches are not enough anymore in protecting against emerging cyber threats. Organizations should move towards new approaches to fill the gaps in their traditional security arsenal. As has been seen, no single solution will fill all gaps; organizations should rather seek a combination of several ones by adopting a multi-layered approach. Although they have been traditionally used merely by government and military agencies, some forms of OSINT techniques will probably be considered by a growing variety of organizations as an additional detection layer. Going back to the original question, there are multiple ways to make use of OSINT. Organizations may focus their OSINT strategy on uncovering ongoing threats (reactive detection), conducting online reconnaissance (proactive prevention) or a combination of both. As has been noted, the complexity of an OSINT strategy may vary significantly between organizations depending on objectives and resources. They may also change over time as a step forward in meeting new and higher requirements. The simplest techniques can be carried out at almost no cost. More complex techniques may require more efforts to put everything in motion, maintain the process and look over the output data. There might be a high rate of false positives somehow, which requires great work to filter them out and to tune the system up. OSINT can be performed whether as ad hoc reviews or a continuous process. OSINT can be either done in-house or contracted out. At the moment, a limited number of vendors provide OSINT services. Rare are the vendors which provide a comprehensive global surveillance of open sources over cyberspace. Frequently, the offered solution can’t be customised enough to really meet the organization’s needs. Many consultancies provide OSINT reviews as a service, even sometimes in continuous mode. Because of a more limited client portfolio, the service they run may be more tailored to each client’s need. A last point to mention is that organizations must keep addressing the human factor while carrying out OSINT practices. Staff online behaviour remains a key concern. Organizations should develop guidelines and best practices on personnel use of the web and social networks, while performing OSINT reviews and monitoring to ascertain how much sensitive information can be found online. Bear in mind that even if a piece of data disclosure in a public place might not be a big concern in itself, this piece may be used in correlation with other available data to infer more sensitive information.