One of the most pressing challenges facing cyber security professionals nowadays is probably the sheer number of security incident alerts, which is becoming too high to cope with even for the most expansive and well-equipped security teams. The increased number of alerts is a result of two factors at play, with the exponential boost in cyber attacks in recent years being the more obvious and straightforward one, the other is certainly much more complex and might also seem a bit ironic and surprising, as it arises from the growing use of different tools and devices within an organization, whose original function is to detect and mitigate incidents in the first place.
Security Operations Centers (SOCs) are now utilizing more devices designed to alert security analysts of cyber attacks than ever before, with the side-effect being too many alerts for the security teams to handle. Consequently, some of the most credible threats go by undetected or are simply not acted upon.
Addressing the Threat Noise Issue
With so many systems monitoring potential security threats and incidents creating alerts, and also taking into consideration that in many cases SOCs are severely understaffed, it comes as no surprise that analysts have a hard time staying on top of every single alert and responding to them appropriately and in a timely fashion. Since they don’t have the time or sufficient human resources to handle all alerts, SOCs often choose to disregard some and try to focus on those they deem to be credible, which understandably can lead to real threats slipping through the cracks and inflicting serious and irreparable damage to organizations.
In an effort to address the issue of threat noise, some SOCs opt for either reducing the number of devices generating alerts or expanding their number of staff, but while seemingly simple and straightforward, these options can be both counterproductive and quite costly. However, these are not the only possible solutions to this challenge standing at the disposal of SOCs, as there is another alternative, which would neither allow alerts to go undetected, nor require hiring additional security analysts.
Automating the Most Time-Consuming Parts of the Process
While the number of alerts generated by monitoring devices in some cases doesn’t necessarily have to be a reason for concern for SOCs in itself, the fact that alerts take a significant amount of time to analyze and handle efficiently often makes them an insurmountable challenge for understaffed security teams. One potentially very promising tactics to tackle this challenge effectively, is by enabling an automated response to some specific types of alerts, in an approach that is thought to be able to yield a wide range of benefits to organizations.
The idea is to automate the routine tasks that are repetitive and that do not require a lot of human expertise, but do usually take a lot of time to respond to and handle. By automating the response to these types of alerts, SOC analysts get more time to handle the alerts that pose a greater risk to their organizations, which must be analyzed in a more focused and comprehensive manner.
As noted in a recent SANS Spotlight paper titled “SOC Automation – Disaster or Deliverance”, written by Eric Cole: “The rate at which organizations are attacked is increasing, as is the speed at which those attacks compromise a network – and it is not possible for a human to keep up with the speed of a computer. The only way to beat a computer is with a computer”.
However, it must be noted that the implementation of incident response automation itself brings a certain degree of risk to organizations, as it might produce false positives, with analysts not being able to determine whether specific alerts are legitimate threats or not. This means that if automation is not properly implemented with predetermined processes and procedures in place, they may end up spending much of their time analyzing alerts that aren’t actual attacks and don’t pose any foreseeable danger. Having said that, organizations should not shy away from automation because of these potential drawbacks, but should instead implement it in a balanced and well thought out manner. The key is to manage and control false positives as oppose to simply eliminating them. It is therefore important to only automate the low-risk alerts that are not expected to have a major impact on an organization and leave the more serious threats to be handled by security professionals who can apply their expertise to resolve them.
When deciding whether to adopt automation or not, organizations need to be aware of its pros and cons, and if this assessment is carried out correctly, they will inevitably realize that the advantages of this approach clearly outweigh the disadvantages, that can also be easily controlled and managed to minimize any potential negative impact.
Looking at the pros and cons of automation, it’s easy to see that the most important benefit is the fact that it allows SOCs to monitor and analyze many more incidents than doing it manually, opening up the security team’s bandwidth to focus on the high-risk and high-impact alerts. Other key benefits also include: a more consistent response to alerts and tickets, a higher volume of ticket closure and response to incidents, as well as coverage of a larger area and larger number of tickets. On the other hand, automation can yield false positives that for their part can lead to directing time and resources towards resolving alerts that are not legitimate attacks, consequently leading to organizations potentially shutting down operations, having an impact on their business and their bottom line.
All said and done, automated incident response has the potential to bring significant benefits to organizations, provided that it’s implemented properly and cautiously, with a well-thought out strategy. Overall it should be a serious consideration for any SOC that has to handle large volumes of alerts on a daily basis.
For further information on SOC automation, read the recent SANS Institute Spotlight Paper – “SOC Automaton – Deliverance or Die”:
Threat actors are increasingly adopting security automation and machine learning – security teams will have to follow suit, or risk falling behind.
Many organizations still conduct incident response based on manual processes. Many playbooks that we have seen in our customer base, for example, hand off to other stakeholders within the organization to wait for additional forensic data, and to execute remediation and containment actions.
While this may seem like good practice to avoid inadvertent negative consequences such as accidentally shutting down critical systems or locking out innocent users, it also means that many attacks are not contained in a sufficiently short time to avoid the worst of their consequences.
Manual Processes Cannot Compete with Automation
Reports are mounting about threat actors and hackers leveraging security automation and machine learning to increase the scale and volume, as well as the velocity of attacks. The implications for organizations should be cause for concern, considering that we have been challenged to effectively respond to less sophisticated attacks in the past.
Ransomware is a case in point. In its most simple form, a ransomware attack does not require the full cyber kill chain to be successful. A user receives an email attachment, executes it, the data is encrypted and the damage is done. At that point, incident response turns into disaster recovery.
Automated attacks have been with us for a long time. Worms and Autorooters have been around since the beginning of hacking, with WannaCry and its worming capability only the most recent example. But these have only automated some aspects of the attack, still permitting timely and successful threat containment further along the kill chain.
Threat actors have also leveraged automated command and control infrastructure for many years. DDoS Zombie Botnets, for example, are almost fully automated. To sum it up, the bad guys have automated, the defenders have not. Manual processes cannot compete with automation.
With the increase in the adoption of automation and machine learning by cyber criminals, enterprises will find that they will have to automate as well. The future mantra will be “Automate or Die”.
Making the Cure More Palatable Than the Disease
But automating containment actions is still a challenging topic. Here at DFLabs we still encounter a lot of resistance to the idea by our customers. Security teams understand that the escalating sophistication and velocity of cyber-attacks means that they must become more agile to rapidly respond to cyber incidents. But the risk of detrimentally impacting operations means that they are reluctant to do so, and rarely have the political backing and clout even if they want to.
Security teams will find themselves having to rationalize the automation of incident response to other stakeholders in their organization more and more in the future. This will require being able to build a business case to justify the risk of automating containment. They will have to explain why the cure is not worse than the disease.
There are three questions that are decisive in evaluating whether to automate containment actions:
- How reliable are the detection and identification?
- What is the potential detrimental impact if the automation goes wrong?
- What is the potential risk if this is not automated?
Our approach at DFLabs to this is to carefully evaluate what to automate, and how to do this safely. We support organizations in selectively applying automation through our R3 Rapid Response Runbooks. Incident Responders can apply dual-mode actions that combine manual, semi-automated and fully automated steps to provide granular control over what is automated. R3 Runbooks can also include conditional statements that apply full automation when it is safe to do so but request that a human vet’s the decision in critical environments or where it may have a detrimental impact on operational integrity.
We just released a whitepaper, “Automate or Die, without Dying”, by our Vice President of Product Evangelism and former Gartner analyst, Oliver Rochford, that discusses best practices to safely approach automation. Download the whitepaper here for an in-depth discussion on this controversial and challenging, but important topic.
Alert fatigue is the desensitization when overwhelmed with too much information. The constant repetition and sheer volume of redundant information are painful and arduous but sadly often constitutes the daily reality for many people working in cyber security. Mike Fowler (DFLabs’ VP of Professional Services) discusses several best practices to help with some of the challenges involved in this in his recent whitepaper “DFLabs as a Force Multiplier in Incident Response”. I am going to discuss another one, but looking at it from a slightly different angle.
Imagine the scenario where we have tens of thousands of alerts. Visualize these as Jigsaw pieces with a multitude of different shapes, sizes and colors and the additional dimension of different states. We have alerts from a firewall, anomalies from behavioral analytics, authentication attempts, data source retrieval attempts or policy violations. Now, there are a lot of ways to shift through this information, for example by using a SIEM’s to correlate the data and reduce the some of the alerts. The SIEM could identify and cross-reference the colors and shapes of the jigsaw pieces so to speak.
The next question once that I’ve got the all the pieces I need for the puzzle is how do I put this together? How do I complete the puzzle and unlock the picture?
The “what does the jigsaw picture?” question is something that will often puzzle the responders, pun intended. How do you prioritise and escalate incidents to the correct stakeholders? How do you apply the correct playbook for a specific scenario? How do you know which pieces of information to analyse to fit the jigsaw pieces together and make sure the puzzle looks correct?
Automation process can speed up putting that puzzle together, but making sure you automate the right things is just as critical. If skilled staff are running search queries that are menial, repetitive and require little cognitive skill to execute, you should ask yourself why they are performing these and not instead focused on analyzing the puzzle pieces to figure out how they fit together?
Remove the menial tasks. Allow automation to do the heavy lifting so your teams are not only empowered by the right information they need to successfully manage the response to an incident but also to give them more time to figure out the why, how and what of the threat.
We also welcome you to join us for a webinar hosted by Mike Fowler on this topic on the 6th of September.
I can remember sometime around late 2001 or early 2002, GREPing Snort logs for that needle in a haystack until I thought I was going to go blind. I further recall around the same time cheering the release of the Analysis Console for Intrusion Databases (ACID) tool which helped to organize the information into something that I could start using to correlate events by way of analysis of traffic patterns.
Skip ahead and the issues we faced while correlating data subtly changed from a one-off analysis to a lack of standardization for the alert formats that were available in the EDR marketplace. Each vendor was producing significant amounts of what was arguably critical information, but unfortunately all in their own proprietary format. This rendered log analysis and information tools constantly behind the 8-ball when trying to ingest all of these critical pieces of disparate event information.
We have since evolved to the point that log file information sharing can be easily facilitated through a number of industry standards, i.e., RFC 6872. Unfortunately, with the advent of the Internet of Things (IoT), we have also created new challenges that must be addressed in order to make the most effective use of data during event correlation. Specifically, how do we quickly correlate and review:
a. Large amounts of data;
b. Data delivered from a number of different resources (IoT);
c. Data which may be trickling in over an extended period of time and,
d. Data segments that, when evaluated separately, will not give insight into the “Big Picture”
How can we now ingest these large amounts of data from disparate devices and rapidly draw conclusions that allow us to make educated decisions during the incident response life cycle? I can envision success coming through the intersection of 4 coordinated activities, all facilitated through event automation:
1. Event filtering – This consists of discarding events that are deemed to be irrelevant by the event correlator. This is also important when we seek to avoid alarm fatigue due to a proliferation of nuisance alarms.
2. Event aggregation – This is a technique where a collection of many similar events (not necessarily identical) are combined into an aggregate that represents the underlying event data.
3. Event Masking – This consists of ignoring events pertaining to systems that are downstream of a failed system.
4. Root cause analysis – This is the last and quite possibly the most complex step of event correlation. Through root cause analysis, we can visualize data juxtapositions to identify similarities or matches between events to detect, determine whether some events can be explained by others, or identify causational factors between security events.
The results of these 4 event activities will promote the identification and correlation of similar cyber security incidents, events and epidemiologies.
According to psychology experts, up to 90% of information is transmitted to the human brain visually. Taking that into consideration, when we are seeking to construct an associational link between large amounts of data we, therefore, must be able to process the information utilizing a visual model. DFLabs IncMan™ provides a feature rich correlation engine that is able to extrapolate information from cyber incidents in order to present the analyst with a contextualized representation of current and historical cyber incident data.
As we can see from the correlation graph above, IncMan has helped simplify and speed up a comprehensive response to identifying the original infection point of entry into the network and then visual representing the network nodes that were subsequently affected, denoted by their associational links.
The ability to ingest large amounts of data and conduct associational link analysis and correlation, while critical, does not have to be overly complicated, provided of course that you have the right tools. If you’re interested in seeing additional capabilities available to simplify your cyber incident response processes, please contact us for a demo at [email protected]
One of my favorite sports, American football, uses a term which has always fascinated me. This term is ‘situational football’ and its whole concept is to react according to the scenario in which you find yourself. American football clubs split their squads into essentially three teams.
–Attack, which is the offensive team and the guys that typically score points.
–Defense, which is the opposite team tasked with stopping the attacking team from scoring points.
–Special teams, which is an often overlooked team. This team can be part of the defense or offense and is typically used for every other play that is not defined as an offensive or defensive setting.
Now, you may be wondering why I am talking about sports in a cyber security blog?!
Well, I always like to relate cyber security industry to other industries and to try to think outside of the box when discussing some of our approaches. That said, I’m going to make a beeline for this idea and start relating this to our thinking:
–Attack, or Red teams, can have a positive impact on your response strategy. Relating your response plans and playbooks directly to common attack methods is advisable and should be used in conjunction with the relevant compliance standards. The actions taken in response to specific attack vectors will usually have a higher success rate than a generic catch-all cyber incident response plans. I would take a lot more comfort knowing I have playbooks designed for a specific threat vector than I would be hoping that one of my generic playbooks would cover it.
–Defense, or Blue Teams, are already a big part of response plans, and ongoing refinement of these plans should coincide with every incident lessons learned. A successful response should still have lessons to consider!
Special Teams are a mix of Red and Blue, of offense and defense. They are best positioned to engage in ‘situational football’ and to enable you to define your approach with more than one mindset, even, in some cases, conflicting mindsets. Using this combined approach will ensure an attackers methodology when searching for enrichment information during incident identification, and the pragmatism of a defender during containment and eradication activities. Having a defined response to each phase of IR is important, but engaging special teams and having the ability to refactor your playbooks on the fly is a key capability when orchestrating an effective cyber security incident response to a dynamic incident.
Unique situations can present themselves at every moment of the game. Our playbook features allow you to make your defense attack-minded by feeding in all the information gathered from your playbooks and allowing you to not be restricted by baseline actions alone. We want your defense to run actions at every point and to allow you to call an audible in any situation that presents itself. The freedom to apply this mindset will drive your incident response teams above and beyond what they see in front of them.
At DFLabs, we not only create playbooks specific to compliance standards and cyber security incident response standards, we also enable you to create and to actively amend your own custom playbooks. Our flexibility ensures that your playbooks can be built on the experience of your Red and Blue teams, in line with adversarial thinking specific to your organization or industry, and to the satisfaction of your corporate, industry and regulatory policies.
Contact us to find out more at [email protected]
According to an October 2016 Fortune Tech article by Jonathan Vanian, entitled Here’s How Much Businesses Worldwide Will Spend on Cybersecurity by 2020, organizations will be spending approximately $73.3 billion in 2016 on network security with a projected 36% increase totaling $101.6 billion in 2020. Stake holders know all too well that the pennies you save today may equate to dollars in lost revenue and fines tomorrow following a significant breach or personal information leak. Finding the balance between risk and ROI is the type of thing that keeps CISO’s and CTO’s sleepless at nights.
This becomes even more critical for multinational corporations as we approach the May 25, 2018 General Data Protection Regulation (GDPR) implementation date. Post GDPR implementation, failing to protect the data of EU citizens could result not only in lost reputation and accompanying revenue, but hefty fines totaling more than some information security budgets.
This brings into sharp focus the need to make the best use of the resources we have while ensuring that we invest in the strategies that provide us the best return. Striking a balance between technology and personnel allows us to leverage each one in a coordinated effort that makes each one a force multiplier for the other.
One of the true pleasures I get here at DFLabs is speaking to our customers, listening to their pain points and discussing how they are dealing with them both on a strategic and tactical level. It never ceases to amaze me how creative the solutions are and I’ve been blown away more than once by some truly outside of the box thinking on their part.
ESG Research recently published a whitepaper entitled Next Generation Cyber Security Analytics and Operations Survey where in one of the (many) takeaways is that the top 5 challenges for security analytics and operations consist of:
- Total cost of operations
- Volume of alerts don’t allow time for strategy and process improvement
- Time to remediate incidents
- Lack of tools and processes to operationalize threat intelligence
- Lack of staff and/or skill set to properly address each task associated with an alert
These 5 pain points come as no surprise and while there is certainly no “silver bullet” there are some steps we can take to lessen the severity and improve our cyber incident response position significantly.
Total Cost of Operations
Addressing the total cost of operations can be the biggest factor in building a solid security analytics and operations capability. The key here is to leverage the resources you currently possess to their maximum potential, be it personnel, processes or technological solutions. Automation and incident orchestration allows the blending of human to machine or machine to machine activities in a real-time incident response. This not only makes the best use of existing resources, but provides you the much-needed insight to determine where your funds are best spent going forward.
Volume of alerts don’t allow time for strategy and process improvement
In the whitepaper entitled Automation as a Force Multiplier in Cyber Incident Response I address the alert fatigue phenomenon and discuss ways to address it within your organization. The strategy discussed, including automatically addressing lesser priority or “nuisance” alerts will provide your operations team with additional time for strategizing and process evaluation.
Time to Remediate Incidents
We are certainly familiar with the term dwell time as it applies to InfoSec. One of the 5 focus areas outlined in Joshua Douglas’ paper entitled Cyber Dwell Time and Lateral Movement is granulated visibility and correlated intelligence. This requires a centralized orchestration platform for incident review and processing that provides not only automated response, but the ability to leverage intelligence feeds to orchestrate that response. Given this capability, that single pane of glass now becomes a fully functional orchestration and automation platform. Now we can see correlated data across multiple systems incidents providing us the capability to locate, contain and remediate incidents faster than we thought possible and reduce dwell time exponentially.
Lack of tools and processes to operationalize threat intelligence
The ability to integrate threat intelligence feeds into existing incidents to enrich the data or alternately to create incidents based on threat intelligence to proactively seek out these threats is integral to your security analytics and operations capabilities. This could be a centralized mechanism in your strategic response and an integral part of your orchestration and automation platform. The ability to coordinate this activity is referred to as Supervised Active Intelligence (SAI)™ and provides the ability to scale the response using the most appropriate methods based on fact-based and intelligence driven data. This coordination should enhance your existing infrastructure making use of your current (and future) security tools.
Lack of staff and/or skillset to properly address each task associated with an alert
Of all the pain points in security analytics and operations, this is the one I hear about most frequently. The ability to leverage the knowledge veterans possess to help grow less experienced team members is an age-old issue. Fortunately, this may be the easiest to solve given the capabilities and amount of data we have available and the process by which we can communicate these practices. Orchestration and automation platforms must include not only a Knowledge Base capable of educating new team members of the latest in IR techniques, but incident workflows (commonly called “Playbooks”) that provide the incident responder on his first day the same structured response utilized by the organizations veterans. This workflow doesn’t require the veteran to be present as the tactics, techniques and procedures have already been laid out to guide less experienced employees.
We’ve seen that there are some significant pain points when developing a structured security analytics and operations capability. However I hope you’ve also seen that each of those points can be addressed via orchestration and automation directed toward prioritizing the improvement of your existing resources, with an eye toward the future.
“Noise” is a prevalent term in the cyber security industry. DFLabs consistently receives feedback from vendor partners and clients that one of the major issues they face daily is the ability to sift through noise in order to understand and differentiate an actual critical problem from a wild goose chase.
Noise is vast amount of information passed from security products that can have little or no meaning to the person receiving this information. Typically, lots of products are not tuned or adapted for certain environments and therefore would present more information than needed or required.
Noise is a problem to all of us in the security industry, as there are meanings within these messages that are many times simply ignored or passed over for higher priorities. For example, having policies and procedures that are incorrectly identified or adapted or the product is not properly aligned within the network topology.
There is no one security product that can deal with every attack vector that businesses experience today. What’s more disturbing about this paradigm is that the products do not talk to each other natively, yet all these products have intelligence data that can overlay to enrich security and incident response teams.
Cyber incident investigative teams spending a vast number of hours doing simple administration that can be relieved by introducing an effective case management system. Given the sheer volume we can see from SIEM products on a day to day basis we can execute all of the human to machine actions and follow best practice per type of incident and company guidelines through automated playbooks.
Re-thinking about what information is being presented and how we deal with it is the biggest question. There are several ways to manage this:
• Fully automating the noise worthy tasks. If these are consistently coming into your Security Operations Center (SOC) causing you to spend more time on administration than investigation, it may be prudent to schedule the tasks in this manner.
• Semi-Automation of tasks can give your SOC teams more control of how to deal with huge numbers. Automating 95% of the task and then giving this last sign off a manual look over can heavily reduce time if your organisation is against completely automating the process.
• Leverage all your existing products to provide better insight into the incident. For example, leverage an existing active directory to lock out or suspend a user account if they log in outside of normal business hours. Additionally it’s possible to sandbox and snapshot that machine to understand what is happening. A key consideration here is to make sure not to disrupt work at every opportunity. It really is a balancing act, however depending on their privilege you may want to act faster for some users than others.
In 2017, the readiness and capability to respond to a variety of cyber incidents will continue to be at the top of every C-level agenda.
By leveraging the orchestration and automation capabilities afforded by IncMan™, stake holders can provide 360-degree visibility during each stage of the incident response life cycle. This provides not only consistency across investigations for personnel, but encourages the implementation of Supervised Active Intelligence™ across the entire incident response spectrum.
At DFLabs we showcase our capacity to reduce investigative time, incident dwell time all while increasing incident handling consistency and reducing liability. Arming your SOC teams with information prior to the start of their incident investigation will help to drive focus purely on the incidents that need attention rather than the noise.
If you’re interested in seeing how we can work together to grow your incident response capabilities, visit us at https://www.DFLabs.com and schedule a demonstration of how we can utilize what you already have and make it better.