Automated Responder Knowledge (ARK) in Action

We released our Machine Learning Engine PRISM in our most recent 4.2 release. The first capability that we developed from PRISM is our Automated Responder Knowledge (ARK). This capability will change the way incident responders and SOC analysts respond to incidents, and how they share and transfer their entire knowledge to the rest of the team. The key to this capability is that it learns from your own analyst’s responses to historical incidents to guide the response to new ones.

We are not re-inventing the wheel with this feature. SOC and Incident Response teams have been doing this the old-fashioned way for a long time – through 6-12 months training. What we’re doing is providing a GPS and Satellite Navigation, guiding the wheel and giving you different paths to choose from according to the terrain you are in.

We do this by analyzing incidents and their associated attributes and observables, to work out how closely they are related. Then we can suggest actions and playbooks based on your organizations’ historical responses to similar threats and incidents.

Automated Responder Knowledge (ARK)

Using Automated Responder Knowledge (ARK) in IncMan

Step 1: Not really a step – as it’s done automatically by Automated Responder Knowledge (ARK), but this occurs in the background for every incoming incident. Every Incident possesses a feature space1 that contains all the information related to it, composed of every attribute, associated observable and attached evidence. ARK analyses the feature spaces associated with every incident ever resolved. When a new incident is opened, it is scored and ranked and then compared by ARK to the historical model to identify related incidents or actions based on similar and shared attributes. The weighting of the ranking can be customized by analysts.

 

Automated Responder Knowledge (ARK)

 

Step 2: Open the incident, selecting the applicable incident type. To save time, you can create an incident template to prepopulate some of the contexts automatically in future.

 

Automated Responder Knowledge (ARK)

 

Step 3: Select Playbooks, and PRISM.

In the next screen, you will see a variety of suggested related actions and related incidents based on the feature space that your incident type is matched with. The slider at the top is used to determine the weighting in ranking for actions that are suggested. For example, if I move the slider to the left, the entire feature space actions appear, then if I move the slider to the far-right only a few actions appear from highly ranked incidents.

 

Automated Responder Knowledge (ARK)

 

 

Automated Responder Knowledge (ARK)

 

Step 4: Determine which automation and actions you want to use from the suggestions. After saving, you will be presented with options such as Auto-Commit, Auto-Run, Skip Enrichment, Containment, Notification or Custom Actions. You have the ability to select only the actions you want to automate. If you are concerned about running containment automatically, for example, you just deselect those options.

 

Automated Responder Knowledge (ARK)

 

Step 5: The automated actions are executed, resolving the incident, based on prior machine-learning generated automated responder knowledge.

ARK is designed to facilitate knowledge transfer from senior to junior analysts and to speed up incident response by applying machine learning to automate the knowledge gathering and analysis.

Automate or Die Without Breaking Your Internet

Threat actors are increasingly adopting security automation and machine learning – security teams will have to follow suit, or risk falling behind.

Many organizations still conduct incident response based on manual processes. Many playbooks that we have seen in our customer base, for example, hand off to other stakeholders within the organization to wait for additional forensic data, and to execute remediation and containment actions.

While this may seem like good practice to avoid inadvertent negative consequences such as accidentally shutting down critical systems or locking out innocent users, it also means that many attacks are not contained in a sufficiently short time to avoid the worst of their consequences.

Manual Processes Cannot Compete with Automation

Reports are mounting about threat actors and hackers leveraging security automation and machine learning to increase the scale and volume, as well as the velocity of attacks. The implications for organizations should be cause for concern, considering that we have been challenged to effectively respond to less sophisticated attacks in the past.

Ransomware is a case in point. In its most simple form, a ransomware attack does not require the full cyber kill chain to be successful. A user receives an email attachment, executes it, the data is encrypted and the damage is done. At that point, incident response turns into disaster recovery.

Automated attacks have been with us for a long time. Worms and Autorooters have been around since the beginning of hacking, with WannaCry and its worming capability only the most recent example. But these have only automated some aspects of the attack, still permitting timely and successful threat containment further along the kill chain.

Threat actors have also leveraged automated command and control infrastructure for many years. DDoS Zombie Botnets, for example, are almost fully automated. To sum it up, the bad guys have automated, the defenders have not. Manual processes cannot compete with automation.

With the increase in the adoption of automation and machine learning by cyber criminals, enterprises will find that they will have to automate as well. The future mantra will be “Automate or Die”.

Making the Cure More Palatable Than the Disease

But automating containment actions is still a challenging topic. Here at DFLabs we still encounter a lot of resistance to the idea by our customers. Security teams understand that the escalating sophistication and velocity of cyber-attacks means that they must become more agile to rapidly respond to cyber incidents. But the risk of detrimentally impacting operations means that they are reluctant to do so, and rarely have the political backing and clout even if they want to.

Security teams will find themselves having to rationalize the automation of incident response to other stakeholders in their organization more and more in the future. This will require being able to build a business case to justify the risk of automating containment. They will have to explain why the cure is not worse than the disease.

There are three questions that are decisive in evaluating whether to automate containment actions:

  1. How reliable are the detection and identification?
  2. What is the potential detrimental impact if the automation goes wrong?
  3. What is the potential risk if this is not automated?

Our approach at DFLabs to this is to carefully evaluate what to automate, and how to do this safely. We support organizations in selectively applying automation through our R3 Rapid Response Runbooks. Incident Responders can apply dual-mode actions that combine manual, semi-automated and fully automated steps to provide granular control over what is automated. R3 Runbooks can also include conditional statements that apply full automation when it is safe to do so but request that a human vet’s the decision in critical environments or where it may have a detrimental impact on operational integrity.

We just released a whitepaper, “Automate or Die, without Dying”, by our Vice President of Product Evangelism and former Gartner analyst, Oliver Rochford, that discusses best practices to safely approach automation. Download the whitepaper here for an in-depth discussion on this controversial and challenging, but important topic.

Security Event Automation and Orchestration in the Age of Ransomware

We have recently experienced a devastating wave of ransomware attacks such as Wannacry or ‘WannCrypt’ which spread to more than 200 countries across the globe. While Russia was hit hard, Spain and the United Kingdom saw significant damage to their National Health Services. Hospitals were forced to unplug their computers to stop the malware from spreading even further. This is just one of the security threats posed by special malware that encrypts computer files, network file shares, and even databases thereby preventing user access (Green 18-19). It happens in spite of heavy investments in a wide array of security automation and orchestration solutions and staff required to triage, investigate and resolve threats.

The primary problem is that organizations seem to be losing the battle against cyber attackers (Radichel, 2). The security administrators are overburdened and compelled to manually perform time-consuming and repetitive tasks to identify, track, and resolve security concerns across various security platforms. Notwithstanding the time and effort, it is difficult to analyze and adequately prioritize the security events and alerts necessary to protect their networks. Still, the inadequate visibility into the present activities of the security teams, metrics and performance leave security managers struggling to justify additional resources. It has long been accepted that the organizational efficiency depends heavily on the ability of the security system to reduce false positives so that analysts can focus on the critical events along with indicators of compromise.

Security event automation and orchestration ensures that an organization detects a compromise in real time. A rapid incident response ensures a quick containment of the threat. Through the automation of common investigation enrichment and response actions, as well as the use of a centralized workflow for performing incident response, it is possible to minimize response times and thus make the organization more secure. Security events automation and orchestration expedites workflows across the threat life-cycle in various phases. However, for the security team to deploy security automation and orchestration of event-driven security, there must be access to data concerning events occurring in the environment that warrant a response. To effectively employ event-driven security, automation should be embedded into processes that could introduce new threats to the environment (Goutam, Kamal and Ingle, 431). The approach requires that there be a way to audit the environment securely and trigger event based on data patterns that indicate security threat or intrusion. Of particular importance, continuous fine tuning of processes is required to make certain the events automation and orchestration being deployed is not merely automating the process, but providing long-term value in the form of machine learning and automated application of incident response workflows that have previously resolved incidents successfully.

At a time of increased cybersecurity threats, a structured approach can expedite the entire response management process from event notification to remediation and closure through automated orchestration and workflow. An automatic gathering of key information, the building of decision cases and the execution of critical actions to prevent and/or remediate cyber threats based on logical incident response processes are enabled. With security orchestration and event automation, various benefits are realized such as cost effectiveness, mitigation of security incidents and improved speed and effectiveness of the response. Hence, security event automation and orchestration is the real deal in containing security threats before real damage takes place.