Interested in cyber security data visualization?
Join our webinar next week to see how KeyLines can help you overcome the scale and complexity challenges of cyber security data.
Visualizing Data Breach Patterns
Data sharing presents a huge challenge to the security industry. When organizations are compromised, the common response is to switch to self-preservation mode. The full details of breaches are rarely disclosed which limits collective intelligence and arguably makes the life of attackers easier.
Happily, there are several projects working to fix this.
VERIS (the Vocabulary for Event Recording and Incident Sharing) aims to provide a common taxonomy for organizations to share information about their breaches. By helping organizations exchange war stories, they hope to facilitate co-operation and improve risk management.
Alongside this project is the Veris Community Database: a project to collate and disseminate information about all publicly disclosed data breaches. Excitingly for data visualization enthusiasts like us, their data is openly available on Github.
Let’s take a look!
The Data model
The Veris team has designed a schema that helps organizations record breaches in a ‘structured and repeatable’ way. It uses the A4 model to describe and classify incidents by:
- Actor – i.e. who performed the attack?
- Action – i.e. what was the attack vector?
- Asset – i.e. who or what was the attack victim?
- Attributed – i.e. what was the outcome / impact of the attack
For each of the 5500+ attacks listed there are more than 150 data points, so we need to design a visual model that will enable us to explore the data set and answer some key questions.
As actors (attackers) and actions (vectors) are grouped into categories, we will model our graph in the following way:

Let’s take the data (handily provided in JSON format) and load it in a KeyLines chart.
Step 1: Data overview

This is a fairly large dataset, but KeyLines’ HTML5 Canvas renderer is able to hand it perfectly well, without the need to switch to the faster WebGL option. We have color coded the links (attack vectors) by the categories supplied in the dataset, which helps us pick out some early patterns:
- The large red group shows Activist Groups favor ‘Advanced Technology’ including remote access, command shell and VPNs.
- The large mostly blue/green cluster indicates breaches originating from end-users or employees. These are more likely to be caused by carelessness, physical access or basic technology, like desktop sharing or document theft.
Step 2: Temporal patterns
One of the data points collated by Veris is a date stamp of when the breach was reported. Let’s add this to our chart with the KeyLines time bar:

The overall picture here is quite lumpy, with peaks in February ’13 and ’14. But using the filters, let’s take a sub-network view of how different vectors change through the months:



Attack vectors by attacker group
The advantage of a graph-based visualization is we can see our data in its full connected environment. Using simple filters, we can find some other trends… For example:




Find the unlucky victims
We’ve looked at the attackers and the attack vectors. The third entity type in this data is the victim. By sizing victim nodes by degree (number of connections to other nodes in the network) we can get an idea of the most frequently breached organizations:

Find threats in your own cyber data
KeyLines makes it possible for us to visualize and explore thousands of attack records in a single chart. By configuring the data model to show different aspects of the Veris schema we can find new trends – a process made faster and more interactive with the help of layouts, filters, social network analysis and the KeyLines time bar.
Want to explore the data for yourself? You can find the demo application in the KeyLines SDK (Demos > Data Breaches). Use the form below to request an account!
The post Visualizing Cyber Security Graphs: Data Breaches appeared first on Cambridge Intelligence.