Migrating from Rockset? Find out if Hydrolix is right for you >>

RSS

Making the Internet Safer With The Honeynet Project

We attended NANOG 91 and announced our new collaboration with The Honeynet Project, a non-profit dedicated to making the internet safer.

Tony Falco

Published:

Jun 21, 2024

4 minute read

Last week at NANOG 91 in Kansas City, Hydrolix debuted a collaboration with The Honeynet Project that brought together the respective strengths of the two organizations—Hydrolix’s transformative streaming data lake software and a vast and constantly evolving trove of honey pot data gathered by The Honeynet Project from around the world.

Hydrolix booth
Representing Hydrolix at NANOG 91

The goal of the collaboration, according to Krassi Tzvetanov, an experienced engineer and The Honeynet Project volunteer currently pursuing his PhD from Purdue University and working at Hydrolix, is to “make the honey pot data vital to tracking and stopping cyber attacks much more easily available to the worldwide network of volunteer researchers at The Honeynet Project.” 

Meet The Honeynet Project

For the past 25 years, The Honeynet Project, a 501.3.c non-profit, has pursued a mission to keep the internet safe by investigating the latest cybersecurity attacks and developing open source security tools to improve internet security. Recent tools include Cuckoo, Capture-HPC, Glastopf, HoneyC, Honeyd, and Honeywall. The organization, made up of volunteers from all over the globe, faces a major obstacle in collaborating effectively due to the considerable volume of data generated by honeypots and associated research activities. 

 Tzvetanov summed up the challenges:

“This data is really valuable, but like any log data, it is expensive to work with and to retain. We needed a way for lots of people to work simultaneously on data without interfering with each other. And we needed to be able to work with log data older than that which is possible with the data retention window, which is thirty days.”

“And we need to do it on a shoe-string budget,” Tzvenatov added.

To address this critical issue, Hydrolix donated its streaming data lake software to The Honeynet Project, which includes paying for and maintaining the full cloud deployment of a cluster dedicated to hosting Honeynet Project data.

Grafana dashboard shows top alerts, top exploited CVEs, top source IPs, top events by country, and number of events per ASN.
A dashboard visualizing Suricata stream excessive retransmission activity that combines Hydrolix, Grafana, and data from The Honeynet Project.

With Hydrolix, The Honeynet Project will be able to increase data retention from thirty days to two years.  By having long-term, readily available access to data, we aim to address some of the fundamental challenges that The Honeynet Project faces.

Transforming the Economics of Log Data

Our mission at Hydrolix is to remove the economic and technical barriers to keeping and analyzing log data—and to make more log data use cases economically viable. Put another way, far too much valuable data is simply thrown away because of cost. And the data that is retained is often moved to cold storage or becomes otherwise inaccessible, making it difficult to work with.

There are other challenges of log data at scale that operators typically have to deal with as well. These include queries that time out, frustrating delays waiting for data to be restored from cold storage (if it’s ever restored at all), and the anxiety of being forced to throw away data that might contain information vital to their business, such as evidence of a compromise or insights into how to build better products. 

Hydrolix addresses these challenges with its unique architecture, which includes:

  • Decoupled storage from compute, with cost-effective S3-compatible object stores and data kept securely in your own virtual private cloud (VPC)
  • Advanced compression techniques that reduce your data footprint by 20-50x
  • Fully-indexed and columnar datastore optimized and partitioned for timestamped log data
  • High volume streaming ingest and data transformation prior to storage

The benefit to the end user? Up to 95% reduction in storage costs, hot tier retention windows that allow you to query your data for years, and sub-second query performance on trillion-row data sets. 

There couldn’t be a more ideal proof point than The Honeynet Project—and we get to support a group of people that are donating their time to make the internet safer.

Postscript: NANOG 91 and going forward

Hats off to the organizers of NANOG 91 for this event. It  was a fantastic venue not just to debut our collaboration with The Honeynet Project but a chance to talk to the network capacity planners, CISOs, and network engineers about their concerns and priorities. A common concern is cost reduction, as teams are seeing log volumes growing quickly.  The era where professionals in cybersecurity and beyond just accept that data costs always go up and that log-intensive applications like observability should account for 20+% of an organization’s cloud budget is coming to an end.

Nanog talk shows crowded hall of attendees

NANOG 92 in Toronto

Part of our plan is to take our collaboration with The Honeynet Project on the road. Look for us at events like IBC in Amsterdam and, of course, NANOG 92 in Toronto in October. Based on our experience in Kansas City, we wouldn’t miss it for anything.

Share this post…

Ready to Start?

Cut data retention costs by 75%

Give Hydrolix a try or get in touch with us to learn more

Last week at NANOG 91 in Kansas City, Hydrolix debuted a collaboration with The Honeynet Project that brought together the respective strengths of the two organizations—Hydrolix’s transformative streaming data lake software and a vast and constantly evolving trove of honey pot data gathered by The Honeynet Project from around the world.

Hydrolix booth
Representing Hydrolix at NANOG 91

The goal of the collaboration, according to Krassi Tzvetanov, an experienced engineer and The Honeynet Project volunteer currently pursuing his PhD from Purdue University and working at Hydrolix, is to “make the honey pot data vital to tracking and stopping cyber attacks much more easily available to the worldwide network of volunteer researchers at The Honeynet Project.” 

Meet The Honeynet Project

For the past 25 years, The Honeynet Project, a 501.3.c non-profit, has pursued a mission to keep the internet safe by investigating the latest cybersecurity attacks and developing open source security tools to improve internet security. Recent tools include Cuckoo, Capture-HPC, Glastopf, HoneyC, Honeyd, and Honeywall. The organization, made up of volunteers from all over the globe, faces a major obstacle in collaborating effectively due to the considerable volume of data generated by honeypots and associated research activities. 

 Tzvetanov summed up the challenges:

“This data is really valuable, but like any log data, it is expensive to work with and to retain. We needed a way for lots of people to work simultaneously on data without interfering with each other. And we needed to be able to work with log data older than that which is possible with the data retention window, which is thirty days.”

“And we need to do it on a shoe-string budget,” Tzvenatov added.

To address this critical issue, Hydrolix donated its streaming data lake software to The Honeynet Project, which includes paying for and maintaining the full cloud deployment of a cluster dedicated to hosting Honeynet Project data.

Grafana dashboard shows top alerts, top exploited CVEs, top source IPs, top events by country, and number of events per ASN.
A dashboard visualizing Suricata stream excessive retransmission activity that combines Hydrolix, Grafana, and data from The Honeynet Project.

With Hydrolix, The Honeynet Project will be able to increase data retention from thirty days to two years.  By having long-term, readily available access to data, we aim to address some of the fundamental challenges that The Honeynet Project faces.

Transforming the Economics of Log Data

Our mission at Hydrolix is to remove the economic and technical barriers to keeping and analyzing log data—and to make more log data use cases economically viable. Put another way, far too much valuable data is simply thrown away because of cost. And the data that is retained is often moved to cold storage or becomes otherwise inaccessible, making it difficult to work with.

There are other challenges of log data at scale that operators typically have to deal with as well. These include queries that time out, frustrating delays waiting for data to be restored from cold storage (if it’s ever restored at all), and the anxiety of being forced to throw away data that might contain information vital to their business, such as evidence of a compromise or insights into how to build better products. 

Hydrolix addresses these challenges with its unique architecture, which includes:

  • Decoupled storage from compute, with cost-effective S3-compatible object stores and data kept securely in your own virtual private cloud (VPC)
  • Advanced compression techniques that reduce your data footprint by 20-50x
  • Fully-indexed and columnar datastore optimized and partitioned for timestamped log data
  • High volume streaming ingest and data transformation prior to storage

The benefit to the end user? Up to 95% reduction in storage costs, hot tier retention windows that allow you to query your data for years, and sub-second query performance on trillion-row data sets. 

There couldn’t be a more ideal proof point than The Honeynet Project—and we get to support a group of people that are donating their time to make the internet safer.

Postscript: NANOG 91 and going forward

Hats off to the organizers of NANOG 91 for this event. It  was a fantastic venue not just to debut our collaboration with The Honeynet Project but a chance to talk to the network capacity planners, CISOs, and network engineers about their concerns and priorities. A common concern is cost reduction, as teams are seeing log volumes growing quickly.  The era where professionals in cybersecurity and beyond just accept that data costs always go up and that log-intensive applications like observability should account for 20+% of an organization’s cloud budget is coming to an end.

Nanog talk shows crowded hall of attendees

NANOG 92 in Toronto

Part of our plan is to take our collaboration with The Honeynet Project on the road. Look for us at events like IBC in Amsterdam and, of course, NANOG 92 in Toronto in October. Based on our experience in Kansas City, we wouldn’t miss it for anything.