RSS

Boost Splunk Performance and Reduce Costs 10x With Hydrolix

With Hydrolix’s Splunk integration, you can query Hydrolix data in Splunk, reduce observability costs 10x, and retain all your data long-term.

Franz Knupfer

Published:

Jun 06, 2024

7 minute read
,

Splunk has exceptional tooling and a UI that many teams love, but it is also expensive, especially at scale. With the average volume of log data going up 500% over the past three years, many enterprises are faced with a dilemma—pay high (and still rising) costs, discard data, or start over with a new platform.

If you’re dealing with data at terabyte scale, you now have another option. With Hydrolix’s new Splunk integration, you can enhance your Splunk cluster by ingesting data into Hydrolix and querying it in Splunk, dramatically reducing costs while getting other performance enhancements.

Benefits of Integrating Hydrolix and Splunk

  • Lower total cost of ownership (TCO): Hydrolix is a streaming data lake that makes log-intensive use cases much more cost-effective. Per GB costs for Hydrolix are 700% less than Splunk based on Splunk Cloud’s listing in AWS Marketplace.
  • Real-time streaming ingest and transformation: Hydrolix streams your log data in real time at terabyte scale. This is in contrast to Splunk, which no longer offers real-time stream processing. Hydrolix also transforms data in real time before storage so you can normalize, enrich, and obfuscate data.
  • Long-term data retention (12 months or more): Hydrolix uses decoupled, commodity object storage, and runs in your virtual private cloud (VPC). Combined with high-density compression that reduces your data footprint by 20-50x, all your data is “hot” for long-term querying and comes with dramatically reduced storage costs.
  • Sub-second query latency even on trillion row datasets: Hydrolix uses massive parallelism, partition pruning, micro-indexing, extreme predicate pushdown, and other advanced features to provide low-latency queries no matter how big the dataset.
  • Combine multiple data sources in one table to avoid complex JOINs: Combine multiple data sources in one table to query and compare them without complex, compute-intensive JOIN statements.
  • World-class customer success team: Whether you need help for a trial, deploying a Hydrolix cluster, or optimizing queries, Hydrolix’s customer success team is deeply technical and provides world-class support.

Setting Up Hydrolix’s Splunk Integration

The Splunk integration works by combining Splunk DB Connect with a lightweight, custom Hydrolix JDBC driver so you can query Hydrolix data directly from Splunk. You can read the setup instructions in the Splunk with DB Connector documentation.

If you have your Splunk and Hydrolix credentials handy, you can set up the integration in about thirty minutes. If you don’t have Hydrolix yet, you can set up a trial or demo to try out the integration.

The following image shows a SQL query in Splunk that retrieves data from Hydrolix.

You can also use Hydrolix data for dashboards and other visualizations. Hydrolix also offers summary tables that aggregate data for dashboards to reduce system load and query costs.

Enhance Your Splunk Cluster with Hydrolix

If you’re ingesting at least a terabyte of log data everyday—or would like to do so if it were more cost-effective—then Hydrolix is a perfect complement to Splunk.

Lower Total Cost of Ownership (TCO) by 10x (or More)

The TCO of Hydrolix is typically at least 4x lower compared to previous solutions, but savings are typically much more compared to Splunk. You can use our pricing calculator to estimate license costs. Comparing costs to the TCO of Splunk Cloud, the per GB cost of Hydrolix is anywhere from 7x to 20x more cost-effective. Note that the highest data estimate that Splunk lists is for 100 GB/day, while Hydrolix starts at 1 TB/day and above.

  • At 100 GB/day, the TCO of Splunk Cloud is $6,667/month ($80,000/year) in the US, according to Splunk Cloud’s listing in AWS Marketplace. The per gigabyte cost is ~$2.19/GB/month.
  • At 2 TB/day with data in AWS, the TCO of Hydrolix is $12,654/month. This assumes 15x compression (typical compression rates are 20x-50x in Hydrolix) and one year of data retention. The per gigabyte cost is $.20/GB/month.

Based on these listed prices, Hydrolix TCO is 11x less than Splunk TCO.

This isn’t quite an apples-to-apples comparison because Splunk’s per GB costs may be lower at higher scale, but 100 GB/day is the maximum tier offered on Amazon Marketplace. Splunk isn’t designed for terabyte-scale use cases and data at that scale is prohibitively expensive in Splunk.

Hydrolix also becomes even more cost-effective at scale. At 10 TB/day, the cost per GB for Hydrolix is ~$.11/GB/month, which is 20x less than Splunk Cloud’s stated cost. Even at 1 TB/day, the lower limit for Hydrolix users, the per gigabyte cost is $.32/GB/month—7x less than Splunk Cloud.

Real-Time Streaming Ingest and Transformation at Terabyte Scale

With Hydrolix, you can stream, transform, and store your incoming log data, with data ready to analyze in under a minute. Hydrolix can handle intensive traffic, with peaks of 11 million log lines per second for the biggest football game of the year. You can use Hydrolix’s HTTP Stream API or integrations with AWS Kinesis or Kafka.

Many observability platforms struggle to handle peak events, and issues with dropped data and slow dashboards are common. Splunk itself discontinued its data stream processor, which received customer feedback that it was too expensive and complicated. So there currently isn’t a native solution in Splunk for real-time streaming data.

Hydrolix handles real-time streaming ingest at any scale so you don’t have to add another third-party tool to the mix. All Hydrolix tables include transforms for each data source so you can enrich, standardize, obfuscate, and format your data before it’s stored.

And because each Hydrolix subsystem (ingest, query, and storage) is independently scalable, you can scale ingest to use more resources during peak events and maintain peak efficiency. You can also scale back resources (even down to zero) when demand subsides to reduce costs.

Long-Term Data Retention (12 Months or More)

Hydrolix offers long-term data retention with sub-second access to your data whether it’s a minute or a year old. With many solutions (including Splunk), log data at scale is too expensive, so data is often quickly moved into a tiered storage solution where it’s more difficult to access or discarded altogether.

Hydrolix does not use tiering or “cold” storage to reduce costs. Because Hydrolix maximizes the strengths of commodity S3-compatible object storage, there is no distinction between “hot” and “cold” data. All data is hot and available for immediate analysis. Hydrolix also uses high-density compression to reduce your data footprint by 20x-50x, reducing storage costs even further.

Long-term, readily available data has a number of major advantages for observability, SIEM, and more, including:

  • No problems with dark data, where valuable insights about issues, vulnerabilities, and threats are no longer accessible because data has been moved into cold storage.
  • Find deeper trends in your data, such as how services are performing over longer periods of time.
  • Test threat hunting hypotheses without any data bottlenecks.

Sub-Second Query Latency Even on Trillion Row Datasets

Many data solutions struggle to provide efficient query performance at scale, and it’s not uncommon for queries of very large datasets to take a long time or even time out altogether. This isn’t just inconvenient—it’s a real problem if you need to quickly find and fix issues.

Hydrolix partitions data by timestamp to maximize query efficiency for log-based use cases. Hydrolix is a columnar datastore that uses advanced querying techniques like micro-indexing, partition pruning, extreme predicate pushdown, and massive parallelism to provide low-latency query performance. And with Hydrolix’s unique architecture, all columns are indexed without negative impacts to storage footprint or querying, so you don’t need to make difficult decisions about which columns to index ahead of time.

Hydrolix’s Splunk integration sends queries from Splunk to Hydrolix using any ANSI-compliant SQL, then Hydrolix returns the query results to Splunk.

With Hydrolix, you can also use query pools to eliminate resource contention, a feature referred to as precision query scaling. You can increase or limit the query resources a pool uses to achieve the right balance between performance and cost, a feature unique to Hydrolix. For example, you could create a query pool with more compute resources for SREs to quickly mitigate performance issues, and you could create another query pool that prioritizes cost-effectiveness over performance for machine learning training runs.

To learn more about how querying works in Hydrolix, see:

Maximize the Value of Your Log Data With Hydrolix

Splunk isn’t the only way to visualize and query data in Hydrolix. Hydrolix is designed to be a store-once, use-many solution for all of your log data.

One of the challenges that many Splunk users face is vendor lock-in, which can make it challenging to analyze your log data outside of Splunk. Hydrolix solves that problem by making your log data available to other tools, including visualization and dashboard tools such as Grafana, Superset, Tableau, and Looker Studio.

With Hydrolix, you own all of your data in your own virtual private cloud (VPC) and there is no vendor lock-in. You can keep all your data for long-term analysis and compliance and share it with other teams for data science, business intelligence, and machine learning use cases. By removing limits to cost and retention, Hydrolix makes many more log data use cases economically viable.

Next Steps

If you’re already using both Splunk and Hydrolix, follow along with the Splunk with DB Connector documentation to start querying Hydrolix data in Splunk.

If you’re not using Hydrolix yet:

  • Use our pricing calculator to estimate your TCO with Hydrolix and learn how much you’ll save.
  • Start a trial or demo and see firsthand how the integration works.

Read Transforming the Economics of Log Management to learn about the high costs of observability at scale for most solutions and how Hydrolix builds on object storage to dramatically reduce costs.

Share this post…

Ready to Start?

Cut data retention costs by 75%

Give Hydrolix a try or get in touch with us to learn more

Splunk has exceptional tooling and a UI that many teams love, but it is also expensive, especially at scale. With the average volume of log data going up 500% over the past three years, many enterprises are faced with a dilemma—pay high (and still rising) costs, discard data, or start over with a new platform.

If you’re dealing with data at terabyte scale, you now have another option. With Hydrolix’s new Splunk integration, you can enhance your Splunk cluster by ingesting data into Hydrolix and querying it in Splunk, dramatically reducing costs while getting other performance enhancements.

Benefits of Integrating Hydrolix and Splunk

  • Lower total cost of ownership (TCO): Hydrolix is a streaming data lake that makes log-intensive use cases much more cost-effective. Per GB costs for Hydrolix are 700% less than Splunk based on Splunk Cloud’s listing in AWS Marketplace.
  • Real-time streaming ingest and transformation: Hydrolix streams your log data in real time at terabyte scale. This is in contrast to Splunk, which no longer offers real-time stream processing. Hydrolix also transforms data in real time before storage so you can normalize, enrich, and obfuscate data.
  • Long-term data retention (12 months or more): Hydrolix uses decoupled, commodity object storage, and runs in your virtual private cloud (VPC). Combined with high-density compression that reduces your data footprint by 20-50x, all your data is “hot” for long-term querying and comes with dramatically reduced storage costs.
  • Sub-second query latency even on trillion row datasets: Hydrolix uses massive parallelism, partition pruning, micro-indexing, extreme predicate pushdown, and other advanced features to provide low-latency queries no matter how big the dataset.
  • Combine multiple data sources in one table to avoid complex JOINs: Combine multiple data sources in one table to query and compare them without complex, compute-intensive JOIN statements.
  • World-class customer success team: Whether you need help for a trial, deploying a Hydrolix cluster, or optimizing queries, Hydrolix’s customer success team is deeply technical and provides world-class support.

Setting Up Hydrolix’s Splunk Integration

The Splunk integration works by combining Splunk DB Connect with a lightweight, custom Hydrolix JDBC driver so you can query Hydrolix data directly from Splunk. You can read the setup instructions in the Splunk with DB Connector documentation.

If you have your Splunk and Hydrolix credentials handy, you can set up the integration in about thirty minutes. If you don’t have Hydrolix yet, you can set up a trial or demo to try out the integration.

The following image shows a SQL query in Splunk that retrieves data from Hydrolix.

You can also use Hydrolix data for dashboards and other visualizations. Hydrolix also offers summary tables that aggregate data for dashboards to reduce system load and query costs.

Enhance Your Splunk Cluster with Hydrolix

If you’re ingesting at least a terabyte of log data everyday—or would like to do so if it were more cost-effective—then Hydrolix is a perfect complement to Splunk.

Lower Total Cost of Ownership (TCO) by 10x (or More)

The TCO of Hydrolix is typically at least 4x lower compared to previous solutions, but savings are typically much more compared to Splunk. You can use our pricing calculator to estimate license costs. Comparing costs to the TCO of Splunk Cloud, the per GB cost of Hydrolix is anywhere from 7x to 20x more cost-effective. Note that the highest data estimate that Splunk lists is for 100 GB/day, while Hydrolix starts at 1 TB/day and above.

  • At 100 GB/day, the TCO of Splunk Cloud is $6,667/month ($80,000/year) in the US, according to Splunk Cloud’s listing in AWS Marketplace. The per gigabyte cost is ~$2.19/GB/month.
  • At 2 TB/day with data in AWS, the TCO of Hydrolix is $12,654/month. This assumes 15x compression (typical compression rates are 20x-50x in Hydrolix) and one year of data retention. The per gigabyte cost is $.20/GB/month.

Based on these listed prices, Hydrolix TCO is 11x less than Splunk TCO.

This isn’t quite an apples-to-apples comparison because Splunk’s per GB costs may be lower at higher scale, but 100 GB/day is the maximum tier offered on Amazon Marketplace. Splunk isn’t designed for terabyte-scale use cases and data at that scale is prohibitively expensive in Splunk.

Hydrolix also becomes even more cost-effective at scale. At 10 TB/day, the cost per GB for Hydrolix is ~$.11/GB/month, which is 20x less than Splunk Cloud’s stated cost. Even at 1 TB/day, the lower limit for Hydrolix users, the per gigabyte cost is $.32/GB/month—7x less than Splunk Cloud.

Real-Time Streaming Ingest and Transformation at Terabyte Scale

With Hydrolix, you can stream, transform, and store your incoming log data, with data ready to analyze in under a minute. Hydrolix can handle intensive traffic, with peaks of 11 million log lines per second for the biggest football game of the year. You can use Hydrolix’s HTTP Stream API or integrations with AWS Kinesis or Kafka.

Many observability platforms struggle to handle peak events, and issues with dropped data and slow dashboards are common. Splunk itself discontinued its data stream processor, which received customer feedback that it was too expensive and complicated. So there currently isn’t a native solution in Splunk for real-time streaming data.

Hydrolix handles real-time streaming ingest at any scale so you don’t have to add another third-party tool to the mix. All Hydrolix tables include transforms for each data source so you can enrich, standardize, obfuscate, and format your data before it’s stored.

And because each Hydrolix subsystem (ingest, query, and storage) is independently scalable, you can scale ingest to use more resources during peak events and maintain peak efficiency. You can also scale back resources (even down to zero) when demand subsides to reduce costs.

Long-Term Data Retention (12 Months or More)

Hydrolix offers long-term data retention with sub-second access to your data whether it’s a minute or a year old. With many solutions (including Splunk), log data at scale is too expensive, so data is often quickly moved into a tiered storage solution where it’s more difficult to access or discarded altogether.

Hydrolix does not use tiering or “cold” storage to reduce costs. Because Hydrolix maximizes the strengths of commodity S3-compatible object storage, there is no distinction between “hot” and “cold” data. All data is hot and available for immediate analysis. Hydrolix also uses high-density compression to reduce your data footprint by 20x-50x, reducing storage costs even further.

Long-term, readily available data has a number of major advantages for observability, SIEM, and more, including:

  • No problems with dark data, where valuable insights about issues, vulnerabilities, and threats are no longer accessible because data has been moved into cold storage.
  • Find deeper trends in your data, such as how services are performing over longer periods of time.
  • Test threat hunting hypotheses without any data bottlenecks.

Sub-Second Query Latency Even on Trillion Row Datasets

Many data solutions struggle to provide efficient query performance at scale, and it’s not uncommon for queries of very large datasets to take a long time or even time out altogether. This isn’t just inconvenient—it’s a real problem if you need to quickly find and fix issues.

Hydrolix partitions data by timestamp to maximize query efficiency for log-based use cases. Hydrolix is a columnar datastore that uses advanced querying techniques like micro-indexing, partition pruning, extreme predicate pushdown, and massive parallelism to provide low-latency query performance. And with Hydrolix’s unique architecture, all columns are indexed without negative impacts to storage footprint or querying, so you don’t need to make difficult decisions about which columns to index ahead of time.

Hydrolix’s Splunk integration sends queries from Splunk to Hydrolix using any ANSI-compliant SQL, then Hydrolix returns the query results to Splunk.

With Hydrolix, you can also use query pools to eliminate resource contention, a feature referred to as precision query scaling. You can increase or limit the query resources a pool uses to achieve the right balance between performance and cost, a feature unique to Hydrolix. For example, you could create a query pool with more compute resources for SREs to quickly mitigate performance issues, and you could create another query pool that prioritizes cost-effectiveness over performance for machine learning training runs.

To learn more about how querying works in Hydrolix, see:

Maximize the Value of Your Log Data With Hydrolix

Splunk isn’t the only way to visualize and query data in Hydrolix. Hydrolix is designed to be a store-once, use-many solution for all of your log data.

One of the challenges that many Splunk users face is vendor lock-in, which can make it challenging to analyze your log data outside of Splunk. Hydrolix solves that problem by making your log data available to other tools, including visualization and dashboard tools such as Grafana, Superset, Tableau, and Looker Studio.

With Hydrolix, you own all of your data in your own virtual private cloud (VPC) and there is no vendor lock-in. You can keep all your data for long-term analysis and compliance and share it with other teams for data science, business intelligence, and machine learning use cases. By removing limits to cost and retention, Hydrolix makes many more log data use cases economically viable.

Next Steps

If you’re already using both Splunk and Hydrolix, follow along with the Splunk with DB Connector documentation to start querying Hydrolix data in Splunk.

If you’re not using Hydrolix yet:

  • Use our pricing calculator to estimate your TCO with Hydrolix and learn how much you’ll save.
  • Start a trial or demo and see firsthand how the integration works.

Read Transforming the Economics of Log Management to learn about the high costs of observability at scale for most solutions and how Hydrolix builds on object storage to dramatically reduce costs.