Frequently asked questions
Frequently asked questions
Your questions about data ingestion and batch processing answered
GENERAL frequently asked questions
Hydrolix is purpose-built to make big data cost-effective to collect, retain, and query. If you are currently working with or planning to work with datasets at volumes of 1 TB/day or greater, then Hydrolix is an excellent choice.
Hydrolix is the only cloud data platform combining stream processing, indexed search, and decoupled storage, resulting in a unique combination of high-performance, cost-efficient data management features.
Unlike traditional solutions, Hydrolix offers a database platform designed to handle massive data volumes at low cost, making it a perfect fit for organizations dealing with big data growth.
Its stateless architecture allows for flexible data ingestion while keeping ingest and query functions independent, ensuring optimal performance. Hydrolix’s low-latency query performance, long-term data retention, and high scalability empower you to unlock the full potential of your data without sacrificing speed or cost-effectiveness.
Here are the estimated costs for running Hydrolix on Amazon EKS, ingesting two terabytes of raw data per day and assuming 12 months of data retention and 15x compression.
Estimated Monthly Cloud Provider Expenses
Core Platform: $2,237
Ingest Capacity: $1,470
Query Capacity: $1,960
Cloud Storage: $1,146
Cloud Provider Total: $6,814 / mo
Hydrolix License & Support: $5,840 / mo
Total Cost of Ownership: $12,654 / mo
Effective Cost: $0.20 / GB
You can use this online pricing calculator to get a cost estimate.
Hydrolix offers a free trial, allowing developers to experience the platform’s capabilities firsthand.
Explore Hydrolix’s features, assess its suitability for your projects, and gain insights into its performance and benefits.
Yes. Hydrolix offers several advantages: high-performance querying, cost-efficient data retention, and seamless scalability.
Hydrolix’s architecture and features make it a compelling choice for organizations seeking improved data management and analytics capabilities beyond what the ELK stack provides.
Vendor lock-in is not a concern with Hydrolix, as your data remains within your architecture.
For cloud deployments, your data resides inside your chosen object storage environment, such as Amazon S3 or Azure Blob Storage.
This flexibility ensures you can easily migrate your data, offering peace of mind and data sovereignty.
Cloud and on-premises offerings all adhere to the following security configurations:
- Role-based access control (RBAC) to limit access to project data.
- Strict data separation between projects – Customers can have multiple projects so that you can limit access to teams and individuals as needed.
- SOC 2 compliance to ensure that all data is stored and processed securely.
- GDPR compliance
- TLS for data in transit – To retrieve data from cloud storage, Hydrolix clusters use token-based authentication over TLS
COLLECT frequently asked questions
Yes, Hydrolix fully supports batch processing. This collection feature allows you to efficiently load data from a storage bucket into a target table, ensuring you can work with your data at scale.
Hydrolix offers two mechanisms for batch ingestion: the Batch Job API, which handles one-off tasks for loading one or more files based on job configurations, and Batch Auto-Ingest, which continuously ingests new files arriving in a storage bucket.
Supported data formats include CSV and JSON. Please note that Hydrolix requires read permissions to access external storage buckets for batch ingestion.
QUERY frequently asked questions
Hydrolix improves query efficiency compared to other cloud data platforms through its unique architecture. Our decoupled and stateless design separates ingest and query resources from storage, allowing us to focus on efficiently handling high-cardinality and high-dimensionality data.
Here’s how our architecture achieves query efficiency:
- Scalable Query Pools: Hydrolix enables you to scale query resources independently, ensuring consistently low-latency queries as your data workload grows.
- Partition Metadata: We utilize partition metadata to speed up time-based queries, which is particularly beneficial for time-series data analysis.
- Full Column Indexing: Hydrolix leverages full-column indexing, which optimizes query performance by swiftly locating the necessary data.
- Predicate Pushdown: Our platform efficiently filters datasets using predicate pushdown, further enhancing query efficiency.
Because Hydrolix query infrastructure is decoupled from storage and collection, you can quickly scale query pools up or down. Small query pools give you consistent low-cost queries, while large pools give you consistent low-latency queries.
You can create separate query pools for different groups of users. For example, you might configure separate sandboxes for administrator, interactive analyst, and monitoring queries.
Pool groups support independent scaling, so the capacity for each pool can adjust automatically to satisfy demand. You can even scale an entire pool to zero when demand is negligible–for example, over the weekend when staff do not need to access data. When demand returns, you can scale the pool back up within minutes.
Hydrolix uses an ANSI-compliant SQL interface. This interface uses the syntax and some of the SQL engine of Clickhouse. All standard features, including how the interface API works for querying data, are supported.
RETAIN frequently asked questions
Hydrolix provides a cost-effective data retention solution with patented high-density compression technology. This technology lets you keep all your data online without offloading or sampling. And because of reduced storage costs, you can retain data for analysis, compliance, and security, eliminating the trade-off between data retention and cost.
You also get the additional benefit of reducing your environmental footprint by reducing the storage infrastructure required to store massive datasets.
In the context of Hydrolix, zero-egress means that when you deploy our solution on-premises, you have complete control over your data, and no additional egress costs are incurred.
This allows you to manage your data efficiently and cost-effectively within your own infrastructure.
No, Hydrolix doesn’t distinguish between hot, warm, or cold data. Queries against all data, whether minutes or years old, deliver sub-second performance, ensuring all of your data remains accessible.
Because Hydrolix combines high-density compression technology with decoupled storage, the cost to deliver low-latency queries on your data, regardless of age, is 4x lower than other databases.