If your head is often in the clouds (that is, thinking about cloud architecture or using cloud provider tools), you may have heard the term “frugal architect,” which Dr. Werner Vogels, CTO and VPS of Amazon.com, defined in his AWS re:Invent 2023 keynote. In the keynote, Vogels defines the 7 laws of a frugal architect, which include principles on designing, measuring, and optimizing applications with cost in mind. Vogels’ principles don’t just apply to building applications—they also provide a useful framework for thinking about observability as well.
Law I: Make Cost a Non-Functional Requirement
Non-functional requirements don’t contribute to the actual functionality of a feature or application, but they are still requirements. In his keynote, Vogels defined not just cost but also sustainability, security, compliance, and accessibility as non-functional requirements.
For observability, you likely have functional requirements for your solution (whether it’s SaaS, software you run, or DIY). These may include ingesting and analyzing logs from specific services, using the dashboards that work best for your team, and alerting on certain thresholds.
But ultimately, in an uncertain macroeconomic environment, cost is the most important requirement of all. In the case of observability, your solution must be cost-effective. Many popular observability solutions are built on legacy data layers that make it expensive to ingest, analyze, and especially retain data at scale. The next generation of observability and log monitoring solutions, which includes products built using Hydrolix, are designed to handle and store large volumes of log data at a much more reasonable cost than traditional solutions.
Law II: Systems That Last Align Cost to Business
“The durability of a system depends on how well its costs are aligned to the business model…make sure the architecture follows the money.” – Dr. Werner Vogels
This makes sense in terms of building applications, but how does it apply to observability?
Observability helps ensure that your architecture is following the money. For example, your request logs provide valuable information on which services are being used most and how performant they are. But it’s not about having observability for the sake of it. You must be able to answer important business questions with your data. If you are collecting data simply for potential compliance and letting it become dark data (data that isn’t easy to access and is never analyzed), then you are not aligning the value of those logs to your business model. And once again, cost becomes critical. Observability should not be 20-30% of your cloud spend as Charity Majors, CTO of Honeycomb, suggested. If observability costs are a huge chunk of your cloud spend, you have fewer resources for the cloud infrastructure that actually powers your business.
Law III: Architecting is a Series of Trade-offs
“It’s crucial to find the right balance between your technical and business needs… Frugality is about maximizing value, not just minimizing spend.” – Dr. Werner Vogels
In the observability space, there are a number of key trade-offs to consider for log data at scale. Here are a few:
- All-in-one versus specialized tools: Do you want the convenience of one observability tool that does many things reasonably well, even if it’s not optimal for some of your business needs? Or would you rather use multiple tools that better fit your use cases, even if it means more screens to monitor and manage?
- Managed versus self-managed: Do you prefer a managed solution, even if it means less control and visibility when it comes to your data? Or do you need to manage and control your data for security, compliance, and cost purposes?
- Traditional SaaS observability versus next-generation solutions: Do you derive greater benefit from the rich user interfaces, extensive feature sets, and ecosystem of integrations that a traditional SaaS observability platform provides, even though long-term retention and storing data at scale can be prohibitively expensive? Or do you go with next-generation solutions that provide long-term data retention and cost-effective log data at scale?
Law IV: Unobserved Systems Lead to Unknown Costs
“Without careful observation and measurement, the true costs of operating a system remain invisible…Tracking utilization, spending, errors, and more, is crucial for cost management.” – Dr. Werner Vogels
This is a direct endorsement of observability. For the frugal architect, it’s not optional—observability is a core tenet.
Law V: Cost-Aware Architectures Implement Cost Controls
“Granular control over components optimizes both cost and experience. Infrastructure, languages, databases should all be tunable.” – Dr. Werner Vogels
This is a core challenge for many observability platforms, especially SaaS solutions, and it ties to scalability. AWS and other cloud providers are extremely scalable, but this is not the case with the majority of traditional observability platforms, which use legacy architectures that tightly couple compute and storage. For your observability solution to be tunable, each of its components must be decoupled.
In his 2023 re:Invent keynote, Vogels states, “You need to be able to switch some of those components off… the switch should be in the hands of the business.” In the case of Hydrolix, you can separately tune and scale ingest, query, and storage, all without impacting other components in the system. This tunability allows you to scale up for peak events and scale down when you don’t need resources. You can even scale Hydrolix resources down to zero in order to maximize savings.
Law VI: Cost Optimization is Incremental
“There is always room for improvement… if we keep looking. The savings we reap today fund innovation for tomorrow.” – Dr. Werner Vogels
This applies for observability, too. Many customers are having growing pains with their observability solutions—in this case, the pain is related to the explosive growth in log data coming from microservices, distributed architecture, CDNs, and more. It is no longer cost-effective to use many traditional solutions for data at terabyte scale. So customers often try to add incremental cost savings by discarding data, either by sampling, limiting retention, or aggregating data into metrics and discarding the raw data. Even with data loss, however, many customers just keep paying more for observability. So if you’re experiencing explosive costs with data at scale, it may be time to consider Hydrolix, which gives you long-term data retention while also cutting the average total cost of ownership (TCO) of your log solution by an average 4x or more.
Law VII: Unchallenged Success Leads to Assumptions
“Software teams often fall into the trap of assuming their current technologies, architectures, or languages will always be the best choice, simply because they have worked in the past. This can create a false sense of security that discourages questioning the status quo or exploring new options which could be more efficient, cost-effective, or scalable.” – Dr. Werner Vogels
In the case of observability, major observability providers like Splunk, Datadog, New Relic, and Elastic offer rich features, integrations, and dashboards—but those aren’t the only considerations to keep in mind. Just because they were the best choice in the past doesn’t mean they will continue to provide the scalability you need to handle terabyte or petabyte volumes of log data. If you’re running into cost issues, it may be time to question the status quo and try a POC with Hydrolix.