IBM recently launched Cloud Logs, a new solution designed to allow customers to efficiently collect and analyze log data at any scale. IBM is no slouch in the product development department, but Big Blue realized its internally developed observability solutions couldn’t match what was developed by one company: Coralogix.
As the most voluminous of the Holy Trinity of observability data (including metrics and traces), logs are essential for detecting IT problems, such as erroneous updates, the presence of hackers or malware, or barriers to Web application scalability. Thanks to an acceleration in digital transformation initiatives, log data is also growing quickly. In fact, by some measures, it’s growing 35% per year, faster than all data is growing as a whole.
That enormous growth is putting pressure on companies to come up with more effective and efficient ways to deal with their log data. The standard method of analyzing logs–which entails extracting the relevant information from logs, storing that information in a big database on fast storage, and then building indexes over it–is no longer cutting it in the new log world, according to Jason McGee, an IBM Fellow and the CTO of IBM Cloud.
“We see that with data volumes continuously growing, the cost of indexing logs and placing them in hot storage has become prohibitively expensive,” McGee said in a recent press release. “As a result, many companies have opted to sample only a subset of their data as well as limit storage retention to one or two weeks. But these practices can hurt observability with incomplete data for troubleshooting and trend analysis.”
What companies need is a new approach to log storage and analysis. The approach that IBM ultimately selected is the one developed by Coralogix, an IT observability firm based in Tel Aviv, Israel.
Streaming Logs
When Coralogix was founded 10 years ago, the company’s solution was largely based on the Elasticsearch, Logstash, and Kibana (ELK) stack and used a traditional database to index and query data. As the log volumes increased, the company realized it needed a new technological underpinning. And so in 2019, the company embarked upon a project to rearchitect the product around streaming data, using Apache Kafka and Kafka Streams.
“It’s a way of organizing your databases–all your read databases and write databases–such that you can horizontally scale your processes really easily and quickly, which makes it cheaper for us to run,” says Coralogix Head of Developer Advocacy Chris Cooney. “But what it really means is that for customers, they can query the data at no additional cost. That means unbounded exploration of the data.”
Instead of building indexes and storing them on high-cost storage, Coralogix developed its Strema solution around its 3 “S” architecture, which stands for source, stream, and sink. The Strema solution uses Kafka Connect and Kafka streams, runs atop Kubernetes for dynamic scaling, and persists data to object storage (i.e Amazon S3).
“What we do is we say, okay let’s do log analytics up front. Let’s start there, and we’ll do it in a streaming pipeline kind of way, rather than in a batch process, in the database,” Cooney said. “That has some really significant implications.”
In addition to adopting Kafka, Coralogix adopted Apache Arrow, the fast in-memory data format for data interchange. Intelligent data tiering that’s built into the platform automatically moves more frequently accessed data from slower S3 buckets into faster S3 storage. The company also developed a piped query language called DataPrime to give customers more powerful tools for extracting useful information from their log data.
“The beauty of it is that they can basically keep all of the data and manage their costs themselves,” Cooney said. “They use something called the TCO Optimizer, which is a self-service tool that lets you say, okay, this application here, the less important noisy machine logs, we’ll send them straight to the archive. If we need them, we’ll query them directly whenever we want.”
Logging TCO
When you add it all up, these technological adaptations give Coralogix the ability to not only deliver sub-second response to log events–such as firing an alert on a dashboard when a log is sent indicating the presence of malware–but also to deliver very fast responses to ad hoc user queries that touch log data sitting in object storage, Cooney says. In fact, these queries that scan data in S3 (or IBM Cloud Storage, as the case may be) sometimes execute faster than queries in mainstream logging solutions based on databases and indexes, he says.
“When you combine TCO optimization in Coralogix with the S3 intelligent tiering…and the clever optimization of data, you’re looking at between 70% and 80% cost reduction in comparison to someone like Datadog,” Cooney tells Datanami. “That’s just in the log space. In the metric space, it’s more.”
Thanks to this innovation–in particular, pulling the cost out of storing indexes by switching to a Kafka-based streaming sub-system–Coralogix is able to radically simplify its pricing scheme for its 2,000 or so cusotmers. Instead of charging for each individual component, the company charges for its logging solution based on how much data the customer ingests. Once it’s ingested, customers can run all the queries to their heart’s content.
“Data that previously was purely the realm of the DevOps team, for example…the DevOps teams will guard that jealousy keep that data. Nobody else can query it, because that is money. You’re actually encouraging silos there,” Cooney says. “What we say is explore the data as much as you like. If you’re part of a BI team, have at it. Go have fun.”
IBM rolled out IBM Cloud Logs to customers in Germany and Spain last month, and will continue its global rollout through the third quarter.
Related Items:
OpenTelemetry Is Too Complicated, VictoriaMetrics Says
Coralogix Brings ‘Loggregation’ to the CI/CD Process
Log Storage Gets ‘Chaotic’ for Communications Firm
The post The White Label Powering IBM’s New Cloud Logs Solution appeared first on Datanami.
0 Commentaires