Gravwell's ingesters can pull data from a wide variety of sources and we advocate keeping raw data formats for root cause analysis, but sometimes it's nice to massage the data a little before sending it to the indexers. Maybe you're getting JSON data sent over syslog and would like to strip out the syslog headers. Maybe you're getting gzip-compressed data from an Apache Kafka stream. Maybe you'd like to be able to route entries to different tags based on the contents of the entries. Gravwell's ingest preprocessors make this possible by inserting one or more processing steps before an entry is sent upstream to the indexer.
We proud to announce the immediate availability of Gravwell version 3.2.3. This release is all about performance and bug fixes, but we did manage to slip in a new Kafka ingester.
We've had some benchmarking requests from multiple organizations struggling with ingest performance from Elasticsearch, so we're publishing them here. The latest Gravwell release marks a significant improvement in ingest and indexing performance and this post covers the nitty gritty details. Better ingest performance means reduced infrastructure cost, less dropped data, and faster time-to-value. See how Gravwell stacks up.
We're continuing to work with investigative reporters to research unscrupulous activity on social media. Most recently, Engadget published a piece on nefarious political influencers on Reddit. We’ve written in the past about analyzing social media comments, but didn’t make the ingesters publicly available. With an increasing need for research in this area, we decided that releasing our Reddit and Hacker News ingesters could help new users get started with Gravwell even faster, so we open-sourced them. Read on to learn how to get the ingesters, how to run them, and how to get started with the data.
Gravwell recently introduced a new ingester which accepts entries via HTTP POST requests. Now it's easy to send arbitrary data to Gravwell via scripts using only the curl command. In this blog post, we'll use the HTTP ingester to build a weather-monitoring dashboard!
We’re pleased to announce the release of Gravwell 2.2.1! For a point release, it’s got some very cool new features; read on to learn what we’ve added.
Thanks to Gravwell's Google PubSub ingester, it's easy to collect logs and other data from services deployed in the Google Cloud Platform. In this blog post, we'll show how to set up Gravwell in GCP and ingest system logs from your virtual machines.
This post is mostly about building your own docker images. If you're interested in getting up and running fast using Gravwell+Docker, head over to our docs that cover our pre-built images:
For this blog post we are going to go over the deployment of a distributed Docker-based Gravwell cluster. We will use Docker and a few manageability features to very quickly build and deploy a cluster of Gravwell indexers. By the end of the post we will have deployed a 6 node Gravwell cluster, a load balancing federator, and a couple ingesters. Also, the six node “cluster” is also going to absolutely SCREAM, collecting over 4 million entries per second on a single Ryzen 1700 CPU. You read that right, we are going to crush the ingest rate of every other unstructured data analytics solution available on a single $250 CPU. Lets get started.
Amazon’s Kinesis Streams service provides a powerful way to aggregate data (logs, etc.) from a large number of sources and feed that data into multiple data consumers. For instance, a large enterprise might use one Kinesis stream to gather log data from their cloud infrastructure and another stream to aggregate sales data from the web services running on that infrastructure. Once the data is in the stream, it remains available for up to a day (or optionally longer) for any number of applications to read it back for processing and analysis. This is particularly useful to customers that want to deploy and destroy virtual machines on a whim; data is stored in the stream, rather than the ephemeral VMs.