Thanks to Gravwell's Google PubSub ingester, it's easy to collect logs and other data from services deployed in the Google Cloud Platform. In this blog post, we'll show how to set up Gravwell in GCP and ingest system logs from your virtual machines.
For this blog post we are going to go over the deployment of a distributed Docker-based Gravwell cluster. We will use Docker and a few manageability features to very quickly build and deploy a cluster of Gravwell indexers. By the end of the post we will have deployed a 6 node Gravwell cluster, a load balancing federator, and a couple ingesters. Also, the six node “cluster” is also going to absolutely SCREAM, collecting over 4 million entries per second on a single Ryzen 1700 CPU. You read that right, we are going to crush the ingest rate of every other unstructured data analytics solution available on a single $250 CPU. Lets get started.
Amazon’s Kinesis Streams service provides a powerful way to aggregate data (logs, etc.) from a large number of sources and feed that data into multiple data consumers. For instance, a large enterprise might use one Kinesis stream to gather log data from their cloud infrastructure and another stream to aggregate sales data from the web services running on that infrastructure. Once the data is in the stream, it remains available for up to a day (or optionally longer) for any number of applications to read it back for processing and analysis. This is particularly useful to customers that want to deploy and destroy virtual machines on a whim; data is stored in the stream, rather than the ephemeral VMs.