The Shift from SIEM to Cybersecurity Data Platform

Written by Gravwell | Apr 24, 2025 8:32:21 PM

"We are looking at transitioning to a real data lake for security, not the traditional SIEM setup" - a guy I just spoke with a minute ago.

The way organizations approach cybersecurity data is changing fast. We're seeing a shift in how people approach how they get value out of security data. It's moved beyond conversations at this point, and Fortune 100s are releasing RFPs looking for "cybersecurity data lakes" and "cyber data platforms" instead of, or in addition to, a SIEM. With threats growing more sophisticated and infrastructure growing more complex, many teams are rethinking their reliance on traditional SIEMs—and for good reason.

SIEMs have served us well, but they weren’t built for today’s data scale, diversity, or velocity. They’re rigid, expensive, and force you to decide what data "matters" before you’ve even seen what’s inside.

That’s why we founded Gravwell as a structure-on-read data lake purpose-built for modern cybersecurity operations. In this post, I'll hit on one reason why Gravwell is the data lake for security folks; structure-on-read is the superpower that makes your SOC agile and helps you cut the absolutely horrendous average Attacker Dwell Time statistic of 300 days[1] down to something you can be proud of.

The SIEM Struggle Is Real

Most security teams already know the pain points:

Ingestion-based pricing makes visibility expensive
Pre-normalization requirements slow down deployments and introduce fragility
Short retention windows limit historical investigation
Rigid schema makes it hard to adapt to new threats or data types

These limitations force security leaders into uncomfortable tradeoffs: drop valuable data to control costs, or over-engineer data pipelines just to get it into the SIEM’s format. Neither is a winning strategy.

Gravwell’s Superpower: Structure-on-Read built from scratch

At Gravwell, we flipped the script.

Instead of making you transform and normalize your data before it enters the system, Gravwell lets you ingest everything —raw, text or binary, unstructured, full-fidelity— without compromise.

Then, when you're ready to analyze, enrich, or correlate, you apply structure "on demand".

It’s not just more flexible—it’s faster, cheaper, and more future-proof.

What This Means for You:

Ingest now, ask questions later: No need to define schemas or normalize up front. Start capturing data immediately.
Keep full-fidelity data for the long haul: Gravwell's efficient architecture lets you retain rich, raw telemetry without blowing up your budget.
Evolve your detections: As threats change, your ability to query and extract meaning evolves with them—no reingestion required.
More data, less noise: Instead of tossing out “non-essential” data, you keep it all and filter at query time. Better coverage, better detection.

Real Use Cases: Where Gravwell Shines

We see customers using Gravwell for everything from traditional SOC workflows to novel detection and response strategies:

Threat hunting with context-rich historical data
Root-cause analysis using full PCAP and NetFlow records
Machine learning on years of endpoint telemetry
Insider threat detection across unstructured logs

And unlike SIEMs that box you into a narrow set of use cases, Gravwell adapts to your environment—cloud, hybrid, air-gapped, you name it.

A short foray into the weeds: Real Log Collection Challenges

Here are some logs from a wireless access provider (which vendor isn't important, this is a challenge everywhere). In the same log source, which must be delivered over syslog, we have a key/value lot format, a JSON format, and a completely unstructured format. If I'm going to eat these logs into a SIEM, Postgres, Snowflake, Clickhouse, or any other columnar store, I must first parse them into their relevant fields. In this instance, that means I need nested parsers to handle the syslog itself, then some logic to handle the subtype.... it's a nightmare. THEN, consider what happens if the vendor changes their log format: without a structure-on-read system I may lose data while I wait for my SIEM/lake vendor to update their parsers to handle the new format.

With Gravwell, you can collect even the ugliest data and chain parsing modules together for extreme flexibility. If a vendor changes their log format, you update your extractor definitions and can apply those retroactively as needed. No collection gaps. No visibility gaps. Minimum work required.

```
<30>Apr 24 09:22:46 hardingwest12 xxxxxxxxx,XXXXXXX: mcad: mcad[20960]: wireless_agg_stats.log_sta_anomalies(): bssid=80:2a:a8:25:cb:14 radio=wifi0 vap=ath0 sta=d0:c5:d3:a2:11:d3 satisfaction_now=74 anomalies=tcp_latency

<30>Apr 24 09:21:35 hardingeast2 xxxxxxxxxx,XXXXXXX: stahtd: stahtd[3338]: [STA-TRACKER].stahtd_dump_event(): {"message_type":"STA_ASSOC_TRACKER","mac":"dc:e5:fa:ea:bb:71","vap":"ath0","event_type":"soft failure","assoc_status":"0","dns_responses":"20","dns_timeouts":"0","ip_delta":"-1648110208","ip_assign_type":"N/A","wpa_auth_delta":"80000","assoc_delta":"0","auth_delta":"0","event_id":"1","auth_ts":"7848786.477385","dns_resp_seen":"yes","avg_rssi":"-77"}

<14>Apr 24 09:21:31 hardingwest8 xxxxxxxxxx,XXXXXXX: libubnt[3678]: wevent[3678]: wevent.ubnt_custom_event(): EVENT_STA_LEAVE ath4: dc:e5:5b:aa:c2:11 / 1

SIEM + Gravwell: Better Together

If you’re not ready to ditch your SIEM entirely, no problem. Many organizations use Gravwell alongside their SIEM in a tiered architecture:

SIEM for real-time alerts and compliance
Gravwell for exploration, enrichment, and long-term retention

This gives you the best of both worlds: actionable alerts today, deep visibility tomorrow, and zero data regrets.

Gravwell Isn’t Just a Tool—It’s a Philosophy

We believe you shouldn’t have to "guess" which data matters. You shouldn’t be punished for collecting too much. And you definitely shouldn’t have to normalize everything up front just to get started.

Gravwell gives you the power to:

Ingest anything
Explore freely at speed
Apply data models on-the-fly
Automate analysis
Handle the weird
Support other business units, like IT Ops
Scale without surprise costs

That’s what Gravwell is all about!

More Than Just Security: Gravwell for IT Ops and Beyond

Gravwell isn't just for the SOC. Its flexibility and automation capabilities make it a powerful tool across multiple business units. IT operations teams, in particular, rely on Gravwell for everything from system performance monitoring and anomaly detection to automating routine investigations. By unifying data access and workflows across security and operations, Gravwell helps teams break down silos and solve problems faster—together.

Ready to See It for Yourself?

Whether you’re running a lean SOC or operating a global security program, Gravwell can give your team the flexibility and power you’ve been missing.

Book A Demo

or Try it free and start building a future-proof security data lake—with your data, your way.

[1] IBM publishes the "Cost of a Breach" report which includes reporting about average attacker dwell time, a stat that has been trending upwards and sits at a staggering 300 days. https://www.ibm.com/reports/data-breach

View full post