Use Cases
Docs
Blog Articles
BlogResources
Pricing
PricingApache Kafka vs. Fluvio Benchmarks

VP Product, InfinyOn Inc.
Index
Introduction
At InfinyOn for the past 6 years we have obsessed over developer ergonomics, functionality, and reliability of Fluvio and Stateful DataFlow. It’s not trivial to build a distributed streaming engine from the ground up.
We had internal benchmarks, and some baseline comparison. But we had not given developers the ability to run benchmarks. And we did not have it on our roadmap until a handful of Fluvio open source developers started asking for benchmarks and P99 latencies.
Benchmarks are never perfect.
Benchmarks are indicative of what to expect from a system.
Benchmarks are not the main or the only reason behind software purchase decisions.
Benchmarks are a decent indication of the scalability and reliability of a distributed system.
Benchmark results, if they are any good, are a reflection of the engineering effort that has gone into building a system.
Anyways, given that users wanted to run benchmarks, we scoped in the work to get the benchmarking capability. It was released in the last major Fluvio release and here are the results.
Given that Apache Kafka is the standard in dstributed streaming, and it’s possible for intelligent builders to extrapolate the comparable RedPanda performance, we figured we will keep it simple and compare Apache Kafka and Fluvio on two system configurations.
The results are as you’d expect.
Benchmark Setup
Before diving into the benchmarks, it’s important to understand Fluvio’s architecture. Fluvio is a next-generation distributed streaming engine, crafted in Rust over the last six years. It follows the conceptual patterns of Apache Kafka, and adds the programming design patterns of Rust and WebAssembly based stream processing framework called Stateful DataFlow (SDF). This makes Fluvio a complete platform for event streaming use cases.
For the setup we used a MacBook Pro 18 with an Apple M1 Max processor and 32 GB RAM. And an AWS EC2 C7G XLarge instance with ARM64 Graviton Processors 4 vCPUs, 8 GB RAM, and Elastic Block Storage running Ubuntu latest LTS version.
Specification | AWS EC2 C7g.xlarge | MacBook Pro M1 Max |
---|---|---|
Processor | AWS Graviton3 (ARM-based) | Apple M1 Max (ARM-based) |
vCPU/Cores | 4 vCPUs | 10 CPU cores (8 performance + 2 efficiency) |
Memory | 8 GB | 32 GB |
Storage | EBS-Only | Built-in SSD |
Architecture | arm64 | arm64 |
We followed the Apache Kafka Quickstart gide to get Kafka installed and setup. And We followed the Fluvio Quickstart guide to get Fluvio setup.
For Kafka, we installed the latest version of Java and Kafka. Launch Kafka. Run benchmark.
For Fluvio, we installed the latest version of Rust and Fluvio. Launch Fluvio. Run benchmark.
In both machines we ran the benchmarks for Kafka first, followed by Fluvio. We ran a series of benchmarks with 200,000 records at 5120 bytes each.
Kafka Benchmark Command:
bin/kafka-producer-perf-test.sh --producer-props bootstrap.servers=localhost:9092 --topic test \
--throughput -1 --num-records 200000 --record-size 5120
Fluvio Benchmark Command:
fluvio benchmark producer --num-records 200000 --record-size 5120
Benchmark Results
Here are the benchmark results in sequence. First on the Macbook Pro and then on the EC2 instance, and in order Apache Kafka followed by Fluvio.
MacBook Pro 18 with an Apple M1 Max processor and 32 gigs of RAM:
Metric | Fluvio on M1 Max | Kafka on M1 Max |
---|---|---|
Records/sec | 76,923 | 49,346 |
Throughput | 394.6 MB/sec | 240.95 MB/sec |
P99 Latency | 5.8ms | 132ms |
- Kafka Performance: We initiated Kafka and initially observed 181 MB/s throughput with a P99 latency of 541 ms. As the JVM warmed up, performance improved, peaking around 240 MB/s with a P99 latency of 150 ms. At this point the JVM and Kafka needed ~1 GB of idling RAM and a couple of percentage points of CPU to keep the engine humming.
- Fluvio Performance: As we moved to Fluvio, we noticed immediate differences. Starting a local cluster with a single streaming controller and processing unit, we achieved 391 MB/s throughput and a much lower P99 latency of 7 ms right from the start. Fluvio required near 0% of CPU cycles and ~50 MB of RAM in it’s idle state. The memory utilization went up momentarily as the records were produced on the fly to run the benchmarks.
AWS EC2 C7G XLarge instance with ARM64 Graviton Processors 4 vCPUs, 8 Gigs of RAM
Metric | Fluvio on EC2 | Kafka on EC2 |
---|---|---|
Records/sec | 37,195 | 26,780 |
Throughput | 190.8 MB/sec | 130.77 MB/sec |
P99 Latency | 10.8ms | 419ms |
- Kafka Performance: Kafka’s initial throughput was underwhelming, starting at less than 30 MB/s with a P99 of 1,516 ms. After multiple runs, its performance stabilized at around 130 MB/s throughput and a P99 of 424 ms. The JVM behaves the same way needing nearly 1 GB RAM while running for optimal performance.
- Fluvio Performance: Fluvio again demonstrated substantial advantages. It reached 192 MB/s throughput with a P99 latency of just 14 ms, maintaining around 190 MB/s throughput with a P99 latency of roughly 11 ms over several trials. Fluvio required the same 50 MB ram to stay on and had momentary loads as it produced data on the fly.
Comparison
Again, there is no fair way to compare a garbage collection optimized JVM dependent engine of Kafka with the Rust based engine of Fluvio. The benchmark points towards a few specific observations.
- Infrstructure Overhead: Fluvio requires minimal idle time resources and indicates substantially lower infrastructure overheads.
- Speed and Scale: At default configs Fluvio does 1.5 times the throughput and 20 to 38 times better latency.
The raw performance and the minimal infrastructure overheads enables Fluvio to offer orders of magnitude performance upgrades on end to end distributed streaming and stream processing workloads.
Conclusion
We could say many things. But it’s always better to hear from a consumer the value they are getting. So here is a clip on what Miles the CEO of Trustless Engineering has to say about the benefits they get from the performance on Fluvio on InfinyOn Cloud.
Stay in Touch:
Thanks for checking out the benchmarks. Hopefully these are useful as you navigate data streaming landscape.
If you’d like to see how InfinyOn Cloud could level up your data operations - Just Ask.
- Share your thoughts on our Github Discussions
- Join the conversation on Fluvio Discord Server
- Subscribe to our YouTube channel for project updates.
- Follow us on Twitter for the latest news.
- Connect with us on LinkedIn for professional networking.