TLDR
YaFF is Yandex’s open-source zero-copy wire format for Protobuf — Apache 2.0, currently C++, v0.1.0.
The .proto file stays the source of truth; only the physical memory layout changes.
On Yandex’s benchmarks, the Flat Layout reads hot data ~3.8× faster than FlatBuffers, within 1.2× of a raw C++ struct.
Four layouts — Fixed, Flat, Sparse, Dynamic — trade read speed for schema flexibility; Dynamic is the default.
YaFF runs in its advertising recommendation system, where it reports 10–20% CPU savings at production scale.
Adoption is incremental: drop it into one hot path, with two-way Protobuf conversion at the edges.
Yandex has open-sourced YaFF (Yet another Flat Format) under Apache 2.0. It is a high-performance C++ serialization library. YaFF provides a zero-copy wire format for the Protobuf ecosystem. Your .proto file stays the single source of truth. The format only changes how data sits in memory. It concentrates on server-side runtimes.
What is YaFF
YaFF is not a replacement for Protobuf. It is an alternative wire format for Protobuf messages. The same .proto schema generates a proto-like C++ API. Reads need no parsing step, so fields come straight from the buffer. Less performance-sensitive code can still parse the wire format back into Protobuf messages. That two-way conversion is what makes module-by-module adoption realistic. You introduce YaFF in one hot path and leave the rest on Protobuf.
The Problem it Targets
Protobuf parsing can consume double-digit percentages of CPU in high-load backends. At scale, that maps to thousands of physical cores. The common zero-copy option is FlatBuffers, also from Google. But FlatBuffers is not a Protobuf drop-in and requires maintaining a separate schema and conversion layer. semantically incompatible with Protobuf. Migrating means duplicated schemas, different schema-evolution rules , and hand-written field converters. Many teams conclude the cost is not worth it. YaFF aims at that gap: zero-copy reads with Protobuf semantics preserved.
How the Layouts Work
A layout decides how a message is stored in the buffer. It changes only the physical representation, leaving the schema and generated interfaces unchanged. YaFF ships four layouts. Fixed is a plain packed struct with no header and a frozen schema. Flat adds a two-byte header and supports schema evolution. Sparse addresses fields through a meta table, fitting sparse schemas. Dynamic is the default and selects Flat or Sparse at runtime. It uses Flat while the schema permits, then switches to Sparse when evolution breaks flat alignment.
LayoutRead accessPer-message overheadSchema evolutionBest forFixed1 read, 0 branches0 bytesFrozenSmall inlined primitivesFlat2 reads, 1 branch2 bytesRestricted (type preservation)Dense, hot dataSparse4 reads, 2 branches6 bytesUnrestrictedSparse schemas, free evolutionDynamic (default)Flat or Sparse at runtime2 or 6 bytesUnrestrictedGeneral application logic
Benchmark
Yandex ships a reproducible benchmark suite, built with google/benchmark in a Release build. The numbers below are median nanoseconds per read on an AMD EPYC 7713 with Clang 20.1.8. Lower is faster. In the hot hierarchical case, the Flat Layout reads in 9.79 ns. FlatBuffers needs 37.30 ns, and Protobuf needs 219.35 ns. The raw C++ struct baseline is 8.14 ns. So the Flat Layout reads about 3.8× faster than FlatBuffers here, and about 22× faster than Protobuf. It stays within 1.2× of the raw struct.
FormatRead time (ns)Slowdown vs raw structRaw C++ struct8.141.0×YaFF Flat Layout9.791.2×YaFF Sparse Layout21.232.6×FlatBuffers37.304.6×Protobuf219.3526.9×Median ns per read, hierarchical / hot / no chain caching. Source: https://yaff.tech/docs/en/benchmarks/access
Note: The absolute numbers depend on the host CPU and memory. The ratios between formats are expected to hold across hardware.
The Compiler Aliasing Detail
FlatBuffers and YaFF both read fields by reinterpreting raw memory as the target type. That type-punning leaves TBAA without strong enough facts. So LLVM’s alias analysis falls back to a conservative MayAlias verdict. The compiler then cannot prove that repeated accesses are safe to reuse. Writing root.intermediate().leaf().a() twice re-walks the tree each time. YaFF adds annotations in its generated code that tell the compiler when reuse is safe. YaFF’s generated-code annotations can often help the compiler reuse the access chain, as long as the relevant memory is not modified between reads. As long as nothing writes to memory between reads, YaFF caches the access chain on its own.
Where It Fits: Use Cases
YaFF targets systems where you control both producer and consumer. Recommendation and ad-serving backends are the clearest fit. According to Yandex, YaFF runs in its advertising recommendation system, where it reports 10–20% CPU savings at production scale. Memory-mapped indexes are a second fit. A host can hold tens of gigabytes of local data. Those mmap-able indexes survive service restarts without re-parsing. Search indexes, feature stores, and feed services share that read-heavy profile. The planned Columnar Layout targets analytics and ML pipelines with large repeated fields. YaFF can also be more compact than FlatBuffers, which helps cache behavior.
A Look at the Code
The read path mirrors Protobuf, minus the parse step.
Copy CodeCopiedUse a different Browser#include “feed.pb.h” // generated by protoc
#include “feed.yaff.h” // generated by yaff_generate()
// 1. Serialize an existing Protobuf message into a YaFF buffer.
feed::FeedResponse proto = LoadFeedResponse();
const auto buffer = yaff::Serialize<protoyaff::feed::FeedResponse>(proto);
// 2. Read fields directly from the buffer. There is no parsing step.
const auto& response = yaff::ReadMessage<protoyaff::feed::FeedResponse>(buffer.Data());
for (const auto& item : response.items()) {
std::string_view title = item.title();
std::string_view author = item.author().name(); // empty if author is unset
}
// 3. Convert back to Protobuf when a consumer needs the parsed message.
feed::FeedResponse restored;
response.ParseTo(restored);
You add YaFF through CMake (find_package) or Conan. Code generation runs protobuf_generate() then yaff_generate(). Generated YaFF types live in the protoyaff::<package> namespace. Most projects only link yaff::core and yaff::proto.
Resources:
Check out the GitHub repository and Documentation.
The post Yandex Open-Sources YaFF: A Zero-Copy Wire Format for Protobuf With Near-Struct Read Speed appeared first on MarkTechPost.

