Satellites are becoming software platforms. More software means more bugs, and bugs 400 km overhead are not easy to fix. Binary parsers written in C are a recurring weak spot: the KA-SAT attack bricked thousands of satellite modems by exploiting the management interface, and researchers have since demonstrated RF signal injection against VSAT terminal firmware by crafting inputs that exploit parsing flaws.
EverParse has a good answer to that problem, for on-earth software. It generates C validators and parsers with proofs of memory safety, arithmetic safety, double-fetch freedom, and correctness. It is part of Microsoft's Project Everest, which ran from 2016 to 2021 around F* (a dependently typed language from the ML family) and produced software that ended up in the Windows kernel, Hyper-V, Linux, Firefox, and Python.
The EverParse compiler seems like a perfect fit for embedded satellite
software. I wanted to try it, but I did not want to write
.3d
files by hand (how do you maintain those? and how do you write
serialisers from them?). So I
wrote ocaml-wire, a
combinator library that lets me describe a format once in OCaml, automatically
write zero-allocation (when possible) parser and serialiser codecs, and generate the
.3d schema and C glue from the same description.
I will use CCSDS Space Packets as the running example, because they are small, bitfield-heavy, and annoying to parse by hand.
EverParse
For a CCSDS Space Packet header, the .3d format is already close to
the specification:
module SpacePacket
typedef struct SpacePacket {
UINT16BE Version:3;
UINT16BE Type:1;
UINT16BE SecHdrFlag:1;
UINT16BE APID:11;
UINT16BE SeqFlags:2;
UINT16BE SeqCount:14;
UINT16BE DataLength;
} SpacePacket;
You can read this as two 16-bit words of bitfields, followed by
a 16-bit DataLength field. EverParse turns this into a C validator
you could link into flight software. The guarantees are useful: memory safety, arithmetic safety,
double-fetch freedom, and correctness with respect to the .3d
specification. They do not prove that the .3d file matches the blue
book, nor that the C extraction from F* or the C compiler are
bug-free. Still, this is exactly the kind of low-level code where
mistakes hurt.
What .3d does not give me is the OCaml side. I would still need
separate code for decoding, encoding, and field access.
The Same Header in OCaml
ocaml-wire lets me
describe the same header once in OCaml, then use it directly in OCaml
and project it to EverParse .3d. It can also generate the OCaml/C
glue to call the verified validator. So, for the Space Packet primary
header at least, I get an OCaml codec and a C validator from the same source.
Define named fields with Field.v, bind them to record projections
with $, and assemble a codec with Codec.v:
open Wire
type packet = {
version : int; type_ : int; sec_hdr : int; apid : int;
seq_flags : int; seq_count : int; data_len : int;
}
let f_version = Field.v "Version" (bits ~width:3 U16be)
let f_type = Field.v "Type" (bits ~width:1 U16be)
let f_sec_hdr = Field.v "SecHdrFlag" (bits ~width:1 U16be)
let f_apid = Field.v "APID" (bits ~width:11 U16be)
let f_seq_flags = Field.v "SeqFlags" (bits ~width:2 U16be)
let f_seq_count = Field.v "SeqCount" (bits ~width:14 U16be)
let f_data_len = Field.v "DataLength" uint16be
(* Bind fields before building the codec. The same bound fields are then
reused for get/set. *)
let bf_version = Codec.(f_version $ fun p -> p.version)
let bf_type = Codec.(f_type $ fun p -> p.type_)
let bf_sec_hdr = Codec.(f_sec_hdr $ fun p -> p.sec_hdr)
let bf_apid = Codec.(f_apid $ fun p -> p.apid)
let bf_seq_flags = Codec.(f_seq_flags $ fun p -> p.seq_flags)
let bf_seq_count = Codec.(f_seq_count $ fun p -> p.seq_count)
let bf_data_len = Codec.(f_data_len $ fun p -> p.data_len)
let packet_codec =
let open Codec in
v "SpacePacket"
(fun version type_ sec_hdr apid seq_flags seq_count data_len ->
{ version; type_; sec_hdr; apid; seq_flags; seq_count; data_len })
[ bf_version;
bf_type;
bf_sec_hdr;
bf_apid;
bf_seq_flags;
bf_seq_count;
bf_data_len ]
The generated .3d is a projection of that OCaml description. It is not
a second source of truth:
entrypoint
typedef struct _SpacePacket(mutable WireCtx *ctx)
{
UINT16BE Version : 3 {:act WireSetU16BE(ctx, (UINT32) 0, Version); };
UINT16BE Type : 1 {:act WireSetU16BE(ctx, (UINT32) 1, Type); };
UINT16BE SecHdrFlag : 1 {:act WireSetU16BE(ctx, (UINT32) 2, SecHdrFlag); };
UINT16BE APID : 11 {:act WireSetU16BE(ctx, (UINT32) 3, APID); };
UINT16BE SeqFlags : 2 {:act WireSetU16BE(ctx, (UINT32) 4, SeqFlags); };
UINT16BE SeqCount : 14 {:act WireSetU16BE(ctx, (UINT32) 5, SeqCount); };
UINT16BE DataLength {:on-success WireSetU16BE(ctx, (UINT32) 6, DataLength);
return true; };
} SpacePacket;
The annotations in the output show how EverParse uses the codec at
validation time. :act writes each validated field value into an
output structure (via the WireSet* callbacks); :on-success is the
longer form that can conditionally fail. So the generated C parser
validates and extracts in a single pass.
The same codec also gives zero-copy field access on the OCaml side:
(* Stage once, reuse the closure *)
let get_apid = Staged.unstage (Codec.get packet_codec bf_apid)
let set_seq = Staged.unstage (Codec.set packet_codec bf_seq_count)
let apid = get_apid buf 0 (* zero allocation for int fields *)
let () = set_seq buf 0 42 (* read-modify-write for bitfields *)
(* Full record decode/encode when needed *)
let pkt = Result.get_ok (Codec.decode packet_codec buf 0)
let () = Codec.encode packet_codec pkt buf 0
The Space Packet header has a fixed layout, but many formats have
variable-length payloads. For those, dependent sizes use Field.ref
to let one field's value determine the size of another:
let f_len = Field.v "Length" uint16be
let f_data = Field.v "Data" (byte_array ~size:(Field.ref f_len))
Testing against C
The same description drives testing against the C path:
let schema = Everparse.schema packet_codec
let () = Wire_3d.run ~outdir:"schemas" [ schema ]
let () = Wire_stubs.generate ~schema_dir:"schemas" ~outdir:"."
[ C packet_codec ]
Wire_3d.run writes the .3d files, invokes EverParse, and produces
C validators. Wire_stubs.generate writes the OCaml/C FFI glue
(external declarations and C stubs) so OCaml code can call the
verified validators directly.
The fuzz/ tests use
Crowbar to check that the
OCaml codec does not crash on random input. Separately, test/diff/ generates random
schemas, runs EverParse to produce C validators, and checks that OCaml
and C agree on every random input. One OCaml description defines the
codec, the schema, and the test oracle.
Performance
Describing a format is one thing; the question is whether the generated codec stays fast enough for real-time packet processing. The benchmarks below measure three common operations (routing, frame reassembly, status polling) and compare the Wire OCaml path with C loops built on EverParse validators (generated from the same OCaml codecs). The protocols come from CCSDS (Space Packets, TM Frames, CLCW words), which travel over SpaceWire links at up to 200 Mbit/s.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Versi|T|S| APID |Seq| SeqCount |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DataLength |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Space Packet (6 bytes)
Packet routing (10M variable-size packets in a 1.2 GB stream). A satellite receives interleaved telemetry from different subsystems (housekeeping, science, diagnostics). Each Space Packet carries an 11-bit Application Process Identifier (APID) that says which subsystem sent it. The router reads the 6-byte header, extracts the APID, looks up a handler in a 2048-entry table, and advances to the next packet. This is the tightest loop in a ground station receiver.
Status polling (1M words). The CLCW is a 32-bit status word that the spacecraft sends back inside every telemetry frame to report the state of its command link (locked out? waiting? retransmitting?). The ground system polls it continuously to detect anomalies. The word packs 13 bitfields, but the polling loop only needs 4 of them.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|C|CLC|Statu|COP| VCID |Spare|N|N|L|W|R|FAR| ReportValue |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
CLCW (32 bits)
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Ver| SCID |VCID |O| MCCount | VCCount |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|S|S|P|Seg| FirstHdrPtr |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
TM Frame (6 bytes)
Frame reassembly (1M frames, 15M packets extracted). Telemetry travels in fixed-size frames (1115 bytes each), but Space Packets are variable-size and can span frame boundaries. The TM Frame header has a FirstHdrPtr field that points to where the first complete packet starts inside the frame. The reassembler walks each frame, extracts the embedded packets, and stitches together packets that were split across two frames. This is the most complex of the three benchmarks.
The chart below shows wall-clock time per operation (lower is better). Each benchmark has two bars: the C baseline (blue, using EverParse validators) and the Wire OCaml path (orange). The ratio column shows how much slower (or faster) Wire is compared to C. A ratio below 1.0 means Wire wins.
Reading the chart from top to bottom: packet routing is 2.6x slower than C, which sounds like a lot until you note that 14.1 ns per packet gives 71 Mpkt/s (a SpaceWire link at 200 Mbit/s with 128-byte packets carries roughly 200 kpkt/s, so Wire has 350x headroom). Frame reassembly is at parity. And status polling is the surprise: Wire is faster than C.
One caveat: the C baseline uses EverParse validators that extract
all fields (7 for a Space Packet, 13 for a CLCW word), while the
OCaml path uses Staged.unstage (Codec.get ...) to read only the
fields it needs. That makes the comparison
slightly unfair to C, but it reflects how the two paths would be used
in practice. Wire wins on CLCW precisely because it loads the 4-byte
word once and extracts only the 4 fields the polling loop needs.
Additional per-field micro-benchmarks confirm the pattern. Reading an 11-bit
bitfield like SpacePacket.apid takes 3.4 ns with zero allocation.
Writing a bitfield (read-modify-write) runs at 2–3 ns.
The remaining 2.5x gap in routing comes from three sources of overhead in OCaml's runtime:
- Function dispatch. The staged field readers are closures, not inlined code. Each call goes through an indirect dispatch (about 11 extra instructions per call).
- Thread-safety barriers. OCaml 5 inserts memory barriers on every mutable write to support multicore. C writes are plain stores.
- Register pressure. The compiler cannot keep state in registers across closure calls, so it spills to the stack between reads.
None of this is specific to Wire. It is the cost of OCaml 5's runtime model. OxCaml would address all three (unboxed closures, thread-local stores, better register allocation across calls), but that is the story for another post.
Getting started
To try the examples from this post, clone the repo and build:
git clone https://github.com/parsimoni-labs/ocaml-wire.git
cd ocaml-wire
opam install . --deps-only
dune build
The library requires OCaml 5.1 or later. The codec definitions from
this post are in
examples/space
(Space Packet, CLCW, TM Frame);
examples/net
does the same for Ethernet, IPv4, TCP, and UDP. Generating .3d
files works out of the box; compiling them into verified C needs
EverParse
(3d.exe in PATH).
Satellite protocols have many binary formats: Space Packets, TM
frames, CLCW words, telecommand segments, and more. I do not want to
write C parsers and serialisers for each of them by hand. I want to
describe the format once in OCaml, get a codec I can use and compose the layers directly, and
generate the EverParse .3d and C glue from the same source.
ocaml-wire is still experimental, but I already find it quite useful
(if only to get ASCII art diagrams of packet headers :-). With OxCaml closing the remaining routing gap, the C side might matter
even less (apart from differential testing against a formally verified
parser). If you want to try it,
the repo is here.
