0009 — Protocraft port and end-to-end test suite

Status: implemented Implemented in: 2026-03-24 App: prototext


Problem

The current test suite for prototext consists of:

The fixture round-trip test (fixture_roundtrip_annotated) reads pre-committed .pb files from prototext/fixtures/cases/ and verifies that render_as_text produces a stable output — but it does not verify that render_as_bytes(render_as_text(wire)) == wire. It tests text stability, not losslessness.

More critically, the existing fixtures were generated by the Python protocraft library and committed as binary files. There is no in-repo tool that can generate new fixtures, extend the suite, or verify that the fixture wire bytes match a known intent.

In ../../code/prototools (the reference Python implementation), a richer test infrastructure exists:

This spec describes porting protocraft to Rust and wiring it into a comprehensive end-to-end test suite within the prototext crate.


Goals

  1. Port protocraft to Rust as a test-only library inside the prototext crate.
  2. Port craft_a.py — all 30+ named fixtures — as Rust fixture definitions using the ported protocraft API.
  3. Add end-to-end tests that, for every fixture:
  4. Keep the .pb fixture files in prototext/fixtures/cases/ as the authoritative golden outputs, but derive them from protocraft at test time rather than committing pre-generated binaries.

Non-goals


Specification

1. Protocraft Rust library

Location: prototext/src/protocraft/ (compiled only under #[cfg(test)] or as a [dev-dependency] module).

Alternatively, if the module grows large: a separate workspace crate protocraft/ with publish = false.

1.1 Core abstractions

/// A message builder accumulating wire bytes.
pub struct Message { ... }

/// An integer value with optional overhanging bytes.
pub struct Integer {
    pub value: u64,
    pub ohb: u8,   // overhanging bytes (0 = canonical)
}

/// A wire tag with explicit field number, wire type, and optional ohb.
pub struct Tag {
    pub field: u64,
    pub wire_type: u8,
    pub ohb: u8,
}

1.2 Builder API

The Rust API mirrors the Python API structurally. Each builder function encodes a field into the current message's buffer.

impl Message {
    /// Create a new root message.
    pub fn new() -> Self;

    /// Encode an int32 field (wire type 0, zigzag-free).
    pub fn int32(&mut self, field: impl IntoTag, value: impl IntoInteger);
    pub fn int64(&mut self, field: impl IntoTag, value: impl IntoInteger);
    pub fn uint32(&mut self, field: impl IntoTag, value: impl IntoInteger);
    pub fn uint64(&mut self, field: impl IntoTag, value: impl IntoInteger);
    pub fn sint32(&mut self, field: impl IntoTag, value: i32);
    pub fn sint64(&mut self, field: impl IntoTag, value: i64);
    pub fn bool_(&mut self, field: impl IntoTag, value: impl IntoInteger);
    pub fn enum_(&mut self, field: impl IntoTag, value: impl IntoInteger);

    pub fn fixed32(&mut self, field: impl IntoTag, value: u32);
    pub fn fixed64(&mut self, field: impl IntoTag, value: u64);
    pub fn sfixed32(&mut self, field: impl IntoTag, value: i32);
    pub fn sfixed64(&mut self, field: impl IntoTag, value: i64);
    pub fn float_(&mut self, field: impl IntoTag, value: f32);
    pub fn double_(&mut self, field: impl IntoTag, value: f64);

    pub fn bytes_(&mut self, field: impl IntoTag, value: &[u8]);
    pub fn string(&mut self, field: impl IntoTag, value: &str);

    /// Append a nested message (length-delimited).
    pub fn message(&mut self, field: impl IntoTag, nested: Message);

    /// Append a group (start tag + contents + end tag).
    pub fn group(&mut self, field: impl IntoTag, end_field: impl IntoTag, nested: Message);

    /// Append raw bytes verbatim (no tag, no length prefix).
    pub fn raw(&mut self, data: &[u8]);

    /// Append an arbitrary field with a custom wire tag and raw value bytes.
    pub fn custom(&mut self, tag_bytes: &[u8]);

    /// Return the accumulated wire bytes.
    pub fn build(self) -> Vec<u8>;
}

IntoTag is implemented for:

IntoInteger is implemented for:

1.3 Varint encoding with overhanging bytes

/// Encode a varint with `ohb` extra continuation bytes appended.
/// ohb=0 produces canonical minimal encoding.
fn encode_varint_ohb(value: u64, ohb: u8) -> Vec<u8>;

Example: encode_varint_ohb(42, 3)[0xaa, 0x80, 0x80, 0x00] (4 bytes instead of canonical 1).

1.4 Wire tag encoding

/// Encode a wire tag: (field_number << 3) | wire_type, as a varint with
/// optional overhanging bytes.
fn encode_tag(field: u64, wire_type: u8, ohb: u8) -> Vec<u8>;

2. Fixture definitions (craft_a.rs)

Location: prototext/src/protocraft/craft_a.rs

A Rust port of all named fixtures from craft_a.py. Each fixture is a fn returning Vec<u8>:

pub fn test_empty() -> Vec<u8> { Message::new().build() }

pub fn test_field_invalid() -> Vec<u8> {
    let mut m = Message::new();
    let mut inner = Message::new();
    inner.custom(b"\x07");          // invalid wire type
    m.message(/* messageRp */ 11, inner);
    // ... etc.
    m.build()
}

pub fn test_n_overhanging_bytes() -> Vec<u8> {
    let mut m = Message::new();
    let mut inner = Message::new();
    inner.uint64(Tag { field: 1, wire_type: 0, ohb: 17 }, 0u64);
    inner.uint64(Tag { field: /* uint64Rp */ 74, wire_type: 0, ohb: 2 }, 0u64);
    m.message(11, inner);
    m.build()
}

The complete list of fixtures to port (from craft_a.py):

Validation: Each craft_a function output must match the corresponding committed .pb file in prototext/fixtures/cases/. This is verified by a dedicated test:

#[test]
fn craft_a_matches_committed_fixtures() {
    for (name, func) in ALL_FIXTURES {
        let generated = func();
        let committed = load_case_bytes(name).expect("fixture file missing");
        assert_eq!(generated, committed,
            "craft_a::{name} output does not match fixtures/cases/{name}.pb");
    }
}

This test acts as a contract: if craft_a diverges from the committed files, the test fails and the developer must either fix craft_a or regenerate the committed fixture.

3. End-to-end round-trip tests

Location: prototext/tests/roundtrip.rs (extend existing file) or a new prototext/tests/e2e.rs.

3.1 Lossless round-trip with annotations (all fixtures)

For every fixture name in ALL_FIXTURES:

let wire = craft_a::name();
let schema = knife_schema();
let text = render_as_text(&wire, Some(&schema), opts(true)).unwrap();
let wire2 = render_as_bytes(&text, opts(true)).unwrap();
assert_eq!(wire2, wire, "{name}: round-trip with annotations must be bit-exact");

3.2 No panic without annotations (all fixtures)

For every fixture:

let wire = craft_a::name();
let schema = knife_schema();
// Must not panic or return Err
let _ = render_as_text(&wire, Some(&schema), opts(false)).unwrap();

3.3 Lossless round-trip without annotations (canonical fixtures only)

For canonical fixtures (name.starts_with("canonical_")):

let wire = craft_a::name();
let schema = knife_schema();
let text = render_as_text(&wire, Some(&schema), opts(false)).unwrap();
let wire2 = render_as_bytes(&text, opts(false)).unwrap();
assert_eq!(wire2, wire,
    "{name}: canonical fixture must round-trip without annotations");

4. Fixture file regeneration

A binary target or test helper gen_fixtures that:

  1. Calls every craft_a::* function.
  2. Writes the output to prototext/fixtures/cases/<name>.pb.

This replaces the Python scripts/gen_prototext_fixtures.py script. It is invoked manually when the fixture definitions change, not as part of the normal test run.

5. Field number mapping

The craft_a fixtures reference SwissArmyKnife fields by name in the Python version. In the Rust port, field numbers are used directly (no schema-aware name resolution in protocraft — that is a Python-only feature since Rust has no runtime descriptor introspection in this context).

The knife.proto field number mapping must be documented in a comment at the top of craft_a.rs for reference.


Implementation order

  1. Implement the protocraft Rust library (Message, Integer, Tag, encode_varint_ohb, encode_tag).
  2. Port canonical_* fixtures first (simplest — no anomalies).
  3. Add craft_a_matches_committed_fixtures test; verify canonical fixtures match committed .pb files.
  4. Port remaining test_* fixtures incrementally, verifying each against the committed .pb file.
  5. Add e2e round-trip tests (§3.1, §3.2, §3.3).
  6. Add gen_fixtures regeneration helper.

References