0006 — Fixture coverage model and gap-filling fixtures

Status: implemented App: prototext Implemented in: 2026-03-12

Problem

The fixture set in fixtures/ was assembled incrementally, each fixture motivated by a specific bug or feature. There is no explicit coverage model stating which inputs the fixtures are intended to cover, nor any systematic record of what is deliberately left uncovered.

Without a coverage model:

Goals

Non-goals


Specification

1. Coverage model

A prototext fixture is a pair (wire_bytes, schema). The fixture exercises the decoder (render_text.rs) and the encoder (encode_text.rs) in sequence via the round-trip invariant wire → text → wire' == wire.

The space of proto inputs is partitioned along the following six dimensions:

Dimension A — Wire type

Every proto encoding maps to one of six wire types on the wire:

A-codeWire typeValueProto field types
A0VARINT0int32, int64, uint32, uint64, bool, enum, sint32, sint64
A1I641fixed64, sfixed64, double
A2LEN2string, bytes, message, packed repeated
A3SGROUP3group (start)
A4EGROUP4group (end)
A5I325fixed32, sfixed32, float

Dimension B — Schema relationship

B-codeRelationshipMeaning
B0UnknownField number not in schema
B1MismatchField number known but wire type ≠ declared type
B2KnownField number and wire type match schema

Dimension C — Field cardinality

C-codeCardinalityMeaning
C0Optionaloptional field (default in proto3; explicit in proto2)
C1Repeated unpackedrepeated field, one record per value
C2Repeated packedrepeated ... [packed=true], values length-prefixed
C3Requiredrequired field (proto2 only)

Dimension D — Encoding anomaly

D-codeAnomalyMeaning
D0CanonicalMinimal, spec-compliant encoding
D1OverhangExtra bytes in tag or value varint (non-minimal encoding)
D2TruncatedVarint or bytes field terminates early / is missing
D3Out-of-range tagField number ≥ 2^29 (protobuf limit)
D4Truncated negativeint32/enum encoded as 5-byte (proto2 quirk)
D5Invalid packedCorrupt varint record inside packed array

Dimension E — Nesting depth

E-codeDepthMeaning
E0FlatNo sub-messages or groups
E1One levelField is a message or group
E2Two levelsNested message inside message/group

Dimension F — Enum-specific (applies only to A0 enum fields)

F-codeEnum conditionMeaning
F0Known valueDecoded integer is in enum_values table
F1Unknown valueDecoded integer is not in enum_values table (ENUM_UNKNOWN)
F2Zero valueDecoded integer is 0 (proto default; must not be confused with "unset")
F3Negative valueEnum constant has a negative numeric value (proto2 allows this)
F4Primitive-name collisionEnum type name matches a proto primitive keyword (e.g. float)
F5Mixed in packedA single packed array contains both known and unknown values

2. Coverage audit

Enum coverage (A0 × F × C):

F0 knownF1 unknownF2 zeroF3 negativeF4 collisionF5 mixed-packed
C0 optionalenum_collision_color_knownenum_collision_color_unknownnum_enum_zeroenum_collision_float_kind
C1 repeatedenum_collision_color_repeated
C2 packedenum_collision_color_packed
C3 required

Note: num_enum_{zero,one,neg_one} use enumOp which is declared int32 in knife.proto, not a real protobuf enum type — they exercise the varint path but NOT the binary_search_by_key symbolic-lookup path in render_text.rs.

Enum in nested context:

Flat (E0)In message (E1)In group (E1)
Enum field✓ all enum_collision_*

Varint types (A0 × B2 × C0):

All proto varint scalar types covered: int32, int64, uint32, uint64, bool, sint32, sint64. ✓

Varint boundary values:

TypeMinMaxZeroNotes
sint32num_sint32_minnum_sint32_max
sint64num_sint64_minnum_sint64_max
int32
int64
uint32
uint64

String escape sequences:

escape_string_into() (serialize/common.rs) handles \n, \t, \r, \", \', \\, and octal \NNN. Covered by string_escapes. ✓

Packed arrays:

TypeNon-empty canonicalEmptyWith unknown enum
int32test_varint_packedN/A
enum (real type)enum_collision_color_packedenum_collision_empty_packedenum_collision_packed_mixed
sint32N/A
sint64N/A

3. Deliberate non-goals of the fixture set

The following combinations are intentionally not covered by fixtures because they are either impossible in valid protobuf, covered by fuzz testing, or represent proto3-only semantics outside the current scope:

4. Fixture definitions

4.1 num_sint32_min, num_sint64_max, num_sint64_min

Gap: sint64 coverage is limited to neg_one and neg_128. The zigzag codec has boundary behaviour at INT64_MIN / INT64_MAX.

Schema: SwissArmyKnife

4.2 enum_collision_color_unknown_repeated

Gap: Unknown enum value (ENUM_UNKNOWN) in a repeated (non-packed) field. The existing enum_collision_color_unknown uses an optional field. The repeated case exercises a different code path in render_text.rs.

Values: colors = [0, 99, 2] — RED known, 99 unknown, BLUE known.

Schema: EnumCollision

4.3 enum_collision_packed_mixed

Gap: Packed array where some elements are known enum values and others are unknown.

Values: colors_pk = [0, 99, 2] — RED=0 known, 99 unknown, BLUE=2 known.

Schema: EnumCollision

4.4 enum_collision_empty_packed

Gap: Empty packed array for a real enum field.

Values: colors_pk = [] — zero elements.

Schema: EnumCollision

4.5 enum_in_nested_message

Gap: Enum field inside a nested sub-message. All existing enum fixtures are flat.

A nested self-referential field is added to enum_collision.proto:

optional EnumCollision nested = 6;

Values: nested.color = GREEN, nested.unknown_color = 99.

Schema: EnumCollision

4.6 enum_in_group

Gap: Enum field inside a proto2 group.

An EnumGroup group field is added to enum_collision.proto:

optional group EnumGroup = 7 {
  optional Color group_color = 1;
}

Values: EnumGroup.group_color = BLUE.

Schema: EnumCollision

4.7 string_escapes

Gap: No fixture exercises the string escape-sequence paths in escape_string_into() (serialize/common.rs).

Value: stringOp = "tab:\there\nnewline\\backslash\"quote" — exercises \t, \n, \\, \".

Schema: SwissArmyKnife

4.8 string_escapes_bytes

Gap: Bytes fields containing non-UTF-8 / non-printable byte values.

Value: bytesOp = bytes(range(256)) — all 256 byte values.

Schema: SwissArmyKnife

5. Schema additions to enum_collision.proto

See §4.5 and §4.6 above. The final schema is:

syntax = "proto2";

enum float { FLOAT_ZERO = 0; FLOAT_ONE = 1; FLOAT_TWO = 2; }

enum Color { RED = 0; GREEN = 1; BLUE = 2; }

message EnumCollision {
  optional float  kind          = 1;
  optional Color  color         = 2;
  optional Color  unknown_color = 3;
  repeated Color  colors        = 4;
  repeated Color  colors_pk     = 5 [packed=true];
  optional EnumCollision nested = 6;
  optional group EnumGroup = 7 {
    optional Color group_color = 1;
  }
}

References