RFC: Composite Primary Keys

This document outlines a practical, incremental plan to add composite (multi-column) primary key support to Tonbo while maintaining backward compatibility. It explains design goals, changes required across the codebase, and a step-by-step implementation and validation plan.

Current Status (2025-08-11)

Phase 1: Completed. Plural Schema APIs, fixed projections, and Parquet writer configuration are implemented for single-PK schemas.
- Schema now exposes primary_key_indices() and primary_key_paths_and_sorting() (src/record/mod.rs). Macro-generated single-PK schemas return one-element slices.
- Read paths build fixed projections as [0, 1] ∪ PKs using primary_key_indices() (src/lib.rs, src/transaction.rs).
- DbOption::new configures sorting columns (_ts then PKs) and enables stats + bloom filters for each PK column path (src/option.rs).
Phase 2: Not implemented. Composite key types under src/record/key/composite/ are placeholders; derive macro still accepts only a single #[record(primary_key)] and generates a single-field key. No multi-PK trybuild/integration tests.
Phase 3: Not implemented. DynSchema remains single-PK (stores one primary_index_arrow, one pk_path, and sorting with a single PK column).

Goals

Support multi-column primary keys with lexicographic ordering of PK components.
Preserve existing single-column PK behavior and public APIs (backward compatible).
Keep zero-copy reads and projection pushdown guarantees for PK columns.
Ensure on-disk layout (Parquet) remains sorted by _ts then PK(s), with statistics/bloom filters enabled for PK columns.
Make it easy to use via the #[derive(Record)] macro by allowing multiple #[record(primary_key)] fields.

Non-Goals (for this RFC)

Foreign keys, cascades, or relational constraints.
Secondary indexes.
Schema migrations for existing data files.
Composite keys in dynamic records in the first phase (can be added subsequently).

High-Level Design

Schema trait changes (completed)

Now: Schema exposes primary_key_indices() and primary_key_path().
primary_key_index() was removed in favor of the slice-based primary_key_indices().
Additive helper: primary_key_paths_and_sorting() returns all PK column paths plus sorting columns.
For single-column PKs, implementations return a one-element slice from primary_key_indices().

Composite key type(s)

Introduce a composite key in src/record/key/composite/ with lexicographic Ord:
- Option A (preferred): The macro generates a record-specific key struct, e.g., UserKey { k1: u64, k2: String } and UserKeyRef<'r> { ... }.
- Option B (interim): Provide generic tuple implementations for (K1, K2), (K1, K2, K3), … up to a small N. Each implements Key and KeyRef with lexicographic Ord, plus Encode/Decode, Hash, Clone.
For string/bytes components, KeyRef holds borrowed forms, mirroring current single-PK behavior.

Macro updates (tonbo_macros)

Allow multiple #[record(primary_key)] fields. Order of appearance in struct determines comparison order (later we can add order = i if needed).
Generate:
- Record-specific key struct and ref struct (Option A), or map to tuple (Option B).
- type Key = <GeneratedKey> in Schema impl.
- fn key(&self) -> <GeneratedKeyRef> in Record impl.
- fn primary_key_indices(&self) -> Vec<usize> in Schema impl (indices are offset by 2 for _null, _ts).
Ensure RecordRef::from_record_batch and projection logic always keep all PK columns, even if they are not listed in the projection.
Keep encoding/arrays builders unchanged in signature; they already append values per-field.

Projections and read paths

Replace single-index assumptions with multi-index collections:
- Use [0, 1] ∪ primary_key_indices() to build fixed projections in src/lib.rs and src/transaction.rs.
- In all RecordRef::projection usages, ensure all PK columns are always retained (already implied by fixed mask).

Parquet writer configuration

In DbOption::new, use primary_key_paths_and_sorting() to:
- Enable stats and bloom filters for each PK column path via .set_column_statistics_enabled() and .set_column_bloom_filter_enabled() (invoke once per path).
- Set sorting columns as [ SortingColumn(_ts, …), SortingColumn(pk1, …), SortingColumn(pk2, …), … ].

Dynamic records (phase 2)

Extend DynSchema to track primary_indices: Vec<usize> in metadata (replacing the single primary_key_index).
Update DynRecordRef::new and readers to honor multiple PK indices.
Ensure tombstone writes (row == None) still populate all PK columns from Ts.key so ordering/lookups remain correct.
Define a composite key wrapper for Value/ValueRef (or generate a per-dyn-schema composite type if feasible). Initially out-of-scope for phase 1.

Step-by-Step Plan

Phase 1: Core plumbing (single-PK stays working)

Extend Schema trait
- Add primary_key_indices() and primary_key_paths_and_sorting() with default impls wrapping existing methods.
- Update call sites in DbOption::new, src/lib.rs, and src/transaction.rs to use the plural forms.
- Acceptance: All tests pass; no behavior change for single-PK users.
Fixed projection refactor
- Replace single primary_key_index usage with iteration over primary_key_indices() to construct fixed_projection = [0, 1] ∪ PKs.
- Acceptance: Existing tests and scan/get projections still behave identically for single-PK.
Parquet writer properties
- Replace single primary_key_path() usage with plural variant to configure stats, bloom filters, and sorting columns for _ts plus all PK components.
- Acceptance: Files write successfully; read paths unchanged.

Phase 2: Macro + key types

Composite key data structure
- Implement composite key(s) in src/record/key/composite/ with Encode/Decode, Ord, Hash, Key/KeyRef.
- Start with tuples (K1, K2), (K1, K2, K3) etc. (Option B) for faster delivery; later switch default macro to per-record key type (Option A).
- Acceptance: Unit tests confirm lexicographic ordering and encode/decode round-trip for composite keys.
Update #[derive(Record)]
- Allow multiple #[record(primary_key)] fields and generate:
  - type Key = (<K1>, <K2>, …) (Option B) or <RecordName>Key (Option A).
  - fn key(&self) -> (<K1Ref>, <K2Ref>, …).
  - fn primary_key_indices(&self) -> Vec<usize> with +2 offset.
  - Ensure from_record_batch and projection retain all PK columns.
- Acceptance: trybuild tests covering multi-PK compile and run; single-PK tests unchanged.
Integration tests
- Add end-to-end tests: insert/get/remove, range scans, projection, and ordering on 2+ PK fields (e.g., tenant_id: u64, name: String).
- Acceptance: All new tests pass.

Phase 3: Dynamic records (optional)

DynSchema multi-PK
- Store primary_indices metadata; update dynamic arrays/refs to keep all PK columns in projections.
- Provide a composite ValueRef key wrapper for in-memory operations.
- Ensure tombstones populate PK components from Ts.key in builders (e.g., DynRecordBuilder::push).
- Acceptance: dynamic tests mirroring integration scenarios pass, including tombstone rows retaining all PK components.

Code Touchpoints

Traits/APIs: src/record/mod.rs (Schema), src/option.rs (DbOption::new)
Read paths: src/lib.rs (get/scan/package), src/transaction.rs (get/scan)
Macro codegen: tonbo_macros/src/record.rs, tonbo_macros/src/keys.rs, tonbo_macros/src/data_type.rs
Key types: src/record/key/composite/
Dynamic (phase 3): src/record/dynamic/* (incl. tombstone handling in DynRecordBuilder::push)

Testing Strategy

Unit tests:
- Composite key Ord, Eq, Hash, Encode/Decode round-trip.
- Schema default impl compatibility.
trybuild tests:
- Multiple #[record(primary_key)] in a struct compiles and generates expected APIs.
- Reject nullable PK components.
Integration tests:
- Insert/get/remove by composite key; range scans across composite key ranges; projection keeps PK columns.
- WAL/compaction unaffected (basic smoke tests).
(Optional) Property tests: ordering equivalence vs. native tuple lexicographic ordering when Option B is used.
Tombstones:
- For both single- and multi-column PKs, verify that tombstone rows (row == None) keep all PK column values populated from Ts.key in Arrow arrays and through RecordRef::from_record_batch.

Backward Compatibility & Migration

All existing single-PK code continues to work without changes due to default-impl fallbacks.
Users opting into composite PKs need only annotate multiple fields with #[record(primary_key)].
No on-disk migration is required for existing tables; new tables with composite PKs will write Parquet sorting columns for all PK components.

Risks and Mitigations

API surface increase: keep new APIs additive with conservative defaults.
Projection bugs: comprehensive tests to ensure PK columns are always included.
Performance: lexicographic compare is standard; Arrow array lengths are uniform, so no extra bounds checks needed.
Dynamic records complexity: staged to a later phase to avoid blocking initial delivery.

Example (target macro UX)

#[derive(Record, Debug)]
pub struct User {
    #[record(primary_key)]
    pub tenant_id: u64,
    #[record(primary_key)]
    pub name: String,
    pub email: Option<String>,
    pub age: u8,
}

// Generated (conceptually):
// type Key = (u64, String);
// fn key(&self) -> (u64, &str);
// fn primary_key_indices(&self) -> Vec<usize> { vec![2, 3] }

Delivery Checklist

Add Schema plural APIs and refactor call sites.
Implement composite key types (tuples first).
Enable multiple PK fields in macro; generate composite key/ref and PK indices.
Update projection logic to retain all PK columns.
Configure Parquet sorting/statistics for all PK components.
Add unit/trybuild/integration tests.
Update user guide (mention composite PK support and examples).
Ensure tombstone rows keep PK component values (builders/readers), validated by tests.

The Tonbo Guide