Provable Subset

Provable Subset

Status: standing reference

This document defines the clearly analyzable, proof-eligible subset of Concrete. It specifies what is included, what is excluded, why, and how the subset connects to the proof pipeline.

For the safety model, see SAFETY. For the proof architecture, see ARCHITECTURE. For the formalization roadmap, see ../research/proof-evidence/formalization-breakdown. For language decisions that shape this subset, see DECISIONS.

Definition

The provable subset is the fragment of Concrete programs that can be formally reasoned about in Lean 4. It currently has two related boundaries:

  1. ProofCore (Concrete/ProofCore.lean) — the actual compiler extractor over validated Core IR
  2. --report proof (Concrete/Report.lean) — a stricter report that approximates which functions are currently good proof targets
  3. Proof (Concrete/Proof.lean) — formal evaluation semantics and proved properties over the proof fragment

ProofCore does not define its own semantics. It is a filter, not a rival IR. The semantic authority remains CoreCheck; ProofCore identifies which validated Core declarations the Lean proof infrastructure can reason about.

The important distinction is:

  • ProofCore extraction is the real compiler boundary today.
  • --report proof is a stricter operational heuristic that also flags extern calls and raw-pointer operations in function bodies.

This document describes both, and labels them explicitly.

What Is Included

Functions

Current ProofCore extractor

A function is proof-eligible when all of the following hold:

CriterionCheckRationale
Empty capability setf.capSet.isEmptyNo side effects — pure computation only
Not trusted!f.isTrustedNo pointer-level implementation techniques
Not an entry point!f.isEntryPointEntry points have runtime obligations (exit code, setup)
No trusted impl originf.trustedImplOrigin.isNoneNot inherited from a trusted impl block

A ProofCore-eligible function is intended to be a pure transformation: it takes values, computes, and returns a value. The current extractor enforces the signature-level and declaration-level conditions above. It does not yet inspect function bodies for extern calls or raw-pointer operations.

Stricter --report proof heuristic

The --report proof mode applies the same base conditions and additionally excludes functions that:

  • call extern functions
  • perform raw pointer operations in their bodies

That makes the report a better approximation of today’s practical proof targets than the current extractor alone.

Types

DeclarationEligible whenRationale
StructNot repr(C), not packed, no reprAlignNo FFI layout constraints — pure algebraic structure
EnumNo builtinIdNot a compiler-intercepted builtin — ordinary algebraic data
Trait definitionAlways includedProvides context for method resolution (not yet verified)

Expressions (formalized in Proof.lean)

The current formal semantics covers:

ConstructStatus
Integer literals and arithmetic (add, sub, mul)Formalized
Boolean literals and logical operatorsFormalized
Comparison operators (eq, ne, lt, le, gt, ge)Formalized
Let bindingsFormalized
If/then/elseFormalized
Non-recursive function callsFormalized
Structs and field accessNot yet formalized
Enums and matchNot yet formalized
While loopsNot yet formalized
Recursive functionsNot yet formalized (uses fuel for termination)
String/char operationsNot yet formalized

What Is Excluded

Always excluded by the current ProofCore extractor

DeclarationWhy
Functions with capabilitiesSide effects make formal reasoning about return values insufficient — the function’s meaning includes its effects
Trusted functionsContain pointer-level techniques that are outside the formal model
Entry pointsRuntime setup/teardown obligations, not pure computation
repr(C) / packed structsLayout is dictated by FFI constraints, not algebraic structure
Builtin-overridden enumsCompiler-intercepted semantics, not user-defined algebraic data

Additionally excluded by --report proof

ConstructWhy
Functions that call extern functionsTheir behavior depends on external code not modeled in Proof.lean
Functions with raw pointer operations in their bodiesThese are a stronger practical signal that the function is outside today’s proof-oriented fragment

Not yet modeled (future work)

ConstructBlocking issue
Structs and enums in proof semanticsNeeds PExpr extensions for constructors, field access, match
Recursive functionsNeeds termination discipline beyond fuel
While loops with break/continueNeeds loop semantics formalization
String/char operationsNeeds string model
Heap operationsFundamentally effectful — may never enter the pure subset
Raw pointer operationsSame — trusted/unsafe territory

Pipeline Position

Source → Parse → Resolve → Check → Elab → CoreCanonicalize → CoreCheck → [ValidatedCore]
                                                                              │
                                                                    extractProofCore
                                                                              │
                                                                        [ProofCore]
                                                                              │
                                                                     Proof.lean semantics
                                                                              │
                                                                      Lean 4 theorems

The proof boundary sits after CoreCheck and before Mono. At this point the program is:

  • Fully elaborated (no surface sugar)
  • Normalized by CoreCanonicalize
  • Validated by CoreCheck (capabilities, types, match coverage, declaration legality)

ProofCore extraction happens at the ValidatedCore stage, not after monomorphization. This means proof-eligible functions are proved over their pre-monomorphization form. Generic functions that are proof-eligible in principle would need to be proved per-instantiation after mono — this is future work.

Proved Properties (Current)

Concrete/Proof.lean contains 17 theorems:

Concrete function correctness:

  • abs_positive, abs_negative, abs_zero — abs function returns correct values
  • max_right, max_left, max_self — max function returns correct values
  • clamp_in_range, clamp_below, clamp_above — clamp function returns correct values

Structural lemmas:

  • eval_lit — integer literals evaluate to themselves
  • eval_bool_lit — boolean literals evaluate to themselves
  • eval_var_bound — variable lookup succeeds when bound
  • eval_if_true, eval_if_false — conditional reduction
  • eval_add_lits, eval_sub_lits, eval_mul_lits — arithmetic on literals

All proofs use either native_decide (kernel reduction) or simp (structural simplification).

Tooling

--report proof

Shows which functions in a program are likely practical proof targets and why excluded functions are excluded:

$ concrete program.con --report proof
=== Proof Eligibility Report ===

module main:
  ✓ pure_add
  ✓ factorial
  ✗ read_file  (requires capabilities: File)
  ✗ main  (requires capabilities: Alloc, Console)
  ✗ unsafe_cast  (trusted boundary)

Totals: 5 functions, 2 eligible for ProofCore, 3 excluded

Exclusion reasons are specific: “requires capabilities: X”, “trusted boundary”, “calls extern: …”, “raw pointer operations”.

Relationship to Other Subsets

ProofCore vs High-Integrity Profile

The provable subset (ProofCore) and the high-integrity profile (see SAFETY) are related but different:

Provable subsetHigh-integrity profile
PurposeFormal verification in LeanStricter runtime restrictions for critical code
ScopePure functions onlyEntire programs under tighter rules
CapabilitiesNone allowedRestricted set (no Unsafe, bounded allocation)
TrustedExcludedAllowed but restricted to approved wrappers
FFIExcludedAllowed through approved trusted extern fn only
EnforcementProofCore extraction filter (with stricter --report proof heuristic today)Future compiler mode/checks

The provable subset is strictly narrower than the high-integrity profile. Every proof-eligible function would also pass high-integrity restrictions, but many high-integrity functions (those with restricted capabilities or approved trusted wrappers) are not proof-eligible.

ProofCore vs the rest of the program

ProofCore is not a wall. It is a window.

Programs are not expected to be entirely proof-eligible. The typical pattern is:

  • Most of the program uses capabilities, trusted wrappers, and FFI normally
  • Selected pure functions (algorithms, validators, transformers) are proof-eligible
  • The --report proof tool shows what is and is not eligible
  • Proofs target the eligible fragment; the rest is covered by testing, reports, and audits

Design Constraints

The provable subset’s shape follows directly from Concrete’s permanent language decisions:

  1. No closures / no trait objects → all dispatch in proof-eligible code is statically resolved. No hidden function pointers or vtables to reason about.
  2. No inference-heavy abstraction → types are explicit. The proof model does not need to reconstruct inferred types.
  3. Trusted = pointer containment only → the boundary between provable and non-provable code is clean. Trusted functions are excluded because they use pointer techniques, not because of vague “unsafety.”
  4. Capabilities in signatures → the eligibility check is trivial: capSet.isEmpty. No need to analyze function bodies for hidden effects.
  5. Phase separation → ProofCore extracts from a well-defined pipeline stage (ValidatedCore) with known invariants. It does not depend on information from later passes.

Future Direction

Tracked in ../research/proof-evidence/formalization-breakdown. The main expansion axes are:

  1. Richer proof semantics — structs, enums, match, recursion in Proof.lean
  2. Source-to-Core traceability — connecting proof results back to user-visible source locations
  3. Language guarantee proofs — proving that capability checking is sound, linearity is enforced, trusted boundaries are honest
  4. Compiler preservation proofs — proving that Core→SSA lowering preserves the semantics of proof-eligible functions
  5. User-facing proof workflow — exporting proof subjects, documenting proof artifacts, tying proofs to --report proof output

Each level builds on the previous. The current state (Level 1: formal semantics for a small pure fragment) is the foundation. See the formalization breakdown for the full roadmap.