IR Model
The primary internal data model in LINC is BindingPackage.
This is the durable evidence artifact used for:
- JSON serialization
- validation against native artifacts
- link-plan construction
- evidence hand-off to downstream tools
Module Organization
The IR is split into focused submodules:
ir::typesfor declarations, functions, records, enums, typedefs, and variablesir::linkfor native link surfaces and provider matchingir::macrosfor preprocessor macro capture and classification
The crate root intentionally stays narrower and exposes workflow-facing entry
points plus the small cross-crate contracts. Consumers that need the detailed
binding IR should import it from linc::ir directly.
Top-Level Shape
At a high level, a BindingPackage contains:
schema_versionlinc_versiontargetinputsmacroslayoutslinkprovenancemacro_provenanceeffective_macro_environmentsource_pathitemsdiagnostics
This matters because the package is not just “the declarations”. It is the declaration surface plus the environment needed to interpret it.
target
target stores information about the scan environment:
- target triple
- compiler command
- compiler version
- flavor
These fields are descriptive rather than prescriptive. They help downstream tooling understand what environment produced the package.
inputs
inputs records the source-side configuration of the scan:
entry_headersinclude_dirsdefines
That is useful for reproducibility, debugging, downstream rebuild decisions, and comparing packages produced from different header environments.
link
BindingLinkSurface is the normalized native-link surface attached to the
package. It preserves:
- preferred link mode
- native surface kind
- platform constraints
- include, framework, and library paths
- declared libraries, frameworks, and artifacts
- original ordered inputs
This is evidence about the native surface, not a build system of its own.
items
items is the core declaration surface.
Supported variants:
| Variant | Meaning |
|---|---|
Function | C function declaration |
Record | struct or union |
Enum | C enum with named variants |
TypeAlias | typedef |
Variable | extern/global variable |
Unsupported | recognized but not fully representable construct |
Downstream tools should not ignore Unsupported blindly. Those entries are
signals that the source surface contains shapes LINC saw but could not
faithfully lower.
BindingType
BindingType represents the type graph used throughout functions, fields,
typedefs, and variables.
Primitive types include:
VoidBoolCharSCharUCharShortUShortIntUIntLongULongLongLongULongLongFloatDoubleLongDouble
Compound/reference forms include:
PointerArrayFunctionPointerTypedefRefRecordRefEnumRefOpaque
Pointer constness is modeled on the pointee so the IR can distinguish
char *, const char *, and char * const without pretending to be a full C
semantic model.
Functions
FunctionBinding contains:
namecalling_conventionparametersreturn_typevariadicsource_offset
Current calling convention coverage is conservative. When the extractor sees
recognized declaration attributes such as stdcall, cdecl, fastcall,
vectorcall, or thiscall, FunctionBinding.calling_convention preserves
that evidence instead of flattening everything to plain C.
Records
RecordBinding represents both struct and union.
It carries:
kindnamefields- optional representation evidence
source_offset
Opaque records are represented by fields: None. That means the type exists
by name but layout and fields are intentionally unavailable.
Each FieldBinding may also carry bit_width. When bit_width is present,
the field is a bitfield and LINC preserved that width as partial evidence even
if full ABI placement is not yet available.
FieldBinding.layout is the companion ABI-evidence slot for compiler-probed
field placement. It mainly carries offset_bytes when record field probing has
been requested and succeeded.
When available from compiler probing, RecordBinding.representation preserves
size, align, and completeness.
RecordBinding.abi_confidence is the higher-level summary of how much ABI
evidence was attached.
Enums
EnumBinding stores:
- enum name
- variants
- optional representation evidence
- source offset
Each variant carries name and optional value. If a value is absent,
downstream tooling should not invent one without understanding the original
source and evaluation context.
When available from compiler probing, EnumBinding.representation preserves
underlying size and signedness.
Macros And Provenance
The IR also carries supporting evidence:
- macro capture and classification
- compiler-probed type layouts
- provenance for declarations and macros
- effective macro environment snapshots
Do not think of the IR as “just declarations”. It is the full analysis record.
Serialization Rules
- keep container fields defaultable
- preserve declaration order where the contract depends on it
- treat additive fields as the normal evolution path
- serialize deterministic JSON in tests and fixtures
What The IR Is Not
The IR is not:
- a parser AST
- a Rust codegen AST
- a shared ABI crate
- a build graph
It is the LINC evidence model. evidence.
For typedef-style named types, TypeAliasBinding.abi_confidence records whether the alias remains
declaration-only or now has compiler-probed layout evidence attached.
TypeAliasBinding.canonical_resolution is the alias-normalization slot.
When present, it preserves:
- the typedef names crossed during alias chasing
- the terminal non-alias
BindingTypethat downstream consumers can treat as the canonical shape
BindingType now also carries explicit qualifier metadata beyond the old pointer-const shortcut:
- pointer nodes preserve pointer-self qualifiers
BindingType::Qualifiedpreserves top-level const / volatile / restrict / atomic evidence
The current Rust codegen layer still lowers most qualifiers conservatively, but downstream library consumers no longer need to reconstruct that evidence from diagnostics alone.
JSON Transport
BindingPackage remains the durable transport artifact, but LINC no longer wraps JSON transport in
crate-specific helper functions.
Use serde_json directly at the tool boundary:
#![allow(unused)]
fn main() {
use linc::ir::BindingPackage;
let json = serde_json::to_string_pretty(&package)?;
let restored: BindingPackage = serde_json::from_str(&json)?;
}
That keeps the artifact story centered on the data contract itself rather than convenience helpers.
Variables
VariableBinding captures global variables and extern declarations with:
nametysource_offset
These are validated separately from functions because symbol-kind mismatches matter.
Unsupported Items
UnsupportedItem should be treated as a first-class signal.
It usually means one of these:
- the syntax was recognized
- extraction could not preserve enough structure
- a diagnostic was emitted
- the binding package is partial, not complete
This is a safer design than silently dropping source constructs.
Macros, Layouts, And Link Data Are Part Of The IR
Even though declarations are the center of the package, three other surfaces often matter just as much:
macroslayoutslink
That is why they live at the package level. They are package-wide evidence, not per-item add-ons.
Declaration Provenance
provenance is a package-level list aligned with items.
Each entry may currently carry:
item_nameitem_kindsource_offsetsource_originsource_location
This is intentionally additive evidence. It gives downstream tooling a stable way to talk about where a declaration came from without rewriting the declaration IR around source metadata.
Serialization Rules
The entire package is serde serializable.
#![allow(unused)]
fn main() {
use linc::ir::BindingPackage;
let json = serde_json::to_string_pretty(&package)?;
let restored: BindingPackage = serde_json::from_str(&json)?;
}
Important artifact behavior:
- the package carries a
schema_version - unknown future schema versions are rejected
- older JSON missing newer optional fields generally deserializes with defaults
That makes the package suitable for machine-to-machine contracts.
What The IR Is Not
The IR is useful, but it is intentionally not:
- a full semantic C type system
- a full ABI proof
- a final link plan
- a final code generation contract for every target
Use it as the normalized source of truth for downstream decisions, not as a claim that all C semantics have been solved.