PARC Reference

PARC is the source frontend of the toolchain. The real crate surface today is:

preprocessing through both external-driver and built-in paths
C parsing into a typed AST
extraction into a durable source IR
header scanning that goes straight to SourcePackage
AST-oriented support APIs such as visiting, spans, locations, and printing

That means the crate serves two audiences at once:

downstream tools that want parc::ir::SourcePackage
parser-facing tools that want direct AST access

What PARC Owns

preprocessing
parsing
parser recovery
source extraction
source diagnostics and provenance
source IR
header scanning
AST traversal and debug support

What PARC Does Not Own

symbol inventories
binary validation
link-plan construction
Rust lowering or crate emission

Actual Data Flow

raw source / headers
  -> driver or built-in preprocessor
  -> parser AST
  -> extraction
  -> SourcePackage
  -> serialized source artifact or downstream harness

scan short-circuits that flow into one high-level operation. parse and driver expose earlier stages for syntax-level consumers.

Module Layout

Module	What it is actually for
`driver`	file-oriented parse flow using an external preprocessor
`preprocess`	built-in preprocessing, tokenization, include resolution
`parse`	fragment parsing and direct translation-unit parsing from strings
`scan`	end-to-end header scanning into `SourcePackage`
`extract`	AST-to-IR lowering and normalization
`ir`	durable PARC-owned source contract
`ast`	syntax tree for parser-facing consumers
`visit`	traversal hooks over the AST
`span` / `loc`	source-position helpers
`print`	debug-oriented AST printer
`intake`	already-preprocessed source intake helpers

Boundary

The strongest consumer boundary is parc::ir::SourcePackage.

That is the point where PARC stops owning the problem. Anything involving binary evidence or Rust generation is downstream from PARC, even if tests and harnesses compose those crates together elsewhere.

Reading Strategy

Read the book in one of these orders:

source-contract path: Getting Started -> Source IR -> Extraction -> Header Scanning -> API Contract
parser-facing path: Getting Started -> Driver API -> Parser API -> AST Model -> Visitor Pattern
contributor/debug path: Project Layout -> Testing -> Diagnostics And Printing -> Parser Boundaries

Getting Started

This chapter is the shortest path from real source or headers to something that PARC actually produces today: either a parsed AST or a SourcePackage.

Read parc as the source frontend of the toolchain:

parc owns preprocessing, parsing, extraction, and source diagnostics
linc owns link and binary evidence
gerc owns Rust lowering and emitted build output

The boundary rule is strict: parc/src/** must not depend on linc or gerc, and any cross-package translation belongs only in tests, examples, or external harnesses.

Add the crate

[dependencies]
parc = { path = "../parc" }

Pick the right API first

Use parc::driver when you have a file on disk and want PARC to run a system preprocessor first.

use parc::driver::{parse, Config};

fn main() -> Result<(), parc::driver::Error> {
    let config = Config::default();
    let parsed = parse(&config, "src/tests/files/minimal.c")?;

    println!("preprocessed bytes: {}", parsed.source.len());
    println!("top-level items: {}", parsed.unit.0.len());
    Ok(())
}

Use parc::parse when you already have source text in memory and want to parse a fragment directly.

use parc::driver::Flavor;
use parc::parse;

fn main() {
    let expr = parse::expression("a + b * 2", Flavor::StdC11).unwrap();
    println!("{:#?}", expr);
}

Choose a language flavor

PARC supports three parser modes:

Flavor	Meaning
`StdC11`	Strict C11
`GnuC11`	C11 plus GNU syntax such as `typeof`, attributes, statement expressions, and GNU asm
`ClangC11`	C11 plus Clang-oriented extensions such as availability attributes

For file-based parsing, Config::default() selects:

clang -E on macOS
gcc -E on other targets

You can also select explicitly:

#![allow(unused)]
fn main() {
use parc::driver::Config;

let gnu = Config::with_gcc();
let clang = Config::with_clang();
}

First useful parse example

This example parses a translation unit through the normal driver path:

use parc::driver::{parse, Config};

fn main() -> Result<(), parc::driver::Error> {
    let parsed = parse(&Config::default(), "src/tests/files/minimal.c")?;

    for (i, item) in parsed.unit.0.iter().enumerate() {
        println!("item #{i}: {:?}", item.node);
    }

    Ok(())
}

First useful scan example

If what you really want is source IR rather than a raw AST, start with parc::scan:

#![allow(unused)]
fn main() {
use parc::scan::{scan_headers, ScanConfig};

let config = ScanConfig::new().entry_header("demo.h");
let result = scan_headers(&config).unwrap();

println!("items: {}", result.package.items.len());
}

This is the closest thing PARC has to a “frontend product” API.

First fragment example

If you only need one declaration or statement, the direct parser API is faster to wire in:

use parc::driver::Flavor;
use parc::parse;

fn main() {
    let decl = parse::declaration("static const int answer = 42;", Flavor::StdC11).unwrap();
    let stmt = parse::statement("return answer;", Flavor::StdC11).unwrap();

    println!("{:#?}", decl);
    println!("{:#?}", stmt);
}

Architectural boundary

parc is the source frontend.

It owns:

preprocessing
parsing
source extraction
source diagnostics
the parc::ir::SourcePackage artifact

It does not own:

symbol inventory
binary validation
link planning
Rust code generation

In this repository, cross-package composition should not live in parc library code. linc and gerc should consume parc output only from tests, examples, or external harnesses.

Common Workflows

Most confusion with PARC comes from choosing the wrong entry point. This chapter maps common tasks to the right API.

Read the workflows in this order:

prefer source/frontend workflows that stay inside parc
serialize SourcePackage when another tool needs the result
keep any cross-package translation in tests, examples, or external harnesses

Workflow selection

Situation	API
Turn headers into `SourcePackage`	`scan::scan_headers`
Parse a `.c` or `.h` file with includes and macros	`driver::parse`
Parse already-preprocessed text from memory	`driver::parse_preprocessed`
Parse one expression, declaration, statement, or translation unit string	`parse::*`
Walk an AST you already parsed	`visit`
Print an AST for debugging	`print::Printer`

Scan headers into source IR

Use this when your real target is the PARC source contract rather than the raw syntax tree.

#![allow(unused)]
fn main() {
use parc::scan::{scan_headers, ScanConfig};

let result = scan_headers(&ScanConfig::new().entry_header("demo.h")).unwrap();
println!("diagnostics: {}", result.package.diagnostics.len());
}

This is the best fit for downstream toolchains that want declarations, provenance, macros, and diagnostics in one package.

Parse a real file

Use this when your source depends on #include, #define, or compiler predefined macros.

#![allow(unused)]
fn main() {
use parc::driver::{parse, Config};

let config = Config::default();
let parsed = parse(&config, "src/main.c")?;
}

This gives you:

parsed.source: the preprocessed source text
parsed.unit: the AST root

Parse preprocessed text

Use this when another tool already ran preprocessing and you only want PARC to parse.

#![allow(unused)]
fn main() {
use parc::driver::{parse_preprocessed, Config};

let config = Config::default();
let source = r#"
1 "generated.i"
typedef int count_t;
count_t answer(void) { return 42; }
"#
.to_string();

let parsed = parse_preprocessed(&config, source)?;
}

This is useful for:

snapshot-based tests
integration with custom build systems
reproducing parse bugs from stored .i files

Parse a fragment

Use parc::parse when you are not dealing with a whole file.

#![allow(unused)]
fn main() {
use parc::driver::Flavor;
use parc::parse;

let expr = parse::expression("ptr->len + 1", Flavor::GnuC11)?;
let decl = parse::declaration("unsigned long flags;", Flavor::StdC11)?;
let stmt = parse::statement("if (ok) return 1;", Flavor::StdC11)?;
}

This is the right choice for:

unit tests
parser experiments
editor tooling for partial snippets

Build an analyzer

The normal analyzer flow is:

Parse with driver or parse
Traverse with visit
Use span and loc for diagnostics

Example outline:

#![allow(unused)]
fn main() {
use parc::driver::{parse, Config};
use parc::visit::{self, Visit};
use parc::{ast, span};

struct FunctionCounter {
    count: usize,
}

impl<'ast> Visit<'ast> for FunctionCounter {
    fn visit_function_definition(
        &mut self,
        node: &'ast ast::FunctionDefinition,
        span: &'ast span::Span,
    ) {
        self.count += 1;
        visit::visit_function_definition(self, node, span);
    }
}

let parsed = parse(&Config::default(), "src/main.c")?;
let mut counter = FunctionCounter { count: 0 };
counter.visit_translation_unit(&parsed.unit);
println!("functions: {}", counter.count);
Ok::<(), parc::driver::Error>(())
}

Debug the parse tree

Use the printer when you need a human-readable structural dump:

#![allow(unused)]
fn main() {
use parc::driver::{parse, Config};
use parc::print::Printer;
use parc::visit::Visit;

let parsed = parse(&Config::default(), "src/main.c")?;

let mut out = String::new();
Printer::new(&mut out).visit_translation_unit(&parsed.unit);
println!("{}", out);
Ok::<(), parc::driver::Error>(())
}

Rule of thumb

If you want SourcePackage, start with scan.
If preprocessing matters and you still want the AST, start with driver.
If you already have plain text in memory, start with parse.
If you need diagnostics tied back to original files, keep the preprocessed source string.
If another crate needs PARC output, stop at SourcePackage and translate it outside parc/src/**.

Driver API

The driver module is the high-level API for file parsing. It runs a system preprocessor, then parses the resulting text into a TranslationUnit.

Main types

#![allow(unused)]
fn main() {
pub struct Config {
    pub cpp_command: String,
    pub cpp_options: Vec<String>,
    pub flavor: Flavor,
}

pub enum Flavor {
    StdC11,
    GnuC11,
    ClangC11,
}

pub struct Parse {
    pub source: String,
    pub unit: TranslationUnit,
}
}

The return value matters:

source is the preprocessed source PARC actually parsed
unit is the AST root

Basic file parsing

#![allow(unused)]
fn main() {
use parc::driver::{parse, Config};

let config = Config::default();
let parsed = parse(&config, "examples/demo.c")?;

println!("preprocessed bytes: {}", parsed.source.len());
println!("top-level nodes: {}", parsed.unit.0.len());
Ok::<(), parc::driver::Error>(())
}

Configuring the preprocessor

You can override both the preprocessor executable and its arguments.

#![allow(unused)]
fn main() {
use parc::driver::{parse, Config, Flavor};

let config = Config {
    cpp_command: "gcc".into(),
    cpp_options: vec![
        "-E".into(),
        "-Iinclude".into(),
        "-DMODE=2".into(),
        "-nostdinc".into(),
    ],
    flavor: Flavor::GnuC11,
};

let parsed = parse(&config, "src/input.c")?;
Ok::<(), parc::driver::Error>(())
}

This is the place to inject:

include directories with -I...
macro definitions with -D...
stricter or more isolated builds with -nostdinc

GCC vs Clang helpers

The convenience constructors also select parser flavor:

#![allow(unused)]
fn main() {
use parc::driver::Config;

let gcc = Config::with_gcc();     // gcc -E, GNU flavor
let clang = Config::with_clang(); // clang -E, Clang flavor
}

Use these when you want the parser flavor to match the syntax accepted by the external preprocessor.

Parsing preprocessed text directly

If you already have .i-style content, skip parse and call parse_preprocessed.

#![allow(unused)]
fn main() {
use parc::driver::{parse_preprocessed, Config};

let source = r#"
1 "sample.i"
typedef int count_t;
count_t next(count_t x) { return x + 1; }
"#
.to_string();

let parsed = parse_preprocessed(&Config::default(), source)?;
println!("{}", parsed.unit.0.len());
Ok::<(), parc::driver::SyntaxError>(())
}

Error model

driver::parse returns:

#![allow(unused)]
fn main() {
Result<Parse, parc::driver::Error>
}

The error variants are:

PreprocessorError(io::Error) when the external preprocessor fails
SyntaxError(SyntaxError) when preprocessing succeeded but parsing failed

Working with syntax errors

SyntaxError includes:

source: the preprocessed source
line, column, offset: the parse failure position in that source
expected: a set of expected tokens

Example:

#![allow(unused)]
fn main() {
use parc::driver::{parse_preprocessed, Config};

let broken = "int main( { return 0; }".to_string();
match parse_preprocessed(&Config::default(), broken) {
    Ok(_) => {}
    Err(err) => {
        eprintln!("parse failed at {}:{}", err.line, err.column);
        eprintln!("expected: {:?}", err.expected);
    }
}
}

If the preprocessed source contains line markers, SyntaxError::get_location() can reconstruct the original file and include stack.

Built-in preprocessor

PARC includes a built-in C preprocessor that eliminates the need for an external gcc or clang binary. Use parse_builtin instead of parse:

#![allow(unused)]
fn main() {
use parc::driver::{parse_builtin, Config};
use std::path::Path;

let config = Config::with_gcc();
let include_paths = vec![Path::new("/usr/include")];
let parsed = parse_builtin(&config, "src/input.c", &include_paths)?;
Ok::<(), parc::driver::Error>(())
}

The built-in preprocessor supports:

Object-like and function-like macros (with #, ##, __VA_ARGS__)
Conditional compilation (#if, #ifdef, #ifndef, #elif, #else, #endif)
#include resolution with configurable search paths
Include guard detection and optimization
defined() operator in #if expressions
Full C constant expression evaluation (arithmetic, bitwise, logical, ternary)
Predefined target macros (architecture, OS, GCC compatibility)

Macro extraction

To extract all #define macros from a C file (equivalent to gcc -dD -E):

#![allow(unused)]
fn main() {
use parc::driver::capture_macros;
use std::path::Path;

let macros = capture_macros("src/input.c", &[Path::new("/usr/include")])?;
for (name, value) in &macros {
    println!("#define {} {}", name, value);
}
Ok::<(), parc::driver::Error>(())
}

This returns all macros active after preprocessing, including predefined target macros and macros from included headers.

Practical advice

Keep parsed.source if you plan to report errors later.
Use parse_preprocessed for deterministic regression tests.
Prefer explicit cpp_options in tools and CI so parse behavior stays reproducible.
Use parse_builtin when you need zero-dependency parsing without a C toolchain.

Built-in Preprocessor

PARC includes a complete built-in C preprocessor in the parc::preprocess module. This eliminates the runtime dependency on gcc or clang for preprocessing.

Architecture

The preprocessor is split into focused modules:

Module	Purpose
`token`	Token types (`Ident`, `Number`, `Punct`, etc.)
`lexer`	Preprocessor tokenizer (§6.4 preprocessing tokens)
`directive`	Directive parser (`#define`, `#if`, `#include`, etc.)
`macros`	Macro table, object-like and function-like expansion
`expr`	`#if` constant expression evaluator
`processor`	Conditional compilation engine
`include`	`#include` resolution with search paths and guard tracking
`predefined`	Target-specific predefined macros

Quick start

#![allow(unused)]
fn main() {
use parc::preprocess::preprocess;

let output = preprocess("#define X 42\nint a = X;\n");
// output.tokens contains the expanded token stream
}

Macro expansion

Both object-like and function-like macros are supported:

#define SIZE 1024
#define MAX(a, b) ((a) > (b) ? (a) : (b))
#define LOG(fmt, ...) printf(fmt, __VA_ARGS__)

Features:

# stringification operator
## token pasting operator
__VA_ARGS__ for variadic macros
Recursive expansion with “paint set” to prevent infinite recursion (C standard §6.10.3.4)
Self-referential macros handled correctly (#define X X + 1 expands to X + 1)

Conditional compilation

All standard conditional directives are supported:

#if CONDITION
#ifdef NAME
#ifndef NAME
#elif CONDITION
#else
#endif

The #if expression evaluator supports:

Integer literals (decimal, octal, hex, binary)
Character constants ('x')
defined(NAME) and defined NAME
All C operators: arithmetic, bitwise, logical, comparison, ternary
Undefined identifiers evaluate to 0 (per C standard §6.10.1p4)

Include resolution

#![allow(unused)]
fn main() {
use parc::preprocess::{IncludeResolver, Processor};

let mut resolver = IncludeResolver::new();
resolver.add_system_path("/usr/include");
resolver.add_local_path("./include");

let mut processor = Processor::new();
let result = resolver.preprocess_file(
    std::path::Path::new("src/main.c"),
    &mut processor,
);
}

Features:

"local" includes search relative to the including file, then local paths
<system> includes search system paths only
Include guard detection (#ifndef X / #define X / ... / #endif)
File content caching
Maximum include depth (200) to prevent infinite recursion

Predefined macros

Target-specific macros are available for common platforms:

#![allow(unused)]
fn main() {
use parc::preprocess::{MacroTable, Target, define_target_macros};

let mut table = MacroTable::new();
define_target_macros(&mut table, &Target::host());
// Now table has __STDC__, __linux__, __x86_64__, __GNUC__, etc.
}

Supported targets:

Architectures: x86_64, aarch64, x86, arm
Operating systems: Linux, macOS (Darwin), Windows

Standard macros defined:

__STDC__, __STDC_VERSION__, __STDC_HOSTED__
Architecture-specific: __x86_64__, __aarch64__, __i386__, __arm__
OS-specific: __linux__, __APPLE__, _WIN32, etc.
GCC compatibility: __GNUC__, __GNUC_MINOR__, __GNUC_PATCHLEVEL__
Type sizes: __SIZEOF_POINTER__, __SIZEOF_INT__, etc.
Limits: __CHAR_BIT__, __INT_MAX__, __LONG_MAX__

Source IR

The parc::ir module defines the durable intermediate representation produced by the PARC frontend. It is the primary contract between the parser/extractor and downstream consumers (LINC, GERC).

Design Principles

Smaller than the AST: only normalized declarations, not the full syntax tree
Serializable: all types derive serde::Serialize and serde::Deserialize
Parser-agnostic: downstream consumers should depend on parc::ir, not parc::ast
No link/binary concerns: no ABI probing, no library paths, no symbol validation

Key Types

`SourcePackage`

The top-level container:

#![allow(unused)]
fn main() {
use parc::ir::SourcePackage;

let pkg = SourcePackage::new();
assert!(pkg.is_empty());
}

`SourceType`

Represents C types at source level:

Void, Bool, Char, SChar, UChar, Short, UShort,
Int, UInt, Long, ULong, LongLong, ULongLong,
Float, Double, LongDouble, Int128, UInt128,
Pointer, Array, Qualified, FunctionPointer,
TypedefRef, RecordRef, EnumRef, Opaque

`SourceItem`

One extracted declaration:

Function — function declaration with name, parameters, return type, calling convention
Record — struct/union with optional fields
Enum — enum with named variants and optional values
TypeAlias — typedef declaration
Variable — extern variable declaration
Unsupported — placeholder for unrepresentable declarations

`SourceMacro`

Captured preprocessor macro with form (object-like/function-like), kind, and optional parsed value.

`SourceDiagnostic`

Frontend diagnostic with kind, severity, message, optional location, and optional item name.

Provenance

SourceOrigin — where a declaration came from (Entry, UserInclude, System, Unknown)
DeclarationProvenance — per-item provenance metadata
MacroProvenance — per-macro provenance metadata
SourceTarget — compiler/target identity
SourceInputs — entry headers, include dirs, defines

JSON Serialization

All IR types support JSON roundtrip:

#![allow(unused)]
fn main() {
use parc::ir::SourcePackage;

let pkg = SourcePackage::new();
let json = serde_json::to_string_pretty(&pkg).unwrap();
let back: SourcePackage = serde_json::from_str(&json).unwrap();
assert_eq!(pkg, back);
}

Querying

SourcePackage provides typed accessors:

#![allow(unused)]
fn main() {
// pkg.functions()      -> Iterator<Item = &SourceFunction>
// pkg.records()        -> Iterator<Item = &SourceRecord>
// pkg.enums()          -> Iterator<Item = &SourceEnum>
// pkg.type_aliases()   -> Iterator<Item = &SourceTypeAlias>
// pkg.variables()      -> Iterator<Item = &SourceVariable>
// pkg.unsupported_items() -> ...
// pkg.find_function("malloc")
// pkg.find_record("point")
// pkg.find_enum("color")
// pkg.find_type_alias("size_t")
// pkg.find_variable("errno")
}

Extraction

The parc::extract module converts a parsed C AST into the normalized SourcePackage IR. It handles all declaration families.

Quick Start

#![allow(unused)]
fn main() {
use parc::extract;

let source = r#"
    typedef unsigned long size_t;
    void *malloc(size_t size);
    struct point { int x; int y; };
"#;

let pkg = extract::extract_from_source(source).unwrap();
assert_eq!(pkg.function_count(), 1);
assert_eq!(pkg.record_count(), 1);
assert_eq!(pkg.type_alias_count(), 1);
}

API Functions

`extract_from_source`

Parse and extract in one step using GNU C11 flavor:

#![allow(unused)]
fn main() {
let pkg = parc::extract::extract_from_source("int foo(void);").unwrap();
}

`parse_and_extract`

Parse and extract with a specific flavor:

#![allow(unused)]
fn main() {
let pkg = parc::extract::parse_and_extract(
    "int foo(void);",
    parc::driver::Flavor::StdC11,
).unwrap();
}

`extract_from_translation_unit`

Extract from an already-parsed AST:

#![allow(unused)]
fn main() {
let unit = parc::parse::translation_unit("int foo(void);", parc::driver::Flavor::StdC11).unwrap();
let pkg = parc::extract::extract_from_translation_unit(&unit, Some("test.h".into()));
}

`parse_and_extract_resilient`

Parse with error recovery and extract what’s possible:

#![allow(unused)]
fn main() {
let pkg = parc::extract::parse_and_extract_resilient(
    "int valid;\n@@@bad@@@;\nint also_valid;",
    parc::driver::Flavor::StdC11,
);
}

`extract_file`

Read a file from disk and extract:

#![allow(unused)]
fn main() {
let pkg = parc::extract::extract_file("path/to/header.h", parc::driver::Flavor::GnuC11).unwrap();
assert!(pkg.source_path.is_some());
}

What Gets Extracted

C Declaration	Source Item
`typedef int T;`	`SourceTypeAlias`
`int foo(void);`	`SourceFunction`
`int foo(void) { ... }`	`SourceFunction` (body ignored)
`struct S { int x; };`	`SourceRecord`
`struct S;`	`SourceRecord` (opaque)
`union U { ... };`	`SourceRecord` (Union kind)
`enum E { A, B };`	`SourceEnum`
`extern int x;`	`SourceVariable`
`static int f() {}`	Diagnostic (not bindable)
`_Static_assert(...)`	Diagnostic

Diagnostics

The extractor produces diagnostics for constructs it cannot fully represent:

Bitfield widths (partial representation)
Inline/noreturn specifiers (ignored)
Calling convention attributes (captured on function, other attributes warned)
K&R function declarations (unsupported)
Block pointers (unsupported)
Static functions (not bindable)

Header Scanning

parc::scan is the highest-level PARC API for people who want the source contract, not just the AST. It preprocesses headers, parses them, extracts items, and returns a SourcePackage plus the preprocessed source text.

Quick Start

#![allow(unused)]
fn main() {
use parc::scan::{ScanConfig, scan_headers};

let config = ScanConfig::new()
    .entry_header("api.h")
    .include_dir("/usr/include")
    .define_flag("NDEBUG")
    .with_builtin_preprocessor();

let result = scan_headers(&config).unwrap();
let pkg = result.package;
}

What `scan` really owns

The scan path currently owns all of these steps:

choose builtin or external preprocessing
build the preprocessing environment
parse the preprocessed translation unit
extract declarations into parc::ir
attach input metadata and diagnostics
optionally resolve typedef chains in the produced package

That makes it the closest thing PARC has to a “source artifact producer”.

ScanConfig

Builder for scan configuration:

Method	Description
`entry_header(path)`	Add an entry-point header
`include_dir(path)`	Add a preprocessor include search path
`define(name, value)`	Add a preprocessor define with value
`define_flag(name)`	Add a flag-style define (no value)
`with_compiler(cmd)`	Set the external preprocessor command
`with_flavor(flavor)`	Set the parser flavor
`with_builtin_preprocessor()`	Use the built-in preprocessor

Preprocessing Modes

External (default)

Uses gcc -E or clang -E to preprocess headers. Requires the compiler to be installed. Supports all system headers.

Built-in

Uses parc::preprocess directly. This is useful for controlled fixtures and repo-local tests. It is not a promise that the built-in preprocessor already matches every hostile system-header stack.

ScanResult

The scan produces:

package: SourcePackage — the extracted declarations and metadata
preprocessed_source: String — the preprocessed source text

Intake

For already-preprocessed source (e.g., output of gcc -E), use parc::intake::PreprocessedInput:

#![allow(unused)]
fn main() {
use parc::intake::PreprocessedInput;

let input = PreprocessedInput::from_string("int foo(void);")
    .with_path("output.i")
    .with_flavor(parc::driver::Flavor::GnuC11);

let pkg = input.extract();
}

What to expect from failures

scan_headers() can fail early on preprocessing setup problems, and it can also return a package with parse diagnostics if preprocessing succeeded but the source could not be fully parsed.

That split is intentional:

operational setup failures are Err(...)
source-level failures become package.diagnostics when possible

Parser API

The parse module exposes direct parsing functions that work on in-memory strings. Unlike driver, it does not invoke an external preprocessor.

Available entry points

#![allow(unused)]
fn main() {
parse::constant(source, flavor)
parse::expression(source, flavor)
parse::declaration(source, flavor)
parse::statement(source, flavor)
parse::translation_unit(source, flavor)
}

These map to progressively larger grammar fragments.

Return types

The direct parser returns the same ParseResult<T> shape for every entry point:

#![allow(unused)]
fn main() {
type ParseResult<T> = Result<T, ParseError>;
}

ParseError contains:

line
column
offset
expected

That makes it well suited for parser tests and editor integrations.

Parse an expression

#![allow(unused)]
fn main() {
use parc::driver::Flavor;
use parc::parse;

let expr = parse::expression("value + 1 * scale", Flavor::StdC11)?;
println!("{:#?}", expr);
Ok::<(), parc::parse::ParseError>(())
}

The return type is Box<Node<Expression>>, so you get both the expression and its span.

Parse a declaration

#![allow(unused)]
fn main() {
use parc::driver::Flavor;
use parc::parse;

let decl = parse::declaration(
    "static const unsigned long mask = 0xff;",
    Flavor::StdC11,
)?;

println!("{:#?}", decl.node);
Ok::<(), parc::parse::ParseError>(())
}

Declarations are useful when you want to inspect:

storage class
type qualifiers
declarator structure
initializers

Parse a statement

#![allow(unused)]
fn main() {
use parc::driver::Flavor;
use parc::parse;

let stmt = parse::statement(
    "for (int i = 0; i < 4; i++) total += i;",
    Flavor::StdC11,
)?;

println!("{:#?}", stmt.node);
Ok::<(), parc::parse::ParseError>(())
}

Parse a whole translation unit

#![allow(unused)]
fn main() {
use parc::driver::Flavor;
use parc::parse;

let source = r#"
typedef int count_t;
count_t inc(count_t x) { return x + 1; }
"#;

let unit = parse::translation_unit(source, Flavor::StdC11)?;
println!("items: {}", unit.0.len());
Ok::<(), parc::parse::ParseError>(())
}

Flavor-sensitive parsing

GNU or Clang syntax only parses when you select a compatible flavor.

#![allow(unused)]
fn main() {
use parc::driver::Flavor;
use parc::parse;

let gnu_expr = "({ int x = 1; x + 2; })";
assert!(parse::expression(gnu_expr, Flavor::GnuC11).is_ok());
assert!(parse::expression(gnu_expr, Flavor::StdC11).is_err());
}

When to prefer `parse`

Use parse when:

you already have a string in memory
you are testing grammar behavior directly
you are parsing snippets, not full files
you want a deterministic input without shelling out to gcc or clang

Use driver instead when preprocessing is part of the problem.

AST Model

The ast module contains the syntax tree PARC produces after parsing. Most types track the C11 grammar closely, with additional variants for supported GNU and Clang extensions.

Core wrapper types

Many parsed values are wrapped in Node<T>:

#![allow(unused)]
fn main() {
pub struct Node<T> {
    pub node: T,
    pub span: Span,
}
}

That means most interesting values come with a byte range in the parsed source.

Top-level structure

The root is:

#![allow(unused)]
fn main() {
pub struct TranslationUnit(pub Vec<Node<ExternalDeclaration>>);
}

Top-level items are:

#![allow(unused)]
fn main() {
pub enum ExternalDeclaration {
    Declaration(Node<Declaration>),
    StaticAssert(Node<StaticAssert>),
    FunctionDefinition(Node<FunctionDefinition>),
}
}

So a translation unit is a flat list of:

declarations
static assertions
function definitions

Declarations

Declarations are split into specifiers and declarators:

#![allow(unused)]
fn main() {
pub struct Declaration {
    pub specifiers: Vec<Node<DeclarationSpecifier>>,
    pub declarators: Vec<Node<InitDeclarator>>,
}
}

This mirrors C’s real syntax. For example:

static const unsigned long value = 42;

roughly becomes:

storage class specifier: Static
type qualifier: Const
type specifiers: Unsigned, Long
one declarator with identifier value
initializer expression 42

Declarators are the hard part

Declarator separates:

the name-bearing core (DeclaratorKind)
derived layers such as pointers, arrays, and functions
extension nodes

That design lets PARC represent C declarators without flattening away their structure.

Examples:

int *p;
int values[16];
int (*handler)(int);
void f(int x, int y);

Expressions

Expression is a large enum covering C expression syntax:

identifiers
constants
string literals
member access
calls
casts
unary operators
binary operators
conditional expressions
comma expressions
sizeof, _Alignof
GNU statement expressions
offsetof and va_arg expansions

Examples:

x
42
ptr->field
f(a, b)
(int) value
a + b * c
cond ? left : right
({ int t = 1; t + 2; })

Statements

Statement covers:

labeled statements
compound blocks
expression statements
if
switch
while
do while
for
goto
continue
break
return
GNU asm statements

Blocks contain BlockItem, which can be:

a declaration
a static assertion
another statement

That means a compound statement preserves the declaration/statement distinction instead of erasing everything into one generic node list.

Types and declarator support

Important declaration-side types include:

TypeSpecifier
TypeQualifier
StorageClassSpecifier
FunctionSpecifier
AlignmentSpecifier
TypeName
DerivedDeclarator
ParameterDeclaration
Initializer
Designator

This is enough to model:

pointer chains
arrays and VLA-like forms
function parameter lists
designated initializers
anonymous and named structs/unions/enums
typedef names
typeof

Extension nodes

PARC includes explicit AST nodes for extensions instead of hiding them:

Extension::Attribute
Extension::AsmLabel
Extension::AvailabilityAttribute
TypeSpecifier::TypeOf
Statement::Asm
Expression::Statement

That makes it practical to write tools that either support or reject extension syntax intentionally.

Reading the AST effectively

When working with PARC, a useful order is:

Start at TranslationUnit
Split declarations from function definitions
Inspect declarators carefully for type shape
Use the visitor API instead of hand-recursing everywhere
Use Printer to learn unfamiliar subtrees

Visitor Pattern

The visit module provides recursive AST traversal. It exposes:

a Visit<'ast> trait with hook methods
free functions like visit_expression and visit_function_definition that recurse into children

The important rule

When you override a method, call the free function from parc::visit, not the trait method on self. Calling self.visit_* from inside the override will recurse back into your override.

Count function definitions

#![allow(unused)]
fn main() {
use parc::{ast, span, visit};
use parc::visit::Visit;

struct FunctionCounter {
    count: usize,
}

impl<'ast> Visit<'ast> for FunctionCounter {
    fn visit_function_definition(
        &mut self,
        node: &'ast ast::FunctionDefinition,
        span: &'ast span::Span,
    ) {
        self.count += 1;
        visit::visit_function_definition(self, node, span);
    }
}
}

Collect identifiers from expressions

#![allow(unused)]
fn main() {
use parc::{ast, span, visit};
use parc::visit::Visit;

struct IdentifierCollector {
    names: Vec<String>,
}

impl<'ast> Visit<'ast> for IdentifierCollector {
    fn visit_identifier(&mut self, node: &'ast ast::Identifier, span: &'ast span::Span) {
        self.names.push(node.name.clone());
        visit::visit_identifier(self, node, span);
    }
}
}

Use the visitor

#![allow(unused)]
fn main() {
use parc::driver::{parse, Config};
use parc::visit::Visit;

let parsed = parse(&Config::default(), "examples/sample.c")?;

let mut counter = FunctionCounter { count: 0 };
counter.visit_translation_unit(&parsed.unit);

println!("functions: {}", counter.count);
Ok::<(), parc::driver::Error>(())
}

When to override which method

Override visit_translation_unit for whole-file summaries
Override visit_function_definition for function-level analysis
Override visit_declaration for declaration inspection
Override visit_expression for expression-wide checks
Override narrow hooks like visit_call_expression when you only care about one form

Traversal style

Two common styles work well:

Pre-order

Do work before recursing:

#![allow(unused)]
fn main() {
fn visit_expression(&mut self, node: &'ast ast::Expression, span: &'ast span::Span) {
    self.seen += 1;
    visit::visit_expression(self, node, span);
}
}

Selective traversal

Only recurse when the node passes a filter:

#![allow(unused)]
fn main() {
fn visit_statement(&mut self, node: &'ast ast::Statement, span: &'ast span::Span) {
    if matches!(node, ast::Statement::Return(_)) {
        self.returns += 1;
    }
    visit::visit_statement(self, node, span);
}
}

Practical advice

Start with a broad hook like visit_expression while learning the tree.
Narrow to specific hooks once you understand the shapes you care about.
Pair the visitor with Printer when a subtree is unclear.

Location Tracking

PARC tracks source positions in two related ways:

Span stores byte offsets into the parsed input
loc maps byte offsets in preprocessed source back to original files and lines

`Span`

Span is a byte range:

#![allow(unused)]
fn main() {
pub struct Span {
    pub start: usize,
    pub end: usize,
}
}

Most AST values are wrapped in Node<T>, which adds a span field:

#![allow(unused)]
fn main() {
pub struct Node<T> {
    pub node: T,
    pub span: Span,
}
}

What spans point to

This depends on the API you used:

with parse::*, spans refer to the string you passed in
with driver::parse_preprocessed, spans refer to the preprocessed string you passed in
with driver::parse, spans refer to the preprocessor output stored in Parse::source

That last case is important: spans do not directly point into the original .c file when preprocessing has inserted line markers or expanded includes.

Mapping offsets back to files

The loc module reads preprocessor line markers like:

# 42 "include/header.h" 1

From those markers, get_location_for_offset reconstructs:

the active file
the active line number
the include stack

Basic example

#![allow(unused)]
fn main() {
use parc::loc::get_location_for_offset;

let src = "# 1 \"main.c\"\nint value;\n";
let (loc, includes) = get_location_for_offset(src, 18);

assert_eq!(loc.file, "main.c");
assert!(includes.is_empty());
}

Using spans with locations

The common pattern is:

take a node span
use span.start or span.end
map that offset through loc

Example:

#![allow(unused)]
fn main() {
use parc::driver::{parse, Config};
use parc::loc::get_location_for_offset;

let parsed = parse(&Config::default(), "examples/sample.c")?;

if let Some(first) = parsed.unit.0.first() {
    let (loc, include_stack) = get_location_for_offset(&parsed.source, first.span.start);
    println!("first item starts in {}:{}", loc.file, loc.line);
    println!("include depth: {}", include_stack.len());
}
Ok::<(), parc::driver::Error>(())
}

`SyntaxError::get_location`

For parser failures in the driver path, driver::SyntaxError already exposes:

#![allow(unused)]
fn main() {
err.get_location()
}

That returns:

the active source location
the include chain that led there

This is the best starting point for user-facing diagnostics.

Caveat: byte offsets, not columns in UTF-16

PARC stores Rust byte offsets. That is usually what you want for source processing, but if you are feeding results into another tool that expects a different coordinate system, convert explicitly.

Testing

parc is the source-meaning crate in the toolchain, so its tests should prove three things:

the frontend accepts or rejects source as intended
the extracted SourcePackage contract carries the intended meaning
cross-package composition can start from parc artifacts without relying on parc internals

PARC has two broad testing layers:

direct parser/API tests in src/tests
corpus-style fixtures under test/reftests/ and, when present, test/full_apps/

It also now has explicit grouped failure suites:

failure_matrix_preprocess for scan/preprocessor hard failures and conservative scan outcomes
failure_matrix_source for source-parse hard failures, resilient recovery, and diagnostic-preserving extraction

Basic commands

The repository Makefile wraps the normal Cargo flow:

make build
make test

Those run:

cargo build --release
cargo test

Hermeticity split

The large PARC surfaces should be read in three groups:

always-on hermetic baselines
host-dependent but high-value ladders
hostile or conservative-failure surfaces

The hermetic baselines should remain the default confidence floor. The host-dependent ladders should strengthen confidence when available. The failure surfaces should prove that PARC stays diagnostic and deterministic when it cannot fully model a header family yet.

Contract tests

Contract tests are the tests a downstream toolchain should treat as the main statement of support:

parse_api tests for direct parser entry points
extraction tests for declaration/source modeling
scan tests for preprocessing and multi-file source intake
consumability tests for the SourcePackage artifact

If one of those changes meaningfully, the corresponding book chapter should change in the same patch.

Parse API tests

src/tests/parse_api.rs checks the public parse entry points directly.

Examples covered in the repository include:

constants
expressions
declarations
statements
translation units

This layer is useful when:

adding a new public parser entry point
fixing a small grammar regression
documenting a minimal parsing example

Reference tests

The reftest harness in src/tests/reftests.rs reads files from test/reftests/. Each case stores:

the source snippet
optional #pragma directives that affect parsing
an expected AST printout between /*=== and ===*/

That means reftests verify both:

whether parsing succeeds
whether the produced tree matches the expected printer output

Reftest update workflow

The harness supports TEST_UPDATE=1 to rewrite expected outputs when printer changes are intentional.

TEST_UPDATE=1 cargo test reftests

Use that carefully. It is appropriate after deliberate AST or printer changes, not as a substitute for reviewing diffs.

Full-app fixtures

The repository includes a full-app harness in src/tests/full_apps.rs. It supports fixture directories with a fixture.toml manifest describing:

mode
flavor
entry
expected
include_dirs
allow_system_includes
tags

Supported modes are:

translation_unit
driver
preprocessed

This is the right layer for:

multi-file examples
include-path behavior
external fixture snapshots
deterministic .i inputs

Filtering larger fixture runs

The full-app runner supports environment filters:

FULL_APP_FILTER=musl/stdint make test
FULL_APP_TAG=synthetic make test

These are useful when debugging one fixture family instead of running the whole corpus.

Current workspace note

The test harness and README describe test/full_apps, but that directory is not present in this workspace snapshot. The book documents the supported format because the code and README do.

Extraction tests

src/tests/extraction_fixtures.rs contains fixture-based tests for the extraction pipeline: typical C patterns (stdio-style, nested structs, typedef chains, function pointers, etc.).

src/extract/mod.rs also contains unit tests for each declaration family.

Hostile header tests

src/tests/hostile_headers.rs covers edge-case and historically problematic C declarations: deep pointer nesting, anonymous structs/enums, specifier ordering variations, bitfield-only structs, extreme enum values, forward-then-define patterns, etc.

Recovery tests

src/tests/recovery.rs tests graceful handling of broken, incomplete, or unusual input. Uses both strict parsing (error expected) and resilient parsing (recovery expected).

Contract tests

src/tests/contract.rs and src/tests/consumability.rs verify that the SourcePackage contract is sufficient for downstream consumers. These tests cover iteration patterns, type navigation, serialization, filtering, merging, and programmatic construction.

Differential tests

src/tests/differential.rs documents the known differences between parc extraction and bic extraction, ensuring behavioral equivalence on standard declarations and explicitly documenting intentional divergences (pointer model, no ABI fields, typedef chain preservation).

Multi-file scan tests

src/tests/scan_multifile.rs covers multi-header scanning scenarios: include chains, multiple entry headers, cross-file struct references, conditional compilation, include guards, include directory resolution, and metadata population.

Adding new tests

A practical progression is:

Add a parse_api unit test for the exact regression
Add a reftest if you need a stable printed-tree expectation
Add an extraction test if the issue is about declaration modeling
Add a scan test if preprocessing or multi-file behavior matters
Add a full-app fixture if the case needs a full filesystem layout

Cross-crate integration proof

parc library tests should not import linc or gerc.

Cross-crate proof belongs in:

linc tests/examples that ingest serialized or translated parc artifacts
gerc tests/examples that ingest translated source artifacts
external harnesses that exercise the full toolchain

That keeps parc’s own test suite focused on source meaning while still proving the larger pipeline elsewhere.

What “supported” means

For parc, support means:

the syntax path is covered by parser-facing tests
the extracted source meaning is covered by SourcePackage-level tests
the relevant limitations are documented honestly when behavior is partial or conservative

It does not mean:

every downstream consumer will accept the artifact unchanged
every hostile system header already has perfect preprocessing coverage
every parser-internal helper is part of the public contract

Diagnostics And Printing

PARC includes two pieces that are especially useful when building tools on top of the parser:

detailed parse errors
a tree printer for AST inspection

Direct parser diagnostics

The parse module returns ParseError:

#![allow(unused)]
fn main() {
use parc::driver::Flavor;
use parc::parse;

match parse::expression("a +", Flavor::StdC11) {
    Ok(_) => {}
    Err(err) => {
        eprintln!("line: {}", err.line);
        eprintln!("column: {}", err.column);
        eprintln!("offset: {}", err.offset);
        eprintln!("expected: {:?}", err.expected);
    }
}
}

This is enough for:

editor error messages
parser regression tests
grammar debugging

Driver diagnostics

The driver adds preprocessor context on top:

#![allow(unused)]
fn main() {
use parc::driver::{parse, Config, Error};

match parse(&Config::default(), "broken.c") {
    Ok(_) => {}
    Err(Error::PreprocessorError(err)) => {
        eprintln!("preprocessor failed: {}", err);
    }
    Err(Error::SyntaxError(err)) => {
        let (loc, includes) = err.get_location();
        eprintln!("syntax error in {}:{}:", loc.file, loc.line);
        eprintln!("column in preprocessed source: {}", err.column);
        for include in includes {
            eprintln!("included from {}:{}", include.file, include.line);
        }
    }
}
}

Formatting expected tokens

driver::SyntaxError also has format_expected, which is useful when building a custom human-readable error message.

AST printing

print::Printer is a visitor that renders the tree as an indented text dump.

#![allow(unused)]
fn main() {
use parc::driver::{parse, Config};
use parc::print::Printer;
use parc::visit::Visit;

let parsed = parse(&Config::default(), "examples/sample.c")?;

let mut out = String::new();
Printer::new(&mut out).visit_translation_unit(&parsed.unit);
println!("{}", out);
Ok::<(), parc::driver::Error>(())
}

The printer is ideal when:

learning how PARC models a syntax form
updating reftests
debugging traversal code

A practical debugging loop

When a new syntax form is not behaving the way you expect:

Parse the smallest reproducer with parse::*
Print the AST with Printer
Inspect spans on the nodes you care about
Switch to driver if preprocessing is involved
Map spans back to original files with loc

Project Layout

This chapter is for contributors and advanced users who want to understand where the parser logic lives.

Top-level crate layout

The repository is organized around a small public API surface and several internal support modules.

Path	Purpose
`src/lib.rs`	Public module exports
`src/ir/`	Source-level IR (`SourcePackage`, `SourceType`, etc.)
`src/extract/`	Declaration extraction from AST to IR
`src/scan/`	Header scanning (preprocess + parse + extract)
`src/intake/`	Preprocessed source intake
`src/driver.rs`	File-based parsing via external preprocessing
`src/preprocess/`	Built-in C preprocessor
`src/parse.rs`	Direct fragment parsing API
`src/ast/`	AST type definitions
`src/visit/`	Recursive visitor functions and trait
`src/parser/`	Parser implementation split by grammar area
`src/loc.rs`	Preprocessor line-marker location mapping
`src/span.rs`	`Span` and `Node<T>` wrappers
`src/print.rs`	AST debug printer
`src/tests/`	Test harnesses and integration-style tests

AST and visitor organization

The AST is split into focused files:

src/ast/declarations.rs
src/ast/expressions.rs
src/ast/statements.rs
src/ast/extensions.rs
src/ast/lexical.rs

The visitor layer mirrors that structure in src/visit/.

That symmetry is useful:

if you add a new AST node, you usually need a matching visitor hook
if you are looking for traversal behavior, the corresponding file is easy to find

Parser organization

The parser implementation is divided by grammar topics instead of one giant file. Examples include:

translation_units_and_functions.rs
declarations_entry.rs
declarators.rs
statements_iteration_and_jump.rs
casts_and_binary.rs
typeof_and_ts18661.rs

That split makes grammar work more localized.

Internal environment handling

Parsing depends on Env, which tracks parser state such as known typedef names and enabled syntax flavor. The public parse and driver APIs construct the right environment for you.

This matters because some C parses depend on whether an identifier is currently known as a typedef.

Testing layout

src/tests/ contains:

API tests
reftest harnesses
larger fixture harnesses
external/system-header related coverage

When changing parser behavior, expect to touch both narrow tests and corpus-style fixtures.

Contributor workflow

A good change sequence is:

reproduce with the smallest possible parse::* input
add or update a focused test
inspect the tree with Printer
patch the grammar or AST logic
run make test

Why the parser is split this way

The parser is organized by syntax areas because C grammar work tends to be local but not trivial. That split helps with three things:

keeping grammar changes reviewable
matching failures to the right part of the parser quickly
reducing the chance that one large parser file becomes impossible to maintain

For example:

declaration bugs often land in declarations_entry.rs, declarators.rs, or related files
expression bugs often land in primary_and_generic.rs, casts_and_binary.rs, or nearby files
statement bugs often land in the statements_* files

Public versus internal boundaries

These are normal consumer-facing modules:

ir (primary data contract)
extract
scan
intake
driver
preprocess
parse
ast
visit
loc
span
print

These are implementation-oriented and should not be treated as a stable downstream boundary:

parser
env
astutil
strings

That distinction matters when you are extending the book or the crate API. Documentation should prefer the consumer-facing modules unless the chapter is specifically contributor-oriented.

API Contract

This chapter records the intended public consumer surface of parc.

It is not a blanket promise about every future change. It is the current guidance for how downstream tools should integrate with the crate without depending on parser internals or accidentally turning parc into a shared ABI owner for the rest of the pipeline.

First Principle

parc is the source-meaning layer of the pipeline: preprocessing, parsing, and source-level semantic extraction.

The intended downstream pattern is:

scan headers or parse source via driver, scan, or parse
extract normalized declarations via extract
consume the SourcePackage IR from ir
use visit, span, and loc to analyze AST-level details if needed

Downstream consumers that want source contracts should depend on parc::ir, not on parc::ast directly.

More importantly for this repository:

parc library code must not depend on linc or gerc
linc and gerc should not require parc as a library dependency in their production code paths
integration should happen through PARC-owned artifacts in tests/examples or external harnesses
there is no shared ABI crate that all three libraries depend on
there is no obligation to preserve discarded pipeline shapes for backward compatibility

Preferred public surface

These are the main consumer-facing modules:

Module	Role	Current expectation
`parc::ir`	source-level IR (`SourcePackage`)	preferred data contract
`parc::extract`	declaration extraction from AST	preferred extraction entry point
`parc::scan`	header scanning (preprocess + extract)	preferred high-level entry point
`parc::intake`	preprocessed source intake	preferred for already-preprocessed source
`parc::driver`	parse files and preprocessed source	preferred parse entry point
`parc::preprocess`	built-in C preprocessor	preferred preprocessing entry point
`parc::parse`	parse string fragments directly	preferred low-level entry point
`parc::ast`	typed syntax tree	internal data model
`parc::visit`	recursive traversal hooks	preferred traversal API
`parc::span`	byte-range metadata	preferred location primitive
`parc::loc`	map offsets back to files/lines	preferred diagnostics helper
`parc::print`	AST debug dumping	preferred inspection helper

Internal modules are not the contract

These modules are public only indirectly through behavior, not as a recommended downstream surface:

parser
env
astutil
strings

If a downstream tool depends directly on how those modules work, it is probably coupling itself to implementation details rather than the intended library boundary.

Normative consumer rules

If you are building on top of parc, the safest current rules are:

use driver when preprocessing matters
use parse::* for fragment parsing or already-controlled text inputs
treat ir::SourcePackage as the primary output contract
use visit for traversal instead of hand-rolling recursive descent everywhere
use span and loc for diagnostics rather than guessing source positions
do not rely on exact error-message strings for durable control flow
do not treat PARC as semantic analysis, type checking, or ABI proof
if another crate needs PARC output, serialize the PARC-owned artifact and translate it outside library code

What is part of the practical contract

Today the strongest practical contract is:

ir::SourcePackage, SourceType, SourceItem, and all IR types — the primary data contract
extract::extract_from_source, extract_from_translation_unit, parse_and_extract, parse_and_extract_resilient
scan::ScanConfig, scan_headers, ScanResult
intake::PreprocessedInput
ir::SourcePackageBuilder — programmatic package construction
driver::Config, Flavor, Parse, Error, SyntaxError, parse_builtin, and capture_macros
preprocess::{Processor, IncludeResolver, MacroTable, Lexer, preprocess, tokens_to_text, Target, define_target_macros}
parse::{constant, expression, declaration, statement, translation_unit, translation_unit_resilient}
the AST model under ast
the traversal hooks under visit
the span/location model under span and loc

Those are the surfaces the rest of the book assumes consumers will use.

The important correction is this: PARC has two practical contracts today, not one:

a source-contract path centered on ir, extract, and scan
a parser-facing path centered on driver, parse, ast, and visit

The docs should not pretend the AST side does not exist, because the crate very much exposes it.

What is intentionally weaker

The following should be treated as less stable than the core parsing surface:

exact debug formatting of AST values
exact Display wording of parse errors
internal parser file layout under src/parser/
incidental ordering of implementation helper functions

These details are useful for debugging and contribution work, but they are not the main consumer contract.

Explicit non-goals

The current contract does not promise:

semantic name resolution beyond parsing decisions such as typedef handling
type checking
ABI compatibility guarantees
full support for every GCC or Clang extension
preservation of raw macro definitions beyond what capture_macros provides

Those are outside the scope of PARC as a source frontend.

Downstream posture

For long-lived integrations, the safest posture is:

use scan or extract as your primary entry point — these produce SourcePackage
consume ir::SourcePackage rather than raw AST types where possible
use driver and parse only when you need AST-level access
treat unsupported syntax and parser errors as normal outcomes
keep tests with representative preprocessed inputs for the syntax families you depend on
keep cross-package translation in tests/examples/harnesses rather than adding library dependencies
see Migration From bic if you are transitioning from bic

End-To-End Workflows

This chapter ties the public modules together into practical usage patterns.

Workflow 1: Parse A Real C File

#![allow(unused)]
fn main() {
use parc::driver::{parse, Config};

let parsed = parse(&Config::default(), "include/demo.h")?;
println!("items: {}", parsed.unit.0.len());
Ok::<(), parc::driver::Error>(())
}

This is the baseline path when:

includes matter
macros matter
compiler predefined types or macros matter

The result gives you both the AST and the exact preprocessed source PARC saw.

Workflow 2: Parse A Preprocessed Snapshot

#![allow(unused)]
fn main() {
use parc::driver::{parse_preprocessed, Config};

let source = std::fs::read_to_string("snapshots/demo.i").unwrap();
let parsed = parse_preprocessed(&Config::default(), source)?;
Ok::<(), parc::driver::SyntaxError>(())
}

Use this when:

reproducing a parse bug
building deterministic tests
integrating with a nonstandard build system

This workflow isolates parser behavior from preprocessor invocation behavior.

Workflow 3: Parse A Fragment In Tests

#![allow(unused)]
fn main() {
use parc::driver::Flavor;
use parc::parse;

let decl = parse::declaration("typedef unsigned long word_t;", Flavor::StdC11)?;
let expr = parse::expression("ptr->field + 1", Flavor::GnuC11)?;
Ok::<(), parc::parse::ParseError>(())
}

This is the right workflow for:

unit tests
grammar debugging
editor or language-server experiments

Workflow 4: Build A Syntax Analyzer

#![allow(unused)]
fn main() {
use parc::driver::{parse, Config};
use parc::visit::{self, Visit};
use parc::{ast, span};

struct ReturnCounter {
    count: usize,
}

impl<'ast> Visit<'ast> for ReturnCounter {
    fn visit_statement(&mut self, node: &'ast ast::Statement, span: &'ast span::Span) {
        if matches!(node, ast::Statement::Return(_)) {
            self.count += 1;
        }
        visit::visit_statement(self, node, span);
    }
}

let parsed = parse(&Config::default(), "src/main.c")?;
let mut counter = ReturnCounter { count: 0 };
counter.visit_translation_unit(&parsed.unit);
println!("return statements: {}", counter.count);
Ok::<(), parc::driver::Error>(())
}

This is the normal PARC analyzer pattern:

parse
traverse
inspect spans and locations
emit your own diagnostics or analysis data

Workflow 5: Build Diagnostics With Real File Locations

#![allow(unused)]
fn main() {
use parc::driver::{parse, Config};
use parc::loc::get_location_for_offset;

let parsed = parse(&Config::default(), "src/main.c")?;

for item in &parsed.unit.0 {
    let (loc, _) = get_location_for_offset(&parsed.source, item.span.start);
    println!("top-level item starts at {}:{}", loc.file, loc.line);
}
Ok::<(), parc::driver::Error>(())
}

Use this when your users care about original file locations rather than raw byte offsets in the preprocessed stream.

Workflow 6: Debug A New Syntax Form

#![allow(unused)]
fn main() {
use parc::driver::Flavor;
use parc::parse;
use parc::print::Printer;
use parc::visit::Visit;

let expr = parse::expression("({ int x = 1; x + 1; })", Flavor::GnuC11)?;

let mut out = String::new();
Printer::new(&mut out).visit_expression(&expr.node, &expr.span);
println!("{}", out);
Ok::<(), parc::parse::ParseError>(())
}

This is the most effective loop when exploring unfamiliar AST shapes.

Workflow 7: Regression-Test A Parse Failure

A practical bug workflow is:

capture the smallest failing input
decide whether preprocessing is relevant
add a parse_api test or a reftest
patch the grammar
verify the printed AST or error outcome

That keeps parser changes concrete and reviewable.

Error Surface

This chapter describes the error model PARC exposes today.

Two layers of errors

PARC has two main error surfaces:

direct parser errors from parse
driver errors from driver

The distinction is important because the driver includes external preprocessing.

Direct parser errors

The parse module returns:

#![allow(unused)]
fn main() {
Result<T, parc::parse::ParseError>
}

ParseError includes:

line
column
offset
expected

This error means:

the parser could not consume the full input
the failure happened at the given position
one of the listed tokens or grammar expectations would have allowed parsing to continue

Driver errors

The driver module returns:

#![allow(unused)]
fn main() {
Result<parc::driver::Parse, parc::driver::Error>
}

That error enum has two branches:

PreprocessorError(io::Error)
SyntaxError(SyntaxError)

This split is a real contract boundary:

preprocessor failures mean PARC never reached parsing
syntax failures mean preprocessing succeeded and PARC failed on the resulting text

`SyntaxError`

driver::SyntaxError contains:

source
line
column
offset
expected

It also provides:

get_location() to map back to source files and include stack
format_expected() for user-facing token formatting

What consumers should key on

For durable control flow, consumers should branch on:

error type
structured fields such as line, column, and expected

Consumers should not branch on:

exact human-readable Display text
incidental token ordering inside formatted strings

Practical examples

Fragment parsing

#![allow(unused)]
fn main() {
use parc::driver::Flavor;
use parc::parse;

match parse::statement("if (x) {", Flavor::StdC11) {
    Ok(_) => {}
    Err(err) => {
        eprintln!("statement parse failed at {}:{}", err.line, err.column);
    }
}
}

File parsing

#![allow(unused)]
fn main() {
use parc::driver::{parse, Config, Error};

match parse(&Config::default(), "broken.c") {
    Ok(_) => {}
    Err(Error::PreprocessorError(err)) => {
        eprintln!("preprocessor failure: {}", err);
    }
    Err(Error::SyntaxError(err)) => {
        let (loc, includes) = err.get_location();
        eprintln!("syntax failure in {}:{} ({})", loc.file, loc.line, err.column);
        eprintln!("include depth: {}", includes.len());
    }
}
}

Resilient parsing

parse::translation_unit_resilient provides error recovery. When a declaration fails to parse, it skips to the next synchronization point (; at file scope or } at brace depth zero) and continues parsing.

#![allow(unused)]
fn main() {
use parc::driver::Flavor;
use parc::parse;

let tu = parse::translation_unit_resilient(source, Flavor::GnuC11);
// tu.0 contains all successfully parsed declarations
// unparseable regions are silently skipped
}

Use this when you want partial results from files that contain unsupported syntax. The strict translation_unit function is still preferred when you need to detect all errors.

Failure-model guidance

Downstream tools should treat parse failures as normal, reportable outcomes.

That means:

do not crash just because one translation unit fails
surface the structured error data to the caller
retain the preprocessed source when debugging hard failures

Explicit limitations of the current error model

The current model does not provide:

semantic diagnostics
fix-it suggestions
a typed taxonomy for every grammar category of failure
warning channels separate from parse success

PARC’s errors are syntax-oriented rather than compiler-like.

Flavor And Extension Support

PARC supports three language flavors and several extension families.

This chapter records what that means in practice.

Flavors

Flavor	Intent
`StdC11`	strict C11 parsing
`GnuC11`	C11 plus GNU-oriented syntax
`ClangC11`	C11 plus Clang-oriented syntax

Use the flavor that matches the syntax you expect in the input.

Why flavor matters

Some C parses are ambiguous or extension-specific.

Examples include:

GNU statement expressions
typeof
GCC-style attributes
GNU asm statements
Clang availability attributes

If you parse extension-heavy source in StdC11, errors are expected.

GNU-oriented support

The AST and parser explicitly model GNU-oriented syntax such as:

typeof
statement expressions
GNU asm statements
asm labels
attributes
designated range initializers

In practice, if the source is GCC-flavored or Linux-kernel-like, GnuC11 is usually the right starting point.

Clang-oriented support

PARC also models Clang-specific or Clang-common syntax including:

Clang availability attributes
the ClangC11 flavor path in driver and parse

If your preprocessing and syntax assumptions are built around Clang, use Config::with_clang() or Flavor::ClangC11.

C23 keyword support

PARC accepts the following C23 keywords in all flavors, because modern compilers (GCC 15+) emit them in preprocessed output by default:

C23 keyword	C11 equivalent	Notes
`bool`	`_Bool`	type specifier
`true`	`1`	parsed as integer constant
`false`	`0`	parsed as integer constant
`nullptr`	`0`	parsed as integer constant
`static_assert`	`_Static_assert`	declaration
`alignas`	`_Alignas`	alignment specifier
`alignof`	`_Alignof`	expression
`thread_local`	`_Thread_local`	storage class
`constexpr`	(none)	storage class specifier
`typeof`	`__typeof__`	type specifier (was GNU-only)
`_BitInt(N)`	(none)	type specifier with width
`noreturn`	`_Noreturn`	function specifier
`complex`	`_Complex`	type specifier

GCC extension types

PARC recognizes these GCC extension types in GNU mode:

Type	AST variant	Notes
`__int128`	`TypeSpecifier::Int128`	non-unique, combinable with `signed`/`unsigned`
`__float128`	`TypeSpecifier::Float128`	unique type specifier
`__builtin_va_list`	typedef	handled as built-in typedef name

Standard-mode guidance

Use StdC11 when:

you want to reject vendor syntax deliberately
your test corpus is intended to stay close to the standard
you want parser behavior that is easier to reason about across compilers

Practical consumer policy

A useful integration policy is:

default to the compiler family you actually preprocess with
add tests for the specific extension families you rely on
treat unsupported extensions as explicit parser limitations, not random bugs

What this chapter does not claim

This chapter does not claim exhaustive support for every extension accepted by GCC or Clang.

It does claim that PARC has explicit support for several important extension families and that the flavor setting is part of the API contract for using them correctly.

Unsupported Cases

This chapter records the important unsupported or intentionally out-of-scope areas.

The goal is to prevent downstream users from mistaking absence of detail for implicit support.

It also acts as the current frontend-family closure ledger. Every hard family should fit into one of these buckets:

fully supported
resilient-only support
diagnostics-only improvement
intentional rejection

For the Level 1 production claim, this ledger is part of the real contract. If a family is not classified here, it should not be treated as in-scope production behavior.

Frontend-Family Closure Ledger

The current important families are:

Family	Current state	Notes
K&R function declarations	diagnostics-only improvement	PARC preserves the function surface and emits explicit unsupported diagnostics.
block pointers	intentional rejection	They still fail in parsing; current work is about sharper diagnostics, not pretending they lower cleanly.
bitfield-heavy records	resilient-only support	PARC keeps record shape and bit widths, but layout truth remains partial.
vendor attributes and calling-convention attributes	resilient-only support	PARC preserves the declaration and emits partial diagnostics when attributes are ignored.
macro-heavy include stacks	fully supported on current canonical corpora	The canonical corpora are the proof surface; more corpora still need to land before claiming broad closure.
hostile include-order and typedef-chain environments	fully supported on current canonical corpora	Treat this as corpus-backed support, not universal extension parity.

This ledger is intentionally blunt:

if a family is not yet honestly representable, reject it
if a family is only partially representable, say so
if a family is only proven on named corpora, document that exact scope

The Level 1 production envelope is Linux/ELF-first and corpus-backed. That means “supported” here should be read as one of:

fully supported within the named canonical corpus
partially supported with explicit diagnostics
rejected explicitly as out of scope

Semantic analysis

PARC does not provide:

full name resolution
type checking
constant folding as a stable analysis contract
ABI or layout proof
compiler-quality warnings

It is a parser with source-structure support, not a complete compiler frontend.

Preprocessing

PARC does not implement a standalone C preprocessor in the driver path.

Instead it depends on an external preprocessor command such as:

gcc -E
clang -E

That means PARC does not try to normalize every compiler’s preprocessing behavior internally.

The built-in preprocessor is increasingly useful for scan-first workflows, but it is still a scoped compatibility surface rather than a promise of universal host-header parity.

Extension completeness

PARC supports several GNU and Clang extensions, but the project does not promise complete parity with every extension accepted by modern GCC or Clang releases.

Downstream tools should not assume:

full GNU extension completeness
full Clang extension completeness
identical acceptance behavior across all compiler-version-specific syntax edges

Macro inventory and expansion modeling

PARC parses the post-preprocessing result. It does not expose a first-class macro inventory or a stable semantic model of macro definitions as its own output contract.

If you need macro capture as data, that is outside PARC’s current scope.

Translation-unit semantics

PARC can parse translation units, but it does not guarantee:

cross-file symbol resolution
duplicate-definition analysis as a stable feature
semantic correctness of declarations
linkability of parsed declarations

Those tasks belong to later analysis layers, not the parser itself.

Diagnostics depth

PARC does not currently provide:

warning classes
fix-it suggestions
rich categorized error codes
a stable diagnostic JSON schema

The current error model is strong enough for syntax handling, not full compiler UX.

The practical rule for the remaining hard families is:

if PARC can keep a trustworthy declaration surface, it should do so and emit diagnostics
if PARC cannot keep a trustworthy declaration surface, it should reject the construct explicitly

Consumer guidance

Downstream tools should treat these gaps as explicit non-guarantees.

That means:

build policy around syntax success and failure, not semantic certainty
isolate extension-heavy assumptions behind tests
keep representative preprocessed fixtures for any hard parser dependency
treat the closure ledger above as part of the real contract, not as a vague future roadmap

Reproducibility

Parsing C is sensitive to the exact preprocessor environment.

This chapter documents how to keep PARC-based workflows reproducible.

Main reproducibility risks

The biggest sources of drift are:

different preprocessor executables
different default include paths
different predefined macros
different parser flavor settings
different preprocessed snapshots in tests

Best practices

For durable automation:

prefer explicit Config values over ambient defaults in CI
pin include paths with -I... when they matter
use -nostdinc for isolated fixture testing when appropriate
keep preprocessed snapshots for hard parser regressions
keep the parser flavor explicit in tests

Deterministic parse debugging

If a real file parse is inconsistent across machines, a strong debugging move is:

capture the preprocessed output
switch the failing test to parse_preprocessed
debug PARC against the stable snapshot

That separates:

preprocessing differences
parser differences

Reftests and snapshots

The reftest harness already encourages deterministic expectations by comparing against printed AST output. For parser bugs that depend on preprocessing, a pinned .i file is often even better.

Consumer guidance

If PARC is part of a larger pipeline, keep the following recorded somewhere durable:

preprocessor executable
preprocessor arguments
flavor
representative fixtures
expected parse outcome

Without that context, debugging parser regressions is much slower.

Stable Usage Patterns

This chapter records usage patterns that are safest for downstream consumers.

Pattern 1: Separate parsing from analysis

A durable integration pattern is:

parse with PARC
convert the AST into your own analysis model if needed
run later semantic or policy logic on that model

This avoids coupling too much of your tool to every detail of PARC’s raw AST layout.

Pattern 2: Preserve preprocessed source for diagnostics

If you use driver, keep Parse::source around as long as you may need diagnostics.

That enables:

mapping spans back to files and lines
debugging parser failures later
reproducing failures from stored snapshots

Pattern 3: Make flavor explicit

Even when defaults are convenient, explicit flavor choices are easier to maintain in tools and tests.

Prefer:

Flavor::StdC11 for strict grammar tests
Flavor::GnuC11 when GNU syntax is intentional
Flavor::ClangC11 when Clang-specific syntax is intentional

Pattern 4: Test the syntax you depend on

If your downstream tool depends on a specific syntax family, keep representative tests for it.

Examples:

function-pointer declarators
designated initializers
GNU statement expressions
inline asm
availability attributes

Pattern 5: Treat parse failure as data

A mature integration does not assume every input will parse. It treats parse failure as a structured, reportable outcome.

That means:

returning parse diagnostics to the caller
logging the failing source context when appropriate
keeping failure fixtures in the test corpus

Pattern 6: Prefer local traversal hooks

When building analyzers, override the narrowest useful visitor hook instead of one huge catch-all traversal method.

That makes the analysis easier to maintain as the AST evolves.

Contributor Workflow

This chapter records a practical workflow for changing parc safely.

Smallest-reproducer rule

When fixing or extending the parser, start with the smallest input that demonstrates the issue.

That input should usually be one of:

a direct parse::* snippet
a reftest file
a preprocessed snapshot

This keeps parser work focused.

Recommended change sequence

reproduce the issue with the smallest possible input
decide whether the right test layer is parse_api, reftest, or full-app style
inspect the AST or failure position with Printer or structured errors
patch the relevant parser module
rerun the focused tests
only then widen out to broader test coverage

Choosing the right test layer

Use parse_api tests when:

the bug is a simple grammar acceptance issue
you only need a success/failure assertion

Use reftests when:

tree shape matters
printer output is the clearest regression oracle

Use preprocessed or full-app style fixtures when:

includes or macro expansion are part of the problem
driver behavior matters

Grammar-oriented debugging

A good parser debugging loop is:

isolate the failing syntax
parse with the right flavor
inspect the closest AST shape that already works
patch the grammar in the most local parser file possible

This is usually better than broad speculative rewrites.

AST changes

If you add or change an AST node, review the corresponding surfaces too:

visitor hooks in visit
printer behavior in print
any book examples that describe the shape
reftest expectations if printer output changed

Documentation changes

If a syntax family becomes better supported, update the book at the same time. The important places are usually:

flavor/extension guidance
unsupported cases
workflows
AST or visitor examples

That keeps the book aligned with the real parser contract.

Boundary rule

When changing parc, keep the ownership split explicit:

parc owns preprocessing, parsing, extraction, and source artifacts
parc does not own link evidence or Rust lowering
do not document parser internals as if they were a shared ABI for the rest of the pipeline

If a change makes the source artifact richer, document the richer source meaning directly instead of hinting that downstream crates depend on parc library internals.

Maintenance rule

The maintenance bar is simple:

add or tighten the smallest useful test first
keep public contract docs and examples in the same patch
prefer deleting stale workflow language over preserving it for history
do not keep dead compatibility stories in the book

Support Tiers

This chapter records a practical support posture for PARC’s public surface.

It is meant to help downstream users judge which parts of the crate are the safest long-term integration points.

Tier 1: Core Consumer Surface

These are the most important public surfaces to depend on:

driver
parse
ast
visit
span
loc

These modules define the main parsing contract of the crate.

Tier 2: Debugging And Inspection Surface

These are public and useful, but more inspection-oriented than contract-critical:

print
Debug views of AST nodes
formatted error text

They are valuable for debugging and tests, but long-lived tooling should still prefer structured data over formatted strings.

Tier 3: Contributor-Oriented Knowledge

These are important for contributors but should not be treated as downstream contracts:

parser file organization under src/parser/
helper-module layout
incidental internal naming
current implementation decomposition across grammar files

These details may evolve as the parser changes.

Consumer guidance

If you are building external tooling on top of PARC, bias toward Tier 1 surfaces first. Reach for Tier 2 when you need diagnostics or debugging support. Treat Tier 3 as implementation detail unless you are actively contributing to PARC itself.

Hardening Matrix

This chapter translates the large PARC test surface into an explicit hardening ladder.

The important point is not “how many tests exist”. The important point is which surfaces are carrying confidence for real-header parsing, preprocessing, and source extraction.

How To Read The Matrix

Read each surface on three axes:

hermetic or host-dependent
parser-only versus scan-first
success path versus conservative failure path

A surface is stronger when it is:

hermetic
scan-first
repeated deterministically
tied to a realistic system or library family

Tier 1: Hermetic Canonical Baselines

These are the first surfaces that should stay green on every machine:

vendored musl stdint
vendored zlib
vendored libpng builtin-preprocessor success path
repo-owned macro_env_a hostile macro corpus
repo-owned type_env_b hostile type corpus
parser and extraction corpus fixtures under src/tests/**

These matter because they exercise:

multi-header scanning
macro and include handling
extraction into SourcePackage
deterministic behavior without relying on the host toolchain layout

Tier 2: Host-Dependent Canonical Ladders

These should stay green on developer and CI hosts where the headers exist, but they are not the first portability baseline:

OpenSSL public wrapper extraction
combined Linux event-loop wrapper extraction
larger libc and system-header clusters

These surfaces matter because they are closer to the “real ugly header world” target than the small synthetic fixtures.

Tier 3: Hostile And Conservative-Failure Surfaces

These prove that PARC is refusing or degrading honestly instead of pretending to understand everything:

hostile declaration fixtures
repo-owned hostile corpora that force builtin-preprocessor macro and typedef expansion
recovery fixtures
unsupported or partial declaration families that still emit diagnostics and partial metadata
extraction-status summaries that distinguish supported, partial, and unsupported output trust

For release purposes, these failures are good when they are:

deterministic
diagnostic
documented

Determinism Anchors

The most important repeat-run anchors right now are:

vendored musl scan
vendored zlib scan
vendored libpng scan
macro_env_a scan
type_env_b scan
OpenSSL wrapper extraction
combined Linux event-loop wrapper extraction

If any of those become unstable, the release posture should drop immediately.

What This Matrix Does Not Mean

This matrix does not mean:

every random system library now parses perfectly
every preprocessor corner is solved
every large host-dependent surface is equally mature

It means the current confidence ladder is explicit instead of implied.

Parser Boundaries

This chapter explains where PARC starts and where it intentionally stops.

PARC owns syntax parsing

PARC is responsible for:

accepting supported C syntax
building an AST
carrying spans
mapping parse positions back through preprocessor line markers

That is the core boundary of the crate.

PARC does not own full compilation

PARC does not attempt to be:

a full preprocessor implementation
a type checker
a linker-aware analyzer
a code generator
a full semantic compiler frontend

These are not accidental omissions. They are part of the intended scope boundary.

Practical layering

A healthy toolchain boundary looks like this:

a compiler or preprocessor produces acceptable input
PARC parses it
a later layer performs semantic analysis, policy checks, or code generation

This keeps PARC focused on syntax and source structure.

Why this matters for consumers

If a downstream tool needs:

ABI guarantees
linker truth
semantic type equivalence
macro inventories as data

then PARC should be one component in the pipeline, not the whole pipeline.

Why this matters for contributors

When deciding whether a new feature belongs in PARC, a useful question is:

“Does this improve PARC’s syntax parsing and source-structure contract, or does it drag PARC into a later compiler stage?”

If it is mostly a later-stage concern, it probably belongs outside PARC.

Release Checklist

This chapter is a pragmatic checklist for documentation and parser changes before a release.

The important release posture is architectural:

parc releases source/frontend behavior
it does not release binary or Rust-generation policy
the tested SourcePackage contract matters more than parser-internal churn

Parser changes

Before releasing parser changes:

confirm the smallest reproducer has a test
confirm the intended flavor coverage is tested
confirm the AST shape change is deliberate
confirm visitor and printer behavior still make sense

Book changes

Before releasing documentation changes:

confirm the affected public behavior is described in the book
confirm unsupported or out-of-scope cases are still documented honestly
confirm examples still match the actual public API names

Error-surface changes

Before releasing changes around errors:

confirm structured fields still provide the needed information
avoid treating formatted strings as the real contract
update the error-surface chapter if the practical behavior changed

Workflow changes

Before releasing changes to the normal integration path:

update the workflow chapter
update the API contract chapter if the preferred boundary changed
update stable-usage guidance if downstream posture should change

Artifact contract changes

Before releasing a SourcePackage shape change:

confirm the changed field meaning is covered by contract-level tests
confirm the consuming workflow examples still describe artifact boundaries
confirm cross-crate composition is still described as tests/examples/harness work, not library coupling

Release gate

parc is ready to release only when:

make build passes
make test passes
the canonical hardening surfaces are still green
- vendored musl stdint
- vendored zlib
- vendored libpng scan
- OpenSSL public wrapper extraction
- libcurl public wrapper extraction
- combined Linux event-loop wrapper extraction
deterministic repeated extraction still holds on the canonical large surfaces
the book still teaches parc as the source-meaning crate
unsupported or partial source behavior is still documented honestly

Final practical rule

If a change would force a downstream PARC consumer to rethink how it parses, traverses, or reports on source, the book should say so explicitly in the same change.

Readiness Scorecard

This chapter ties PARC readiness to real suites instead of vague confidence claims.

Overall Posture

PARC should currently be read as:

strong on parser and extraction fundamentals
strong on scan-first vendored baselines
materially stronger on hostile real-world builtin-preprocessor corners
intentionally conservative when a large header family cannot be modeled honestly

For Level 1 production, PARC should be read as Linux/ELF-first and canonical-corpus-backed, not as a universal frontend for arbitrary C headers.

For whole-pipeline claims, this score is also capped by downstream gerc anchors that ingest translated PARC source surfaces in tests/examples.

That is good progress, but it is not the same thing as “finished for every C header in the wild”.

Subsystem Scorecard

parser entrypoints: high
AST traversal and printing: high
extraction to SourcePackage: high
scan-first vendored baselines: high
hostile-header recovery: medium-high
built-in preprocessor coverage on ugly system headers: medium-high
large host-dependent wrapper extraction: medium-high
deterministic behavior on canonical large surfaces: high

Canonical Readiness Anchors

The release posture should be judged against these anchors first:

vendored musl stdint
vendored zlib
vendored libpng scan
repo-owned macro_env_a
repo-owned type_env_b
OpenSSL public wrapper extraction
combined Linux event-loop wrapper extraction

If those anchors stay green and deterministic, PARC is earning trust. If they drift, the scorecard should be lowered even if many smaller tests still pass.

What Would Raise Readiness Further

The next meaningful gains would be:

broader built-in-preprocessor coverage on other hostile width and platform gates beyond the libpng family
more ugly combined system-header clusters
more repeat-run deterministic scans on large host-dependent surfaces
clearer unsupported-case diagnostics for the remaining difficult families

Migration From bic

This chapter documents how to migrate downstream consumers from bic’s frontend extraction to parc’s SourcePackage contract.

Why migrate

parc now owns source-level declaration extraction. bic’s extract.rs was the legacy location for this logic. The canonical path is now:

C headers  ->  parc::scan / parc::extract  ->  SourcePackage  ->  downstream

bic should consume parc::ir::SourcePackage instead of owning its own extraction.

Type mapping

bic type	parc type	Notes
`BindingPackage`	`SourcePackage`	parc has no `layouts`, `link`, or `bic_version`
`BindingItem`	`SourceItem`	Same variant set
`BindingType`	`SourceType`	Pointer model differs (see below)
`FunctionBinding`	`SourceFunction`	Identical structure
`ParameterBinding`	`SourceParameter`	Identical structure
`RecordBinding`	`SourceRecord`	No `representation` or `abi_confidence`
`FieldBinding`	`SourceField`	No `layout` field
`EnumBinding`	`SourceEnum`	Identical structure
`TypeAliasBinding`	`SourceTypeAlias`	No `canonical_resolution`
`VariableBinding`	`SourceVariable`	Identical structure
`UnsupportedItem`	`SourceUnsupported`	Identical structure
`CallingConvention`	`CallingConvention`	parc version includes `Unknown(String)`
`TypeQualifiers`	`TypeQualifiers`	Identical structure
`BindingTarget`	`SourceTarget`	Identical structure
`BindingInputs`	`SourceInputs`	Identical structure
`BindingDefine`	`SourceDefine`	Identical structure
`MacroBinding`	`SourceMacro`	parc drops `function_like` and `category`
`DeclarationProvenance`	`DeclarationProvenance`	Identical structure
`MacroProvenance`	`MacroProvenance`	Identical structure

Pointer model difference

bic:

#![allow(unused)]
fn main() {
Pointer {
    pointee: Box<BindingType>,
    const_pointee: bool,      // whether pointee is const
    qualifiers: TypeQualifiers, // qualifiers on the pointer itself
}
}

parc:

#![allow(unused)]
fn main() {
Pointer {
    pointee: Box<SourceType>,
    qualifiers: TypeQualifiers, // is_const means pointee is const
}
}

In parc, qualifiers.is_const on a Pointer indicates that the pointee is const-qualified. Use SourceType::const_ptr(inner) and SourceType::ptr(inner) as constructors.

Missing fields in parc

These bic fields are intentionally absent from parc because they belong to the link/ABI layer:

FieldBinding.layout (field offset) — use LINC probing
RecordBinding.representation — use LINC probing
RecordBinding.abi_confidence — use LINC validation
TypeAliasBinding.canonical_resolution — parc preserves TypedefRef chains
BindingPackage.layouts — use LINC probing
BindingPackage.link — use LINC link surface
BindingPackage.effective_macro_environment — use LINC macro analysis

Migration steps

Step 1: Replace extraction call

Before:

#![allow(unused)]
fn main() {
use bic::extract::Extractor;
use bic::ir::BindingPackage;

let extractor = Extractor::new();
let (items, diagnostics) = extractor.extract(&unit);
let mut pkg = BindingPackage::new();
pkg.items = items;
}

After:

#![allow(unused)]
fn main() {
use parc::extract;
use parc::ir::SourcePackage;

let pkg = extract::extract_from_translation_unit(&unit, Some("header.h".into()));
}

Or for end-to-end scanning:

#![allow(unused)]
fn main() {
use parc::scan::{ScanConfig, scan_headers};

let config = ScanConfig::new()
    .entry_header("header.h")
    .with_builtin_preprocessor();
let result = scan_headers(&config).unwrap();
let pkg: &SourcePackage = &result.package;
}

Step 2: Update type references

Replace all uses of BindingType with SourceType, BindingItem with SourceItem, etc. The variant names are identical.

Step 3: Handle pointer model

Replace const_pointee checks:

#![allow(unused)]
fn main() {
// Before (bic)
if let BindingType::Pointer { const_pointee: true, .. } = ty { ... }

// After (parc)
if let SourceType::Pointer { qualifiers, .. } = ty {
    if qualifiers.is_const { ... }
}
}

Step 4: Remove ABI fields

Any code that reads FieldBinding.layout, RecordBinding.representation, or RecordBinding.abi_confidence should be moved to LINC’s domain.

Step 5: Use builder for programmatic construction

#![allow(unused)]
fn main() {
use parc::ir::{SourcePackageBuilder, SourceItem, SourceFunction, ...};

let pkg = SourcePackageBuilder::new()
    .source_path("api.h")
    .item(SourceItem::Function(func))
    .item(SourceItem::Record(rec))
    .build();
}

API reference

Key public APIs for downstream consumers:

parc::extract::extract_from_source(src) — parse and extract
parc::extract::extract_from_translation_unit(unit, path) — extract from AST
parc::extract::parse_and_extract(src, flavor) — with flavor control
parc::extract::parse_and_extract_resilient(src, flavor) — with error recovery
parc::scan::scan_headers(config) — end-to-end header scanning
parc::ir::SourcePackage — the contract type
parc::ir::SourcePackageBuilder — programmatic construction
SourcePackage::retain_items(pred) — filter items
SourcePackage::merge(other) — combine packages

Keyboard shortcuts

PARC Reference