Keyboard shortcuts

Press ← or → to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

AST Model

The ast module contains the syntax tree PARC produces after parsing. Most types track the C11 grammar closely, with additional variants for supported GNU and Clang extensions.

Core wrapper types

Many parsed values are wrapped in Node<T>:

#![allow(unused)]
fn main() {
pub struct Node<T> {
    pub node: T,
    pub span: Span,
}
}

That means most interesting values come with a byte range in the parsed source.

Top-level structure

The root is:

#![allow(unused)]
fn main() {
pub struct TranslationUnit(pub Vec<Node<ExternalDeclaration>>);
}

Top-level items are:

#![allow(unused)]
fn main() {
pub enum ExternalDeclaration {
    Declaration(Node<Declaration>),
    StaticAssert(Node<StaticAssert>),
    FunctionDefinition(Node<FunctionDefinition>),
}
}

So a translation unit is a flat list of:

  • declarations
  • static assertions
  • function definitions

Declarations

Declarations are split into specifiers and declarators:

#![allow(unused)]
fn main() {
pub struct Declaration {
    pub specifiers: Vec<Node<DeclarationSpecifier>>,
    pub declarators: Vec<Node<InitDeclarator>>,
}
}

This mirrors C’s real syntax. For example:

static const unsigned long value = 42;

roughly becomes:

  • storage class specifier: Static
  • type qualifier: Const
  • type specifiers: Unsigned, Long
  • one declarator with identifier value
  • initializer expression 42

Declarators are the hard part

Declarator separates:

  • the name-bearing core (DeclaratorKind)
  • derived layers such as pointers, arrays, and functions
  • extension nodes

That design lets PARC represent C declarators without flattening away their structure.

Examples:

  • int *p;
  • int values[16];
  • int (*handler)(int);
  • void f(int x, int y);

Expressions

Expression is a large enum covering C expression syntax:

  • identifiers
  • constants
  • string literals
  • member access
  • calls
  • casts
  • unary operators
  • binary operators
  • conditional expressions
  • comma expressions
  • sizeof, _Alignof
  • GNU statement expressions
  • offsetof and va_arg expansions

Examples:

x
42
ptr->field
f(a, b)
(int) value
a + b * c
cond ? left : right
({ int t = 1; t + 2; })

Statements

Statement covers:

  • labeled statements
  • compound blocks
  • expression statements
  • if
  • switch
  • while
  • do while
  • for
  • goto
  • continue
  • break
  • return
  • GNU asm statements

Blocks contain BlockItem, which can be:

  • a declaration
  • a static assertion
  • another statement

That means a compound statement preserves the declaration/statement distinction instead of erasing everything into one generic node list.

Types and declarator support

Important declaration-side types include:

  • TypeSpecifier
  • TypeQualifier
  • StorageClassSpecifier
  • FunctionSpecifier
  • AlignmentSpecifier
  • TypeName
  • DerivedDeclarator
  • ParameterDeclaration
  • Initializer
  • Designator

This is enough to model:

  • pointer chains
  • arrays and VLA-like forms
  • function parameter lists
  • designated initializers
  • anonymous and named structs/unions/enums
  • typedef names
  • typeof

Extension nodes

PARC includes explicit AST nodes for extensions instead of hiding them:

  • Extension::Attribute
  • Extension::AsmLabel
  • Extension::AvailabilityAttribute
  • TypeSpecifier::TypeOf
  • Statement::Asm
  • Expression::Statement

That makes it practical to write tools that either support or reject extension syntax intentionally.

Reading the AST effectively

When working with PARC, a useful order is:

  1. Start at TranslationUnit
  2. Split declarations from function definitions
  3. Inspect declarators carefully for type shape
  4. Use the visitor API instead of hand-recursing everywhere
  5. Use Printer to learn unfamiliar subtrees