Compiler Diagnostics
This chapter is about compiler reporting, not about language-level panic or
report semantics.
In other words:
panicandreportdescribe what your program does- diagnostics describe what the compiler tells you when it cannot continue or when it wants to surface important information
Why this chapter exists
FOL now has a real compiler pipeline:
fol-streamfol-lexerfol-parserfol-packagefol-resolverfol-typecheckfol-lowerfol-runtimefol-backendfol-diagnostics
That means errors are no longer just loose strings printed from one place. Compiler failures now move through a shared diagnostics layer with stable structure.
This matters for three reasons:
- humans need readable compiler output
- tests need stable enough structure to assert against
- future tools need a machine-readable format that is not just a copy of the human renderer
What a diagnostic contains
At the current compiler stage, a diagnostic can carry:
- severity
- main message
- a stable diagnostic code (e.g.
P1001,R1003,T1003) - one primary location
- zero or more related locations
- notes
- helps
- suggestions
The current compiler mostly emits Error, but the reporting layer also supports
Warning and Info.
Diagnostic codes
Every diagnostic carries a stable producer-owned code. The code identifies the error family and specific failure without relying on message text.
Current code families:
| Prefix | Producer | Examples |
|---|---|---|
P1xxx | parser | P1001 syntax, P1002 file root |
K1xxx | package loading | K1001 metadata, K1002 layout |
R1xxx | resolver | R1003 unresolved, R1005 ambiguous |
T1xxx | type checker | T1003 type mismatch |
L1xxx | lowering | L1001 unsupported surface |
F1xxx | frontend | F1001 invalid input, F1002 workspace not found |
K11xx | build evaluator | K1101 build failure |
Codes are structurally assigned. The parser carries an explicit ParseErrorKind
field on each error rather than deriving the code from message text. This means
message wording can change without breaking code identity.
Human output shows codes in brackets:
error[R1003]: could not resolve name 'answer'
JSON output includes the code as a top-level field:
{ "code": "R1003", "message": "could not resolve name 'answer'" }
Primary location
The most important part of a diagnostic is its primary location.
That location is currently expressed as:
- file
- line
- column
- optional span length
This is what allows the compiler to point at the exact token or source span that caused the failure.
Every diagnostic now carries a real location. Parser errors that previously lacked locations (safety-bound overflows, constraint violations like duplicate parameter names) now extract file/line/column from the current token position.
Typical examples:
- a parser error at the token that made a declaration invalid
- a package-loading error at the control file or package root that failed
- a resolver error at the unresolved identifier or ambiguous reference
- a typecheck error at the expression or declaration whose types do not match
- a lowering error at the typed surface that has no current
V1lowering rule
Related locations
Some compiler failures are not well described by one location alone.
For example:
- duplicate declarations
- ambiguous references
- duplicate package metadata fields
In those cases the compiler keeps one primary site and can also attach related sites as secondary labels.
That allows the compiler to say things like:
- this declaration conflicts with an earlier declaration
- this name could refer to either of these two candidates
- this metadata field was already defined elsewhere
Notes, helps, and suggestions
FOL diagnostics separate extra guidance into different buckets instead of forcing everything into one long message.
The current contract is:
- the main message says what went wrong
- notes add technical context
- helps add actionable guidance
- suggestions describe a possible replacement or next step when the producer can express one
This split matters because tooling and tests can preserve structure instead of trying to parse intent back out of prose.
Error recovery
The parser implements error recovery so that a single syntax mistake does not cascade into dozens of unrelated errors.
When a declaration parse fails, the parser calls sync_to_next_declaration to
skip forward to the next declaration-start keyword (fun, var, def, typ,
pro, log, seg, ali, imp, lab, con, use) or EOF. This means:
fun[exp] emit(...) = { ... }produces exactly 1 error, not 20+- two broken declarations separated by a good one produce 2 errors, and the good declaration still parses correctly
Cascade suppression
Even with parser recovery, edge cases in any pipeline stage can cascade.
The diagnostic report layer applies two safety nets:
- same-code, same-line dedup: if the most recently added diagnostic has the same code and same line as a new one, the new one is suppressed
- hard cap: the report accepts at most 50 diagnostics total and shows “(output truncated)” when the limit is reached
These limits prevent walls of identical errors without hiding genuinely distinct failures.
Human-readable diagnostics
By default the CLI prints human-readable diagnostics.
The current renderer is designed around:
- a severity prefix with a diagnostic code bracket (e.g.
error[R1003]:) - an arrow line with
file:line:column - a source snippet when the file and line can be loaded
- an underline for the primary span
- note-style summaries for related labels
- note/help lines after the main snippet
Illustrative shape:
error[R1003]: could not resolve name 'answer'
--> app/main.fol:3:12
|
3 | return answer
| ^^^^^^ unresolved name
note: no visible declaration with that name was found in the current scope chain
help: check imports or declare the name before use
Messages are clean human-readable text. The compiler does not prepend internal
kind labels like ResolverUnresolvedName: to messages. The diagnostic code in
brackets is the stable identifier.
Source fallbacks
Sometimes the compiler knows the location but cannot render the source line itself.
Examples:
- the file is no longer readable
- the file path is missing
- the requested line is outside the current file contents
In those cases the compiler still keeps the location and falls back cleanly instead of crashing the renderer.
So the priority order is:
- exact location
- source snippet when available
- explicit fallback note when the snippet cannot be shown
JSON diagnostics
When the CLI is invoked with --json, diagnostics are emitted as structured
JSON instead of human-readable text.
This output is meant for scripts, tests, editor tooling, and future integration layers.
Important rule:
- JSON is not a lossy summary of human output
Instead, both human and JSON outputs are generated from the same structured diagnostic model.
The editor/LSP layer should follow that same rule too: editor diagnostics should be adapted from the shared structured diagnostic model rather than rebuilt from free-form strings.
That means JSON can preserve:
- severity
- code
- message
- primary location
- related labels
- notes
- helps
- suggestions
Illustrative shape:
{
"severity": "Error",
"code": "R1003",
"message": "could not resolve name 'answer'",
"location": {
"file": "app/main.fol",
"line": 3,
"column": 12,
"length": 6
},
"labels": [
{
"kind": "Primary",
"message": "unresolved name",
"location": {
"file": "app/main.fol",
"line": 3,
"column": 12,
"length": 6
}
}
],
"notes": [
"no visible declaration with that name was found in the current scope chain"
],
"helps": [
"check imports or declare the name before use"
],
"suggestions": []
}
Again, the exact payload can evolve, but the important guarantee is that the structured fields are first-class rather than reverse-engineered from text.
Which compiler phases currently participate
At head, the main producers that lower into the shared diagnostics layer are:
- parser
- package loading
- resolver
- type checking
- lowering
- build evaluator
- backend
- frontend (workspace discovery and input validation)
That means diagnostics are already strong across:
- syntax errors (with error recovery so cascades are contained)
- package metadata and package-root errors
- import-loading failures
- unresolved names
- duplicate names
- ambiguous references
- type mismatches and unsupported semantic surfaces inside
V1 - unsupported lowered
V1surfaces before target emission - backend emission and build failures when lowered
V1workspaces cannot become runnable artifacts - build graph evaluation failures
This is the important boundary for the current compiler stage:
- the compiler can now parse, resolve, type-check, and lower the supported
V1subset - diagnostics already cover failures from each of those stages plus backend emission/build failures
- the project now does promise a finished first backend for the current
V1subset, while later targets, optimizations, C ABI work, and Rust interop work remain outside this chapter
What diagnostics do not guarantee
Diagnostics are strong, but they are not a substitute for the later semantic phases.
Current limits still matter:
- parser diagnostics do not imply type checking has happened
reportcompatibility still belongs to later semantic work- type mismatch, coercion, and conversion diagnostics are future type-checker work
- ownership and borrowing diagnostics are future semantic work
- C ABI diagnostics are future
V4package/type/backend work - Rust interop diagnostics are future
V4package/type/backend work
So the current guarantee is:
- if stream, lexer, parser, package loading, resolver, typechecker, or lowering can identify the problem now, diagnostics should be structured and exact
But not:
- all language-semantic errors already exist today
Practical rule of thumb
When reading compiler output, think in this order:
- look at the diagnostic code in brackets to identify the error family
- trust the primary location
- use related labels to understand competing or earlier sites
- read notes for technical context
- read helps for the most actionable next step
That mental model matches how the compiler currently structures reporting.