TBCX

Tcl Bytecode eXchange tbcx 1.1 Apr 23, 2026

NAME

tbcx — serialize, load, and inspect precompiled Tcl 9.1 bytecode (procs, OO methods, and lambdas). Artifacts require an exact Tcl major/minor match at load time.

SYNOPSIS

tbcx::save in out ?-include-source?
tbcx::load in
tbcx::dump filename
tbcx::gc

DESCRIPTION

The tbcx extension provides four commands that enable an efficient save → load → eval pipeline for Tcl 9.1 scripts.

The goal is to pay the cost of parsing/compiling at save time so that loading is as fast as reading a compact binary, while remaining functionally equivalent to source of the original script.

Artifacts store the compiled top-level, precompiled proc bodies, TclOO method/ctor/dtor bodies, and lambda literals for use with apply. Loading installs these into the current interpreter and executes the top-level block in the caller’s current namespace, with source-equivalent semantics for variable scope, frame identity, and info script.

In safe interpreters, tbcx_SafeInit provides the package and type infrastructure but does not register any tbcx::* commands. A parent interpreter may selectively grant access with interp alias or interp expose.

COMMANDS

tbcx::save in out ?-include-source?

Synopsis — Compile a script and write a .tbcx artifact.

Parameters

in — Resolved in this order:

  1. Open channel name — if the value names an existing open channel, it is read as text (the caller controls encoding; TBCX does not alter it).
  2. Readable file path — if the value is a path to a readable file, it is opened in text mode (default UTF-8), read, and closed by tbcx::save. The normalized path is recorded in the artifact header for info script restoration at load time.
  3. Literal script text — otherwise the value is treated as inline Tcl script text. Consequently, a value that looks like a path but is not currently readable is compiled as script text, not reported as a file-open error. No source path is recorded.

out — One of:

  • Writable channel — an open channel; binary mode (-translation binary -eofchar {}) is enforced. The caller’s channel settings are mutated and not restored. The channel is not closed.
  • Writable path — TBCX writes a temporary file in the target directory and renames it into place only after serialization succeeds, so a failed save never leaves a truncated artifact at the final path.

-include-source — Optional flag. Embeds the authored source text of every proc and TclOO method body in the artifact as an LPString. Required when consumers depend on info body, info class definition, TIP #280 line attribution in stack traces, disassembly annotations, or introspection-based clone idioms (e.g. cloneRule, installTocRule) to return the original authored text. Artifact size grows proportional to the aggregate source text of all procs and methods.

Default behavior (no -include-source): Every proc/method body source field is emitted as an empty LPString. At load time the loader substitutes the diagnostic sentinel described in SOURCE PRESERVATION below, so info body returns a loud string instead of silently returning empty. This matches the 25-year tclcompiler/tbcload precedent for Tcl AOT compile output: deployment artifacts are typically distributed instead of source, and stripping keeps the artifact smaller and preserves the IP-protection angle.

Behavior
  • Performs a single-pass capture and rewrite of the script, extracting proc, namespace eval, oo::class create, oo::define/oo::objdefine, and self method definitions while producing a rewritten script with method bodies replaced by stubs.
  • Pre-compiles namespace eval body literals and other script-body patterns (try/on/trap/finally, foreach, lmap, while, for, catch, if/elseif/else, eval, uplevel, time, timerate, dict for/map/update/with, lsort -command, and self method bodies inside oo::define).
  • Performs two-phase instruction-level body detection: Phase 1 analyzes invokeStk operand stack patterns; Phase 2 identifies unpushed literals from inline-compiled foreach/lmap loops. Uses an O(1) opcode dispatch table covering all Tcl 9.1 instructions.
  • Strips startCommand debugging instructions from all compiled blocks, replacing them with nop bytes for ~15% leaner bytecode execution.
  • Detects bytearray literals (bytes ≥ 0x80) and emits them as TBCX_LIT_BYTEARR to prevent UTF-8 encoding corruption.
  • Body literals are emitted as TBCX_LIT_BYTESRC (source text + compiled bytecode) enabling cross-interpreter recompilation.
  • When the input is a readable file path, the normalized path is recorded in the header via Tcl_FSGetNormalizedPath so the loader can restore info script to the authored value.
  • Serializes the header, top-level block, Procs table (with per-proc source-text LPString), Classes catalog (advisory), and Methods table (with per-method source-text LPString and kind 4 for self methods).
  • Lambda literals (apply forms) are compiled and serialized as lambda-bytecode literals.
  • Conflicting proc definitions across if/else branches are handled via indexed markers, enabling position-based matching at load time.
Returns
The output object: either the normalized path written, or the writable channel handle.
Errors
Typical errors include: unreadable input; unwritable output; unknown/unsupported AuxData; size limit exceeded; runaway serialization (too many literals, blocks, excessive recursion depth, or output too large); short read/write.
Notes
  • Input channels retain their encoding settings; output channels are set to binary mode (and not restored).
  • The produced artifact targets Tcl 9.1 (format version 92); other versions are rejected at load time.

Examples

# Save from path → path (default: source stripped)
set path [tbcx::save ./app.tcl ./app.tbcx]

# Save from path → path with body source preserved for
# info body / info class definition / TIP #280 line attribution.
tbcx::save ./app.tcl ./app.tbcx -include-source

# Save from string value → path
set script {proc hi {} {puts Hello}; hi}
tbcx::save $script ./hello.tbcx

# Save from channel → channel
set in  [open ./lib/foo.tcl r]
set out [open ./foo.tbcx w]
fconfigure $out -translation binary -eofchar {}
try {
    tbcx::save $in $out
} finally {
    close $in
    close $out
}

tbcx::load in

Synopsis — Load a .tbcx artifact, install precompiled entities, and execute the top-level block in the caller’s current namespace.

Parameters

in — One of:

  • Readable channel — an open channel positioned at the beginning of a .tbcx stream (binary).
  • Readable path — a filesystem path to a .tbcx file.
Behavior
  • Validates header and producer version (exact major.minor match required); deserializes sections.
  • If the header carries a recorded authored source path, reads it into a Tcl_Obj for info script restoration.
  • Installs a temporary ProcShim that intercepts the proc command to substitute precompiled bodies when the name and argument signature match. The body’s string representation is set to either the preserved source text (artifacts built with -include-source) or the diagnostic sentinel (default stripped mode).
  • Installs a temporary OOShim that intercepts oo::define and oo::objdefine to substitute precompiled method/constructor/destructor bodies. Self methods (kind 4, TBCX_METH_SELF) are installed via oo::define { self method } to preserve metaclass inheritance for subclasses. Method body string representations follow the same preserved-or-sentinel rule as proc bodies.
  • Registers precompiled lambdas in a persistent per-interpreter ApplyShim that recovers the compiled representation if type shimmer evicts it.
  • Executes the precompiled top-level block via Tcl_EvalObjEx with flags 0 (no TCL_EVAL_GLOBAL), evaluating in the caller’s current namespace. iPtr->scriptFile is saved, set to the header’s source path (or the tbcx artifact path as a fallback), and restored after evaluation — matching Tcl_FSEvalFileEx’s handling in tclIOUtil.c. Compiled locals for the top-level frame are installed on the caller’s active variable frame by linking named variables to existing same-name variables in the caller’s scope. A top-level return is handled identically to source.
  • Removes the ProcShim and OOShim after execution; the ApplyShim persists for the interpreter’s lifetime.
Returns
The result of the top-level block evaluation.
Errors
Typical errors include: unreadable input; bad header; incompatible Tcl version; malformed/unknown section; size limit exceeded; short read; class or namespace creation errors during top-level evaluation.
Notes
  • The top-level block executes in the caller’s current namespace, with compiled locals linked to the caller’s variable frame. When tbcx::load is wrapped inside a proc that the user invokes externally, the caller should use uplevel 1 [list tbcx::load $path] to pop to the outer frame — identical to the pattern already required for source in the same position.
  • info script inside the loaded top-level block returns the authored .tcl source path when the artifact was built from a file, matching what source would have set. This supports both [file dirname [info script]] (for sibling asset lookup) and [info script] eq $::argv0 (self-invocation guards).
  • Loading executes code; only load artifacts from trusted sources.

Examples

# Load into current interp
tbcx::load ./app.tbcx

# Load from an open channel
set ch [open ./app.tbcx r]
fconfigure $ch -translation binary -eofchar {}
try {
    tbcx::load $ch
} finally {
    close $ch
}

# Drop-in replacement for source, inside a module loader proc.
# Both branches use uplevel 1 so they reach the caller’s frame.
proc moduleLoad {path} {
    set tbcxPath [file join [file dirname $path] ../tbcx [file tail $path].tbcx]
    if {[file exists $tbcxPath]} {
        uplevel 1 [list tbcx::load $tbcxPath]
    } else {
        uplevel 1 [list source $path]
    }
}

tbcx::dump filename

Synopsis — Disassemble and describe a .tbcx artifact in human-readable form.

Parameters
filename — A readable path to a .tbcx file.
Behavior
Reads the artifact and prints: header fields (including the authored source path when recorded); section summaries; literals and AuxData; exception ranges; disassembly of the top-level, proc, method, and lambda bodies; and the preserved body source text (indented inline, no truncation) for each proc and method when the artifact was built with -include-source. For stripped artifacts (the default), each body record shows source: <stripped at save time>.
Returns
A string containing the formatted dump.
Errors
Unreadable file; bad header; malformed section; short read.

Examples

% package require tbcx
% tbcx::save hello.tcl hello.tbcx -include-source
% puts [tbcx::dump hello.tbcx]
TBCX Header:
  magic = 0x58434254 ('T''B''C''X')
  format = 92
  tcl_version = 9.1.0 (type 0)
  top: code=12, except=0, lits=2, aux=0, locals=1, stack=2
  source = /path/to/hello.tcl

Top-level block:
  Disassembly (top-level):
    ...

Procs: 1
  - proc hi  (ns=::)
    args:
    source: (14 bytes)
      puts Hello
    Disassembly (hi):
    ...

tbcx::gc

Synopsis — Purge stale entries from the per-interpreter lambda shimmer-recovery registry.

Behavior
The ApplyShim maintains a registry of precompiled lambda objects. Over time, entries for lambdas that are no longer referenced (except by the registry itself) accumulate. This command purges those stale entries immediately. Stale entries are also purged lazily on each tbcx::load call, so explicit use of tbcx::gc is typically unnecessary except in long-running interpreters.

tbcx::gc is a no-op if no ApplyShim has been installed yet (i.e. before any tbcx::load call), and it is safe to call multiple times.

Parameters
None.
Returns
Empty string.

SOURCE PRESERVATION

Without -include-source, every proc and method body is emitted with an empty source-text field on the wire, and the loader installs a two-line diagnostic sentinel as the body’s string representation:

# tbcx: body source stripped at save time; info body unavailable
error "tbcx: introspection-based cloning is not supported for this artifact"

The shape matches the canonical tclcompiler pattern (Compiler8.html, “Example 1: Cloning Procedures”):

With -include-source, the authored source text is preserved in the artifact and attached as the body Tcl_Obj’s string representation via Tcl_InvalidateStringRep + Tcl_InitStringRep. The ByteCode internal representation is untouched; execution still uses the precompiled bytecode. info body, info class definition, info class constructor, and TIP #280 source attribution round-trip byte-for-byte.

FILE FORMAT (OVERVIEW)

This section summarizes the on-disk structure. Format version is 92 (Tcl 9.1). All integers are little-endian.

Header
Magic (0x58434254) + format version (92) + producing Tcl version; size/count metadata for the top-level block (code length, exception ranges, literal count, AuxData count, locals, max stack); authored source path LPString (empty for inline/channel inputs).
Sections (order)
(1) Top-level block (code, literals, AuxData, exceptions, locals epilogue); (2) Procs (FQN, ns, arg spec, body-source LPString, compiled block); (3) Classes (advisory catalog of discovered class names; actual creation occurs at load time via the top-level script); (4) Methods (class, kind 0–4, name, args, body-source LPString, compiled block).
Literal kinds
boolean, (wide)int/uint, double, bignum, string, bytearray, list, dict (insertion order preserved), bytesrc (bytecode + source text for cross-interp recompilation), and lambda-bytecode literals for use with apply.
AuxData families
jump tables (string or numeric), dictupdate, and NewForeachInfo.

LAMBDA SUPPORT

Literals in the script that represent lambdas for apply (lists of the form {args body ?ns?}) are compiled and serialized as lambda-bytecode literals at save time. On load, the compiled body, argument list, and optional namespace element are rehydrated into a Proc and registered in the ApplyShim so that the first call to apply does not trigger compilation. If type shimmer later evicts the lambdaExpr internal representation, the ApplyShim transparently re-installs it on the next apply call.

SEMANTICS AND PERFORMANCE

Artifacts are portable across little- and big-endian hosts. The loader detects host byte order once and applies a tight in-place swap over bulk sections that require conversion, avoiding per-field branching so cross-endian loads remain close to native performance.

Loading evaluates the precompiled top-level in the caller’s current namespace with iPtr->scriptFile set to the authored source path (if recorded), then installs precompiled proc/method bodies and rehydrates lambda literals. The intent is to be functionally indistinguishable from source of the original script, with the benefit of faster startup due to avoided parsing/compilation.

PRECOMPILATION BOUNDARY

TBCX precompiles bodies and lambdas only when they are present in statically identifiable literal positions (script-body arguments to commands like foreach, while, try, eval, etc., or lambda literals for apply). Strings assembled at runtime — for example with format, string interpolation, or list construction — still round-trip correctly, but they remain ordinary data and compile at execution time when Tcl evaluates them.

OO SUPPORT

TBCX preserves normal TclOO class/object construction semantics by executing the rewritten top-level script, while substituting precompiled bodies for recognized oo::define / oo::objdefine method forms. Tested scenarios include class methods, self methods, per-object methods, private methods, inheritance (including diamond), mixins, filters, forwards, abstract/singleton metaclasses, method rename/delete/export changes, metaclasses with self method, and next-based constructor chaining.

With -include-source, info class definition, info class constructor, info class destructor, and info object method all return the authored body text byte-for-byte, enabling introspection-based idioms such as cloneMethod src dst patterns to work identically to the source-based baseline.

MULTI-INTERPRETER AND THREAD SUPPORT

TBCX follows Tcl’s standard threading model: only the thread that created an interpreter may call tbcx::save, tbcx::load, tbcx::dump, or tbcx::gc on that interpreter. Multi-thread support means multiple independent interpreters, each used by its owning thread — not sharing one interpreter across threads. Calling a TBCX command from a non-owning thread returns TCL_ERROR with a diagnostic message.

Artifacts are designed to load into interpreters other than the originating one. Interpreter-specific state (ApplyShim lambda registry, load depth, OO shim hidden-ID counter) remains strictly per-interpreter and is cleaned up automatically when the interpreter is deleted.

LIMITS

Sanity caps exist for code size (64 MiB), literal/AuxData/exception counts (1M each), string lengths (4 MiB), total serialized output (256 MB), serialization recursion depth (64), total WriteLiteral calls (2M), and total WriteCompiledBlock calls (256K). Exceeding any limit produces an error.

Nested or reentrant tbcx::load calls are capped at depth 8 per interpreter to prevent runaway recursive loading.

DIAGNOSTICS

Representative messages include: “bad header”, “incompatible Tcl version”, “short read/write”, “unsupported AuxData kind”, “input is neither an open channel nor a readable file”, “runaway serialization detected”, “tbcx::save: unknown option …; expected -include-source”, and Tcl errors from top-level evaluation.

SECURITY

Loading executes code. Only load artifacts you trust. Safe interpreters receive no tbcx::* commands by default; use interp alias or interp expose from a parent interpreter to grant selective access.

SEE ALSO

source(n), TclOO(n), info(n), tclcompiler (historical Tcl 8.x AOT tool)

© 2025–2026 Miguel Banon

MIT License.

↑ Top