Emergent Hardware Verification

Chapter H AI-Driven RTL Generation from Actor Specifications

AI tools generating RTL today work from English specifications to Verilog. The translation crosses a large specification gap: English is unstructured, ambiguous, and frequently incomplete; Verilog is structurally concurrent, precisely typed, and demands every signal be accounted for. The AI must reconstruct the concurrent structure from prose, infer the timing model, decide on handshake protocols, and produce something that synthesizes. Output is often broken because reconstruction is error-prone, and the prose itself is ambiguous. State-of-the-art results are roughly: simple combinational designs work, simple FSMs sometimes work, anything involving multiple interacting modules with timing and backpressure is unreliable.

The actor methodology changes the picture in two ways, and the second is the one this appendix builds toward. First, it closes the specification gap on the input side: when the input is actor code rather than English, the AI is no longer reconstructing structure but performing a shape-preserving translation between two forms of the same thing. Second, and more consequentially, it splits the problem so the AI does only the part it is good at. The deterministic actor-to-RTL lowering is a proven compiler’s job, not a guess — an actor is a latency-insensitive bounded dataflow network node, and the open Golden Gate compiler lowers such a node to cycle-exact RTL automatically with a formal guarantee (Appendix E §E.5). What is left for the AI is what AI is actually good at: helping author and refine the actor specification, exploring topology variants, suggesting coverage. The error-prone reconstruction step — where today’s AI-RTL fails — does not exist in this flow, and what remains is guarded by an oracle (§H.7).

H.1 The Specification Gap, Concretely

The current AI-Verilog flow:

[1] Engineer writes English spec:
    "An 8-bit UART transmitter that takes parallel input data when valid_i
     is asserted, drops a start bit, shifts out 8 data bits LSB-first at
     the configured baud rate, and emits a stop bit. Once started a
     transmission cannot be interrupted."


[2] AI tool reads the spec, attempts Verilog generation.


[3] AI must decide:
    - clock domain (1 clk_i? phase-locked baud clk?)
    - reset polarity (sync? async? active-high? active-low?)
    - parameter names (BAUD, BAUD_DIV, BAUD_RATE, NCO?)
    - handshake on input (ready/valid? FIFO? pulse?)
    - output port shape (serial bit + valid? open-drain?)
    - what "configured baud rate" means (parameter? input port? CSR?)
    - state machine encoding
    - whether to support back-to-back transmission


[4] Generated Verilog often:
    - has off-by-one bit-counter errors
    - mixes signed/unsigned in widths
    - produces inferred latches in always_comb
    - assumes input is a pulse when caller wanted a level
    - drops the second word in back-to-back transmissions

Every decision in step [3] is a place where the AI can produce a result that compiles, lints clean, and is wrong. The fundamental issue is not the AI’s competence but the input format: English does not pin down the structural decisions Verilog requires.

H.2 The Shape-Preserving Translation with Actor Input

The actor flow:

[1] Engineer writes the actor (e.g. UartTxActor in actor_pkg style):


    class UartTxActor extends Actor;
      UartConfig_s cfg;
      task act(MsgBase msg);
         if (msg.getTypeName() == $typename(UartItem_s)) begin
           UartItem_s item = Msg#(UartItem_s)::unwrap(msg);
           if (item.dir == UART_TX) begin
               drive_byte(item.data);
               // optionally publish a UartTxDone event
           end
         end
      endtask
      task drive_byte(logic [7:0] data);
         // start, 8 data, optional parity, stop, all at cycles_per_bit
      endtask
    endclass


[2] AI tool is given:
    - the class source
    - the synthesizable-form rules
    - the typed message struct (UartItem_s) as input/output channel format


[3] AI generates the synthesizable form by mechanical mapping:
    - class -> module
    - mailbox.get -> in_valid_i / in_ready_o handshake on UartItem_s bundle
    - publish -> out_valid_o / out_ready_i handshake (if any)
    - drive_byte body -> registered FSM with bit counter
    - cfg.baud_rate -> parameter or CSR-driven divisor
    - lineage trace_id -> propagated through the message bundle


[4] Generated Verilog:
    - structurally identical to the class (one module per actor)
    - one input channel, optional one output channel
    - state machine is the synthesizable form of the handler
    - port naming follows the standard handshake pattern

The AI is no longer making architectural decisions. The architecture is in the input. The translation has one degree of freedom (specific FSM encoding), whereas the English-to-Verilog flow has dozens. Output is reliable because the structure is given; the AI fills in the cycle-level details — or, better, hands them to a compiler (§H.3).

H.3 The Lowering Is a Compiler’s Job, Not a Guess

The shape-preserving translation reduces the AI’s degrees of freedom from dozens to one. The strongest form of the methodology removes even that one, because the remaining lowering is mechanical and has an automatic implementation with a formal correctness guarantee.

An actor in the synthesizable form is a primitive latency-insensitive bounded dataflow network (LI-BDN) node (Appendix E §E.5). The open Golden Gate compiler (Magyar/Biancolin et al., ICCAD 2019) takes arbitrary FIRRTL and emits a cycle-exact hardware model of such a node automatically, with a machine-checked partial-implementation guarantee that the emitted model matches the source cycle for cycle. So the actor-to-RTL lowering does not need the AI at all: a proven compiler performs it, and the result is correct by a formal argument rather than by a test pass.

This reframes the strongest version of actor-AI-RTL. It is not “the AI generates the RTL.” It is “the AI helps author the actor; a proven compiler lowers it.” The two parts have very different trust profiles. Authoring the actor is creative, ambiguous, and exactly where a human-plus-AI collaboration adds value; it is verified at the specification level by the scoreboards and coverage that already exist (Appendix D §D.4). Lowering the actor to gates is deterministic, structure-preserving, and carries its own equivalence proof through Golden Gate. The 100-line counter RTL of Appendix E is what such a lowering emits, not what an AI guesses — and the guarantee that it matches the three-line class is mechanical.

Two existence proofs, one research and one shipping. Golden Gate is the formally-guaranteed instance of the lowering — open, cycle-exact, machine-checked. There is also a commercially shipping instance: high-level synthesis (Catapult, Stratus) accepts synthesizable SystemC as its input, through tool chains already in production use (Appendix J §J.5). So “the lowering is a compiler’s job” is not a bet on future tooling; it is a research result and a path through shipping tools, and the AI sits above both. Nor is the target restricted to RTL: the same compiler-floor argument covers the GPU kernel and the accelerator tile (Appendix M), one more backend for the same lowering. The authoring the AI does rises above a floor that itself spans every substrate the book targets.

Where a proven compiler is not in the flow — a hand-written actor-DSL transpiler, or an AI applying the rules directly — the lowering is still mechanical but no longer self-certifying, and Sequential Equivalence Checking is the trust anchor (§H.7). Either way, the architectural decisions live in the actor, the lowering preserves them, and the AI’s reconstruction step — the source of today’s AI-RTL failures — is gone.

H.4 Worked Example: Counter Actor

The class-based form (illustrative, in actor_pkg.sv style):

class CounterActor extends Actor;
  int count;
  task act(MsgBase msg);
    count++;
    `PUBLISH(count);
  endtask
endclass

Three lines of substantive content: state declaration, handler, publish. The synthesizable RTL form (appE_synth/examples/counter_actor.sv, 100 lines) is what the lowering — compiler or AI-applying-the-rules — produces:

• Module signature with parameterized message width and standard handshake ports.
• Three registered signals: count_q, out_valid_q, out_data_q.
• Combinational next-state block deriving *_d from *_q and the inbound handshake.
• Sequential block latching *_d into *_q on posedge clk_i.
• Reset clears state; ready/valid handshake regulates input acceptance and output presentation.

The lowering’s steps, given the class and the rules:

1. Recognize that count is module-scope state \(\rightarrow \) register pair count_q/count_d.
2. Recognize that act(msg) returns void and modifies state \(\rightarrow \) combinational block updates count_d on in_fire.
3. Recognize that ‘PUBLISH(count) produces an outbound message \(\rightarrow \) register out_data_q, set out_valid_q on the same cycle.
4. Recognize that the actor’s mailbox accepts one message at a time \(\rightarrow \) in_ready_o = !out_valid_q || out_ready_i, ensuring no message is lost while the previous output is unaccepted.
5. Generate the standard reset block.

Every step is mechanical. There is no architectural decision; the architecture was specified by the actor’s structure. The synthesizable form was hand-translated for this appendix; Golden Gate performs the equivalent lowering on arbitrary FIRRTL automatically (§H.3), and an AI given the same input plus the rules in Appendix E would produce equivalent output. The synthesized result, from Yosys (Appendix E): 98 standard cells per actor instance — 33 flip-flops, 30 XOR + 30 AND + 1 XNOR for the increment, plus a small handful for the handshake. Real hardware, structurally clean, comprehensible to a designer reviewing the netlist.

The harder case — a stateful checker, not a counter — is what the substrate-swap example (appG_firesim_substrate_swap) demonstrates: a scoreboard actor with a golden model, an expected-value FIFO, and a comparator, lowered to 231 flip-flops and run against its software rendering with identical results (Appendix E §E.4). That a verification actor lowers as cleanly as a design actor is what makes the whole-graph AI flow credible, not just the DUT.

H.5 The Prompt Structure

When the AI rather than a proven compiler performs the lowering, a practical prompt has three components:

1. The actor source. The class-based actor file as a string. The AI consumes this as the structural specification.
2. The translation rules. Appendix E’s five rules, the class-to-module mapping table, and the handshake convention. Any AI capable of structured code generation can apply these rules deterministically.
3. The message-bundle declarations. The typed structs the actor publishes/consumes, given as packed-struct declarations the AI uses as the wire-bundle format on each channel.

A full prompt is on the order of ten kilobytes for a realistic actor (counter, UART byte driver, simple monitor); the output is a single .sv file. Compared to the prompt required to produce the same output from an English spec — which involves the AI inferring port shapes, handshake conventions, FSM encoding, and the boundary against the rest of the system — the actor-input prompt is shorter, more deterministic, and reproducible. Two AI tools given the same input should produce structurally identical output, differing only in cosmetic details. And the output is checkable against the reference, which is the point of §H.7.

H.6 Beyond One-Shot Translation: AI Moves Up the Flow

Shape-preserving translation is the visible benefit. The deeper benefit is that, once the lowering is a compiler’s job, AI participation moves up the design flow to where ambiguity — and therefore the value of a capable collaborator — actually lives.

Architecture refinement. The AI is given the actor model and asked to suggest a topology variant: “what if the alert handler used a hierarchical aggregator instead of a flat broadcast?” The AI proposes new actor classes and ‘WIRE edges; the same scoreboards run against the proposed topology and the AI iterates against the resulting metrics. This is design exploration with AI as a collaborator, not a transpiler hoping to guess right.

Coverage point generation. Given an actor and an observed message trace, the AI proposes coverage points: “the actor never observed an inbound message of this type while the pending output bit was set; consider adding a coverage bin.” The typed-message discipline gives the AI a structured surface to reason about; coverage suggestions become concrete additions to a coverage actor, not free-form English.

Refactoring. The AI is given two related actors and asked to factor common behavior into a base class. Because actors have well-defined message contracts and no shared state, the refactoring is local and verifiable: the typed messages on each channel must match before and after.

Cross-language porting. The SystemC port (Appendix J) and the SystemVerilog implementation share the same actor topology. An AI given a SystemC actor produces the SystemVerilog equivalent or vice versa by following the mapping table in Appendix J. This is stronger than “AI can translate code” — it is “AI can preserve methodology while changing host language,” which is what cross-team collaboration needs. The renderings the AI can move among are exactly those Appendix E §E.1 maps — SystemVerilog class, SystemC module, pure-C++ object (Appendix K), and the synthesizable RTL the compiler emits — and the port is shape-preserving across all of them because the topology and the typed messages do not change between renderings.

Verification. The AI is given an actor specification and a candidate RTL implementation and asked to identify points where the implementation could violate the spec. Because the specification is structurally explicit (typed messages, fixed topology), the AI’s task is bounded equivalence checking rather than open-ended fault-finding.

H.7 The Kernel Oracle, Restored

There is one asymmetry between AI-assisted theorem proving (Chapter 3) and AI-assisted RTL that the methodology has to confront. AI-assisted proving has a kernel oracle: the prover’s type-checker mechanically rejects malformed proofs, so an LLM’s wrong tactic suggestion is filtered automatically and soundness survives the assistance. English-to-Verilog AI has no equivalent — the test suite that would play the role is itself fallible, so a wrong-but-passing output escapes. The actor methodology does not give AI-RTL a perfect kernel, but it restores an oracle in each of the three regimes the flow can run in, and narrows the unguarded surface to exactly one of them.

1. AI authors the actor; a proven compiler lowers it. The oracle is two-part and mechanical. The actor specification is verified at the specification level by the scoreboards and coverage that already exist (Appendix D §D.4); the proven derivation — Golden Gate’s LI-BDN partial-implementation guarantee, or a Kami-style extraction (machine-checked to the Bluespec level, the compiler below it trusted) — carries that verification down to the RTL (Appendix D §D.8). Nothing the AI produces reaches gates unchecked: the spec is checked by the testbench, the lowering is checked by the proof.
2. AI applies the lowering rules itself, but the lowering is model-checkable. Even without a full proven compiler, a lowered actor model is small and closed enough to check against its reference by bounded model checking — the push-button BMC flow Golden Gate uses to certify its optimized models (Appendix E §E.5) is exactly this, applied per node. The AI’s lowering output is filtered by a mechanical check, not trusted on a test pass.
3. AI generates RTL directly, no proven compiler in the loop. This is the only unguarded regime, and here Sequential Equivalence Checking (SEC) between the actor specification’s reference behavior and the AI-generated RTL is the trust anchor (Chapter 3 §3.5). The actor framework makes this the smallest possible SEC problem: the reference and the implementation are the same shape, one structured concurrent representation against another, so the equivalence checker’s state-matching is tractable where a UVM-reference-against-RTL SEC would not be.

The methodology’s contribution to AI-RTL trust is thus not a claim that AI output is correct; it is that the architecture confines the unguarded surface to regime 3 and gives regime 3 the smallest SEC problem it can have. Appendices D §D.8 and E §E.5 state the same point from the synthesis and continuum sides.

H.8 Why This Matters Now

AI-assisted RTL design is widely cited as one of the fastest-growing areas of EDA tooling. Today’s tools work from English to Verilog because that is the format engineers currently use to capture intent. The argument of this book is that the format itself is the leverage point — structured concurrent specifications give AI a tractable substrate, and a proven lowering on top of them removes the step where AI-RTL fails; unstructured English gives neither. Teams that adopt the actor methodology now get a clean upgrade path as AI tooling and proven compilers mature around structured inputs. Teams that stick to English specifications and UVM testbenches inherit the friction of a bad input format at every AI iteration, with a fallible test suite as their only check.

The concrete deliverable for AI-RTL on the actor methodology is not a new tool but a new input, and a re-division of labor. The same AI tools consuming actor source instead of English produce more reliable results; the deterministic lowering moves to a proven compiler; the AI’s effort moves up to the spec and the exploration. Structured concurrency in, structured RTL out, shape preserved, equivalence checked.

H.9 What This Appendix Has Established

• The shape-preserving translation is mechanical, and the lowering is a proven compiler’s job. Appendix E shows the worked example; Golden Gate (§H.3) automates the lowering with a formal guarantee. The AI need not perform the deterministic step at all.
• The architectural decisions are in the input, not the AI. Class-based actors specify state, handler, message types, and topology; the AI fills in nothing architectural. This is the difference between English-to-Verilog and actor-to-Verilog.
• AI participation moves up the flow. Authoring and refining the spec, exploring topology, generating coverage, refactoring, cross-language porting, and bounded equivalence checking become the AI’s role — the work where ambiguity, and therefore a collaborator’s value, actually lives.
• The kernel oracle is restored, and the unguarded surface is confined to SEC. In two of three regimes a mechanical check (testbench plus proven derivation; or per-node model checking) filters the AI’s output; in the third, SEC is the anchor, and the actor framework gives it the smallest possible problem.
• The framework is the format leverage. The same property that makes actors carry through to RTL synthesis (Appendix E) makes them tractable for AI translation and verifiable after it. Picking the right model of computation once unlocks both.