Emergent Hardware Verification

Chapter C Verifying Earl Grey as Actors

Appendix A demonstrated the actor framework on a single block (UBUS), and Appendix B composed a synthetic two-master / three-peripheral mini-SoC. This appendix takes the next step: a real production silicon root-of-trust SoC, OpenTitan’s Earl Grey variant, with twenty-eight IP types modeled as actors, and a chip-level test that exercises power, security, lifecycle, crypto, entropy, symbolic-name CSR access through the framework’s RAL, and lockstep glitch detection.

The point of the appendix is not the line-count delta. It is the mental-model fit. Hardware verification at the SoC scale is a concurrent, distributed problem; the actor topology mirrors the silicon block diagram rather than imposing a class hierarchy on top of it. Re-implementing pieces of Earl Grey makes the framework’s claims empirical rather than rhetorical.

The full source lives in appC_earlgrey/ and runs end-to-end on Verilator 5.049.

C.1 Why Earl Grey

Earl Grey is the canonical OpenTitan implementation: an Ibex-based RISC-V SoC with around twenty-eight IP blocks, a TileLink interconnect, multiple clock and reset domains, and a heavy security verification surface (alert handler with four parallel escalation classes, lockstep Ibex pair, lifecycle controller, key manager, true-random-number chain, OTP-protected secrets). The current OpenTitan UVM verification effort is approximately 205,000 lines distributed across:

  • hw/dv/sv/ – shared agents, base classes, RAL utilities (~41,000 lines)

  • hw/ip//dv/ – per-IP DV environments (~144,000 lines across twenty-eight IPs)

  • hw/top_earlgrey/dv/ – chip-level env with roughly a hundred chip_sw_* sequences (~19,000 lines)

That is the scale at which any framework claim about hardware verification has to hold up. Block-level proofs (Appendix A) and synthetic SoC proofs (Appendix B) do not exercise the cross-IP coordination, multi-clock complexity, security-critical FSM interactions, or fault-injection paths that an industrial SoC has in abundance. Earl Grey does.

(-tikz- diagram)

Figure C.1: Earl Grey at chip scale: 28 IP types modeled as actors, wired through a single TileLink xbar. The dashed-red edge is a cross-IP causality the chip-level test exercises (§C.8).

C.2 Scope of This Appendix

The example covers all twenty-eight Earl Grey IP types, organized by concern:

  • Bus and routing: TL-UL master / slave / monitor / xbar

  • Power island: pwrmgr, clkmgr, rstmgr, lc_ctrl

  • Non-volatile memory: flash_ctrl, rom_ctrl, otp_ctrl

  • Key derivation: keymgr

  • Crypto: aes, kmac, hmac, otbn

  • RNG chain: entropy_src, csrng, edn0, edn1

  • CPU and interrupts: rv_plic, rv_core_ibex (with internal lockstep pair)

  • Common peripherals: UART (4 instances), AON timer, alert handler

  • General-purpose peripherals: gpio, pinmux_aon, pwm_aon, adc_ctrl_aon

  • Serial buses: i2c (3 instances), spi_host (2 instances), spi_device

  • USB: usbdev with twelve endpoints

The IP actors are behavioral models that simulate the SW-visible behavior of each block: register operations, state machines, message protocols, interrupt sources, alert sources, and clock-domain interactions. They are not wrappers around the real OpenTitan RTL. To run the framework against real Earl Grey RTL, point the TlulSlaveActor BFM at OpenTitan’s RTL files and replace each behavioral actor with a thin wrapper that proxies bus reads and writes; the structural verification machinery (BFMs, scoreboards, coverage, supervision, observability) transfers unchanged.

C.3 Shared Infrastructure

Two pieces of shared infrastructure appear once and are used everywhere: the TileLink BFM stack and the Earl Grey memory map.

TileLink Uncached Lightweight (TL-UL)

Every Earl Grey IP exposes its CSRs through TL-UL. The actor framework models the bus protocol with four small actors:

  • TlulMasterActor – consumes TlulReq_s envelopes from its mailbox, drives the request channel, samples the response, and publishes a TlulRsp_s back.

  • TlulSlaveActor – responds to bus requests in the slave’s address window with an associative-array backing store; one slave actor per IP CSR region.

  • TlulMonitorActor – passive observer; pairs request and response handshakes and publishes one TlulMonPkt_s per completed transaction for downstream subscribers (scoreboard, RAL predictor, tracer, recorder).

  • TlulXbarActor – routes requests by address to the right slave actor; arbitrates between multiple masters; synthesizes an error response on address-decode misses.

Together these four files are around 385 lines. The OpenTitan UVM equivalent (tl_agent plus base classes) is roughly 3,000 lines.

Earl Grey memory map

The chip-level memory map is transcribed verbatim from hw/top_earlgrey/doc/memory_map.md (which OpenTitan’s topgen.py auto-generates from its Hjson SoC description). It is the single source of truth that the chip env uses to wire the xbar’s address regions and that test code uses to do CSR access by name:

package earlgrey_memory_map_pkg;
 // Peripheral xbar (0x4000_0000 - 0x40FF_FFFF)
  parameter logic [31:0] EG_UART0_BASE            = 32'h4000_0000;
  parameter logic [31:0] EG_UART1_BASE            = 32'h4001_0000;
  parameter logic [31:0] EG_GPIO_BASE             = 32'h4004_0000;
 // ... uart2/3, spi_device, i2c0..2, pattgen
  parameter logic [31:0] EG_RV_TIMER_BASE         = 32'h4010_0000;
  parameter logic [31:0] EG_OTP_CTRL_BASE         = 32'h4013_0000;
  parameter logic [31:0] EG_LC_CTRL_BASE          = 32'h4014_0000;
  parameter logic [31:0] EG_ALERT_HANDLER_BASE    = 32'h4015_0000;


 // Always-on (AON) sub-bus
  parameter logic [31:0] EG_PWRMGR_AON_BASE       = 32'h4040_0000;
  parameter logic [31:0] EG_RSTMGR_AON_BASE       = 32'h4041_0000;
  parameter logic [31:0] EG_CLKMGR_AON_BASE       = 32'h4042_0000;
  parameter logic [31:0] EG_AON_TIMER_AON_BASE    = 32'h4047_0000;


 // Main xbar (security/crypto, 0x4100_0000 - 0x41FF_FFFF)
  parameter logic [31:0] EG_FLASH_CTRL_BASE       = 32'h4100_0000;
  parameter logic [31:0] EG_AES_BASE              = 32'h4110_0000;
  parameter logic [31:0] EG_HMAC_BASE             = 32'h4111_0000;
  parameter logic [31:0] EG_KMAC_BASE             = 32'h4112_0000;
  parameter logic [31:0] EG_KEYMGR_BASE           = 32'h4114_0000;
  parameter logic [31:0] EG_CSRNG_BASE            = 32'h4115_0000;
  parameter logic [31:0] EG_ENTROPY_SRC_BASE     = 32'h4116_0000;
  parameter logic [31:0] EG_EDN0_BASE             = 32'h4117_0000;
  parameter logic [31:0] EG_EDN1_BASE             = 32'h4118_0000;
  parameter logic [31:0] EG_ROM_CTRL_BASE         = 32'h411E_0000;


 // Memory blocks
  parameter logic [31:0] EG_ROM_MEM_BASE          = 32'h0000_8000;
  parameter logic [31:0] EG_SRAM_MAIN_BASE        = 32'h1000_0000;
  parameter logic [31:0] EG_FLASH_MEM_BASE        = 32'h2000_0000;


 // ... (full list in appC_earlgrey/model/common/earlgrey_memory_map_pkg.sv)
endpackage

In real verification, every CSR access in every test goes through the RAL by symbolic name; the address comes from this package. The actor framework’s actor_ral_pkg::RalActor (Chapter 6, §6.9) plays the same role, resolving symbolic register names via addr_of and re-publishing each access as a RalEvent_s; the deliberately simplified, dictionary-backed variant of Appendix B (Step 5) shares the name but not the symbolic-name API.

C.4 The Power Island as Four Cooperating Actors

The single most novel modeling exercise in this appendix is the always-on power island: pwrmgr, clkmgr, rstmgr, and lc_ctrl working together to manage power states, clock gating, reset domains, and lifecycle state.

In OpenTitan UVM, verifying these four IPs in concert involves the chip-level scoreboard, multiple monitors on different clock domains, and careful CDC-aware reasoning about when each domain’s signals are stable. Several of those chip_sw_* sequences exist specifically to exercise this subsystem (chip_sw_deep_sleep_*, chip_sw_all_escalation_resets_vseq, chip_sw_pwrmgr_*).

As actors, each of the four IPs is a single-message-type-handling state machine, and they cooperate by publishing onto each other’s mailboxes:

  • PwrmgrActor – six-state FSM (RESET / LOW_POWER_REQ / LOW_POWER / DEEP_SLEEP / FAST_WAKEUP / ACTIVE). Subscribes to PwrLowPowerReq_s from SW (via the bus), PwrWakeupEvent_s from wakeup sources (AON timer, sysrst_ctrl, pinmux), and ResetEvent_s from rstmgr. Publishes PwrStateTransition_s on every state change and ClkGateReq_s to clkmgr.

  • ClkmgrActor – subscribes to ClkGateReq_s from pwrmgr and ClkHintReq_s from SW. Publishes ClkStateChange_s on every gate change.

  • RstmgrActor – wired for ResetReq_s from any source. Maps reset requests to reset domains specific to Earl Grey and reset reasons, records reason history, and publishes ResetEvent_s pairs (assert + deassert) downstream. Downstream IP actors consume those events to clear their own state, while a chip-level Supervisor (actor_supervision_pkg, REST_FOR_ONE) restarts any actor that fails outright. At block scope, the reusable OtResetSupervisor from common/ packages this same ResetReq_s \(\rightarrow \) ResetEvent_s conversion together with its own supervisor.

  • LcCtrlActor – security-critical state machine (RAW \(\rightarrow \) TEST_UNLOCKED \(\rightarrow \) DEV/PROD/RMA \(\rightarrow \) SCRAP). SCRAP is terminal. Subscribes to LcTransitionReq_s from SW (token-protected) and EscAction_s from the alert handler (ESC_LC_SCRAP forces immediate transition to SCRAP, irrespective of current state).

The interesting property is the interaction. A SW low-power request publishes PwrLowPowerReq_s; pwrmgr receives it, transitions through the request and entry states, publishes ClkGateReq_s to clkmgr (which gates io / usb / main clocks); pwrmgr stays in LOW_POWER until a PwrWakeupEvent_s arrives from AON timer; on wakeup, pwrmgr transitions through FAST_WAKEUP and back to ACTIVE while telling clkmgr to ungate.

None of the four actors needs to know the others’ internal state. Coordination is entirely by message. The chip-level scoreboard is wired with one ‘WIRE per relevant message type and observes the cross-IP causality: this PwrLowPowerReq_s caused this sequence of ClkStateChange_s caused the system to enter low power, and that PwrWakeupEvent_s caused the reverse. UVM expresses the same observation with TLM analysis fifos, virtual sequencer hooks, and reconstruction logic in the chip scoreboard. Actors express it with a handful of ‘WIRE declarations.

C.5 The Entropy Chain

A second multi-IP composition that exercises the framework is the random-number chain: entropy_src (NIST SP800-90B) feeds csrng (SP800-90A AES-CTR DRBG), which feeds two edn endpoints (EDN0, EDN1), which feed hardware consumers (AES, KMAC, OTBN, keymgr).

In silicon, this is an entropy distribution network. As actors:

// In EarlgreyChipEnvActor::new() :
`WIRE(entropy_src, EntropySeed_s, csrng)   // raw seeds flow up to CSRNG
`WIRE(csrng, CsrngRsp_s,          edn0)    // CSRNG output feeds both EDNs
`WIRE(csrng, CsrngRsp_s,          edn1)
`WIRE(edn0,   CsrngCmd_s,         csrng)   // EDN GENERATE requests flow back
`WIRE(edn1,   CsrngCmd_s,         csrng)

The chain is five data-path ‘WIRE edges (alert and response side edges add more), each naming the producer, the exact message type flowing on that edge, and the consumer. Adding a new entropy consumer (say, a future ascon crypto IP) is one more line.

The data path is symmetric to the silicon: entropy_src ingests raw 4-bit noise samples (in real silicon these come from AST analog noise sources; here they are testbench-published EntropyNoiseSample_s structs), runs an SP800-90B repetition-count health test (repcnt; the adaptive-proportion test adaptp is declared but stubbed), and emits an EntropySeed_s containing 384 conditioned bits once enough samples have been gathered. csrng consumes the seed to instantiate a DRBG instance per EDN, produces CsrngRsp_s when an EDN issues a GENERATE request, and edn0 / edn1 repackage the words and route them to the requesting hardware consumer.

The verification scoreboards observe each step independently. entropy_src.health_failures counts SP800-90B health-test trips; csrng.ops_done counts DRBG operations; per-EDN buffers track refresh latency. None of this requires touching the entropy chain itself; coverage and observability are added as new subscribers.

C.6 The Security Path: Lockstep, Alerts, and Lifecycle

The most important demonstration in this appendix is the security path. It is the verification problem where UVM scaffolding hurts most, and where actor topology buys the most clarity.

The scenario: a glitch is injected on Ibex CoreA’s program counter; the lockstep comparator detects divergence from CoreB; a fatal alert is raised; the alert handler escalates through phases 0–3 firing NMI, lifecycle scrap, system reset, and chip reset; the lifecycle controller is forced to SCRAP; the key manager is forced to DISABLED; the reset supervisor restarts the topology cleanly.

In the actor framework, this is a topology, not a state-coordination problem:

  • Two Ibex core actors plus a comparator actor. Two IbexCoreActor instances, core_a (CoreA) and core_b (CoreB), each publish InstrTrace_s on every step(). IbexLockstepComparatorActor subscribes to both and compares per-cycle PC and instruction. On mismatch, it publishes LockstepMismatch_s and AlertEvent_s.

  • Alert handler with four concurrent escalation FSMs. AlertHandlerActor dispatches incoming AlertEvent_s to all four EscClassFsmActor instances; each filters on its own class. The CLASS_A FSM (the comparator’s target class) progresses through phases 0..3, publishing EscPhaseChange_s and EscAction_s on each phase entry. The other three FSMs ignore this alert and are free to handle their own.

  • Action handlers as composable subscribers. NmiActionActor is wired for EscAction_s (handles ESC_NMI); LcScrapActionActor is wired for EscAction_s (handles ESC_LC_SCRAP); ResetActionActor is wired for EscAction_s (handles ESC_RESET_SYS and ESC_RESET_CHIP). Each publishes EscActionResult_s after performing its action.

  • Cross-IP forcing via the same ‘WIRE primitive. lc_ctrl is wired to the four class FSMs for EscAction_s; on ESC_LC_SCRAP it forces its own state to SCRAP. keymgr does the same with KEYMGR_DISABLED. No special coordination machinery is needed.

  • Reset cascade through the supervisor. ResetActionActor publishes ResetReq_s for the reset phases. rstmgr (subscribing) records the reset reason and publishes ResetEvent_s pairs (assert + deassert); every actor wired to that stream clears its own state on the assert and resumes on the deassert, while the chip-level Supervisor relaunches any actor that fails outright.

  • Scoreboard observes everything. ChipScoreboardActor is wired for each event type it tracks — one ‘WIRE per producer per type: AlertEvent_s, ResetReq_s, AonTimerEvent_s, IrqMsg_s, TlulMonPkt_s, and RalEvent_s. It observes the causality (alert at time \(t\), reset at time \(t + \Delta \) within tolerance) and reports the chain ran end-to-end.

The chip-level test (§C.8) walks through this scenario and verifies all of it.

C.7 Composing the Chip-Level Environment

EarlgreyChipEnvActor::new constructs all twenty-eight IP actors and wires the topology with ‘WIRE declarations. The constructor is around 460 lines — roughly 220 ‘WIRE typed edges plus the IP-actor instantiations — which is the entire chip-level wiring. The OpenTitan UVM equivalent is around 1,500 lines of chip_env.sv plus chip_env_cfg.sv plus connect-phase wiring.

A representative excerpt:

// Power island reset cascade
`WIRE(pwrmgr,           PwrStateTransition_s,      rstmgr)        // power island drives the cascade
`WIRE(aon_timer_env.timer,         ResetReq_s,     rstmgr)        // watchdog bite
`WIRE(alert_env.reset_handler,     ResetReq_s,    rstmgr)         // escalation
`WIRE(alert_env.reset_handler,     ResetReq_s,     chip_scoreboard)   // causality
`WIRE(aon_timer_env.timer,         ResetReq_s,     chip_scoreboard)


// rstmgr broadcasts ResetEvent_s to every reset-aware IP
foreach (uart_envs[i]) `WIRE(rstmgr, ResetEvent_s, uart_envs[i].uart)
`WIRE(rstmgr, ResetEvent_s, aon_timer_env.timer)
`WIRE(rstmgr, ResetEvent_s, rom_ctrl)
`WIRE(rstmgr, ResetEvent_s, otp_ctrl)
`WIRE(rstmgr, ResetEvent_s, pwrmgr)
`WIRE(rstmgr, ResetEvent_s, lc_ctrl)
`WIRE(rstmgr, ResetEvent_s, ibex.core_a)
`WIRE(rstmgr, ResetEvent_s, ibex.core_b)


// Alert sources (real Earl Grey IPs, not synthetic stand-ins)
`WIRE(rom_ctrl,           AlertEvent_s,          alert_env.handler)
`WIRE(entropy_src,        AlertEvent_s,          alert_env.handler)
`WIRE(ibex.comparator, AlertEvent_s,             alert_env.handler)
`WIRE(ibex.comparator, LockstepMismatch_s,       alert_env.handler)


// Alert escalation -> security-critical actors
foreach (alert_env.handler.classes[i]) begin
  `WIRE(alert_env.handler.classes[i], EscAction_s, lc_ctrl)
  `WIRE(alert_env.handler.classes[i], EscAction_s, keymgr)
end


// Entropy chain
`WIRE(entropy_src, EntropySeed_s, csrng)
`WIRE(csrng,         CsrngRsp_s,     edn0)
`WIRE(csrng,         CsrngRsp_s,     edn1)
`WIRE(edn0,          CsrngCmd_s,     csrng)
`WIRE(edn1,          CsrngCmd_s,     csrng)


// OTP fan-out at boot
`WIRE(otp_ctrl, OtpInitDone_s, keymgr)
`WIRE(otp_ctrl, OtpInitDone_s, rom_ctrl)
`WIRE(otp_ctrl, OtpInitDone_s, lc_ctrl)


// IRQ aggregation -> Ibex
`WIRE(aon_timer_env.timer, IrqMsg_s,             plic)
`WIRE(plic,                   PlicIrqClaim_s,    ibex.core_a)

Each line is one typed edge. The result is a directed graph that mirrors the silicon block diagram. Adding a new observer (say, a security coverage actor that wants to watch every LcTransitionResult_s and KeymgrAdvanceResult_s together) is one new actor file plus two ‘WIRE calls, one per type.

C.8 The Seven-Phase Chip-Level Test

earlgrey_chip_sw_test.sv runs seven phases that mirror the shape of OpenTitan’s chip_sw_* sequence library. Test code is concise because the topology is already correct; the test is just stimulus.

Phase 1 – Boot flow. The test pre-loads a small ROM image, computes the matching hash, seeds OTP with creator/owner secrets and the ROM digest, and publishes a ResetEvent_s with asserted=0 so rom_ctrl and otp_ctrl “boot.” rom_ctrl runs its hash check (matches; no alert raised). otp_ctrl publishes OtpInitDone_s with seeds. keymgr (subscribing) advances three times: RESET \(\rightarrow \) INIT \(\rightarrow \) CREATOR_ROOT_KEY \(\rightarrow \) OWNER_INT_KEY. lc_ctrl powers up in RAW, and the test requests a volatile transition straight to DEV; since RAW \(\rightarrow \) DEV is not a legal hop, lc_ctrl rejects it and stays in RAW (it is later forced to SCRAP by alert escalation in Phase 3). The Ibex executes sixteen NOPs in lockstep; the comparator sees no mismatch.

Phase 1.5 – Peripheral traffic. The test exercises the general-purpose peripherals concurrently. pinmux routes four IP signals to four physical pads. gpio enables interrupt-on-change for the bottom four input pins, drives a 32-bit output value, and observes an external input change. pwm configures channel 0 with a 50% duty cycle (publishes pulse-edge events on its background tick thread). adc_ctrl configures both ADC channels with high/low thresholds, sees an under-low and an over-high sample on each channel, and publishes a wakeup event. i2c0 runs a four-byte host-mode write. spi_host0 reads the JEDEC ID from spi_device (configured in flash mode). usbdev configures three endpoints and processes a SETUP, IN, and OUT packet from the host stub.

Phase 1.7 – CSR access through the RAL by symbolic name. This phase exercises the full bus \(\rightarrow \) RAL \(\rightarrow \) scoreboard observation path. The test resolves register addresses through the per-IP RalActor (env.ral_uart0.addr_of("CTRL"), env.ral_aes.addr_of("TRIGGER"), etc.) and calls ibex.write_reg() / ibex.read_reg() to drive the access through the xbar. The xbar publishes a synthetic TlulMonPkt_s on every forward; each per-IP RalActor (subscribed to the xbar) filters by its block-base offset and re-publishes a symbolic RalEvent_s with the resolved register name. The chip scoreboard subscribes to every RalEvent_s and counts accesses by symbolic name. The phase issues twelve writes and seven reads — nineteen accesses in all — spanning UART, GPIO, I2C, SPI host, AES, KMAC, HMAC, pwrmgr, lc_ctrl, keymgr, entropy_src, csrng, OTBN, flash_ctrl, rom_ctrl, and otp_ctrl, and the scoreboard reports each by symbolic register name (twelve distinct names, since names such as CTRL, CMD, and STATUS recur across IPs).

Phase 2 – Crypto and entropy. The test publishes 100 EntropyNoiseSample_s structs into entropy_src. The actor passes them through the SP800-90B health tests (no failures) and emits one EntropySeed_s. The test then asks csrng to instantiate (which consumes the seed), and asks edn0 for random data (which causes csrng to GENERATE four 128-bit words, i.e. sixteen 32-bit words; edn0 then unpacks them and routes four 32-bit words back to the consumer). The test then issues one AES encrypt, one KMAC digest, one HMAC digest, and one keymgr generate. All complete; the chip scoreboard counts each.

Phase 2.5 – OTBN program. The test loads eight instruction words into otbn’s IMEM and issues an OtbnExecReq_s. The actor transitions IDLE \(\rightarrow \) BUSY \(\rightarrow \) IDLE (publishing OtbnStateChange_s on each transition), simulates execution latency, raises an EXEC_DONE interrupt to the CPU, and publishes OtbnExecDone_s for the chip scoreboard.

Phase 3 – Lockstep glitch. After four clean lockstep steps, the test injects a glitch on ibex.core_a via ibex.inject_glitch(0x10), which XORs the next PC. The next step(pc, instr) call drives both cores; CoreA executes at pc ̂ 0x10, CoreB at pc. The comparator sees the PC mismatch, publishes LockstepMismatch_s and AlertEvent_s. alert_handler’s CLASS_A FSM kicks off escalation. The four phases fire over \(\sim \)400 ns: NMI in phase 0 (handled by nmi_handler), ESC_LC_SCRAP in phase 1 (which lc_ctrl sees and forces its state to SCRAP; keymgr sees and forces DISABLED), system reset in phase 2 (which rstmgr broadcasts; the supervisor stops and restarts the supervised set), and chip reset in phase 3.

Phase 4 – Watchdog bite. The test configures the AON timer’s watchdog (bark threshold 50, bite threshold 100). The test does no further activity. The watchdog ticks on the AON clock (200 kHz); at 50 ticks bark fires; at 100 ticks bite fires. Bite publishes a ResetReq_s with kind=RST_SYSTEM, which rstmgr consumes; the chip scoreboard observes the bite-to-reset causality.

Sample run output

Running make -C model earlgrey from appC_earlgrey/ produces output ending with the chip-level report (abridged below; the per-IP scoreboard summary lines are trimmed to their salient fields and the per-register list is grouped for space):

[100]     === Phase 1: BOOT FLOW ===
[850]     Boot phase done. Ibex begins fetching...
[1170]    === Phase 1.5: PERIPHERAL TRAFFIC ===
[6170]    === Phase 1.7: RAL CSR ACCESS BY SYMBOLIC NAME ===
[8360]    === Phase 2: CRYPTO + ENTROPY ===
[10810] === Phase 2.5: OTBN PROGRAM ===
[12310] === Phase 3: LOCKSTEP GLITCH ===
[12390] Injecting glitch on Ibex CoreA...
[12390] AlertHandlerScb: action class=0 phase=0 action=1
[12390] AlertHandlerScb: action class=0 phase=1 action=2
[12390] AlertHandlerScb: result nmi.handler           "NMI dispatched to CPU"
[12390] AlertHandlerScb: result lc_scrap.handler "Lifecycle controller ... SCRAP"
[12590] AlertHandlerScb: action class=0 phase=2 action=4
[12590] AlertHandlerScb: result reset.action          "system reset asserted"
[12790] AlertHandlerScb: action class=0 phase=3 action=5
[12790] AlertHandlerScb: result reset.action          "chip reset asserted"
[12890] === Phase 4: WATCHDOG BITE ===


==== Earlgrey chip-level report ====
ChipScoreboard:
  bus_txns          = 19
  alerts            = 1
  irqs              = 4
  aon_bite          = 2 (bite-to-reset causality observed = 1)
  system_resets     = 3
  chip_resets       = 1
  alert-to-reset causality observed = 1
  RAL accesses      = 12 writes / 7 reads (across 12 distinct registers)
  per-register (symbolic names from the auto-generated RAL):
    CTRL : 3 writes        CMD : 3 writes    STATUS : 4 reads
    TRIGGER, CONTROL, CTRL_CFG_REGWEN, DIRECT_OUT, FIFO_CTRL,
    MODULE_ENABLE : 1 write each;      DATA_IN, FATAL_ALERT_CAUSE,
    OP_STATUS : 1 read each
RvCoreIbex: reads=7 writes=12 lockstep_mismatches=1
Per-IP summaries:
  pwrmgr.state              = EG_PWR_ACTIVE
  lc_ctrl.state             = EG_LC_SCRAP         (forced by alert escalation)
  keymgr.state              = KEYMGR_DISABLED    (forced by alert escalation)
  aes / kmac / hmac         = 1 op each
  entropy_src                = 1 seed (health-fail = 0)
  csrng                     = 2 ops
  otbn programs/faults       = 1 / 0
  pinmux routes             = 4 configured
  ActorRegistry size        = 39
AonTimerScb: bark=1 bite=1 reset_reqs=1
AlertHandlerScb: phases=5 actions=4 results=4 (class 0: phases 0..3 each once)

The two “causality observed = 1” lines are the technical claim of the chapter, made empirical: across independent IPs, with no shared state and no central coordinator, the cross-IP cause-and-effect was observed by a passive scoreboard wired with one ‘WIRE per observed type. lc_ctrl reaching SCRAP and keymgr reaching DISABLED are the security-path consequences of the same chain.

C.9 The Mental-Model Claim, Validated

Chapter 6 argued that hardware verification is a concurrent, distributed problem and that the actor mental model fits the problem more naturally than the OOP class hierarchy UVM imposes. Each of the patterns exercised in this appendix supports that claim point by point:

  • Power island shape mirrors silicon shape. Four IPs in the silicon power island become four actors in the testbench, communicating through the same message types the silicon hardware uses (clock-gate requests, reset events, lifecycle transitions). UVM’s class hierarchy forces these into an inheritance tree that does not match the silicon’s lateral coordination.

  • Concurrent escalation chains run in parallel as separate actors. The four CLASS_A..D FSMs in alert_handler are four separate actors. They cannot interfere with each other because they have no shared state. UVM’s chip scoreboard reconstructs the same parallelism from analysis fifos, but at the cost of explicit timing reasoning in the scoreboard.

  • Lockstep CPU pair literally is two actors. Modeling a redundant compute element as two independent actors with a third comparator actor matches the silicon exactly. UVM’s lockstep verification (chip_sw_rv_core_ibex_lockstep_glitch_vseq.sv, 800 lines) is most of its size because it needs to coordinate the comparator with the recovery flow through a virtual sequencer.

  • Cross-IP causality is observable from a passive scoreboard. The chip scoreboard observes alert\(\rightarrow \)reset and bite\(\rightarrow \)reset causality without owning either source — it only wires for the events it tracks. UVM’s equivalent requires deliberate analysis-port instrumentation in every involved component plus reconstruction logic in the scoreboard.

  • Security forcing is a normal message. lc_ctrl transitions to SCRAP and keymgr disables itself by handling the same EscAction_s message any other escalation subscriber would. There is no special UVM-style forcing mechanism, no hierarchical reference, and no test author opting into the behavior.

  • Multi-clock crossings need no special discipline. The AON timer’s actor publishes from its 200 kHz thread; rstmgr’s actor consumes from its system-clock thread. The mailbox between them is the synchronization point. UVM’s CDC-aware monitors and clock-domain-crossing scoreboard discipline are entirely absent because the actor framework does not need them.

  • Compositionality is the integration story. The reset supervisor was written for the AON watchdog bite; the alert handler reuses it without changes. The entropy chain composes into the chip env with five data-path ‘WIRE edges. Adding a new IP is six files (per the established pattern). UVM integration cost grows with the size of the inherited base class library.

C.10 Pure-Actor Real-RTL DV: No DPI Bridges

OpenTitan’s own Verilator chip-level testbench (hw/top_earlgrey/dv/verilator/chip_sim_tb.sv plus six C-side bridges uartdpi, gpiodpi, spidpi, usbdpi, jtagdpi, dmidpi) takes a particular shape: SystemVerilog instantiates the real chip RTL, then each chip-level pin (UART tx/rx, GPIO p2d/d2p, SPI device sck/csb/sdi/sdo, USB d+/d-, JTAG tck/tms/tdi/tdo) is wired to a tiny bespoke C bridge that translates between bit-level signals and a host-side primitive — a pseudo-terminal for UART, a named pipe for GPIO, a TCP listener for OpenOCD JTAG, a separate pseudo-terminal for SPI. Each bridge is a separate fileset in fusesoc, a separate import "DPI-C" interface in the SV wrapper, and a separate piece of host-side tooling (opentitantool subprocess for UART, openocd for JTAG, cat/tail -f for monitor logs).

Same Result, One Mechanism

The actor framework collapses all six DPI bridges into one substrate: a typed message contract per peripheral plus an actor that watches or drives the corresponding pins. The chip-level DV testbench in appC_earlgrey/dv/chip/chip_actor_tb.sv replaces chip_sim_tb.sv with a SystemVerilog-only top that:

  • • Instantiates the same real chip_earlgrey_verilator DUT.

  • • Wires cio_uart_tx_d2p_o into a uart_if interface and reuses the model side’s UartActor as the pin-level monitor/driver. The actor publishes UartItem_s for every received byte and accepts inbound UartItem_s to drive cio_uart_rx_p2d_i — one actor, both directions, no separate uartdpi.c + uartdpi.sv.

  • • Watches GPIO outputs with an inline edge counter (a one-line stand-in for a GpioMonitor actor) and holds the SPI and USB pins idle for this first cut; adding GPIO, SPI, and USB pin-level monitor actors follows the same pattern as UartActor (small Actor subclass watching pin signals, publishing typed events).

  • • Has no import "DPI-C" declarations, no chip_sim_tb.cc entry point, no libelf link dependency, and no host-side IPC primitives (no pty allocation, no FIFO creation, no TCP listener).

What the Single Substrate Buys
  • One mechanism, six obsoleted bridges. A typed message plus an actor subclass replaces six bespoke C IPC bridges. Adding a seventh peripheral (say I2C) is the same pattern — declare the message struct, write a \(\sim \)30-line actor that watches the pins. There is no parallel C-side machinery to set up.

  • Network-transparent observation, for free. Any actor message can be routed across processes or machines through actor_distributed_pkg’s transport bridges (ZMQ, NATS, iceoryx, libfabric). An UartItem_s can be consumed simultaneously by a scoreboard, a coverage actor, a recorder, a developer’s interactive Python REPL, a CI dashboard, and a remote regression server — without any of those consumers caring whether the simulation is local or distributed. DPI bridges are point-to-point; the actor approach is publish-subscribe by construction.

  • Polyglot consumers attach naturally. screen /dev/ttys15 is one specific consumer of UART traffic. With the actor framework, any tool that speaks ZMQ or NATS (Python, Rust, Go, Node, the browser via WebSockets) is a first-class consumer. OpenOCD attaches to a JTAG actor through a small ZMQ adapter the same way it attaches to jtagdpi’s TCP listener — the same code path, now part of the verification framework instead of an out-of-tree C bridge.

  • Headless and portable. No /dev/ttys* allocation, no named-pipe creation, no privileged TCP listener. One process, one log. Runs in CI containers without elevated privileges. Reproduces deterministically on a developer laptop and a regression farm without per-environment plumbing.

  • Faster verification closure. The DV engineer integrating a new peripheral does not have to think about DPI calls, host-side build scripts, pty plumbing, or how to wire opentitantool into the test. They write one actor with one ‘WIRE per observed type into the chip scoreboard, and observation, coverage, recording, and external visibility all light up at once. Host-tool integration is no longer an afterthought; it is the same actor framework already verifying the chip.

  • Cross-IP causality stays in one log. chip_sim_tb’s six DPI bridges write to six separate destinations (UART pty, GPIO pipe, SPI pty, USB log file, JTAG TCP socket, dmidpi TCP socket). Reconstructing causality across them requires post-processing. The actor framework carries every observed event through the same ‘WIRE topology to one chip scoreboard — the alert\(\rightarrow \)reset\(\rightarrow \)UART-byte chain is a single timeline, observable in real time as the simulation runs.

  • Build complexity collapses. chip_sim.vc (the file list for chip_sim_tb) has 671 SystemVerilog files plus 81 .vlt waivers plus 8 C++ files plus 15 C files (the DPI bridges and their helpers), pulled in via 12 fusesoc cores, linked against libelf for ELF parsing in verilator_memutil. chip.vc (the file list chip_actor_tb uses) has 658 SystemVerilog files plus 81 .vlt waivers, no C/C++ files, no library dependencies. The build does not need a C compiler tuned for DPI, does not need OS-specific pty allocation, and does not need libelf installed.

What the DPI Bridges Looked Like, and What Replaced Them
DPI bridge (hw/dv/dpi/)

Files

Actor replacement

uartdpi

.sv + .c (~300 lines)

UartActor on uart_if

gpiodpi

.sv + .c (~260 lines)

GpioMonitor actor on pin transitions

spidpi

.sv + .c (~500 lines)

SpiHostActor or SpiDeviceActor

usbdpi

.sv + 8 .c files (~3,370 lines)

UsbActor (USB packet generator/monitor)

jtagdpi

.sv + .c (~200 lines)

JtagActor + optional ZMQ bridge for OpenOCD

dmidpi

.sv + .c (~490 lines)

DmiActor + optional ZMQ bridge

Plus C glue

verilator_memutil, verilator_sim_ctrl, dpi_memutil

Actor::run() + ActorRegistry

libelf for ELF parsing

not needed; framework loads ROM via a dedicated RomLoaderActor

The Generalization Beyond OpenTitan

The DPI-bridge anti-pattern is not unique to OpenTitan; it is what most chip-level Verilator testbenches end up with. Every project’s verification team writes its own variants of uartdpi, spidpi, jtagdpi, then writes glue scripts to connect them to host tools, then writes parallel UVM agents to model the same peripherals from the bus side, then writes regression infrastructure that spawns subprocesses to feed the DPI bridges. The actor framework absorbs all of those layers into one substrate: typed messages, single-mailbox actors, ‘WIRE edges, and an optional distributed transport for cross-process or cross-machine fan-out. For a verification team adopting the actor framework, peripheral integration becomes a one-shot exercise — write the actor once, every consumer (scoreboard, coverage, recorder, dashboard, OpenOCD, Slack bot, regression dashboard) attaches with one ‘WIRE per observed type or a transport bridge — rather than an ongoing maintenance burden split across SystemVerilog, C, and host-side scripts.

C.11 Line-Count Comparison

The example in this appendix verifies the same twenty-eight Earl Grey IPs and the same chip-level integration that OpenTitan’s UVM environment does. The line counts below are not a measurement of feature parity — OpenTitan’s UVM ships a far broader test library, randomized stimulus generators, and coverage closure tooling that the example does not yet match — but they do measure the cost of writing the testbench architecture (per-IP environments, chip-level integration, bus and protocol infrastructure, register abstraction layer) that is in scope for both. Counts are wc -l totals over OpenTitan’s release-tagged trees and, for the actor framework, the appC_earlgrey/ directory plus the shared actor_pkg/*.sv packages and the Python RAL generator (appC_earlgrey/tools/reggen_actor.py) it reuses.

UVM (OpenTitan) Actor framework
Per-IP DV environments (28 IPs) 144,135 6,826
of which auto-generated RAL 2,387
Chip-level / SoC integration 19,438 1,631
Shared DV libraries / verification framework 41,016 1,316
Common bus and protocol packages (TL-UL, IRQ, alert, reset) 974
RAL generator tool (Python) 283
Total 204,589 11,030

The shared OpenTitan DV libraries figure (hw/dv/sv/: 41,016 lines) is shown alongside the actor framework’s own packages (actor_pkg.sv, actor_supervision_pkg.sv, actor_lifecycle_pkg.sv, actor_observability_pkg.sv, actor_persistence_pkg.sv, actor_ral_pkg.sv: 1,316 lines combined). Both are reusable substrates; both are counted because both are needed to build the chip-level test. The 31\(\times \) ratio in this row reflects exactly what Chapter 6 argued: a small set of bus-aware actor primitives plus a single ‘WIRE-based composition mechanism is enough where UVM’s class hierarchy needed driver/monitor/agent/env/sequencer/sequence/scoreboard inheritance towers per IP.

The per-IP DV ratio (~21\(\times \)) reflects the same architectural collapse repeated twenty-eight times: each IP’s actor environment is a typed-message contract, an actor that handles one message at a time, a scoreboard that subscribes via ‘WIRE, a coverage actor that subscribes via ‘WIRE, and an env that wires the typed edges together. The UVM equivalents replicate the agent/sequencer/sequence/driver/monitor/scoreboard/env/test pattern in every IP, with the size dominated by sequence libraries and scoreboard reconstruction logic.

The single largest reduction is the RAL itself. OpenTitan’s per-IP UVM RAL files ship as part of each IP’s DV directory and are bulk-generated by reggen from the same Hjson register descriptions our reggen_actor.py reads. Both tools are auto-generation; the difference is what they emit. UVM’s generated RAL is the four-copy mirrored-state model (per Section 6.9). The actor RAL is a single immutable contract plus a backdoor pass-through, and the generated SystemVerilog reflects that — 2,387 lines for the twenty-six register-bearing IPs combined, compared to OpenTitan’s UVM RAL files that are typically 1–3 thousand lines per IP.

These numbers are the empirical version of the chapter’s qualitative claims. Every architectural decision Chapter 6 advocated for — shape-matched topology, single-mailbox actors, declarative typed ‘WIRE, supervisor-driven reset, no-shadow RAL — shows up as a line-count reduction at chip scale because the actor approach removes the layers (factory, configuration, sequencer hierarchy, predictor, mirrored state, virtual sequencer) that UVM accreted over twenty years to make a class-tree framework do what a message-passing framework does natively.