TinyML Accelerator + PicoRV32 Visualizer

System architecture, wrapper FSM, PCPI handshake, and PE-level systolic data movement

1) System Architecture

System-level view based on pcpi_tinyml_accel.v. Memory is below the wrapper because the wrapper, not the accelerator, owns the accel_mem_* interface.

clk and resetn fan out to CPU, wrapper, accelerator, and memory model.
CPU domain
Wrapper domain
Accelerator domain
Memory domain
PicoRV32 Core
Presents pcpi_valid, pcpi_insn, pcpi_rs1, pcpi_rs2. Stalls while pcpi_wait is high.
pcpi_tinyml_accel Wrapper
Decode + FSM
insn_match, state, resp_valid
Memory Sequencer
accel_mem_valid, accel_mem_we, accel_mem_addr, accel_mem_wdata
Base + Index
base_a, base_b, elem_idx
Operand Buffers
a_flat, b_flat
Response Path
result_reg -> pcpi_rd, plus pcpi_ready / pcpi_wr
matrix_accel_4x4_q5_10
Latch + Control
start, busy, done, cycle_count, a_latched, b_latched
Issue Logic
row_inject_flat, col_inject_flat
Result Side
c_flat, accel_done, accel_cycle_count
System Memory
Wrapper reads A/B and writes C through the shared memory path.
A @ base_a
B @ base_b
C @ 0x0000_0200
PCPI handshake
wrapper to accelerator
wrapper to memory
active in current step

2) Systolic Array View

Top-right panel. Each active PE shows entry and exit values for x and y.

For active PE(r,c), the current pair is x_in = A[r][k], y_in = B[k][c]. Those values are forwarded as x_out and y_out while the PE accumulates the scaled product into z_acc.

3) State Transitions

Bottom-left panel. Includes one-step reverse navigation.

Flow: IDLE -> LOAD_A -> LOAD_B -> KICK -> WAIT_ACC (10 cycles) -> STORE_C -> RESP -> IDLE

StateIDLE
Micro-cycle0
WAIT_ACC cycle0 / 10
insn_match0
pcpi_valid0
pcpi_wait0
pcpi_ready0
pcpi_wr0
pcpi_rd-
resp_valid0
pcpi_rs1 / base_a-
pcpi_rs2 / base_b-
elem_idx-
a_flat statusempty
b_flat statusempty
accel_mem_valid0
accel_mem_we0
accel_mem_addr-
accel_mem_wdata-
accel_mem_rdata-
accel_mem_ready0
accel_busy0
accel_done0
accel_cycle_count0

Current State Meaning

S_IDLE

The wrapper is idle. No memory request is active and the CPU is not stalled.

Wrapper FSM

  1. S_IDLE
  2. S_LOAD_A
  3. S_LOAD_B
  4. S_KICK
  5. S_WAIT_ACC
  6. S_STORE_C
  7. S_RESP

4) Event Log

Bottom-right panel. Readable trace of the transaction sequence.