RTL Datasets
We offer training data for chip design AI: a massive corpus of VHDL, Verilog, and SystemVerilog, including RL environments, for labs looking to train models on hardware design.
- 60,000 + projects
- 5 million + files
- 100 billion + tokens
The datasets are proprietary, unique, and untrained-on. Exclusive and non-exclusive licenses available. You retain full ownership of all trained models and outputs.
By chip type:
TPU
- Systolic arrays: Weight-stationary, output-stationary, and row-stationary dataflows for matrix multiplication in training and inference silicon
- Tensor processing pipelines: Activation functions, accumulator chains, and configurable NxN matrix multiply units
- AI accelerator SoCs: End-to-end TPU-style designs with on-chip SRAM, DMA engines, and host interfaces
GPU
- Shader cores & SIMD pipelines: Warp schedulers, register files, and execution units for GPGPU compute
- Tensor & matrix cores: Dedicated FP16/INT8 matrix multiply units for deep learning workloads
- 2D/3D graphics engines: Rasterizers, framebuffer controllers, display pipelines, and video processing
CPU
- RISC-V cores: Single-cycle through superscalar out-of-order implementations, privilege modes, and ISA extensions
- ARM Cortex & custom ARM: Pipelined and multi-cycle ARM processors with SoC integration logic
- MIPS architectures: Classic 5-stage pipelines, multi-cycle designs, and full ISA implementations
- Custom ISA processors: 8-bit through 64-bit original CPU designs, ALUs, branch predictors, and cache hierarchies
- VLIW processors: Very Long Instruction Word architectures with multi-issue execution units, instruction bundling, and static scheduling
DPU
- Xilinx DPU inference engines: DPUCZDX8G and variant configurations for CNN inference on Zynq and ZCU platforms
- Data processing units: DAG processing architectures, dual-clock domain processing, and memory pool controllers
- Network-attached processing: SmartNIC-style offload engines for packet parsing, classification, and transformation
MPU
- 8-bit & 16-bit microprocessors: 8051, 6502, PIC, and AVR-compatible implementations in Verilog and VHDL
- 32-bit microcontrollers: ARM Cortex-M0 based SoCs, OpenMSP430 designs, and embedded processor cores
- Microcomputer systems: Complete microprocessor designs with peripherals, memory controllers, and bus interfaces
NPU
- CNN accelerators: Convolutional neural network engines with configurable kernel sizes, pooling, and activation layers
- Neural network processors: Spiking neural networks, MLP accelerators, and on-chip inference engines
- Edge ML hardware: Low-power neural engines for always-on inference, keyword spotting, and image classification
ASIC
- Complete ASIC flows: RTL-to-GDSII designs including synthesis, place-and-route, and signoff for tapeout
- Standard cell & physical design: Cell libraries, floorplanning, power grid synthesis, and timing closure at advanced nodes
- Application-specific designs: Custom cryptographic engines, DSP blocks, communication controllers, and sensor interfaces
MCU
- 8051 & legacy cores: Classic 8051 implementations, turbo variants, and embedded cryptosystem integrations in Verilog and VHDL
- RISC-V & ARM Cortex-M MCUs: Soft microcontroller SoCs with APB/AHB buses, GPIO, UART, SPI, I2C, timers, and interrupt controllers
- Peripheral & IO subsystems: Standalone UART, SPI, I2C, GPIO, watchdog, and PWM timer IP blocks for embedded integration
FPGA
- Xilinx & Intel/Altera platforms: Vivado and Quartus project files, IP integrations, and board-level designs across device families
- FPGA-based accelerators: CNN inference, cryptomining, signal processing, and high-speed data acquisition on reconfigurable fabric
- Prototyping & IP cores: Reusable FPGA IP for Ethernet, USB, HDMI, SDRAM, SPI, I2C, UART, and PCIe interfaces
Waveform & Signal Processing:
Oscilloscopes
- FPGA-based digital oscilloscopes: Complete acquisition systems with ADC interfaces, trigger logic, and VGA/HDMI display on Xilinx Zynq, Artix, and Intel Cyclone platforms
- Mixed-signal scopes: Combined analog and logic analyzer designs with FFT analysis, multi-channel capture, and Ethernet/USB readout
ADC
- SAR ADC: Successive approximation register converters from 8-bit to 16-bit, including open-source tapeout-ready designs targeting Sky130 and Efabless flows
- Delta-Sigma ADC: Oversampling converters with decimation filters for high-resolution audio and sensor applications
- Xilinx XADC integrations: Hard-block analog-to-digital configurations across Arty, Basys, Nexys, and Zybo board families
DAC
- R2R ladder & Sigma-Delta DAC: Resistor-network and noise-shaped digital-to-analog converters for audio and waveform output
- MASH & modulated DAC: Multi-stage noise-shaping architectures and shift-modulated sigma-delta designs, including TinyTapeout variants
DDS & Waveform Generation
- Direct Digital Synthesizers: CORDIC-based and LUT-based DDS engines with configurable frequency, phase, and amplitude for sine, square, triangle, and arbitrary waveforms
- Arbitrary waveform generators: Function generators with UART/SPI control, multi-channel output, and DDS-FIR filter integration
Simulation & Waveform Tooling
- VCD, FST & FSDB waveform flows: Complete RTL simulation pipelines with Icarus Verilog, Verilator, GHDL, GTKWave, and Verdi/Synopsys FSDB for waveform capture and analysis
- Testbench waveform generation: Stimulus-from-VCD converters, automated waveform comparison, and CI-integrated simulation templates
For training AI on chip design:
Pre-training
- Massive RTL corpus: Verilog, VHDL, and SystemVerilog across 60,000+ projects for language model pre-training on hardware design
- Full project context: Not just isolated modules, but complete designs with testbenches, constraints, Makefiles, and documentation
- Architecture diversity: Processors, accelerators, interconnects, memory controllers, and peripherals spanning every major chip category
- EDA tool flows: Synthesis scripts, OpenROAD/Yosys/OpenLane configurations, and physical design collateral
Post-training (RL Environments)
- UVM & SystemVerilog verification: Thousands of structured verification environments with scoreboards, monitors, and coverage models that provide automated pass/fail feedback
- Simulation testbenches: Cocotb, Verilator, and Icarus Verilog setups for rapid design-simulate-evaluate reward loops
- Synthesis & timing feedback: EDA tool integration for area, power, and timing metrics as RL reward signals
- Benchmark circuits: Reference designs and known-good implementations for evaluating generated RTL quality
Countless more architectures not listed here. Contact us to inquire.