Overview
The DOOM Neuron project uses a distributed architecture that separates the training system (running DOOM and PyTorch models) from the CL1 biological neural hardware. This separation allows computationally intensive game rendering and model training to run on a powerful training server while the delicate neural interface operates on dedicated CL1 hardware.Architecture Components
Training System
VizDoom Environment (training_server.py:1043-1380)
The VizDoomEnv class wraps the VizDoom game engine and provides:
- Game state extraction: Processes game variables (health, ammo, position, velocity)
- Enemy tracking: Tracks up to 5 enemies with position, velocity, and facing direction
- Visual observation: Optional CNN input with configurable downsampling
- Reward shaping: Computes rewards for kills, damage taken, armor pickup, etc.
PPO Policy Network (training_server.py:721-1037)
The PPOPolicy class implements the complete encoder-decoder architecture:
PPOPolicy Components
PPOPolicy Components
Encoder Network (
EncoderNetwork)- Converts game observations to stimulation parameters (frequency and amplitude)
- Optional CNN for visual processing (64 base channels by default)
- Trainable Beta distributions for frequency/amplitude sampling
- Outputs for 8 channel sets (encoding, movement, turning, attack)
DecoderNetwork)- Converts spike counts to action logits
- Linear readout heads with optional non-negative weight constraints
- Single joint action head for combinatorial action space (54 discrete actions)
- Minimal parameters to ensure biological neurons control behavior
ValueNetwork)- Estimates state value for PPO critic
- 2-layer MLP with SiLU activations
- Hidden size: 128 units (configurable)
Channel Organization
The system uses 8 channel groups (59 total channels from 64 available):Channels 0, 4, 7, 56, and 63 are reserved by the CL1 hardware and cannot be used for stimulation.
CL1 Neural Interface
Hardware Loop (cl1_neural_interface.py:292-499)
The CL1 device runs a tight loop at configurable frequency (default 10 Hz):
Stimulation Application (cl1_neural_interface.py:173-218)
Stimulation uses the CL SDK to create biphasic pulses:
The CL1 device caches stimulation designs using an LRU cache (maxsize=2048) to avoid recreating identical StimDesign objects, improving performance.
Spike Collection (cl1_neural_interface.py:219-236)
Spikes are counted per channel group:
UDP Communication Protocol
Packet Formats (udp_protocol.py)
Binary Packet Specifications
Binary Packet Specifications
Stimulation Command (Training → CL1): 72 bytesSpike Data (CL1 → Training): 40 bytesEvent Metadata (Training → CL1): Variable sizeFeedback Command (Training → CL1): 120 bytes
Port Configuration (Default)
Data Flow
Forward Pass (Observation → Action)
- VizDoom generates game state and screen buffer
- Encoder (
EncoderNetwork.sample()) converts observation to stimulation parameters:- Frequencies: 4-40 Hz range
- Amplitudes: 1.0-2.5 μA range
- UDP sends stimulation command to CL1 device
- CL1 applies biphasic stimulation to biological neurons
- Neurons respond with spike patterns
- CL1 counts spikes per channel group and sends via UDP
- Decoder (
DecoderNetwork.forward()) converts spike counts to action logits - Action sampling produces discrete actions (forward/strafe/turn/attack)
- VizDoom executes action and produces next observation
Training Loop
The training system collects experience rollouts and performs PPO updates:Visualization
MJPEG Streaming (mjpeg_server.py)
The MJPEGServer provides real-time visualization of the game:
- Runs in separate process using multiprocessing
- Pre-encodes JPEG once to minimize CPU usage
- Threaded HTTP server supports multiple clients
- Access via
http://training-host:12349/doom.mjpeg
Configuration
Key Architecture Parameters
Performance Considerations
Optimization Strategies
Optimization Strategies
Stimulation Caching
- LRU cache (2048 entries) for
cl.StimDesignandcl.BurstDesignobjects - Cache key:
(channel_index, frequency, rounded_amplitude) - Avoids repeated object creation in tight loop
- CL1 sockets set to non-blocking mode to prevent stalling
- Training system uses socket timeouts for graceful fallback
- Missing packets default to zero stimulation/spikes
- Fixed-size packets (40-120 bytes) minimize parsing overhead
- Little-endian byte order for x86/ARM compatibility
- Float32 precision sufficient for neural stimulation
- Default: 10 Hz (100ms per tick)
- Configurable via
--tick-frequencyflag - Higher frequencies improve temporal resolution but increase network traffic
Recording and Logging
The CL1 interface automatically records neural data:DataStream: