Overview
training_server.py is a modified version of ppo_doom.py designed for distributed training. It runs PPO training on a GPU server while communicating with a remote CL1 device over UDP.
Architecture:
- Training System: Runs VizDoom, PyTorch models, and PPO algorithm
- CL1 Device: Runs
cl1_neural_interface.pyto handle neural hardware - Communication: UDP protocol for low-latency stimulation/spike exchange
- No direct CL SDK imports (CL SDK only on CL1 device)
- UDP sockets for stimulation commands and spike data
- MJPEG server for remote visualization
- Event metadata logging to CL1 device
source/training_server.py
Command-Line Arguments
Basic Options
Execution mode for the training serverChoices:
train, watchtrain: Full training mode with gradient updateswatch: Observe neural activity without training (inference mode)
Path to checkpoint file for loading pre-trained weightsExample:
--checkpoint checkpoints/l5_2048_rand/checkpoint_7900.ptMaximum number of training episodes before termination
PyTorch device for gradient computationChoices:
cpu, cudaNeural Interface Options
Ablation mode for diagnostic testing of decoder dependency on spikesChoices:
none: Normal operation, use real spike featureszero: Replace spike features with zeros (tests decoder bias)random: Replace spike features with random values (tests decoder robustness)
Enable CNN encoder over screen buffer in addition to scalar featuresWhen enabled, processes downsampled screen buffer through convolutional network.
Display & Visualization
Display the VizDoom game window on the training systemNote: For remote viewing, use the MJPEG stream instead.
Directory path on CL1 device for saving neural recordingsThis path is sent to the CL1 device via event metadata. Actual recordings are managed by the CL1 device.
TCP port for MJPEG visualization streamAccess the live gameplay feed at:
http://<training-host>:<port>/doom.mjpegHardware Loop Configuration
Frequency (Hz) for running the game loopThis should match the
--tick-frequency setting on the CL1 device.Controls the rate of:- Sending stimulation commands to CL1
- Receiving spike data from CL1
- Game state updates
UDP Network Configuration
IP address of the CL1 device running
cl1_neural_interface.pyExample: --cl1-host 192.168.1.100UDP port for sending stimulation commands to CL1 deviceMust match
--stim-port on CL1 device.UDP port for receiving spike data from CL1 deviceMust match
--spike-port on CL1 device.UDP port for sending event metadata to CL1 deviceEvents include episode completions, checkpoints, and training completion signals.Must match
--event-port on CL1 device.UDP port for sending feedback stimulation commands to CL1 deviceFeedback includes reward signals and event-based stimulation (kills, damage, etc.).Must match
--feedback-port on CL1 device.Feedback Configuration
Enable episode-level feedback stimulationWhen enabled, applies feedback stimulation at the end of each episode based on total reward.
Disable episode-level feedback stimulationConvenience flag to turn off episode feedback (sets
use_episode_feedback=False).Scale episode feedback intensity by TD-error surprise magnitudeWhen enabled, unexpected rewards/penalties receive stronger feedback.
Disable surprise scaling for episode feedbackUse fixed feedback intensity regardless of prediction error.
Usage Examples
Basic Training Setup
On CL1 Device:Custom Port Configuration
Resume from Checkpoint
Watch Mode (Inference)
Disable Episode Feedback
Network Communication
UDP Packet Flow
Training System → CL1 Device:- Stimulation Commands (port 12345): Frequencies and amplitudes for neural stimulation
- Event Metadata (port 12347): Episode completion, checkpoint saves, training status
- Feedback Commands (port 12348): Reward/event-based stimulation
- Spike Data (port 12346): Spike counts per channel group from each hardware tick
UDP Protocol Details
See udp_protocol.py for packet format specifications.Socket Setup
MJPEG Visualization
The training server hosts an MJPEG stream for remote visualization:http://<training-host>:12349/doom.mjpeg
The stream shows:
- Game screen (if
use_screen_buffer=True) - Player stats overlay
- Episode statistics
Event Metadata
The training system sends metadata to the CL1 device for recording:Episode End Event
Training Complete Event
Performance Considerations
Network Latency
- Target Latency: < 5ms round-trip
- Typical: 1-2ms on local network
- UDP: No retransmission overhead
Tick Frequency Trade-offs
| Frequency | Game Speed | Latency Sensitivity | Compute Load |
|---|---|---|---|
| 10 Hz | Slow | Low | Low |
| 30 Hz | Normal | Medium | Medium |
| 60 Hz | Fast | High | High |
| 120 Hz | Very Fast | Very High | Very High |
Troubleshooting
No Spike Data Received
High Packet Loss
- Reduce tick frequency
- Check network bandwidth
- Use wired connection instead of WiFi
- Ensure no firewall blocking UDP ports
Latency Issues
- Monitor latency with built-in logging (every 1000 packets)
- Consider switching to 1Gbps or 10Gbps network
- Reduce batch size to decrease compute time
See Also
- ppo_doom.py - Direct CL1 training script
- cl1_neural_interface.py - CL1 hardware interface server
- udp_protocol.py - UDP packet formats