training_server.py

Overview

training_server.py is a modified version of ppo_doom.py designed for distributed training. It runs PPO training on a GPU server while communicating with a remote CL1 device over UDP. Architecture:

Training System: Runs VizDoom, PyTorch models, and PPO algorithm
CL1 Device: Runs cl1_neural_interface.py to handle neural hardware
Communication: UDP protocol for low-latency stimulation/spike exchange

Key Differences from ppo_doom.py:

No direct CL SDK imports (CL SDK only on CL1 device)
UDP sockets for stimulation commands and spike data
MJPEG server for remote visualization
Event metadata logging to CL1 device

Location: source/training_server.py

Command-Line Arguments

Basic Options

--mode

string

default:"train"

Execution mode for the training serverChoices: train, watch

train: Full training mode with gradient updates
watch: Observe neural activity without training (inference mode)

--checkpoint

string

default:"None"

Path to checkpoint file for loading pre-trained weightsExample: --checkpoint checkpoints/l5_2048_rand/checkpoint_7900.pt

--max-episodes

integer

default:"65000"

Maximum number of training episodes before termination

--device

string

default:"cuda"

PyTorch device for gradient computationChoices: cpu, cuda

Neural Interface Options

--decoder-ablation

string

default:"none"

Ablation mode for diagnostic testing of decoder dependency on spikesChoices:

none: Normal operation, use real spike features
zero: Replace spike features with zeros (tests decoder bias)
random: Replace spike features with random values (tests decoder robustness)

--encoder-use-cnn

boolean

default:"false"

Enable CNN encoder over screen buffer in addition to scalar featuresWhen enabled, processes downsampled screen buffer through convolutional network.

Display & Visualization

--show_window

boolean

default:"false"

Display the VizDoom game window on the training systemNote: For remote viewing, use the MJPEG stream instead.

--recording_path

string

default:"/data/recordings/doom-neuron"

Directory path on CL1 device for saving neural recordingsThis path is sent to the CL1 device via event metadata. Actual recordings are managed by the CL1 device.

--visualisation-port

integer

default:"12349"

TCP port for MJPEG visualization streamAccess the live gameplay feed at: http://<training-host>:<port>/doom.mjpeg

Hardware Loop Configuration

--tick_frequency_hz

integer

default:"10"

Frequency (Hz) for running the game loopThis should match the --tick-frequency setting on the CL1 device.Controls the rate of:

Sending stimulation commands to CL1
Receiving spike data from CL1
Game state updates

UDP Network Configuration

--cl1-host

string

default:"localhost"

IP address of the CL1 device running cl1_neural_interface.pyExample: --cl1-host 192.168.1.100

--cl1-stim-port

integer

default:"12345"

UDP port for sending stimulation commands to CL1 deviceMust match --stim-port on CL1 device.

--cl1-spike-port

integer

default:"12346"

UDP port for receiving spike data from CL1 deviceMust match --spike-port on CL1 device.

--cl1-event-port

integer

default:"12347"

UDP port for sending event metadata to CL1 deviceEvents include episode completions, checkpoints, and training completion signals.Must match --event-port on CL1 device.

--cl1-feedback-port

integer

default:"12348"

UDP port for sending feedback stimulation commands to CL1 deviceFeedback includes reward signals and event-based stimulation (kills, damage, etc.).Must match --feedback-port on CL1 device.

Feedback Configuration

--use-episode-feedback

boolean

default:"true"

Enable episode-level feedback stimulationWhen enabled, applies feedback stimulation at the end of each episode based on total reward.

--no-episode-feedback

boolean

default:"false"

Disable episode-level feedback stimulationConvenience flag to turn off episode feedback (sets use_episode_feedback=False).

--episode-feedback-surprise-scaling

boolean

default:"true"

Scale episode feedback intensity by TD-error surprise magnitudeWhen enabled, unexpected rewards/penalties receive stronger feedback.

--no-episode-feedback-surprise-scaling

boolean

default:"false"

Disable surprise scaling for episode feedbackUse fixed feedback intensity regardless of prediction error.

Usage Examples

Basic Training Setup

On CL1 Device:

python cl1_neural_interface.py \
  --training-host 192.168.1.50 \
  --tick-frequency 10

On Training Server:

python training_server.py \
  --mode train \
  --cl1-host 192.168.1.100 \
  --max-episodes 10000 \
  --device cuda \
  --encoder-use-cnn

Custom Port Configuration

python training_server.py \
  --mode train \
  --cl1-host 192.168.1.100 \
  --cl1-stim-port 5000 \
  --cl1-spike-port 5001 \
  --cl1-event-port 5002 \
  --cl1-feedback-port 5003 \
  --visualisation-port 8080

Resume from Checkpoint

python training_server.py \
  --mode train \
  --checkpoint checkpoints/checkpoint_5000.pt \
  --cl1-host 192.168.1.100 \
  --max-episodes 20000

Watch Mode (Inference)

python training_server.py \
  --mode watch \
  --checkpoint checkpoints/best_model.pt \
  --cl1-host 192.168.1.100 \
  --show_window

Disable Episode Feedback

python training_server.py \
  --mode train \
  --cl1-host 192.168.1.100 \
  --no-episode-feedback \
  --no-episode-feedback-surprise-scaling

Network Communication

UDP Packet Flow

Training System → CL1 Device:

Stimulation Commands (port 12345): Frequencies and amplitudes for neural stimulation
Event Metadata (port 12347): Episode completion, checkpoint saves, training status
Feedback Commands (port 12348): Reward/event-based stimulation

CL1 Device → Training System:

Spike Data (port 12346): Spike counts per channel group from each hardware tick

UDP Protocol Details

See udp_protocol.py for packet format specifications.

Socket Setup

# Stimulation socket (send to CL1)
stim_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)

# Spike socket (receive from CL1)
spike_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
spike_socket.bind(('0.0.0.0', config.cl1_spike_port))
spike_socket.settimeout(0.1)  # 100ms timeout

# Event socket (send to CL1)
event_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)

# Feedback socket (send to CL1)
feedback_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)

MJPEG Visualization

The training server hosts an MJPEG stream for remote visualization:

from mjpeg_server import MJPEGServer

mjpeg_server = MJPEGServer(
    host=config.vis_host,
    port=config.vis_port,
    path=config.vis_path
)

Access: http://<training-host>:12349/doom.mjpeg The stream shows:

Game screen (if use_screen_buffer=True)
Player stats overlay
Episode statistics

Event Metadata

The training system sends metadata to the CL1 device for recording:

Episode End Event

{
  "episode": 1234,
  "total_reward": 450.5,
  "episode_length": 512,
  "kills": 3,
  "damage_taken": 25,
  "survived": true
}

Training Complete Event

{
  "total_episodes": 10000,
  "total_steps": 5120000,
  "reason": "max_episodes_reached"
}

These events are logged to the CL1 DataStream for analysis.

Performance Considerations

Network Latency

Target Latency: < 5ms round-trip
Typical: 1-2ms on local network
UDP: No retransmission overhead

Tick Frequency Trade-offs

Frequency	Game Speed	Latency Sensitivity	Compute Load
10 Hz	Slow	Low	Low
30 Hz	Normal	Medium	Medium
60 Hz	Fast	High	High
120 Hz	Very Fast	Very High	Very High

Recommendation: Start with 10 Hz, increase once training is stable.

Troubleshooting

No Spike Data Received

# Check if CL1 device is reachable
ping <cl1-host>

# Check if CL1 interface is running
# On CL1 device:
ps aux | grep cl1_neural_interface

High Packet Loss

Reduce tick frequency
Check network bandwidth
Use wired connection instead of WiFi
Ensure no firewall blocking UDP ports

Latency Issues

Monitor latency with built-in logging (every 1000 packets)
Consider switching to 1Gbps or 10Gbps network
Reduce batch size to decrease compute time

Scripts

Configuration

Overview

Command-Line Arguments

Basic Options

Neural Interface Options

Display & Visualization

Hardware Loop Configuration

UDP Network Configuration

Feedback Configuration

Usage Examples

Basic Training Setup

Custom Port Configuration

Resume from Checkpoint

Watch Mode (Inference)

Disable Episode Feedback

Network Communication

UDP Packet Flow

UDP Protocol Details

Socket Setup

MJPEG Visualization

Event Metadata

Episode End Event

Training Complete Event

Performance Considerations

Network Latency

Tick Frequency Trade-offs

Troubleshooting

No Spike Data Received

High Packet Loss

Latency Issues

See Also

Scripts

Configuration

​Overview

​Command-Line Arguments

​Basic Options

​Neural Interface Options

​Display & Visualization

​Hardware Loop Configuration

​UDP Network Configuration

​Feedback Configuration

​Usage Examples

​Basic Training Setup

​Custom Port Configuration

​Resume from Checkpoint

​Watch Mode (Inference)

​Disable Episode Feedback

​Network Communication

​UDP Packet Flow

​UDP Protocol Details

​Socket Setup

​MJPEG Visualization

​Event Metadata

​Episode End Event

​Training Complete Event

​Performance Considerations

​Network Latency

​Tick Frequency Trade-offs

​Troubleshooting

​No Spike Data Received

​High Packet Loss

​Latency Issues

​See Also

Overview

Command-Line Arguments

Basic Options

Neural Interface Options

Display & Visualization

Hardware Loop Configuration

UDP Network Configuration

Feedback Configuration

Usage Examples

Basic Training Setup

Custom Port Configuration

Resume from Checkpoint

Watch Mode (Inference)

Disable Episode Feedback

Network Communication

UDP Packet Flow

UDP Protocol Details

Socket Setup

MJPEG Visualization

Event Metadata

Episode End Event

Training Complete Event

Performance Considerations

Network Latency

Tick Frequency Trade-offs

Troubleshooting

No Spike Data Received

High Packet Loss

Latency Issues

See Also