Re-training Models¶

After the agent completes a run, you can re-train the model with new data using the train.sh script.

When to Use¶

Train on updated or expanded datasets
Fine-tune with additional samples
Reproduce training with different data splits

Basic Usage¶

./train.sh \
  --agent-dir outputs/<agent_id> \
  --train-data /path/to/new_train.csv \
  --validation-data /path/to/new_validation.csv \
  --artifacts-dir /path/to/output_artifacts

Required Arguments¶

Argument	Description
`--agent-dir`	Path to completed agent output folder
`--train-data`	Path to new training CSV file
`--validation-data`	Path to new validation CSV file
`--artifacts-dir`	Where to save new training artifacts

Optional Arguments¶

Argument	Description
`--cpu-only`	Run without GPU
`--local`	Run locally without Docker
`--help`	Show help message

Example¶

# Re-train using new data
./train.sh \
  --agent-dir outputs/enchanted_fixing_reigned \
  --train-data datasets/updated_data/train.csv \
  --validation-data datasets/updated_data/validation.csv \
  --artifacts-dir outputs/retrained_model

How It Works¶

The script:

Loads the agent's train.py script from best_run_files/
Uses the agent's conda environment
Runs training with the new data
Saves artifacts to the specified directory

Data Format¶

Your new data files must match the format expected by the agent's training script:

Same column names as original training data
Same feature encoding/preprocessing expectations
Target column with same name and format

Output¶

After training completes, you'll find:

artifacts_dir/
├── ...                 # Artifacts produced by train.py

Docker vs Local Mode¶

By default, training runs in Docker for isolation. Use --local for direct execution:

# Docker mode (default)
./train.sh --agent-dir outputs/my_agent ...

# Local mode
./train.sh --local --agent-dir outputs/my_agent ...

GPU Support¶

GPU is used automatically if available. To disable:

./train.sh --cpu-only --agent-dir outputs/my_agent ...

Troubleshooting¶

"Docker image not found"¶

Run ./run.sh once to build the Docker image, or use --local mode.

"Agent directory not found"¶

Ensure the path points to a completed agent output in outputs/.

"Column mismatch"¶

Your new data must have the same column structure as the original training data.

Next Steps¶

Running Inference - Make predictions with trained models
Understanding Outputs - Explore what the agent produces