CLI Options¶

Complete reference for run.sh command-line options.

Basic Options¶

Option	Description	Default
`--model <name>`	LLM model to use	Interactive selection
`--dataset <name>`	Dataset name	Interactive selection
`--iterations <n>`	Number of iterations	Prompted in interactive mode (default 5)
`--val-metric <metric>`	Validation metric to optimize	Task-based default (`AUROC` for classification, `MAE` for regression)
`--timeout <seconds>`	Time limit for entire run	None
`--run-python-timeout <seconds>`	Timeout in seconds for each run_python tool execution - this will determine the maximum training time	`21600` (6 hours)

The run stops when either the iteration count is reached or the timeout expires.

Deployment Options¶

Option	Description
`--build-images`	Build Docker images locally
`--local`	Run without Docker (uses conda)
`--cpu-only`	Disable GPU acceleration
`--ollama`	Use local Ollama for LLM

Listing Options¶

Option	Description
`--list-models`	Show available LLM models
`--list-datasets`	Show prepared datasets
`--list-metrics`	Show available validation metrics
`--help`	Show help message

Advanced Options¶

Option	Description
`--user-prompt <text>`	Custom prompt for the agent
`--iteration-plan-model <name>`	LLM model used for generating the iteration plan (defaults to `--model`)
`--foundation-model-type <type>`	Pre-download foundation models (`dna`, `rna`, `protein`, `molecule`, `all`)
`--use-provisioning-key`	Use OpenRouter temporary API key
`--spend-limit <n>`	Spend limit for provisioning key (requires `--use-provisioning-key`)
`--verbosity <summary\\|full>`	How much agent interaction detail is printed during the run (default: `full`)
`--disable-training-reporting`	Disable the TrainingReporter helper that emits structured training progress updates from the agent's training script
`--split-allowed-iterations <n>`	Iterations that can modify train/val split (default 1)
`--exploration-iterations <n>`	Baseline exploration iterations (default 4)
`--run-python-timeout <seconds>`	Per-training timeout for `run_python` tool (default 21600)

Forking¶

Option	Description
`--fork-from-run <path>`	Path to a source `outputs/<run_id>` directory. Creates a new run branching off from a checkpoint in that run. Most other options are optional when forking — omitting them inherits from the source run config.
`--fork-from-iteration <n>`	Iteration number to fork from (default: latest). Only used with `--fork-from-run`.
`--fork-from-step <step>`	Step ID to fork from, e.g. `data_split` or `model_training` (default: latest checkpoint). Only used with `--fork-from-run`.

When --iterations, --split-allowed-iterations, or --exploration-iterations are provided alongside --fork-from-run, they are interpreted as additional iterations from the fork point rather than absolute totals.

See Forking a Run for a full guide and examples.

Examples¶

Basic Run¶

./run.sh --model openai/gpt-4 --dataset breast_cancer --iterations 10

Quick Start with Pre-built Images¶

./run.sh

Local Mode¶

./run.sh --local --model openai/gpt-4 --dataset my_data

With Time Limit¶

./run.sh --timeout 3600 --model openai/gpt-4 --dataset my_data

Custom Optimization Goal¶

./run.sh --user-prompt "Focus on interpretable models only" --model openai/gpt-4

Using Ollama¶

./run.sh --ollama

CPU Only¶

./run.sh --cpu-only --model openai/gpt-4 --dataset my_data

Pre-download Foundation Models¶

./run.sh --foundation-model-type protein --model openai/gpt-4

Run with locally built Docker images¶

./run.sh --build-images

Validation Metrics¶

Available metrics for --val-metric:

Classification:

ACC - Accuracy
AUROC - Area Under ROC Curve
AUPRC - Area Under Precision-Recall Curve
F1 - F1 Score (macro)
LOG_LOSS - Log loss
MCC - Matthews Correlation Coefficient

Regression:

MSE - Mean Squared Error
RMSE - Root Mean Squared Error
MAE - Mean Absolute Error
MAPE - Mean Absolute Percentage Error
PEARSON - Pearson Correlation
SPEARMAN - Spearman Correlation
R2 - R-squared

Environment Variables¶

API keys and logging settings come from environment variables or .env. See Environment Variables.

Model Names¶

Model names are provider-specific. Use --list-models to see available models for your configured providers.

Exit Codes¶

Code	Meaning
0	Success
1	General error
2	Invalid arguments
130	Interrupted (Ctrl+C)