Skip to content

Running the Agent

The main entry point for Agentomics-ML is run.sh. This guide covers both interactive and non-interactive usage.

Interactive Mode

Running without arguments launches interactive mode:

./run.sh

Docker mode expects a .env file in the repo root (copy .env.example).

You'll be prompted to select:

  1. LLM Model - Choose from available models
  2. Dataset - Select a prepared dataset
  3. Iterations - Number of optimization cycles (default prompt: 5)
  4. Validation Metric - Optional metric to optimize (defaults: AUROC for classification, MAE for regression)

Non-Interactive Mode

Supply parameters directly to skip prompts:

./run.sh \
  --model openai/gpt-4 \
  --dataset breast_cancer \
  --iterations 10

For non-interactive runs, provide at least --model, --dataset, and --iterations.

Common Options

Option Description Example
--model LLM model to use --model openai/gpt-4
--dataset Dataset name --dataset my_data
--iterations Number of iterations --iterations 15
--val-metric Validation metric (optional, task-based default if omitted) --val-metric AUROC
--timeout Time limit in seconds --timeout 3600
--run-python-timeout Timeout in seconds for each run_python tool execution (see CLI options) --run-python-timeout 43200
--use-provisioning-key Use a provisioning key for OpenRouter --use-provisioning-key
--spend-limit Spend limit for provisioning key --spend-limit 25

Listing Available Options

# List available models
./run.sh --list-models

# List prepared datasets
./run.sh --list-datasets

# List available metrics
./run.sh --list-metrics

Deployment Flags

Flag Description
--pull-images Pull pre-built Docker images
--local Run without Docker (uses conda)
--cpu-only Disable GPU acceleration
--ollama Use local Ollama models

Advanced Options

Foundation Models

Pre-download domain-specific foundation models:

./run.sh --foundation-model-type dna

Available types: dna, rna, protein, molecule Also supported: all

Data Split and Exploration Controls

./run.sh --split-allowed-iterations 1 --exploration-iterations 4

--split-allowed-iterations controls how many early iterations are allowed to resplit train/validation (ignored if you provide validation.csv). --exploration-iterations controls how long the agent spends on baseline/exploration models.

Time Limits

Set a deadline for the entire run:

./run.sh --timeout 7200  # 2 hour limit

Set timeout for each training execution (default is 6 hours):

./run.sh --run-python-timeout 43200  # 12 hours per training run

Custom User Prompt

Override the default optimization goal:

./run.sh --user-prompt "Only use simple models like logistic regression"

See Custom Prompts for more details.

Full Help

./run.sh --help

What Happens During a Run

  1. Dataset Preparation - Validates and prepares data in prepared_datasets/
  2. Iterative Development - Agent runs exploration, training, and evaluation cycles
  3. Snapshot Best Model - Tracks the best-performing iteration
  4. Final Evaluation - Tests on held-out test set (if provided)
  5. Output Results - Saves everything to outputs/<agent_id>/

Monitoring Progress

During execution, you'll see:

  • Current iteration number
  • Agent step (exploration, training, etc.)
  • Validation metrics after each iteration
  • Best iteration tracking

Stopping a Run

Press Ctrl+C to stop. The agent will attempt to save current progress.

Next Steps