Skip to content

Metrics Reference

Complete list of available metrics for model evaluation.

Classification Metrics

Accuracy (ACC)

The proportion of correct predictions.

ACC = (TP + TN) / (TP + TN + FP + FN)
Property Value
Range 0 to 1
Best Higher
Use case Balanced datasets

AUROC

Area Under the Receiver Operating Characteristic Curve.

Property Value
Range 0 to 1
Best Higher
Use case Ranking quality, imbalanced data

Multi-class: Uses one-vs-rest.


AUPRC

Area Under the Precision-Recall Curve.

Property Value
Range 0 to 1
Best Higher
Use case Highly imbalanced data

Multi-class: Macro average.


F1 Score (F1)

Harmonic mean of precision and recall.

F1 = 2 * (Precision * Recall) / (Precision + Recall)
Property Value
Range 0 to 1
Best Higher
Use case Balance precision and recall

Multi-class: Macro average.


Log Loss (LOG_LOSS)

Negative log-likelihood of the true labels given predicted probabilities.

Property Value
Range 0 to ∞
Best Lower
Use case Probability calibration

Matthews Correlation Coefficient (MCC)

Correlation between predicted and actual classifications.

Property Value
Range -1 to 1
Best Higher (1 is perfect)
Use case Imbalanced data, overall quality

Regression Metrics

Mean Squared Error (MSE)

Average of squared prediction errors.

MSE = (1/n) * Σ(y_true - y_pred)^2
Property Value
Range 0 to ∞
Best Lower
Use case Penalizing large errors

Root Mean Squared Error (RMSE)

Square root of MSE.

RMSE = sqrt(MSE)
Property Value
Range 0 to ∞
Best Lower
Use case Same units as target

Mean Absolute Error (MAE)

Average of absolute prediction errors.

MAE = (1/n) * Σ|y_true - y_pred|
Property Value
Range 0 to ∞
Best Lower
Use case Robust to outliers

Mean Absolute Percentage Error (MAPE)

Average absolute percent error.

Property Value
Range 0 to ∞
Best Lower
Use case Relative error on positive targets

R-squared (R2)

Proportion of variance explained by the model.

R2 = 1 - (SS_res / SS_tot)
Property Value
Range -∞ to 1
Best Higher (1 is perfect)
Use case Model explanatory power

Pearson Correlation (PEARSON)

Linear correlation between predictions and true values.

Property Value
Range -1 to 1
Best Higher
Use case Linear relationship strength

Spearman Correlation (SPEARMAN)

Rank correlation between predictions and true values.

Property Value
Range -1 to 1
Best Higher
Use case Monotonic relationship strength

Choosing a Metric

Classification

Scenario Recommended Metric
Balanced classes ACC, F1
Imbalanced classes AUROC, AUPRC, MCC
Probability calibration LOG_LOSS
Overall quality MCC

Regression

Scenario Recommended Metric
General performance RMSE, MAE
Relative performance R2, PEARSON, SPEARMAN
Same units as target RMSE, MAE

Using Metrics

CLI

./run.sh --val-metric AUROC

Listing Available Metrics

./run.sh --list-metrics

Metric Abbreviations

Abbreviation Full Name
ACC Accuracy
AUROC Area Under ROC Curve
AUPRC Area Under Precision-Recall Curve
F1 F1 Score
LOG_LOSS Log Loss
MCC Matthews Correlation Coefficient
MSE Mean Squared Error
RMSE Root Mean Squared Error
MAE Mean Absolute Error
MAPE Mean Absolute Percentage Error
R2 R-squared
PEARSON Pearson Correlation
SPEARMAN Spearman Correlation