Tuning CatBoost With GASearchCV
CatBoost's ordered boosting algorithm introduces several parameters that have no direct equivalent in XGBoost or LightGBM: bagging_temperature controls data perturbation via Bayesian bootstrap weights, border_count sets the granularity of numeric feature binning, and colsample_bylevel samples features per tree level rather than per tree. These parameters interact with regularization in ways that make joint optimization — what GA does — more effective than independent search.
Prerequisites
pip install sklearn-genetic-opt catboostSetup
import warnings
from pprint import pprint
import time
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import loguniform, randint, uniform
from sklearn.datasets import load_breast_cancer
from sklearn.metrics import accuracy_score, balanced_accuracy_score, roc_auc_score
from sklearn.model_selection import RandomizedSearchCV, StratifiedKFold, train_test_split
from catboost import CatBoostClassifier
from sklearn_genetic import (
EvolutionConfig, GASearchCV, OptimizationConfig, PopulationConfig, RuntimeConfig,
)
from sklearn_genetic.callbacks import ConsecutiveStopping, TimerStopping
from sklearn_genetic.schedules import ExponentialAdapter, InverseAdapter
from sklearn_genetic.space import Continuous, Integer
warnings.filterwarnings("ignore")
RANDOM_STATE = 42
data = load_breast_cancer(as_frame=True)
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.30, stratify=y, random_state=RANDOM_STATE
)
cv = StratifiedKFold(n_splits=3, shuffle=True, random_state=RANDOM_STATE)
print(f"Train: {X_train.shape}, Test: {X_test.shape}")Baseline Model
def evaluate(name, estimator, X_eval, y_eval):
predictions = estimator.predict(X_eval)
probabilities = estimator.predict_proba(X_eval)[:, 1]
return {
"name": name,
"accuracy": round(accuracy_score(y_eval, predictions), 4),
"balanced_accuracy": round(balanced_accuracy_score(y_eval, predictions), 4),
"roc_auc": round(roc_auc_score(y_eval, probabilities), 4),
}
baseline = CatBoostClassifier(verbose=0, random_state=RANDOM_STATE)
baseline.fit(X_train, y_train)
baseline_metrics = evaluate("CatBoost defaults", baseline, X_test, y_test)
print(baseline_metrics)
# {'name': 'CatBoost defaults', 'accuracy': 0.9708, 'balanced_accuracy': 0.9680, 'roc_auc': 0.9951}Always pass verbose=0 to CatBoostClassifier
Without it, CatBoost prints a detailed training table for every iteration of every cross-validation fold. A 3-fold search with 100 iterations per candidate produces thousands of output lines. Unlike LightGBM's verbose=-1, CatBoost uses verbose=0 (integer, not negative).
Search Space
CatBoost's parameter names and their valid ranges differ from XGBoost/LightGBM in important ways.
param_grid = {
# Boosting volume
"iterations": Integer(50, 500), # number of trees (= n_estimators elsewhere)
# Step size — CatBoost default is 0.03, much lower than XGBoost/LightGBM
"learning_rate": Continuous(0.01, 0.3, distribution="log-uniform"),
# Tree complexity
"depth": Integer(4, 10), # max tree depth (CatBoost default: 6)
# Regularization
"l2_leaf_reg": Continuous(1e-3, 10.0, distribution="log-uniform"), # L2 reg on leaf weights
# Sampling — per level, not per tree
"colsample_bylevel": Continuous(0.5, 1.0), # feature fraction per tree level
# Bayesian bootstrap (incompatible with subsample — see note below)
"bagging_temperature": Continuous(0.0, 1.0), # 0 = uniform weights, 1 = exponential
# Numeric feature granularity
"border_count": Integer(32, 255), # bins per numeric feature
}CatBoost-specific parameters explained
bagging_temperature — CatBoost's default bootstrap on CPU is Bayesian, not subsampling. Each training sample gets a weight drawn from an exponential distribution scaled by bagging_temperature. At 0.0 all weights are equal (no perturbation); at 1.0 the distribution is exponential and provides strong regularization through data perturbation. This replaces subsample — do not include both in the same search.
border_count — the number of splits evaluated for each numeric feature (analogous to max_bin in XGBoost). Higher values give finer-grained splits and better accuracy but slower training. The default is 254; searching from 32 to 255 covers fast-but-coarse through the full default granularity.
colsample_bylevel — unlike XGBoost/LightGBM where feature subsampling is per tree, CatBoost subsamples features at each tree level. Combined with bagging_temperature this provides two complementary regularization axes.
Configure GASearchCV
CatBoost is slower per iteration than LightGBM, so we use fewer generations and rely on TimerStopping as the primary budget control.
callbacks = [
ConsecutiveStopping(generations=8, metric="fitness_best"),
TimerStopping(total_seconds=300),
]
ga_search = GASearchCV(
estimator=CatBoostClassifier(verbose=0, random_state=RANDOM_STATE),
param_grid=param_grid,
scoring="roc_auc",
cv=cv,
evolution_config=EvolutionConfig(
population_size=20,
generations=20,
crossover_probability=ExponentialAdapter(
initial_value=0.8, end_value=0.4, adaptive_rate=0.15
),
mutation_probability=InverseAdapter(
initial_value=0.25, end_value=0.05, adaptive_rate=0.20
),
tournament_size=3,
elitism=True,
keep_top_k=3,
),
population_config=PopulationConfig(
initializer="smart",
warm_start_configs=[{
"iterations": 100,
"learning_rate": 0.03, # CatBoost default — note: lower than XGBoost!
"depth": 6,
"l2_leaf_reg": 3.0,
"colsample_bylevel": 1.0,
"bagging_temperature": 1.0,
"border_count": 254,
}],
),
runtime_config=RuntimeConfig(
n_jobs=-1,
parallel_backend="cv", # CatBoost manages its own thread pool
use_cache=True,
verbose=True,
),
optimization_config=OptimizationConfig(
local_search=True,
local_search_top_k=2,
local_search_steps=1,
local_search_radius=0.2,
diversity_control=True,
diversity_threshold=0.30,
diversity_stagnation_generations=3,
diversity_mutation_boost=1.8,
random_immigrants_fraction=0.10,
fitness_sharing=True,
sharing_radius=0.35,
),
)GPU acceleration
If a CUDA-capable GPU is available, CatBoost can be orders of magnitude faster:
CatBoostClassifier(verbose=0, task_type="GPU", random_state=RANDOM_STATE)GPU training uses SymmetricTree structure, which may affect optimal depth and border_count values. Consider expanding the search space if switching to GPU.
Fit and Results
started_at = time.perf_counter()
ga_search.fit(X_train, y_train, callbacks=callbacks)
ga_seconds = time.perf_counter() - started_at
print(f"\nBest CV ROC AUC: {ga_search.best_score_:.4f}")
print(f"Search time: {ga_seconds:.0f}s")
pprint(ga_search.best_params_)Evaluation Mechanics
print(ga_search.fit_stats_)
# {
# 'evaluated_candidates': 240,
# 'unique_candidates': 238,
# 'cache_hits': 2,
# 'random_immigrants': 18,
# }Generation Telemetry
history = pd.DataFrame(ga_search.history)
cols = ["gen", "fitness", "fitness_max", "fitness_std",
"unique_individual_ratio", "genotype_diversity", "stagnation_generations"]
print(history[[c for c in cols if c in history.columns]].to_string())Fitness Evolution
ax = history.plot(
x="gen",
y=["fitness_best", "fitness_max", "fitness"],
marker="o",
figsize=(9, 4),
)
ax.set_title("CatBoost GA Search — Fitness over Generations")
ax.set_xlabel("Generation")
ax.set_ylabel("ROC AUC (CV)")
ax.legend(["best so far", "generation max", "generation mean"])
plt.tight_layout()
plt.show()border_count vs bagging_temperature Interaction
border_count controls split precision; bagging_temperature controls regularization strength through data perturbation. High border_count with low bagging_temperature can lead to overfitting; the GA discovers the balance.
cv_results = pd.DataFrame(ga_search.cv_results_)
cv_results = cv_results.dropna(subset=["mean_test_score"])
fig, ax = plt.subplots(figsize=(8, 6))
scatter = ax.scatter(
cv_results["param_border_count"].astype(float),
cv_results["param_bagging_temperature"].astype(float),
c=cv_results["mean_test_score"],
cmap="RdYlGn",
s=60,
alpha=0.8,
edgecolors="none",
)
plt.colorbar(scatter, ax=ax, label="Mean CV ROC AUC")
ax.set_xlabel("border_count")
ax.set_ylabel("bagging_temperature")
ax.set_title("Evaluated Candidates — border_count vs bagging_temperature")
plt.tight_layout()
plt.show()Compare with RandomizedSearchCV
randomized_search = RandomizedSearchCV(
estimator=CatBoostClassifier(verbose=0, random_state=RANDOM_STATE),
param_distributions={
"iterations": randint(50, 501),
"learning_rate": loguniform(0.01, 0.3),
"depth": randint(4, 11),
"l2_leaf_reg": loguniform(1e-3, 10.0),
"colsample_bylevel": uniform(0.5, 0.5),
"bagging_temperature": uniform(0.0, 1.0),
"border_count": randint(32, 256),
},
n_iter=20,
scoring="roc_auc",
cv=cv,
n_jobs=-1,
random_state=RANDOM_STATE,
)
started_at = time.perf_counter()
randomized_search.fit(X_train, y_train)
rs_seconds = time.perf_counter() - started_at
rs_metrics = evaluate("RandomizedSearchCV", randomized_search, X_test, y_test)
ga_metrics = evaluate("GASearchCV", ga_search, X_test, y_test)
comparison = pd.DataFrame([baseline_metrics, rs_metrics, ga_metrics])
comparison["best_cv_score"] = [
None,
round(randomized_search.best_score_, 4),
round(ga_search.best_score_, 4),
]
comparison["fit_seconds"] = [None, round(rs_seconds, 1), round(ga_seconds, 1)]
print(comparison.to_string(index=False))Expected output (approximate):
name accuracy balanced_accuracy roc_auc best_cv_score fit_seconds
CatBoost defaults 0.9708 0.9680 0.9951 None None
RandomizedSearchCV 0.9766 0.9742 0.9958 0.9936 38.5
GASearchCV 0.9825 0.9810 0.9972 0.9962 72.4Practical Notes
verbose=0is not optional — it is an integer, not a boolean.verbose=Falsealso works.- CatBoost default
learning_rateis0.03, not0.1like XGBoost/LightGBM. Reflect this inwarm_start_configs. - Do not mix
subsampleandbagging_temperature— on CPU, CatBoost uses Bayesian bootstrap by default (bootstrap_type='Bayesian'), which usesbagging_temperature. Addingsubsampleto the search space while on Bayesian bootstrap will raise a parameter conflict error. If you want to usesubsample, also searchbootstrap_type=Categorical(['Bernoulli', 'MVS']). border_countimpacts speed: 255 bins is up to 8× slower than 32 on wide datasets. If search time is a bottleneck, fixborder_count=64and remove it from the search space.parallel_backend="cv"— CatBoost, like XGBoost and LightGBM, manages its own thread pool. CV-level parallelism avoids oversubscription.
See Also
- Tune XGBoost — depth-wise boosting
- Tune LightGBM — leaf-wise boosting with
num_leavesinteraction - Advanced Optimizer Control — diversity, fitness sharing, local search
- GASearchCV API
