feat: enhance precision optimization in model training

- Introduced a new plan to modify the Optuna objective function to prioritize precision under a recall constraint of 0.35, improving model performance in scenarios where false positives are costly.
- Updated training scripts to implement precision-based metrics and adjusted the walk-forward cross-validation process to incorporate precision and recall calculations.
- Enhanced the active LGBM parameters and training log to reflect the new metrics and model configurations.
- Added a new design document outlining the implementation steps for the precision-focused optimization.

This update aims to refine the model's decision-making process by emphasizing precision, thereby reducing potential losses from false positives.
This commit is contained in:
21in7
2026-03-03 00:57:19 +09:00
parent 3613e3bf18
commit 6fe2158511
6 changed files with 1590 additions and 627 deletions

View File

@@ -81,3 +81,36 @@ Environment variables via `.env` file (see `.env.example`). Key vars: `BINANCE_A
- **Docker**: `Dockerfile` (Python 3.12-slim) + `docker-compose.yml`
- **CI/CD**: Jenkins pipeline (Gitea → Docker registry → LXC production server)
- Models stored in `models/`, data cache in `data/`, logs in `logs/`
## Design & Implementation Plans
All design documents and implementation plans are stored in `docs/plans/` with the naming convention `YYYY-MM-DD-feature-name.md`. Design docs (`-design.md`) describe architecture decisions; implementation plans (`-plan.md`) contain step-by-step tasks for Claude to execute.
**Chronological plan history:**
| Date | Plan | Status |
|------|------|--------|
| 2026-03-01 | `xrp-futures-autotrader` | Completed |
| 2026-03-01 | `discord-notifier-and-position-recovery` | Completed |
| 2026-03-01 | `upload-to-gitea` | Completed |
| 2026-03-01 | `dockerfile-and-docker-compose` | Completed |
| 2026-03-01 | `fix-pandas-ta-python312` | Completed |
| 2026-03-01 | `jenkins-gitea-registry-cicd` | Completed |
| 2026-03-01 | `ml-filter-design` / `ml-filter-implementation` | Completed |
| 2026-03-01 | `train-on-mac-deploy-to-lxc` | Completed |
| 2026-03-01 | `m4-accelerated-training` | Completed |
| 2026-03-01 | `vectorized-dataset-builder` | Completed |
| 2026-03-01 | `btc-eth-correlation-features` (design + plan) | Completed |
| 2026-03-01 | `dynamic-margin-ratio` (design + plan) | Completed |
| 2026-03-01 | `lgbm-improvement` | Completed |
| 2026-03-01 | `15m-timeframe-upgrade` | Completed |
| 2026-03-01 | `oi-nan-epsilon-precision-threshold` | Completed |
| 2026-03-02 | `rs-divide-mlx-nan-fix` | Completed |
| 2026-03-02 | `reverse-signal-reenter` (design + plan) | Completed |
| 2026-03-02 | `realtime-oi-funding-features` | Completed |
| 2026-03-02 | `oi-funding-accumulation` | Completed |
| 2026-03-02 | `optuna-hyperparam-tuning` (design + plan) | Completed |
| 2026-03-02 | `user-data-stream-tp-sl-detection` (design + plan) | Completed |
| 2026-03-02 | `adx-filter-design` | Completed |
| 2026-03-02 | `hold-negative-sampling` (design + plan) | Completed |
| 2026-03-03 | `optuna-precision-objective-plan` | Pending |

View File

@@ -0,0 +1,80 @@
# Optuna 목적함수를 Precision 중심으로 변경
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** 현재 ROC-AUC만 최적화하는 Optuna objective를 **recall >= 0.35 제약 하에서 precision을 최대화**하는 방향으로 변경한다. AUC는 threshold-independent 지표라 실제 운용 시점의 성능(precision)을 반영하지 못하며, 오탐(false positive = 잘못된 진입)이 실제 손실을 발생시키므로 precision 우선 최적화가 필요하다.
**Tech Stack:** Python, LightGBM, Optuna, scikit-learn
---
## 변경 파일
- `scripts/tune_hyperparams.py` (유일한 변경 대상)
---
## 구현 단계
### 1. `_find_best_precision_at_recall` 헬퍼 함수 추가
- `sklearn.metrics.precision_recall_curve`로 recall >= min_recall 조건의 최대 precision과 threshold 반환
- 조건 불만족 시 `(0.0, 0.0, 0.50)` fallback
- train_model.py:277-292와 동일한 로직
### 2. `_walk_forward_cv` 수정
- 기존 반환: `(mean_auc, fold_aucs)` → 신규: `(mean_score, details_dict)`
- `details_dict` 키: `fold_aucs`, `fold_precisions`, `fold_recalls`, `fold_thresholds`, `fold_n_pos`, `mean_auc`, `mean_precision`, `mean_recall`
- **Score 공식**: `precision + auc * 0.001` (AUC는 precision 동률 시 tiebreaker)
- fold 내 양성 < 3개면 해당 fold precision=0.0으로 처리, 평균 계산에서 제외
- 인자 추가: `min_recall: float = 0.35`
- import 추가: `from sklearn.metrics import precision_recall_curve`
- Pruning: 양성 충분한 fold만 report하여 false pruning 방지
### 3. `make_objective` 수정
- `min_recall` 인자 추가 → `_walk_forward_cv`에 전달
- `trial.set_user_attr`로 precision/recall/threshold/n_pos 등 저장
- 반환값: `mean_score` (precision + auc * 0.001)
### 4. `measure_baseline` 수정
- `min_recall` 인자 추가
- 반환값을 `(mean_score, details_dict)` 형태로 변경
### 5. `--min-recall` CLI 인자 추가
- `parser.add_argument("--min-recall", type=float, default=0.35)`
- `make_objective``measure_baseline`에 전달
### 6. `print_report` 수정
- Best Score, Precision, AUC 모두 표시
- 폴드별 AUC + Precision + Recall + Threshold + 양성수 표시
- Baseline과 비교 시 precision 기준 개선폭 표시
### 7. `save_results` 수정
- JSON에 `min_recall_constraint`, precision/recall/threshold 필드 추가
- `best_trial``score`, `precision`, `recall`, `threshold`, `fold_precisions`, `fold_recalls`, `fold_thresholds`, `fold_n_pos` 추가
- `best_trial.params` 구조는 그대로 유지 (하위호환)
### 8. 비교 로직 및 기타 수정
- line 440: `study.best_value > baseline_auc``study.best_value > baseline_score`
- `study_name`: `"lgbm_wf_auc"``"lgbm_wf_precision"`
- progress callback: Precision과 AUC 동시 표시
- `n_warmup_steps` 2 → 3 (precision이 AUC보다 노이즈가 크므로)
---
## 검증 방법
```bash
# 기본 실행 (min_recall=0.35)
python scripts/tune_hyperparams.py --trials 10 --folds 3
# min_recall 조절
python scripts/tune_hyperparams.py --trials 10 --min-recall 0.4
# 기존 테스트 통과 확인
bash scripts/run_tests.sh
```
확인 포인트:
- 폴드별 precision/recall/threshold가 리포트에 표시되는지
- recall >= min_recall 제약이 올바르게 동작하는지
- active_lgbm_params.json이 precision 기준으로 갱신되는지
- train_model.py가 새 JSON 포맷을 기존과 동일하게 읽는지

File diff suppressed because it is too large Load Diff

View File

@@ -401,5 +401,30 @@
"reg_lambda": 0.80039
},
"weight_scale": 0.718348
},
{
"date": "2026-03-03T00:39:05.427160",
"backend": "lgbm",
"auc": 0.9436,
"best_threshold": 0.3041,
"best_precision": 0.467,
"best_recall": 0.269,
"samples": 1524,
"features": 23,
"time_weight_decay": 0.5,
"model_path": "models/lgbm_filter.pkl",
"tuned_params_path": "models/active_lgbm_params.json",
"lgbm_params": {
"n_estimators": 221,
"learning_rate": 0.031072,
"max_depth": 5,
"num_leaves": 20,
"min_child_samples": 39,
"subsample": 0.83244,
"colsample_bytree": 0.526349,
"reg_alpha": 0.062177,
"reg_lambda": 0.082872
},
"weight_scale": 1.431662
}
]

View File

@@ -17,7 +17,7 @@ import joblib
import lightgbm as lgb
import numpy as np
import pandas as pd
from sklearn.metrics import roc_auc_score, classification_report
from sklearn.metrics import roc_auc_score, classification_report, precision_recall_curve
from src.indicators import Indicators
from src.ml_features import build_features, FEATURE_COLS
@@ -275,7 +275,6 @@ def train(data_path: str, time_weight_decay: float = 2.0, tuned_params_path: str
auc = roc_auc_score(y_val, val_proba)
# 최적 임계값 탐색: 최소 재현율(0.15) 조건부 정밀도 최대화
from sklearn.metrics import precision_recall_curve
precisions, recalls, thresholds = precision_recall_curve(y_val, val_proba)
# precision_recall_curve의 마지막 원소는 (1.0, 0.0)이므로 제외
precisions, recalls = precisions[:-1], recalls[:-1]
@@ -375,6 +374,7 @@ def walk_forward_auc(
train_end_start = int(n * train_ratio)
aucs = []
fold_metrics = []
for i in range(n_splits):
tr_end = train_end_start + i * step
val_end = tr_end + step
@@ -395,12 +395,30 @@ def walk_forward_auc(
proba = model.predict_proba(X_val)[:, 1]
auc = roc_auc_score(y_val, proba) if len(np.unique(y_val)) > 1 else 0.5
aucs.append(auc)
# 폴드별 최적 임계값 (recall >= 0.15 조건부 precision 최대화)
MIN_RECALL = 0.15
precs, recs, thrs = precision_recall_curve(y_val, proba)
precs, recs = precs[:-1], recs[:-1]
valid_idx = np.where(recs >= MIN_RECALL)[0]
if len(valid_idx) > 0:
best_i = valid_idx[np.argmax(precs[valid_idx])]
f_thr, f_prec, f_rec = float(thrs[best_i]), float(precs[best_i]), float(recs[best_i])
else:
f_thr, f_prec, f_rec = 0.50, 0.0, 0.0
fold_metrics.append({"auc": auc, "precision": f_prec, "recall": f_rec, "threshold": f_thr})
print(
f" 폴드 {i+1}/{n_splits}: 학습={tr_end}개, "
f"검증={tr_end}~{val_end} ({step}개), AUC={auc:.4f}"
f"검증={tr_end}~{val_end} ({step}개), AUC={auc:.4f} | "
f"Thr={f_thr:.4f} Prec={f_prec:.3f} Rec={f_rec:.3f}"
)
mean_prec = np.mean([m["precision"] for m in fold_metrics])
mean_rec = np.mean([m["recall"] for m in fold_metrics])
mean_thr = np.mean([m["threshold"] for m in fold_metrics])
print(f"\n Walk-Forward 평균 AUC: {np.mean(aucs):.4f} ± {np.std(aucs):.4f}")
print(f" 평균 Precision: {mean_prec:.3f} | 평균 Recall: {mean_rec:.3f} | 평균 Threshold: {mean_thr:.4f}")
print(f" 폴드별: {[round(a, 4) for a in aucs]}")

View File

@@ -7,6 +7,7 @@ Optuna를 사용한 LightGBM 하이퍼파라미터 자동 탐색.
python scripts/tune_hyperparams.py --trials 10 --folds 3 # 빠른 테스트
python scripts/tune_hyperparams.py --data data/combined_15m.parquet --trials 100
python scripts/tune_hyperparams.py --no-baseline # 베이스라인 측정 건너뜀
python scripts/tune_hyperparams.py --min-recall 0.4 # 최소 재현율 제약 조정
결과:
- 콘솔: Best Params + Walk-Forward 리포트
@@ -28,7 +29,7 @@ import lightgbm as lgb
import optuna
from optuna.samplers import TPESampler
from optuna.pruners import MedianPruner
from sklearn.metrics import roc_auc_score
from sklearn.metrics import roc_auc_score, precision_recall_curve
from src.ml_features import FEATURE_COLS
from src.dataset_builder import generate_dataset_vectorized, stratified_undersample
@@ -82,6 +83,37 @@ def load_dataset(data_path: str) -> tuple[np.ndarray, np.ndarray, np.ndarray, np
return X, y, w, source
# ──────────────────────────────────────────────
# Precision 헬퍼
# ──────────────────────────────────────────────
def _find_best_precision_at_recall(
y_true: np.ndarray,
proba: np.ndarray,
min_recall: float = 0.35,
) -> tuple[float, float, float]:
"""
precision_recall_curve에서 recall >= min_recall 조건을 만족하는
최대 precision과 해당 threshold를 반환한다.
Returns:
(best_precision, best_recall, best_threshold)
조건 불만족 시 (0.0, 0.0, 0.50)
"""
precisions, recalls, thresholds = precision_recall_curve(y_true, proba)
precisions, recalls = precisions[:-1], recalls[:-1]
valid_idx = np.where(recalls >= min_recall)[0]
if len(valid_idx) > 0:
best_idx = valid_idx[np.argmax(precisions[valid_idx])]
return (
float(precisions[best_idx]),
float(recalls[best_idx]),
float(thresholds[best_idx]),
)
return (0.0, 0.0, 0.50)
# ──────────────────────────────────────────────
# Walk-Forward 교차검증
# ──────────────────────────────────────────────
@@ -94,17 +126,28 @@ def _walk_forward_cv(
params: dict,
n_splits: int,
train_ratio: float,
min_recall: float = 0.35,
trial: "optuna.Trial | None" = None,
) -> tuple[float, list[float]]:
) -> tuple[float, dict]:
"""
Walk-Forward 교차검증으로 평균 AUC를 반환한다.
Walk-Forward 교차검증으로 precision 기반 복합 점수를 반환한다.
Score = mean_precision + mean_auc * 0.001 (AUC는 tiebreaker)
trial이 제공되면 각 폴드 후 Optuna에 중간 값을 보고하여 Pruning을 활성화한다.
Returns:
(mean_score, details) where details contains per-fold metrics.
"""
n = len(X)
step = max(1, int(n * (1 - train_ratio) / n_splits))
train_end_start = int(n * train_ratio)
fold_aucs: list[float] = []
fold_precisions: list[float] = []
fold_recalls: list[float] = []
fold_thresholds: list[float] = []
fold_n_pos: list[int] = []
scores_so_far: list[float] = []
for fold_idx in range(n_splits):
tr_end = train_end_start + fold_idx * step
@@ -119,8 +162,14 @@ def _walk_forward_cv(
source_tr = source[:tr_end]
bal_idx = stratified_undersample(y_tr, source_tr, seed=42)
n_pos = int(y_val.sum())
if len(bal_idx) < 20 or len(np.unique(y_val)) < 2:
fold_aucs.append(0.5)
fold_precisions.append(0.0)
fold_recalls.append(0.0)
fold_thresholds.append(0.50)
fold_n_pos.append(n_pos)
continue
model = lgb.LGBMClassifier(**params, random_state=42, verbose=-1)
@@ -132,14 +181,47 @@ def _walk_forward_cv(
auc = roc_auc_score(y_val, proba) if len(np.unique(y_val)) > 1 else 0.5
fold_aucs.append(float(auc))
# Optuna Pruning: 중간 값 보고
if trial is not None:
trial.report(float(np.mean(fold_aucs)), step=fold_idx)
if trial.should_prune():
raise optuna.TrialPruned()
# Precision at recall-constrained threshold
if n_pos >= 3:
prec, rec, thr = _find_best_precision_at_recall(y_val, proba, min_recall)
else:
prec, rec, thr = 0.0, 0.0, 0.50
fold_precisions.append(prec)
fold_recalls.append(rec)
fold_thresholds.append(thr)
fold_n_pos.append(n_pos)
# Pruning: 양성 충분한 fold의 score만 보고
score = prec + auc * 0.001
scores_so_far.append(score)
if trial is not None and n_pos >= 3:
valid_scores = [s for s, np_ in zip(scores_so_far, fold_n_pos) if np_ >= 3]
if valid_scores:
trial.report(float(np.mean(valid_scores)), step=fold_idx)
if trial.should_prune():
raise optuna.TrialPruned()
# 양성 충분한 fold만으로 precision 평균 계산
valid_precs = [p for p, np_ in zip(fold_precisions, fold_n_pos) if np_ >= 3]
mean_auc = float(np.mean(fold_aucs)) if fold_aucs else 0.5
return mean_auc, fold_aucs
mean_prec = float(np.mean(valid_precs)) if valid_precs else 0.0
valid_recs = [r for r, np_ in zip(fold_recalls, fold_n_pos) if np_ >= 3]
mean_rec = float(np.mean(valid_recs)) if valid_recs else 0.0
mean_score = mean_prec + mean_auc * 0.001
details = {
"fold_aucs": fold_aucs,
"fold_precisions": fold_precisions,
"fold_recalls": fold_recalls,
"fold_thresholds": fold_thresholds,
"fold_n_pos": fold_n_pos,
"mean_auc": mean_auc,
"mean_precision": mean_prec,
"mean_recall": mean_rec,
}
return mean_score, details
# ──────────────────────────────────────────────
@@ -153,6 +235,7 @@ def make_objective(
source: np.ndarray,
n_splits: int,
train_ratio: float,
min_recall: float = 0.35,
):
"""클로저로 데이터셋을 캡처한 목적 함수를 반환한다."""
@@ -190,23 +273,31 @@ def make_objective(
"reg_lambda": reg_lambda,
}
mean_auc, fold_aucs = _walk_forward_cv(
mean_score, details = _walk_forward_cv(
X, y, w_scaled, source, params,
n_splits=n_splits,
train_ratio=train_ratio,
min_recall=min_recall,
trial=trial,
)
# 폴드별 AUC를 user_attrs에 저장 (결과 리포트용)
trial.set_user_attr("fold_aucs", fold_aucs)
# 폴드별 상세 메트릭을 user_attrs에 저장 (결과 리포트용)
trial.set_user_attr("fold_aucs", details["fold_aucs"])
trial.set_user_attr("fold_precisions", details["fold_precisions"])
trial.set_user_attr("fold_recalls", details["fold_recalls"])
trial.set_user_attr("fold_thresholds", details["fold_thresholds"])
trial.set_user_attr("fold_n_pos", details["fold_n_pos"])
trial.set_user_attr("mean_auc", details["mean_auc"])
trial.set_user_attr("mean_precision", details["mean_precision"])
trial.set_user_attr("mean_recall", details["mean_recall"])
return mean_auc
return mean_score
return objective
# ──────────────────────────────────────────────
# 베이스라인 AUC 측정 (현재 고정 파라미터)
# 베이스라인 측정 (현재 고정 파라미터)
# ──────────────────────────────────────────────
def measure_baseline(
@@ -216,8 +307,9 @@ def measure_baseline(
source: np.ndarray,
n_splits: int,
train_ratio: float,
) -> tuple[float, list[float]]:
"""현재 실전 파라미터(active 파일 또는 하드코딩 기본값)로 베이스라인 AUC를 측정한다."""
min_recall: float = 0.35,
) -> tuple[float, dict]:
"""현재 실전 파라미터(active 파일 또는 하드코딩 기본값)로 베이스라인을 측정한다."""
active_path = Path("models/active_lgbm_params.json")
if active_path.exists():
@@ -241,7 +333,11 @@ def measure_baseline(
}
print("베이스라인 측정 중 (active 파일 없음 → 코드 내 기본 파라미터)...")
return _walk_forward_cv(X, y, w, source, baseline_params, n_splits=n_splits, train_ratio=train_ratio)
return _walk_forward_cv(
X, y, w, source, baseline_params,
n_splits=n_splits, train_ratio=train_ratio,
min_recall=min_recall,
)
# ──────────────────────────────────────────────
@@ -250,17 +346,24 @@ def measure_baseline(
def print_report(
study: optuna.Study,
baseline_auc: float,
baseline_folds: list[float],
baseline_score: float,
baseline_details: dict,
elapsed_sec: float,
output_path: Path,
min_recall: float,
) -> None:
"""콘솔에 최종 리포트를 출력한다."""
best = study.best_trial
best_auc = best.value
best_folds = best.user_attrs.get("fold_aucs", [])
improvement = best_auc - baseline_auc
improvement_pct = (improvement / baseline_auc * 100) if baseline_auc > 0 else 0.0
best_score = best.value
best_prec = best.user_attrs.get("mean_precision", 0.0)
best_auc = best.user_attrs.get("mean_auc", 0.0)
best_rec = best.user_attrs.get("mean_recall", 0.0)
baseline_prec = baseline_details.get("mean_precision", 0.0)
baseline_auc = baseline_details.get("mean_auc", 0.0)
prec_improvement = best_prec - baseline_prec
prec_improvement_pct = (prec_improvement / baseline_prec * 100) if baseline_prec > 0 else 0.0
elapsed_min = int(elapsed_sec // 60)
elapsed_s = int(elapsed_sec % 60)
@@ -276,11 +379,15 @@ def print_report(
f"(완료={len(completed)}, 조기종료={len(pruned)}) | "
f"소요: {elapsed_min}{elapsed_s}")
print(sep)
print(f" Best AUC : {best_auc:.4f} (Trial #{best.number})")
if baseline_auc > 0:
sign = "+" if improvement >= 0 else ""
print(f" Baseline : {baseline_auc:.4f} (현재 train_model.py 고정값)")
print(f" 개선폭 : {sign}{improvement:.4f} ({sign}{improvement_pct:.1f}%)")
print(f" 최적화 지표: Precision (recall >= {min_recall} 제약)")
print(f" Best Prec : {best_prec:.4f} (Trial #{best.number})")
print(f" Best AUC : {best_auc:.4f}")
print(f" Best Recall: {best_rec:.4f}")
if baseline_score > 0:
sign = "+" if prec_improvement >= 0 else ""
print(dash)
print(f" Baseline : Prec={baseline_prec:.4f}, AUC={baseline_auc:.4f}")
print(f" 개선폭 : Precision {sign}{prec_improvement:.4f} ({sign}{prec_improvement_pct:.1f}%)")
print(dash)
print(" Best Parameters:")
for k, v in best.params.items():
@@ -289,19 +396,42 @@ def print_report(
else:
print(f" {k:<22}: {v}")
print(dash)
print(" Walk-Forward 폴드별 AUC (Best Trial):")
for i, auc in enumerate(best_folds, 1):
print(f" 폴드 {i}: {auc:.4f}")
if best_folds:
arr = np.array(best_folds)
print(f" 평균: {arr.mean():.4f} ± {arr.std():.4f}")
if baseline_folds:
# 폴드별 상세
fold_aucs = best.user_attrs.get("fold_aucs", [])
fold_precs = best.user_attrs.get("fold_precisions", [])
fold_recs = best.user_attrs.get("fold_recalls", [])
fold_thrs = best.user_attrs.get("fold_thresholds", [])
fold_npos = best.user_attrs.get("fold_n_pos", [])
print(" Walk-Forward 폴드별 상세 (Best Trial):")
for i, (auc, prec, rec, thr, npos) in enumerate(
zip(fold_aucs, fold_precs, fold_recs, fold_thrs, fold_npos), 1
):
print(f" 폴드 {i}: AUC={auc:.4f} Prec={prec:.3f} Rec={rec:.3f} Thr={thr:.3f} (양성={npos})")
if fold_precs:
valid_precs = [p for p, np_ in zip(fold_precs, fold_npos) if np_ >= 3]
if valid_precs:
arr_p = np.array(valid_precs)
print(f" 평균 Precision: {arr_p.mean():.4f} ± {arr_p.std():.4f}")
if fold_aucs:
arr_a = np.array(fold_aucs)
print(f" 평균 AUC: {arr_a.mean():.4f} ± {arr_a.std():.4f}")
# 베이스라인 폴드별
bl_folds = baseline_details.get("fold_aucs", [])
bl_precs = baseline_details.get("fold_precisions", [])
bl_recs = baseline_details.get("fold_recalls", [])
bl_thrs = baseline_details.get("fold_thresholds", [])
bl_npos = baseline_details.get("fold_n_pos", [])
if bl_folds:
print(dash)
print(" Baseline 폴드별 AUC:")
for i, auc in enumerate(baseline_folds, 1):
print(f" 폴드 {i}: {auc:.4f}")
arr = np.array(baseline_folds)
print(f" 평균: {arr.mean():.4f} ± {arr.std():.4f}")
print(" Baseline 폴드별 상세:")
for i, (auc, prec, rec, thr, npos) in enumerate(
zip(bl_folds, bl_precs, bl_recs, bl_thrs, bl_npos), 1
):
print(f" 폴드 {i}: AUC={auc:.4f} Prec={prec:.3f} Rec={rec:.3f} Thr={thr:.3f} (양성={npos})")
print(dash)
print(f" 결과 저장: {output_path}")
print(f" 다음 단계: python scripts/train_model.py (파라미터 수동 반영 후)")
@@ -310,10 +440,11 @@ def print_report(
def save_results(
study: optuna.Study,
baseline_auc: float,
baseline_folds: list[float],
baseline_score: float,
baseline_details: dict,
elapsed_sec: float,
data_path: str,
min_recall: float,
) -> Path:
"""결과를 JSON 파일로 저장하고 경로를 반환한다."""
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
@@ -327,8 +458,12 @@ def save_results(
if t.state == optuna.trial.TrialState.COMPLETE:
all_trials.append({
"number": t.number,
"auc": round(t.value, 6),
"score": round(t.value, 6),
"auc": round(t.user_attrs.get("mean_auc", 0.0), 6),
"precision": round(t.user_attrs.get("mean_precision", 0.0), 6),
"recall": round(t.user_attrs.get("mean_recall", 0.0), 6),
"fold_aucs": [round(a, 6) for a in t.user_attrs.get("fold_aucs", [])],
"fold_precisions": [round(p, 6) for p in t.user_attrs.get("fold_precisions", [])],
"params": {
k: (round(v, 6) if isinstance(v, float) else v)
for k, v in t.params.items()
@@ -336,19 +471,33 @@ def save_results(
})
result = {
"timestamp": datetime.now().isoformat(),
"data_path": data_path,
"n_trials_total": len(study.trials),
"n_trials_complete": len(all_trials),
"elapsed_sec": round(elapsed_sec, 1),
"timestamp": datetime.now().isoformat(),
"data_path": data_path,
"min_recall_constraint": min_recall,
"n_trials_total": len(study.trials),
"n_trials_complete": len(all_trials),
"elapsed_sec": round(elapsed_sec, 1),
"baseline": {
"auc": round(baseline_auc, 6),
"fold_aucs": [round(a, 6) for a in baseline_folds],
"score": round(baseline_score, 6),
"auc": round(baseline_details.get("mean_auc", 0.0), 6),
"precision": round(baseline_details.get("mean_precision", 0.0), 6),
"recall": round(baseline_details.get("mean_recall", 0.0), 6),
"fold_aucs": [round(a, 6) for a in baseline_details.get("fold_aucs", [])],
"fold_precisions": [round(p, 6) for p in baseline_details.get("fold_precisions", [])],
"fold_recalls": [round(r, 6) for r in baseline_details.get("fold_recalls", [])],
"fold_thresholds": [round(t, 6) for t in baseline_details.get("fold_thresholds", [])],
},
"best_trial": {
"number": best.number,
"auc": round(best.value, 6),
"fold_aucs": [round(a, 6) for a in best.user_attrs.get("fold_aucs", [])],
"number": best.number,
"score": round(best.value, 6),
"auc": round(best.user_attrs.get("mean_auc", 0.0), 6),
"precision": round(best.user_attrs.get("mean_precision", 0.0), 6),
"recall": round(best.user_attrs.get("mean_recall", 0.0), 6),
"fold_aucs": [round(a, 6) for a in best.user_attrs.get("fold_aucs", [])],
"fold_precisions": [round(p, 6) for p in best.user_attrs.get("fold_precisions", [])],
"fold_recalls": [round(r, 6) for r in best.user_attrs.get("fold_recalls", [])],
"fold_thresholds": [round(t, 6) for t in best.user_attrs.get("fold_thresholds", [])],
"fold_n_pos": best.user_attrs.get("fold_n_pos", []),
"params": {
k: (round(v, 6) if isinstance(v, float) else v)
for k, v in best.params.items()
@@ -373,6 +522,7 @@ def main():
parser.add_argument("--trials", type=int, default=50, help="Optuna trial 수 (기본: 50)")
parser.add_argument("--folds", type=int, default=5, help="Walk-Forward 폴드 수 (기본: 5)")
parser.add_argument("--train-ratio", type=float, default=0.6, help="학습 구간 비율 (기본: 0.6)")
parser.add_argument("--min-recall", type=float, default=0.35, help="최소 재현율 제약 (기본: 0.35)")
parser.add_argument("--no-baseline", action="store_true", help="베이스라인 측정 건너뜀")
args = parser.parse_args()
@@ -381,29 +531,40 @@ def main():
# 2. 베이스라인 측정
if args.no_baseline:
baseline_auc, baseline_folds = 0.0, []
baseline_score, baseline_details = 0.0, {}
print("베이스라인 측정 건너뜀 (--no-baseline)\n")
else:
baseline_auc, baseline_folds = measure_baseline(X, y, w, source, args.folds, args.train_ratio)
baseline_score, baseline_details = measure_baseline(
X, y, w, source, args.folds, args.train_ratio, args.min_recall,
)
bl_prec = baseline_details.get("mean_precision", 0.0)
bl_auc = baseline_details.get("mean_auc", 0.0)
bl_rec = baseline_details.get("mean_recall", 0.0)
print(
f"베이스라인 AUC: {baseline_auc:.4f} "
f"(폴드별: {[round(a, 4) for a in baseline_folds]})\n"
f"베이스라인: Prec={bl_prec:.4f}, AUC={bl_auc:.4f}, Recall={bl_rec:.4f} "
f"(recall >= {args.min_recall} 제약)\n"
)
# 3. Optuna study 실행
optuna.logging.set_verbosity(optuna.logging.WARNING)
sampler = TPESampler(seed=42)
pruner = MedianPruner(n_startup_trials=5, n_warmup_steps=2)
pruner = MedianPruner(n_startup_trials=5, n_warmup_steps=3)
study = optuna.create_study(
direction="maximize",
sampler=sampler,
pruner=pruner,
study_name="lgbm_wf_auc",
study_name="lgbm_wf_precision",
)
objective = make_objective(X, y, w, source, n_splits=args.folds, train_ratio=args.train_ratio)
objective = make_objective(
X, y, w, source,
n_splits=args.folds,
train_ratio=args.train_ratio,
min_recall=args.min_recall,
)
print(f"Optuna 탐색 시작: {args.trials} trials, {args.folds}폴드 Walk-Forward")
print(f"최적화 지표: Precision (recall >= {args.min_recall} 제약)")
print("(trial 완료마다 진행 상황 출력)\n")
start_time = time.time()
@@ -411,12 +572,13 @@ def main():
def _progress_callback(study: optuna.Study, trial: optuna.trial.FrozenTrial) -> None:
if trial.state == optuna.trial.TrialState.COMPLETE:
best_so_far = study.best_value
leaves = trial.params.get("num_leaves", "?")
depth = trial.params.get("max_depth", "?")
prec = trial.user_attrs.get("mean_precision", 0.0)
auc = trial.user_attrs.get("mean_auc", 0.0)
print(
f" Trial #{trial.number:3d} | AUC={trial.value:.4f} "
f" Trial #{trial.number:3d} | Prec={prec:.4f} AUC={auc:.4f} "
f"| Best={best_so_far:.4f} "
f"| leaves={leaves} depth={depth}"
f"| leaves={trial.params.get('num_leaves', '?')} "
f"depth={trial.params.get('max_depth', '?')}"
)
elif trial.state == optuna.trial.TrialState.PRUNED:
print(f" Trial #{trial.number:3d} | PRUNED (조기 종료)")
@@ -431,21 +593,32 @@ def main():
elapsed = time.time() - start_time
# 4. 결과 저장 및 출력
output_path = save_results(study, baseline_auc, baseline_folds, elapsed, args.data)
print_report(study, baseline_auc, baseline_folds, elapsed, output_path)
output_path = save_results(
study, baseline_score, baseline_details, elapsed, args.data, args.min_recall,
)
print_report(
study, baseline_score, baseline_details, elapsed, output_path, args.min_recall,
)
# 5. 성능 개선 시 active 파일 자동 갱신
import shutil
active_path = Path("models/active_lgbm_params.json")
if not args.no_baseline and study.best_value > baseline_auc:
if not args.no_baseline and study.best_value > baseline_score:
shutil.copy(output_path, active_path)
improvement = study.best_value - baseline_auc
print(f"[MLOps] AUC +{improvement:.4f} 개선 → {active_path} 자동 갱신 완료")
best_prec = study.best_trial.user_attrs.get("mean_precision", 0.0)
bl_prec = baseline_details.get("mean_precision", 0.0)
improvement = best_prec - bl_prec
print(f"[MLOps] Precision +{improvement:.4f} 개선 → {active_path} 자동 갱신 완료")
print(f"[MLOps] 다음 train_model.py 실행 시 새 파라미터가 자동 적용됩니다.\n")
elif args.no_baseline:
print("[MLOps] --no-baseline 모드: 성능 비교 없이 active 파일 유지\n")
else:
print(f"[MLOps] 성능 개선 없음 (Best={study.best_value:.4f} ≤ Baseline={baseline_auc:.4f}) → active 파일 유지\n")
best_prec = study.best_trial.user_attrs.get("mean_precision", 0.0)
bl_prec = baseline_details.get("mean_precision", 0.0)
print(
f"[MLOps] 성능 개선 없음 (Prec={best_prec:.4f} ≤ Baseline={bl_prec:.4f}) "
f"→ active 파일 유지\n"
)
if __name__ == "__main__":