feat: implement LightGBM model improvement plan with feature normalization and walk-forward validation

- Added a new markdown document outlining the plan to enhance the LightGBM model's AUC from 0.54 to 0.57+ through feature normalization, strong time weighting, and walk-forward validation.
- Implemented rolling z-score normalization for absolute value features in `src/dataset_builder.py` to improve model robustness against regime changes.
- Introduced a walk-forward validation function in `scripts/train_model.py` to accurately measure future prediction performance.
- Updated training log to include new model performance metrics and added ONNX model export functionality for compatibility.
- Adjusted model training parameters for better performance and included detailed validation results in the training log.
This commit is contained in:
21in7
2026-03-01 22:02:32 +09:00
parent c6428af64e
commit a6697e7cca
7 changed files with 487 additions and 22 deletions

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -45,5 +45,95 @@
"samples": 3269,
"features": 21,
"model_path": "models/lgbm_filter.pkl"
},
{
"date": "2026-03-01T21:46:29.599674",
"backend": "mlx",
"auc": 0.516,
"samples": 6470,
"train_sec": 1.3,
"time_weight_decay": 2.0,
"model_path": "models/mlx_filter.weights"
},
{
"date": "2026-03-01T21:50:12.449819",
"backend": "lgbm",
"auc": 0.4772,
"samples": 6470,
"features": 21,
"time_weight_decay": 2.0,
"model_path": "models/lgbm_filter.pkl"
},
{
"date": "2026-03-01T21:50:32.491318",
"backend": "lgbm",
"auc": 0.4943,
"samples": 6470,
"features": 21,
"time_weight_decay": 2.0,
"model_path": "models/lgbm_filter.pkl"
},
{
"date": "2026-03-01T21:50:48.665654",
"backend": "lgbm",
"auc": 0.4943,
"samples": 6470,
"features": 21,
"time_weight_decay": 2.0,
"model_path": "models/lgbm_filter.pkl"
},
{
"date": "2026-03-01T21:51:02.539565",
"backend": "lgbm",
"auc": 0.4943,
"samples": 6470,
"features": 21,
"time_weight_decay": 2.0,
"model_path": "models/lgbm_filter.pkl"
},
{
"date": "2026-03-01T21:51:09.830250",
"backend": "lgbm",
"auc": 0.4925,
"samples": 1716,
"features": 13,
"time_weight_decay": 2.0,
"model_path": "models/lgbm_filter.pkl"
},
{
"date": "2026-03-01T21:51:20.133303",
"backend": "lgbm",
"auc": 0.54,
"samples": 1716,
"features": 13,
"time_weight_decay": 2.0,
"model_path": "models/lgbm_filter.pkl"
},
{
"date": "2026-03-01T21:51:25.445363",
"backend": "lgbm",
"auc": 0.4943,
"samples": 6470,
"features": 21,
"time_weight_decay": 2.0,
"model_path": "models/lgbm_filter.pkl"
},
{
"date": "2026-03-01T21:52:24.296191",
"backend": "lgbm",
"auc": 0.54,
"samples": 1716,
"features": 13,
"time_weight_decay": 2.0,
"model_path": "models/lgbm_filter.pkl"
},
{
"date": "2026-03-01T22:00:34.737597",
"backend": "lgbm",
"auc": 0.5097,
"samples": 6470,
"features": 21,
"time_weight_decay": 3.0,
"model_path": "models/lgbm_filter.pkl"
}
]