cointrader

Author	SHA1	Message	Date
21in7	6fe2158511	feat: enhance precision optimization in model training - Introduced a new plan to modify the Optuna objective function to prioritize precision under a recall constraint of 0.35, improving model performance in scenarios where false positives are costly. - Updated training scripts to implement precision-based metrics and adjusted the walk-forward cross-validation process to incorporate precision and recall calculations. - Enhanced the active LGBM parameters and training log to reflect the new metrics and model configurations. - Added a new design document outlining the implementation steps for the precision-focused optimization. This update aims to refine the model's decision-making process by emphasizing precision, thereby reducing potential losses from false positives.	2026-03-03 00:57:19 +09:00
21in7	fce4d536ea	feat: implement HOLD negative sampling and stratified undersampling in ML pipeline Added HOLD candles as negative samples to increase training data from ~535 to ~3,200 samples. Introduced a negative_ratio parameter in generate_dataset_vectorized() for sampling HOLD candles alongside signal candles. Implemented stratified undersampling to ensure signal samples are preserved during training. Updated relevant tests to validate new functionality and maintain compatibility with existing tests. - Modified dataset_builder.py to include HOLD negative sampling logic - Updated train_model.py to apply stratified undersampling - Added tests for new sampling methods Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 00:13:42 +09:00
21in7	74966590b5	feat: apply stratified undersampling to hyperparameter tuning Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 00:09:43 +09:00
21in7	6cd54b46d9	feat: apply stratified undersampling to training pipeline Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 00:03:09 +09:00
21in7	6d82febab7	feat: implement Active Config pattern for automatic param promotion - tune_hyperparams.py: 탐색 완료 후 Best AUC > Baseline AUC 이면 models/active_lgbm_params.json 자동 갱신 - tune_hyperparams.py: 베이스라인을 active 파일 기준으로 측정 (active 없으면 코드 내 기본값 사용) - train_model.py: _load_lgbm_params()에 active 파일 자동 탐색 추가 우선순위: --tuned-params > active_lgbm_params.json > 하드코딩 기본값 - models/active_lgbm_params.json: 현재 best 파라미터로 초기화 - .gitignore: tune_results_*.json 제외, active 파일은 git 추적 유지 Made-with: Cursor	2026-03-02 14:56:42 +09:00
21in7	d5f8ed4789	feat: update default LightGBM params to Optuna best (trial #46 , AUC=0.6002) Optuna 50 trials Walk-Forward 5폴드 탐색 결과 (tune_results_20260302_144749.json): - Baseline AUC: 0.5803 → Best AUC: 0.6002 (+0.0199, +3.4%) - n_estimators: 500 → 434 - learning_rate: 0.05 → 0.123659 - max_depth: (미설정) → 6 - num_leaves: 31 → 14 - min_child_samples: 15 → 10 - subsample: 0.8 → 0.929062 - colsample_bytree: 0.8 → 0.946330 - reg_alpha: 0.05 → 0.573971 - reg_lambda: 0.1 → 0.000157 - weight_scale: 1.0 → 1.783105 Made-with: Cursor	2026-03-02 14:52:41 +09:00
21in7	ce02f1335c	feat: add run_optuna.sh wrapper script for Optuna tuning Made-with: Cursor	2026-03-02 14:50:50 +09:00
21in7	4afc7506d7	feat: connect Optuna tuning results to train_model.py via --tuned-params - _load_lgbm_params() 헬퍼 추가: 기본 파라미터 반환, JSON 주어지면 덮어씀 - train(): tuned_params_path 인자 추가, weight_scale 적용 - walk_forward_auc(): tuned_params_path 인자 추가, weight_scale 적용 - main(): --tuned-params argparse 인자 추가, 두 함수에 전달 - training_log.json에 tuned_params_path, lgbm_params, weight_scale 기록 Made-with: Cursor	2026-03-02 14:45:15 +09:00
21in7	caaa81f5f9	fix: add shebang and executable permission to tune_hyperparams.py Made-with: Cursor	2026-03-02 14:41:13 +09:00
21in7	8dd1389b16	feat: add Optuna Walk-Forward AUC hyperparameter tuning pipeline - scripts/tune_hyperparams.py: Optuna + Walk-Forward 5폴드 AUC 목적 함수 - 데이터셋 1회 캐싱으로 모든 trial 공유 (속도 최적화) - num_leaves <= 2^max_depth - 1 제약 강제 (소규모 데이터 과적합 방지) - MedianPruner로 저성능 trial 조기 종료 - 결과: 콘솔 리포트 + models/tune_results_YYYYMMDD_HHMMSS.json - requirements.txt: optuna>=3.6.0 추가 - README.md: 하이퍼파라미터 자동 튜닝 사용법 섹션 추가 - docs/plans/: 설계 문서 및 구현 플랜 추가 Made-with: Cursor	2026-03-02 14:39:07 +09:00
21in7	0fca14a1c2	feat: auto-detect first run in train_and_deploy.sh (365d full vs 35d upsert) Made-with: Cursor	2026-03-02 14:15:00 +09:00
21in7	10b1ecd273	feat: fetch 35 days for daily upsert instead of overwriting 365 days Made-with: Cursor	2026-03-02 14:13:16 +09:00
21in7	016b13a8f1	fix: fill NaN in oi_change/funding_rate after concat when columns missing in existing parquet Made-with: Cursor	2026-03-02 14:13:00 +09:00
21in7	3c3c7fd56b	feat: add upsert_parquet to accumulate OI/funding data incrementally 바이낸스 OI 히스토리 API가 최근 30일치만 제공하는 제약을 우회하기 위해 upsert_parquet() 함수를 추가. 매일 실행 시 기존 parquet의 oi_change/funding_rate가 0.0인 구간만 신규 값으로 덮어써 점진적으로 과거 데이터를 채워나감. --no-upsert 플래그로 기존 덮어쓰기 동작 유지 가능. Made-with: Cursor	2026-03-02 14:09:36 +09:00
21in7	c89374410e	feat: enhance trading bot functionality and documentation - Updated README.md to reflect new features including dynamic margin ratio, model hot-reload, and multi-symbol streaming. - Modified bot logic to ensure raw signals are passed to the `_close_and_reenter` method, even when the ML filter is loaded. - Introduced a new script `run_tests.sh` for streamlined test execution. - Improved test coverage for signal processing and re-entry logic, ensuring correct behavior under various conditions.	2026-03-02 01:51:53 +09:00
21in7	9ec78d76bd	feat: implement immediate re-entry after closing position on reverse signal - Added `_close_and_reenter` method to handle immediate re-entry after closing a position when a reverse signal is detected, contingent on passing the ML filter. - Updated `process_candle` to call `_close_and_reenter` instead of `_close_position` for reverse signals. - Enhanced test coverage for the new functionality, ensuring correct behavior under various conditions, including ML filter checks and position limits.	2026-03-02 01:34:36 +09:00
21in7	725a4349ee	chore: Update MLXFilter model deployment and logging with new training results and ONNX file management - Added new training log entries for lgbm backend with AUC, precision, and recall metrics. - Enhanced deploy_model.sh to manage ONNX and lgbm model files based on the selected backend. - Adjusted output shape in mlx_filter.py for ONNX export to support dynamic batch sizes.	2026-03-02 01:08:12 +09:00
21in7	0f6a22fcb5	feat: MLX 임계값 탐색을 정밀도 우선(recall>=0.15 조건부)으로 변경 Made-with: Cursor	2026-03-01 23:54:38 +09:00
21in7	aa413f4d7c	feat: LightGBM 임계값 탐색을 정밀도 우선(recall>=0.15 조건부)으로 변경 Made-with: Cursor	2026-03-01 23:54:13 +09:00
21in7	3b7ee3e890	chore: .worktrees/ gitignore에 추가 Made-with: Cursor	2026-03-01 23:50:18 +09:00
21in7	24d3ba9411	feat: enhance data fetching and model training with OI and funding rate integration - Updated `fetch_history.py` to collect open interest (OI) and funding rate data from Binance, improving the dataset for model training. - Modified `train_and_deploy.sh` to include options for OI and funding rate collection during data fetching. - Enhanced `dataset_builder.py` to incorporate OI change and funding rate features with rolling z-score normalization. - Updated training logs to reflect new metrics and features, ensuring comprehensive tracking of model performance. - Adjusted feature columns in `ml_features.py` to include OI and funding rate for improved model robustness.	2026-03-01 22:25:38 +09:00
21in7	4245d7cdbf	feat: implement 15-minute timeframe upgrade for model training and data processing - Introduced a new markdown document detailing the plan to transition the entire pipeline from a 1-minute to a 15-minute timeframe, aiming to improve model AUC from 0.49-0.50 to over 0.53. - Updated key parameters across multiple scripts, including `LOOKAHEAD` adjustments and default data paths to reflect the new 15-minute interval. - Modified data fetching and training scripts to ensure compatibility with the new timeframe, including changes in `fetch_history.py`, `train_model.py`, and `train_and_deploy.sh`. - Enhanced the bot's data stream configuration to operate on a 15-minute interval, ensuring real-time data processing aligns with the new model training strategy. - Updated training logs to capture new model performance metrics under the revised timeframe.	2026-03-01 22:16:15 +09:00
21in7	a6697e7cca	feat: implement LightGBM model improvement plan with feature normalization and walk-forward validation - Added a new markdown document outlining the plan to enhance the LightGBM model's AUC from 0.54 to 0.57+ through feature normalization, strong time weighting, and walk-forward validation. - Implemented rolling z-score normalization for absolute value features in `src/dataset_builder.py` to improve model robustness against regime changes. - Introduced a walk-forward validation function in `scripts/train_model.py` to accurately measure future prediction performance. - Updated training log to include new model performance metrics and added ONNX model export functionality for compatibility. - Adjusted model training parameters for better performance and included detailed validation results in the training log.	2026-03-01 22:02:32 +09:00
21in7	c6428af64e	feat: enhance Jenkins pipeline with Discord notifications and model hot-reload functionality - Added a new stage to the Jenkins pipeline to notify Discord when a build starts, succeeds, or fails, improving communication during the CI/CD process. - Implemented model hot-reload functionality in the MLFilter class, allowing automatic reloading of models when file changes are detected, enhancing responsiveness to updates. - Updated deployment scripts to provide clearer messaging regarding model loading and container status, improving user experience and debugging capabilities.	2026-03-01 21:46:36 +09:00
21in7	d9238afaf9	feat: enhance MLX model training with combined data handling - Introduced a new function `_split_combined` to separate XRP, BTC, and ETH data from a combined DataFrame. - Updated `train_mlx` to utilize the new function, improving data management and feature handling. - Adjusted dataset generation to accommodate BTC and ETH features, with warnings for missing features. - Changed default data path in `train_mlx` and `train_model` to point to the combined dataset for consistency. - Increased `LOOKAHEAD` from 60 to 90 and adjusted `ATR_TP_MULT` for better model performance.	2026-03-01 21:43:27 +09:00
21in7	db144750a3	feat: enhance model training and deployment scripts with time-weighted sampling - Updated `train_model.py` and `train_mlx_model.py` to include a time weight decay parameter for improved sample weighting during training. - Modified dataset generation to incorporate sample weights based on time decay, enhancing model performance. - Adjusted deployment scripts to support new backend options and improved error handling for model file transfers. - Added new entries to the training log for better tracking of model performance metrics over time. - Included ONNX model export functionality in the MLX filter for compatibility with Linux servers.	2026-03-01 21:25:06 +09:00
21in7	301457ce57	chore: remove unused risk_per_trade references Made-with: Cursor	2026-03-01 20:39:26 +09:00
21in7	d1af736bfc	feat: implement BTC/ETH correlation features for improved model accuracy - Added a new design document outlining the integration of BTC/ETH candle data as additional features in the XRP ML filter, enhancing prediction accuracy. - Introduced `MultiSymbolStream` for combined WebSocket data retrieval of XRP, BTC, and ETH. - Expanded feature set from 13 to 21 by including 8 new BTC/ETH-related features. - Updated various scripts and modules to support the new feature set and data handling. - Enhanced training and deployment scripts to accommodate the new dataset structure. This commit lays the groundwork for improved model performance by leveraging the correlation between BTC and ETH with XRP.	2026-03-01 19:30:17 +09:00
21in7	de933b97cc	feat: remove in-container retraining, training is now mac-only Made-with: Cursor	2026-03-01 18:54:00 +09:00
21in7	fd96055e73	perf: replace generate_dataset with vectorized version in train_mlx_model Made-with: Cursor	2026-03-01 18:53:21 +09:00
21in7	db134c032a	perf: replace generate_dataset with vectorized version in train_model Made-with: Cursor	2026-03-01 18:52:56 +09:00
21in7	8f834a1890	feat: implement training and deployment pipeline for LightGBM model on Mac to LXC - Added comprehensive plans for training a LightGBM model on M4 Mac Mini and deploying it to an LXC container. - Created scripts for model training, deployment, and a full pipeline execution. - Enhanced model transfer with error handling and logging for better tracking. - Introduced profiling for training time analysis and dataset generation optimization. Made-with: Cursor	2026-03-01 18:30:01 +09:00
21in7	298d4ad95e	feat: enhance train_model.py to dynamically determine CPU count for parallel processing - Added a new function to accurately retrieve the number of allocated CPUs in containerized environments, improving parallel processing efficiency. - Updated the dataset generation function to utilize the new CPU count function, ensuring optimal resource usage during model training. Made-with: Cursor	2026-03-01 17:46:40 +09:00
21in7	b86c88a8d6	feat: add README and enhance scripts for data fetching and model training - Created README.md to document project features, structure, and setup instructions. - Updated fetch_history.py to include path adjustments for module imports. - Enhanced train_model.py for parallel processing of dataset generation and added command-line argument for specifying worker count. Made-with: Cursor	2026-03-01 17:42:12 +09:00
21in7	7e4e9315c2	feat: implement ML filter with LightGBM for trading signal validation - Added MLFilter class to load and evaluate LightGBM model for trading signals. - Introduced retraining mechanism to update the model daily based on new data. - Created feature engineering and label building utilities for model training. - Updated bot logic to incorporate ML filter for signal validation. - Added scripts for data fetching and model training. Made-with: Cursor	2026-03-01 17:07:18 +09:00

35 Commits