- Updated `fetch_history.py` to collect open interest (OI) and funding rate data from Binance, improving the dataset for model training.
- Modified `train_and_deploy.sh` to include options for OI and funding rate collection during data fetching.
- Enhanced `dataset_builder.py` to incorporate OI change and funding rate features with rolling z-score normalization.
- Updated training logs to reflect new metrics and features, ensuring comprehensive tracking of model performance.
- Adjusted feature columns in `ml_features.py` to include OI and funding rate for improved model robustness.
- Introduced a new markdown document detailing the plan to transition the entire pipeline from a 1-minute to a 15-minute timeframe, aiming to improve model AUC from 0.49-0.50 to over 0.53.
- Updated key parameters across multiple scripts, including `LOOKAHEAD` adjustments and default data paths to reflect the new 15-minute interval.
- Modified data fetching and training scripts to ensure compatibility with the new timeframe, including changes in `fetch_history.py`, `train_model.py`, and `train_and_deploy.sh`.
- Enhanced the bot's data stream configuration to operate on a 15-minute interval, ensuring real-time data processing aligns with the new model training strategy.
- Updated training logs to capture new model performance metrics under the revised timeframe.
- Added a new markdown document outlining the plan to enhance the LightGBM model's AUC from 0.54 to 0.57+ through feature normalization, strong time weighting, and walk-forward validation.
- Implemented rolling z-score normalization for absolute value features in `src/dataset_builder.py` to improve model robustness against regime changes.
- Introduced a walk-forward validation function in `scripts/train_model.py` to accurately measure future prediction performance.
- Updated training log to include new model performance metrics and added ONNX model export functionality for compatibility.
- Adjusted model training parameters for better performance and included detailed validation results in the training log.
- Updated `train_model.py` and `train_mlx_model.py` to include a time weight decay parameter for improved sample weighting during training.
- Modified dataset generation to incorporate sample weights based on time decay, enhancing model performance.
- Adjusted deployment scripts to support new backend options and improved error handling for model file transfers.
- Added new entries to the training log for better tracking of model performance metrics over time.
- Included ONNX model export functionality in the MLX filter for compatibility with Linux servers.
- Added a new design document outlining the integration of BTC/ETH candle data as additional features in the XRP ML filter, enhancing prediction accuracy.
- Introduced `MultiSymbolStream` for combined WebSocket data retrieval of XRP, BTC, and ETH.
- Expanded feature set from 13 to 21 by including 8 new BTC/ETH-related features.
- Updated various scripts and modules to support the new feature set and data handling.
- Enhanced training and deployment scripts to accommodate the new dataset structure.
This commit lays the groundwork for improved model performance by leveraging the correlation between BTC and ETH with XRP.
- Added a new entry to the training log for the LightGBM model, including date, AUC, sample count, and model path.
- This entry mirrors an existing one, potentially for tracking model performance over time.
- Added comprehensive plans for training a LightGBM model on M4 Mac Mini and deploying it to an LXC container.
- Created scripts for model training, deployment, and a full pipeline execution.
- Enhanced model transfer with error handling and logging for better tracking.
- Introduced profiling for training time analysis and dataset generation optimization.
Made-with: Cursor