Commit Graph

8 Commits

Author SHA1 Message Date
21in7
0af138d8ee feat: add stratified_undersample helper function
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 23:58:15 +09:00
21in7
b7ad358a0a fix: make HOLD negative sampling tests non-vacuous
The two HOLD negative tests (test_hold_negative_labels_are_all_zero,
test_signal_samples_preserved_after_sampling) were passing vacuously
because sample_df produces 0 signal candles (ADX ~18, below threshold
25). Added signal_producing_df fixture with higher volatility and volume
surges to reliably generate signals. Removed if-guards so assertions
are mandatory. Also restored the full docstring for
generate_dataset_vectorized() documenting btc_df/eth_df,
time_weight_decay, and negative_ratio parameters.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 23:45:10 +09:00
21in7
8e56301d52 feat: add HOLD negative sampling to dataset_builder
Add negative_ratio parameter to generate_dataset_vectorized() that
samples HOLD candles as label=0 negatives alongside signal candles.
This increases training data from ~535 to ~3,200 samples when enabled.

- Split valid_rows into base_valid (shared) and sig_valid (signal-only)
- Add 'source' column ("signal" vs "hold_negative") for traceability
- HOLD samples get label=0 and random 50/50 side assignment
- Default negative_ratio=0 preserves backward compatibility
- Fix incorrect column count assertion in existing test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 23:34:45 +09:00
21in7
bcc717776d fix: RS 계산을 np.divide(where=) 방식으로 교체 — epsilon 이상치 폭발 차단
Made-with: Cursor
2026-03-02 00:30:36 +09:00
21in7
820d8e0213 refactor: 분모 연산을 1e-8 epsilon 패턴으로 통일
Made-with: Cursor
2026-03-01 23:52:59 +09:00
21in7
417b8e3c6a feat: OI/펀딩비 결측 구간을 np.nan으로 마스킹 (0.0 → nan)
Made-with: Cursor
2026-03-01 23:52:19 +09:00
21in7
d1af736bfc feat: implement BTC/ETH correlation features for improved model accuracy
- Added a new design document outlining the integration of BTC/ETH candle data as additional features in the XRP ML filter, enhancing prediction accuracy.
- Introduced `MultiSymbolStream` for combined WebSocket data retrieval of XRP, BTC, and ETH.
- Expanded feature set from 13 to 21 by including 8 new BTC/ETH-related features.
- Updated various scripts and modules to support the new feature set and data handling.
- Enhanced training and deployment scripts to accommodate the new dataset structure.

This commit lays the groundwork for improved model performance by leveraging the correlation between BTC and ETH with XRP.
2026-03-01 19:30:17 +09:00
21in7
e1560f882b feat: add vectorized dataset builder (1x pandas_ta call)
Made-with: Cursor
2026-03-01 18:52:34 +09:00