feat: MLX 임계값 탐색을 정밀도 우선(recall>=0.15 조건부)으로 변경

Made-with: Cursor
feat: LightGBM 임계값 탐색을 정밀도 우선(recall>=0.15 조건부)으로 변경
2026-03-01 23:54:38 +09:00 · 2026-03-01 23:54:13 +09:00 · 2026-03-01 23:53:49 +09:00 · 2026-03-01 23:52:59 +09:00 · 2026-03-01 23:52:19 +09:00 · 2026-03-01 23:50:18 +09:00
33 changed files with 2947 additions and 177 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -8,3 +8,4 @@ logs/
 venv/
 models/*.pkl
 data/*.parquet
+.worktrees/
--- a/30
+++ b/30
@@ -7,9 +7,24 @@ pipeline {
        IMAGE_TAG     = "${env.BUILD_NUMBER}"
        FULL_IMAGE    = "${REGISTRY}/${IMAGE_NAME}:${IMAGE_TAG}"
        LATEST_IMAGE  = "${REGISTRY}/${IMAGE_NAME}:latest"
+        
+        // 젠킨스 자격 증명에 저장해둔 디스코드 웹훅 주소를 불러옵니다.
+        DISCORD_WEBHOOK = credentials('discord-webhook')
    }

    stages {
+        // 빌드가 시작되자마자 알림을 보냅니다.
+        stage('Notify Build Start') {
+            steps {
+                sh """
+                curl -H "Content-Type: application/json" \
+                     -X POST \
+                     -d '{"content": "🚀 **[빌드 시작]** `cointrader` (Build #${env.BUILD_NUMBER}) 배포 파이프라인 가동"}' \
+                     ${DISCORD_WEBHOOK}
+                """
+            }
+        }
+
        stage('Git Clone from Gitea') {
            steps {
                git branch: 'main',
@@ -55,12 +70,25 @@ pipeline {
        }
    }

+    // 파이프라인 결과에 따른 디스코드 알림
    post {
        success {
            echo "Build #${env.BUILD_NUMBER} 성공: ${FULL_IMAGE} → 운영 LXC(10.1.10.24) 배포 완료"
+            sh """
+            curl -H "Content-Type: application/json" \
+                 -X POST \
+                 -d '{"content": "✅ **[배포 성공]** `cointrader` (Build #${env.BUILD_NUMBER}) 운영 서버(10.1.10.24) 배포 완료!\\n- 📦 이미지: `${FULL_IMAGE}`"}' \
+                 ${DISCORD_WEBHOOK}
+            """
        }
        failure {
            echo "Build #${env.BUILD_NUMBER} 실패"
+            sh """
+            curl -H "Content-Type: application/json" \
+                 -X POST \
+                 -d '{"content": "❌ **[배포 실패]** `cointrader` (Build #${env.BUILD_NUMBER}) 파이프라인 에러 발생. 젠킨스 로그를 확인해 주세요!"}' \
+                 ${DISCORD_WEBHOOK}
+            """
        }
    }
-}
+}
--- a/docs/plans/2026-03-01-15m-timeframe-upgrade.md
+++ b/docs/plans/2026-03-01-15m-timeframe-upgrade.md
@@ -0,0 +1,376 @@
+# 15분봉 타임프레임 업그레이드 구현 계획
+
+> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
+
+**Goal:** 1분봉 파이프라인 전체를 15분봉으로 전환하고, LOOKAHEAD=24(6시간 뷰)로 조정해 모델 AUC를 0.49~0.50 구간에서 0.53+ 이상으로 개선한다.
+
+**Architecture:** 데이터 수집(fetch_history.py) → 데이터셋 빌더(dataset_builder.py) → 학습 스크립트(train_model.py, train_mlx_model.py) → 실시간 봇(bot.py, data_stream.py) 순서로 파라미터를 변경한다. 각 레이어는 `interval` 문자열과 `LOOKAHEAD` 상수만 수정하면 되며 피처 구조는 그대로 유지한다.
+
+**Tech Stack:** Python, LightGBM, pandas, binance-python-client, pytest
+
+---
+
+## 변경 요약
+
+| 파일 | 변경 내용 |
+|------|-----------|
+| `src/dataset_builder.py` | `LOOKAHEAD 90→24`, `WARMUP 60→60` (유지) |
+| `scripts/train_model.py` | `LOOKAHEAD 60→24`, `--data` 기본값 `combined_1m→combined_15m` |
+| `scripts/train_mlx_model.py` | `--data` 기본값 `combined_1m→combined_15m` |
+| `scripts/fetch_history.py` | `--interval` 기본값 `1m→15m`, `--output` 기본값 반영 |
+| `scripts/train_and_deploy.sh` | `--interval 1m→15m`, 파일명 `1m→15m` |
+| `src/bot.py` | `interval="1m"→"15m"` |
+| `src/data_stream.py` | `buffer_size` 기본값 `200→200` (유지, 15분봉 200개=50시간 충분) |
+
+---
+
+## Task 1: dataset_builder.py — LOOKAHEAD 상수 변경
+
+**Files:**
+- Modify: `src/dataset_builder.py:14-17`
+
+**Step 1: 현재 상수 확인**
+
+```bash
+head -20 src/dataset_builder.py
+```
+
+Expected: `LOOKAHEAD = 90`, `WARMUP = 60`
+
+**Step 2: 상수 변경**
+
+`src/dataset_builder.py` 14번째 줄:
+```python
+# 변경 전
+LOOKAHEAD    = 90
+ATR_SL_MULT  = 1.5
+ATR_TP_MULT  = 2.0
+WARMUP       = 60
+
+# 변경 후
+LOOKAHEAD    = 24   # 15분봉 × 24 = 6시간 뷰
+ATR_SL_MULT  = 1.5
+ATR_TP_MULT  = 2.0
+WARMUP       = 60   # 15분봉 기준 60캔들 = 15시간 (지표 안정화 충분)
+```
+
+**Step 3: 변경 확인**
+
+```bash
+head -20 src/dataset_builder.py
+```
+
+Expected: `LOOKAHEAD = 24`
+
+---
+
+## Task 2: train_model.py — LOOKAHEAD 상수 및 기본 데이터 경로 변경
+
+**Files:**
+- Modify: `scripts/train_model.py:56-61`, `scripts/train_model.py:360`
+
+**Step 1: 현재 상수 확인**
+
+```bash
+sed -n '55,62p' scripts/train_model.py
+sed -n '358,362p' scripts/train_model.py
+```
+
+Expected: `LOOKAHEAD = 60`, `--data default="data/combined_1m.parquet"`
+
+**Step 2: LOOKAHEAD 변경**
+
+`scripts/train_model.py` 56번째 줄:
+```python
+# 변경 전
+LOOKAHEAD = 60
+
+# 변경 후
+LOOKAHEAD = 24  # 15분봉 × 24 = 6시간 (dataset_builder.py와 동기화)
+```
+
+**Step 3: --data 기본값 변경**
+
+`scripts/train_model.py` 360번째 줄 근처 `argparse` 부분:
+```python
+# 변경 전
+parser.add_argument("--data", default="data/combined_1m.parquet")
+
+# 변경 후
+parser.add_argument("--data", default="data/combined_15m.parquet")
+```
+
+**Step 4: 변경 확인**
+
+```bash
+grep -n "LOOKAHEAD\|combined_" scripts/train_model.py
+```
+
+Expected: `LOOKAHEAD = 24`, `combined_15m.parquet`
+
+---
+
+## Task 3: train_mlx_model.py — 기본 데이터 경로 변경
+
+**Files:**
+- Modify: `scripts/train_mlx_model.py:149`
+
+**Step 1: 현재 기본값 확인**
+
+```bash
+grep -n "combined_" scripts/train_mlx_model.py
+```
+
+Expected: `default="data/combined_1m.parquet"`
+
+**Step 2: 기본값 변경**
+
+`scripts/train_mlx_model.py` 149번째 줄:
+```python
+# 변경 전
+parser.add_argument("--data", default="data/combined_1m.parquet")
+
+# 변경 후
+parser.add_argument("--data", default="data/combined_15m.parquet")
+```
+
+**Step 3: 변경 확인**
+
+```bash
+grep -n "combined_" scripts/train_mlx_model.py
+```
+
+Expected: `combined_15m.parquet`
+
+---
+
+## Task 4: fetch_history.py — 기본 interval 및 output 변경
+
+**Files:**
+- Modify: `scripts/fetch_history.py:114-118`
+
+**Step 1: 현재 argparse 기본값 확인**
+
+```bash
+sed -n '112,120p' scripts/fetch_history.py
+```
+
+Expected: `--interval default="1m"`, `--output default="data/xrpusdt_1m.parquet"`
+
+**Step 2: 기본값 변경**
+
+```python
+# 변경 전
+parser.add_argument("--interval", default="1m")
+parser.add_argument("--days",     type=int, default=90)
+parser.add_argument("--output",   default="data/xrpusdt_1m.parquet")
+
+# 변경 후
+parser.add_argument("--interval", default="15m")
+parser.add_argument("--days",     type=int, default=365)
+parser.add_argument("--output",   default="data/xrpusdt_15m.parquet")
+```
+
+**Step 3: 변경 확인**
+
+```bash
+grep -n "interval\|output\|days" scripts/fetch_history.py | grep "default"
+```
+
+Expected: `default="15m"`, `default=365`, `default="data/xrpusdt_15m.parquet"`
+
+---
+
+## Task 5: train_and_deploy.sh — interval 및 파일명 변경
+
+**Files:**
+- Modify: `scripts/train_and_deploy.sh:26-43`
+
+**Step 1: 현재 스크립트 확인**
+
+```bash
+cat scripts/train_and_deploy.sh
+```
+
+**Step 2: 스크립트 변경**
+
+```bash
+# 변경 전 (26~32번째 줄)
+echo "=== [1/3] 데이터 수집 (XRP + BTC + ETH 3심볼, 1년치) ==="
+python scripts/fetch_history.py \
+    --symbols XRPUSDT BTCUSDT ETHUSDT \
+    --interval 1m \
+    --days 365 \
+    --output data/xrpusdt_1m.parquet
+# 결과: data/combined_1m.parquet (타임스탬프 기준 병합)
+
+# 변경 후
+echo "=== [1/3] 데이터 수집 (XRP + BTC + ETH 3심볼, 1년치) ==="
+python scripts/fetch_history.py \
+    --symbols XRPUSDT BTCUSDT ETHUSDT \
+    --interval 15m \
+    --days 365 \
+    --output data/xrpusdt_15m.parquet
+# 결과: data/combined_15m.parquet (타임스탬프 기준 병합)
+```
+
+```bash
+# 변경 전 (38~43번째 줄)
+    python scripts/train_mlx_model.py --data data/combined_1m.parquet --decay "$DECAY"
+else
+    echo "  백엔드: LightGBM (CPU), decay=${DECAY}"
+    python scripts/train_model.py --data data/combined_1m.parquet --decay "$DECAY"
+
+# 변경 후
+    python scripts/train_mlx_model.py --data data/combined_15m.parquet --decay "$DECAY"
+else
+    echo "  백엔드: LightGBM (CPU), decay=${DECAY}"
+    python scripts/train_model.py --data data/combined_15m.parquet --decay "$DECAY"
+```
+
+**Step 3: 변경 확인**
+
+```bash
+grep -n "1m\|15m" scripts/train_and_deploy.sh
+```
+
+Expected: 모든 `1m` 참조가 `15m`으로 변경됨
+
+---
+
+## Task 6: bot.py — 실시간 스트림 interval 변경
+
+**Files:**
+- Modify: `src/bot.py:22-25`
+
+**Step 1: 현재 interval 확인**
+
+```bash
+grep -n "interval" src/bot.py
+```
+
+Expected: `interval="1m"` (MultiSymbolStream 생성자)
+
+**Step 2: interval 변경**
+
+`src/bot.py` 21~25번째 줄:
+```python
+# 변경 전
+self.stream = MultiSymbolStream(
+    symbols=[config.symbol, "BTCUSDT", "ETHUSDT"],
+    interval="1m",
+    on_candle=self._on_candle_closed,
+)
+
+# 변경 후
+self.stream = MultiSymbolStream(
+    symbols=[config.symbol, "BTCUSDT", "ETHUSDT"],
+    interval="15m",
+    on_candle=self._on_candle_closed,
+)
+```
+
+**Step 3: 변경 확인**
+
+```bash
+grep -n "interval" src/bot.py
+```
+
+Expected: `interval="15m"`
+
+---
+
+## Task 7: 전체 변경 검증
+
+**Step 1: 모든 `1m` 하드코딩 잔재 확인**
+
+```bash
+grep -rn '"1m"' src/ scripts/
+```
+
+Expected: 결과 없음 (모두 `"15m"`으로 변경됨)
+
+**Step 2: LOOKAHEAD 동기화 확인**
+
+```bash
+grep -rn "LOOKAHEAD" src/ scripts/
+```
+
+Expected:
+- `src/dataset_builder.py`: `LOOKAHEAD = 24`
+- `scripts/train_model.py`: `LOOKAHEAD = 24`
+
+**Step 3: combined 파일명 일관성 확인**
+
+```bash
+grep -rn "combined_" src/ scripts/
+```
+
+Expected: 모두 `combined_15m` 참조
+
+**Step 4: 파이프라인 드라이런 (데이터 없이 import 테스트)**
+
+```bash
+python -c "
+from src.dataset_builder import LOOKAHEAD, ATR_SL_MULT, ATR_TP_MULT, WARMUP
+assert LOOKAHEAD == 24, f'LOOKAHEAD={LOOKAHEAD}'
+print(f'OK: LOOKAHEAD={LOOKAHEAD}, ATR_SL={ATR_SL_MULT}, ATR_TP={ATR_TP_MULT}, WARMUP={WARMUP}')
+"
+```
+
+Expected: `OK: LOOKAHEAD=24, ATR_SL=1.5, ATR_TP=2.0, WARMUP=60`
+
+---
+
+## Task 8: 데이터 수집 및 Walk-Forward 검증 실행
+
+> 이 태스크는 실제 바이낸스 API 키와 네트워크가 필요합니다.
+
+**Step 1: 15분봉 데이터 수집**
+
+```bash
+python scripts/fetch_history.py \
+    --symbols XRPUSDT BTCUSDT ETHUSDT \
+    --interval 15m \
+    --days 365 \
+    --output data/xrpusdt_15m.parquet
+```
+
+Expected: `data/combined_15m.parquet` 생성, 약 35,040행 (365일 × 96캔들/일)
+
+**Step 2: Walk-Forward AUC 측정 (기준선 확인)**
+
+```bash
+python scripts/train_model.py \
+    --data data/combined_15m.parquet \
+    --wf \
+    --wf-splits 5
+```
+
+Expected: Walk-Forward 평균 AUC가 0.53 이상이면 개선 확인
+
+**Step 3: 정식 학습 및 모델 저장**
+
+```bash
+python scripts/train_model.py \
+    --data data/combined_15m.parquet \
+    --decay 2.0
+```
+
+Expected: `models/lgbm_filter.pkl` 저장, 기존 모델은 `lgbm_filter_prev.pkl`로 백업
+
+---
+
+## 롤백 방법
+
+15분봉 모델이 기대에 미치지 못할 경우:
+
+```bash
+# 기존 1분봉 모델 복원
+cp models/lgbm_filter_prev.pkl models/lgbm_filter.pkl
+
+# 코드는 git으로 복원
+git checkout src/dataset_builder.py scripts/train_model.py \
+    scripts/train_mlx_model.py scripts/fetch_history.py \
+    scripts/train_and_deploy.sh src/bot.py
+```
--- a/docs/plans/2026-03-01-dynamic-margin-ratio-design.md
+++ b/docs/plans/2026-03-01-dynamic-margin-ratio-design.md
@@ -0,0 +1,131 @@
+# 동적 증거금 비율 설계
+
+**날짜**: 2026-03-01  
+**목적**: 잔고의 50%를 증거금으로 사용하되, 잔고가 늘어날수록 비율이 선형으로 감소하는 안전한 포지션 크기 계산 도입
+
+---
+
+## 배경
+
+- 현재 포지션 크기 계산: `risk_per_trade = 0.02` (잔고의 2%) × 레버리지 → 명목금액
+- 현재 잔고 22 USDT 기준, 최소 명목금액(5 USDT) 보장 로직으로 5 USDT 포지션만 잡힘
+- 목표: 잔고의 50%를 증거금으로 활용하여 실질적인 포지션 크기 확보
+- 안전장치: 잔고가 늘수록 비율이 자동으로 줄어들어 과도한 노출 방지
+
+---
+
+## 아키텍처
+
+### 데이터 흐름
+
+```
+bot.run()
+  └─ balance = await exchange.get_balance()
+  └─ risk.set_base_balance(balance)          ← 봇 시작 시 1회
+
+bot._open_position()
+  └─ balance = await exchange.get_balance()
+  └─ margin_ratio = risk.get_dynamic_margin_ratio(balance)   ← 신규
+  └─ exchange.calculate_quantity(balance, price, leverage, margin_ratio)
+```
+
+### 비율 계산 공식
+
+```
+ratio = MAX_RATIO - (balance - base_balance) × DECAY_RATE
+ratio = clamp(ratio, MIN_RATIO, MAX_RATIO)
+```
+
+- `base_balance`: 봇 시작 시 바이낸스 API로 조회한 실제 잔고
+- `MAX_RATIO`: 잔고가 기준값일 때 최대 비율 (기본 50%)
+- `MIN_RATIO`: 잔고가 아무리 늘어도 내려가지 않는 하한 비율 (기본 20%)
+- `DECAY_RATE`: 잔고 1 USDT 증가당 비율 감소량 (기본 0.0006)
+
+### 시뮬레이션 (기본 파라미터 기준)
+
+| 잔고 | 증거금 비율 | 증거금 | 명목금액(×10배) |
+|---|---|---|---|
+| 22 USDT | 50.0% | 11.0 USDT | 110 USDT |
+| 100 USDT | 45.3% | 45.3 USDT | 453 USDT |
+| 300 USDT | 33.2% | 99.6 USDT | 996 USDT |
+| 600 USDT | 20.0% (하한) | 120 USDT | 1,200 USDT |
+
+---
+
+## 변경 파일
+
+### 1. `src/config.py`
+
+`Config` 데이터클래스에 3개 파라미터 추가:
+
+```python
+margin_max_ratio: float = 0.50
+margin_min_ratio: float = 0.20
+margin_decay_rate: float = 0.0006
+```
+
+`__post_init__`에서 `.env` 값 읽기:
+
+```python
+self.margin_max_ratio = float(os.getenv("MARGIN_MAX_RATIO", "0.50"))
+self.margin_min_ratio = float(os.getenv("MARGIN_MIN_RATIO", "0.20"))
+self.margin_decay_rate = float(os.getenv("MARGIN_DECAY_RATE", "0.0006"))
+```
+
+### 2. `src/risk_manager.py`
+
+메서드 2개 추가:
+
+```python
+def set_base_balance(self, balance: float) -> None:
+    """봇 시작 시 기준 잔고 설정"""
+    self.initial_balance = balance
+
+def get_dynamic_margin_ratio(self, balance: float) -> float:
+    """잔고에 따라 선형 감소하는 증거금 비율 반환"""
+    ratio = self.config.margin_max_ratio - (
+        (balance - self.initial_balance) * self.config.margin_decay_rate
+    )
+    return max(self.config.margin_min_ratio, min(self.config.margin_max_ratio, ratio))
+```
+
+### 3. `src/exchange.py`
+
+`calculate_quantity` 시그니처에 `margin_ratio` 파라미터 추가:
+
+```python
+def calculate_quantity(self, balance: float, price: float, leverage: int, margin_ratio: float) -> float:
+    notional = balance * margin_ratio * leverage
+    if notional < self.MIN_NOTIONAL:
+        notional = self.MIN_NOTIONAL
+    ...
+```
+
+기존 `risk_per_trade` 기반 로직 제거.
+
+### 4. `src/bot.py`
+
+- `run()`: 시작 시 잔고 조회 후 `risk.set_base_balance(balance)` 호출
+- `_open_position()`: `margin_ratio = self.risk.get_dynamic_margin_ratio(balance)` 호출 후 `calculate_quantity`에 전달
+
+### 5. `.env`
+
+```
+MARGIN_MAX_RATIO=0.50
+MARGIN_MIN_RATIO=0.20
+MARGIN_DECAY_RATE=0.0006
+```
+
+---
+
+## 제거되는 설정
+
+- `RISK_PER_TRADE` — `.env` 및 `Config`에서 제거 (동적 비율로 대체)
+
+---
+
+## 리스크 고려사항
+
+- 잔고 22 USDT × 50% × 10배 레버리지 = 명목금액 110 USDT 노출 (잔고의 5배)
+- 손실 시 잔고가 줄어들면 다음 포지션 크기도 자동으로 줄어드는 자연스러운 안전장치 존재
+- `MARGIN_DECAY_RATE` 조정으로 감소 속도 제어 가능
--- a/docs/plans/2026-03-01-dynamic-margin-ratio-plan.md
+++ b/docs/plans/2026-03-01-dynamic-margin-ratio-plan.md
@@ -0,0 +1,368 @@
+# 동적 증거금 비율 구현 계획
+
+> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
+
+**Goal:** 잔고의 50%를 증거금으로 사용하되, 잔고가 늘수록 비율이 선형으로 감소하는 동적 포지션 크기 계산 도입
+
+**Architecture:** `RiskManager`에 `get_dynamic_margin_ratio(balance)` 메서드를 추가하고, `bot.py`에서 포지션 진입 전 호출한다. `exchange.py`의 `calculate_quantity`는 `margin_ratio` 파라미터를 받아 기존 `risk_per_trade` 로직을 대체한다. 봇 시작 시 바이낸스 API로 실제 잔고를 조회하여 기준값(`base_balance`)으로 저장한다.
+
+**Tech Stack:** Python 3.11, python-binance, loguru, pytest, python-dotenv
+
+---
+
+## 사전 확인
+
+- 현재 `.env`: `RISK_PER_TRADE=0.02` 존재
+- 현재 `Config.risk_per_trade: float = 0.02` 존재
+- 현재 `calculate_quantity`는 `balance * risk_per_trade * leverage` 로직 사용
+- 테스트 파일 위치: `tests/` 디렉토리 (없으면 생성)
+
+---
+
+### Task 1: Config에 동적 증거금 파라미터 추가
+
+**Files:**
+- Modify: `src/config.py`
+- Modify: `.env`
+
+**Step 1: `.env`에 새 파라미터 추가**
+
+`.env` 파일 하단에 추가:
+
+```
+MARGIN_MAX_RATIO=0.50
+MARGIN_MIN_RATIO=0.20
+MARGIN_DECAY_RATE=0.0006
+```
+
+기존 `RISK_PER_TRADE=0.02` 줄은 삭제.
+
+**Step 2: `src/config.py` 수정**
+
+`Config` 데이터클래스에 필드 추가, `risk_per_trade` 필드 제거:
+
+```python
+@dataclass
+class Config:
+    api_key: str = ""
+    api_secret: str = ""
+    symbol: str = "XRPUSDT"
+    leverage: int = 10
+    max_positions: int = 3
+    stop_loss_pct: float = 0.015
+    take_profit_pct: float = 0.045
+    trailing_stop_pct: float = 0.01
+    discord_webhook_url: str = ""
+    margin_max_ratio: float = 0.50
+    margin_min_ratio: float = 0.20
+    margin_decay_rate: float = 0.0006
+
+    def __post_init__(self):
+        self.api_key = os.getenv("BINANCE_API_KEY", "")
+        self.api_secret = os.getenv("BINANCE_API_SECRET", "")
+        self.symbol = os.getenv("SYMBOL", "XRPUSDT")
+        self.leverage = int(os.getenv("LEVERAGE", "10"))
+        self.discord_webhook_url = os.getenv("DISCORD_WEBHOOK_URL", "")
+        self.margin_max_ratio = float(os.getenv("MARGIN_MAX_RATIO", "0.50"))
+        self.margin_min_ratio = float(os.getenv("MARGIN_MIN_RATIO", "0.20"))
+        self.margin_decay_rate = float(os.getenv("MARGIN_DECAY_RATE", "0.0006"))
+```
+
+**Step 3: Commit**
+
+```bash
+git add src/config.py .env
+git commit -m "feat: add dynamic margin ratio config params"
+```
+
+---
+
+### Task 2: RiskManager에 동적 비율 메서드 추가
+
+**Files:**
+- Modify: `src/risk_manager.py`
+- Create: `tests/test_risk_manager.py`
+
+**Step 1: 실패하는 테스트 작성**
+
+`tests/test_risk_manager.py` 생성:
+
+```python
+import pytest
+from src.config import Config
+from src.risk_manager import RiskManager
+
+
+@pytest.fixture
+def config():
+    c = Config()
+    c.margin_max_ratio = 0.50
+    c.margin_min_ratio = 0.20
+    c.margin_decay_rate = 0.0006
+    return c
+
+
+@pytest.fixture
+def risk(config):
+    r = RiskManager(config)
+    r.set_base_balance(22.0)
+    return r
+
+
+def test_set_base_balance(risk):
+    assert risk.initial_balance == 22.0
+
+
+def test_ratio_at_base_balance(risk):
+    """기준 잔고에서 최대 비율(50%) 반환"""
+    ratio = risk.get_dynamic_margin_ratio(22.0)
+    assert ratio == pytest.approx(0.50, abs=1e-6)
+
+
+def test_ratio_decreases_as_balance_grows(risk):
+    """잔고가 늘수록 비율 감소"""
+    ratio_100 = risk.get_dynamic_margin_ratio(100.0)
+    ratio_300 = risk.get_dynamic_margin_ratio(300.0)
+    assert ratio_100 < 0.50
+    assert ratio_300 < ratio_100
+
+
+def test_ratio_clamped_at_min(risk):
+    """잔고가 매우 커도 최소 비율(20%) 이하로 내려가지 않음"""
+    ratio = risk.get_dynamic_margin_ratio(10000.0)
+    assert ratio == pytest.approx(0.20, abs=1e-6)
+
+
+def test_ratio_clamped_at_max(risk):
+    """잔고가 기준보다 작아도 최대 비율(50%) 초과하지 않음"""
+    ratio = risk.get_dynamic_margin_ratio(5.0)
+    assert ratio == pytest.approx(0.50, abs=1e-6)
+```
+
+**Step 2: 테스트 실패 확인**
+
+```bash
+pytest tests/test_risk_manager.py -v
+```
+
+Expected: `AttributeError: 'RiskManager' object has no attribute 'set_base_balance'`
+
+**Step 3: `src/risk_manager.py` 수정**
+
+기존 코드에 메서드 2개 추가:
+
+```python
+def set_base_balance(self, balance: float) -> None:
+    """봇 시작 시 기준 잔고 설정 (동적 비율 계산 기준점)"""
+    self.initial_balance = balance
+
+def get_dynamic_margin_ratio(self, balance: float) -> float:
+    """잔고에 따라 선형 감소하는 증거금 비율 반환"""
+    ratio = self.config.margin_max_ratio - (
+        (balance - self.initial_balance) * self.config.margin_decay_rate
+    )
+    return max(self.config.margin_min_ratio, min(self.config.margin_max_ratio, ratio))
+```
+
+**Step 4: 테스트 통과 확인**
+
+```bash
+pytest tests/test_risk_manager.py -v
+```
+
+Expected: 5개 테스트 모두 PASS
+
+**Step 5: Commit**
+
+```bash
+git add src/risk_manager.py tests/test_risk_manager.py
+git commit -m "feat: add get_dynamic_margin_ratio to RiskManager"
+```
+
+---
+
+### Task 3: exchange.py의 calculate_quantity 수정
+
+**Files:**
+- Modify: `src/exchange.py:18-29`
+- Create: `tests/test_exchange.py`
+
+**Step 1: 실패하는 테스트 작성**
+
+`tests/test_exchange.py` 생성:
+
+```python
+import pytest
+from unittest.mock import MagicMock
+from src.config import Config
+from src.exchange import BinanceFuturesClient
+
+
+@pytest.fixture
+def client():
+    config = Config()
+    config.leverage = 10
+    c = BinanceFuturesClient.__new__(BinanceFuturesClient)
+    c.config = config
+    return c
+
+
+def test_calculate_quantity_basic(client):
+    """잔고 22, 비율 50%, 레버리지 10배 → 명목금액 110, XRP 가격 2.5 → 수량 44.0"""
+    qty = client.calculate_quantity(balance=22.0, price=2.5, leverage=10, margin_ratio=0.50)
+    # 명목금액 = 22 * 0.5 * 10 = 110, 수량 = 110 / 2.5 = 44.0
+    assert qty == pytest.approx(44.0, abs=0.1)
+
+
+def test_calculate_quantity_min_notional(client):
+    """명목금액이 최소(5 USDT) 미만이면 최소값으로 올림"""
+    qty = client.calculate_quantity(balance=1.0, price=2.5, leverage=1, margin_ratio=0.01)
+    # 명목금액 = 1 * 0.01 * 1 = 0.01 < 5 → 최소 5 USDT
+    assert qty * 2.5 >= 5.0
+
+
+def test_calculate_quantity_zero_balance(client):
+    """잔고 0이면 최소 명목금액 기반 수량 반환"""
+    qty = client.calculate_quantity(balance=0.0, price=2.5, leverage=10, margin_ratio=0.50)
+    assert qty > 0
+```
+
+**Step 2: 테스트 실패 확인**
+
+```bash
+pytest tests/test_exchange.py -v
+```
+
+Expected: `TypeError: calculate_quantity() got an unexpected keyword argument 'margin_ratio'`
+
+**Step 3: `src/exchange.py` 수정**
+
+`calculate_quantity` 메서드를 아래로 교체:
+
+```python
+def calculate_quantity(self, balance: float, price: float, leverage: int, margin_ratio: float) -> float:
+    """동적 증거금 비율 기반 포지션 크기 계산 (최소 명목금액 $5 보장)"""
+    notional = balance * margin_ratio * leverage
+    if notional < self.MIN_NOTIONAL:
+        notional = self.MIN_NOTIONAL
+    quantity = notional / price
+    qty_rounded = round(quantity, 1)
+    if qty_rounded * price < self.MIN_NOTIONAL:
+        qty_rounded = round(self.MIN_NOTIONAL / price + 0.05, 1)
+    return qty_rounded
+```
+
+**Step 4: 테스트 통과 확인**
+
+```bash
+pytest tests/test_exchange.py -v
+```
+
+Expected: 3개 테스트 모두 PASS
+
+**Step 5: Commit**
+
+```bash
+git add src/exchange.py tests/test_exchange.py
+git commit -m "feat: replace risk_per_trade with margin_ratio in calculate_quantity"
+```
+
+---
+
+### Task 4: bot.py 연결
+
+**Files:**
+- Modify: `src/bot.py:85-99` (`_open_position`)
+- Modify: `src/bot.py:165-172` (`run`)
+
+**Step 1: `run()` 메서드에 `set_base_balance` 호출 추가**
+
+`run()` 메서드를 아래로 교체:
+
+```python
+async def run(self):
+    logger.info(f"봇 시작: {self.config.symbol}, 레버리지 {self.config.leverage}x")
+    await self._recover_position()
+    balance = await self.exchange.get_balance()
+    self.risk.set_base_balance(balance)
+    logger.info(f"기준 잔고 설정: {balance:.2f} USDT (동적 증거금 비율 기준점)")
+    await self.stream.start(
+        api_key=self.config.api_key,
+        api_secret=self.config.api_secret,
+    )
+```
+
+**Step 2: `_open_position()` 메서드에 동적 비율 적용**
+
+`_open_position()` 내부 `quantity` 계산 부분을 수정:
+
+```python
+async def _open_position(self, signal: str, df):
+    balance = await self.exchange.get_balance()
+    price = df["close"].iloc[-1]
+    margin_ratio = self.risk.get_dynamic_margin_ratio(balance)
+    quantity = self.exchange.calculate_quantity(
+        balance=balance, price=price, leverage=self.config.leverage, margin_ratio=margin_ratio
+    )
+    logger.info(f"포지션 크기: 잔고={balance:.2f} USDT, 증거금비율={margin_ratio:.1%}, 수량={quantity}")
+    # 이하 기존 코드 유지 (stop_loss, take_profit, place_order 등)
+```
+
+**Step 3: 전체 테스트 실행**
+
+```bash
+pytest tests/ -v
+```
+
+Expected: 전체 PASS
+
+**Step 4: Commit**
+
+```bash
+git add src/bot.py
+git commit -m "feat: apply dynamic margin ratio in bot position sizing"
+```
+
+---
+
+### Task 5: 기존 risk_per_trade 참조 정리
+
+**Files:**
+- Search: 프로젝트 전체에서 `risk_per_trade` 참조 확인
+
+**Step 1: 잔여 참조 검색**
+
+```bash
+grep -r "risk_per_trade" src/ tests/ .env
+```
+
+Expected: 결과 없음 (이미 모두 제거됨)
+
+남아있는 경우 해당 파일에서 제거.
+
+**Step 2: 전체 테스트 최종 확인**
+
+```bash
+pytest tests/ -v
+```
+
+Expected: 전체 PASS
+
+**Step 3: Commit**
+
+```bash
+git add -A
+git commit -m "chore: remove unused risk_per_trade references"
+```
+
+---
+
+## 검증 체크리스트
+
+- [ ] `pytest tests/test_risk_manager.py` — 5개 PASS
+- [ ] `pytest tests/test_exchange.py` — 3개 PASS
+- [ ] `pytest tests/` — 전체 PASS
+- [ ] `.env`에 `MARGIN_MAX_RATIO`, `MARGIN_MIN_RATIO`, `MARGIN_DECAY_RATE` 존재
+- [ ] `.env`에 `RISK_PER_TRADE` 없음
+- [ ] 봇 시작 로그에 "기준 잔고 설정: XX USDT" 출력
+- [ ] 포지션 진입 로그에 "증거금비율=50.0%" 출력 (잔고 22 USDT 기준)
--- a/docs/plans/2026-03-01-lgbm-improvement.md
+++ b/docs/plans/2026-03-01-lgbm-improvement.md
@@ -0,0 +1,251 @@
+# LightGBM 예측력 개선 구현 계획
+
+> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
+
+**Goal:** 현재 AUC 0.54 수준의 LightGBM 모델을 피처 정규화 + 강한 시간 가중치 + Walk-Forward 검증 세 가지를 순서대로 적용해 AUC 0.57+ 로 끌어올린다.
+
+**Architecture:**
+- `src/dataset_builder.py`에 rolling z-score 정규화를 추가해 레짐 변화에 강한 피처를 만든다.
+- `scripts/train_model.py`에 Walk-Forward 검증 루프를 추가해 실제 예측력을 정확히 측정한다.
+- 1년치 `combined_1m.parquet` 데이터를 decay=4.0 이상의 강한 시간 가중치로 학습해 샘플 수와 최신성을 동시에 확보한다.
+
+**Tech Stack:** LightGBM, pandas, numpy, scikit-learn, Python 3.13
+
+---
+
+## 배경: 현재 문제 진단 결과
+
+| 데이터 | 구간별 독립 AUC | 전체 80/20 AUC |
+|--------|----------------|----------------|
+| combined 1년 | 0.49~0.51 (전 구간 동일) | 0.49 |
+| xrpusdt 3개월 | 0.49~0.58 (구간 편차 큼) | 0.54 |
+
+**핵심 원인 두 가지:**
+1. `xrp_btc_rs` 같은 절대값 피처가 Q1=0.86 → Q4=3.68로 4배 변동 → 모델이 스케일 변화에 혼란
+2. 학습셋(과거)이 검증셋(최근)을 설명 못 함 → Walk-Forward로 실제 예측력 측정 필요
+
+---
+
+## Task 1: 피처 정규화 개선 (rolling z-score)
+
+**Files:**
+- Modify: `src/dataset_builder.py` — `_calc_features_vectorized()` 함수 내부
+
+**목표:** 절대값 피처(`atr_pct`, `vol_ratio`, `xrp_btc_rs`, `xrp_eth_rs`, `ret_1/3/5`, `btc_ret_1/3/5`, `eth_ret_1/3/5`)를 rolling 200 window z-score로 정규화해서 레짐 변화에 무관하게 만든다.
+
+**Step 1: 정규화 헬퍼 함수 추가**
+
+`_calc_features_vectorized()` 함수 시작 부분에 추가:
+
+```python
+def _rolling_zscore(arr: np.ndarray, window: int = 200) -> np.ndarray:
+    """rolling window z-score 정규화. window 미만 구간은 0으로 채운다."""
+    s = pd.Series(arr)
+    mean = s.rolling(window, min_periods=window).mean()
+    std  = s.rolling(window, min_periods=window).std()
+    z = (s - mean) / std.replace(0, np.nan)
+    return z.fillna(0).values.astype(np.float32)
+```
+
+**Step 2: 절대값 피처에 정규화 적용**
+
+`result` DataFrame 생성 시 다음 피처를 정규화 버전으로 교체:
+
+```python
+# 기존
+"atr_pct":   atr_pct.astype(np.float32),
+"vol_ratio": vol_ratio.astype(np.float32),
+"ret_1":     ret_1.astype(np.float32),
+"ret_3":     ret_3.astype(np.float32),
+"ret_5":     ret_5.astype(np.float32),
+
+# 변경 후
+"atr_pct":   _rolling_zscore(atr_pct),
+"vol_ratio": _rolling_zscore(vol_ratio),
+"ret_1":     _rolling_zscore(ret_1),
+"ret_3":     _rolling_zscore(ret_3),
+"ret_5":     _rolling_zscore(ret_5),
+```
+
+BTC/ETH 피처도 동일하게:
+```python
+"btc_ret_1": _rolling_zscore(btc_r1), "btc_ret_3": _rolling_zscore(btc_r3), ...
+"xrp_btc_rs": _rolling_zscore(xrp_btc_rs), "xrp_eth_rs": _rolling_zscore(xrp_eth_rs),
+```
+
+**Step 3: 검증**
+
+```bash
+cd /Users/gihyeon/github/cointrader
+.venv/bin/python -c "
+from src.dataset_builder import generate_dataset_vectorized
+import pandas as pd
+df = pd.read_parquet('data/combined_1m.parquet')
+base = ['open','high','low','close','volume']
+btc = df[[c+'_btc' for c in base]].copy(); btc.columns = base
+eth = df[[c+'_eth' for c in base]].copy(); eth.columns = base
+ds = generate_dataset_vectorized(df[base].copy(), btc_df=btc, eth_df=eth, time_weight_decay=0)
+print(ds[['atr_pct','vol_ratio','xrp_btc_rs']].describe())
+"
+```
+
+기대 결과: `atr_pct`, `vol_ratio`, `xrp_btc_rs` 모두 mean≈0, std≈1 범위
+
+---
+
+## Task 2: Walk-Forward 검증 함수 추가
+
+**Files:**
+- Modify: `scripts/train_model.py` — `train()` 함수 뒤에 `walk_forward_auc()` 함수 추가 및 `main()` 에 `--wf` 플래그 추가
+
+**목표:** 시계열 순서를 지키면서 n_splits번 학습/검증을 반복해 실제 미래 예측력의 평균 AUC를 측정한다.
+
+**Step 1: walk_forward_auc 함수 추가**
+
+`train()` 함수 바로 아래에 추가:
+
+```python
+def walk_forward_auc(
+    data_path: str,
+    time_weight_decay: float = 2.0,
+    n_splits: int = 5,
+    train_ratio: float = 0.6,
+) -> None:
+    """Walk-Forward 검증: 슬라이딩 윈도우로 n_splits번 학습/검증 반복."""
+    import warnings
+    from sklearn.metrics import roc_auc_score
+
+    print(f"\n=== Walk-Forward 검증 ({n_splits}폴드) ===")
+    df_raw = pd.read_parquet(data_path)
+    base_cols = ["open", "high", "low", "close", "volume"]
+    btc_df = eth_df = None
+    if "close_btc" in df_raw.columns:
+        btc_df = df_raw[[c + "_btc" for c in base_cols]].copy(); btc_df.columns = base_cols
+    if "close_eth" in df_raw.columns:
+        eth_df = df_raw[[c + "_eth" for c in base_cols]].copy(); eth_df.columns = base_cols
+    df = df_raw[base_cols].copy()
+
+    dataset = generate_dataset_vectorized(df, btc_df=btc_df, eth_df=eth_df,
+                                          time_weight_decay=time_weight_decay)
+    actual_feature_cols = [c for c in FEATURE_COLS if c in dataset.columns]
+    X = dataset[actual_feature_cols].values
+    y = dataset["label"].values
+    w = dataset["sample_weight"].values
+    n = len(dataset)
+
+    step = int(n * (1 - train_ratio) / n_splits)
+    train_end_start = int(n * train_ratio)
+
+    aucs = []
+    for i in range(n_splits):
+        tr_end = train_end_start + i * step
+        val_end = tr_end + step
+        if val_end > n:
+            break
+
+        X_tr, y_tr, w_tr = X[:tr_end], y[:tr_end], w[:tr_end]
+        X_val, y_val = X[tr_end:val_end], y[tr_end:val_end]
+
+        pos_idx = np.where(y_tr == 1)[0]
+        neg_idx = np.where(y_tr == 0)[0]
+        if len(neg_idx) > len(pos_idx):
+            np.random.seed(42)
+            neg_idx = np.random.choice(neg_idx, size=len(pos_idx), replace=False)
+        idx = np.sort(np.concatenate([pos_idx, neg_idx]))
+
+        model = lgb.LGBMClassifier(
+            n_estimators=500, learning_rate=0.05, num_leaves=31,
+            min_child_samples=15, subsample=0.8, colsample_bytree=0.8,
+            reg_alpha=0.05, reg_lambda=0.1, random_state=42, verbose=-1,
+        )
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            model.fit(X_tr[idx], y_tr[idx], sample_weight=w_tr[idx])
+
+        proba = model.predict_proba(X_val)[:, 1]
+        if len(np.unique(y_val)) < 2:
+            auc = 0.5
+        else:
+            auc = roc_auc_score(y_val, proba)
+        aucs.append(auc)
+        print(f"  폴드 {i+1}/{n_splits}: 학습={tr_end}, 검증={tr_end}~{val_end} ({step}개), AUC={auc:.4f}")
+
+    print(f"\n  Walk-Forward 평균 AUC: {np.mean(aucs):.4f} ± {np.std(aucs):.4f}")
+    print(f"  폴드별: {[round(a,4) for a in aucs]}")
+```
+
+**Step 2: main()에 --wf 플래그 추가**
+
+```python
+parser.add_argument("--wf", action="store_true", help="Walk-Forward 검증 실행")
+parser.add_argument("--wf-splits", type=int, default=5)
+
+# args 처리 부분
+if args.wf:
+    walk_forward_auc(args.data, time_weight_decay=args.decay, n_splits=args.wf_splits)
+else:
+    train(args.data, time_weight_decay=args.decay)
+```
+
+**Step 3: 검증 실행**
+
+```bash
+# xrpusdt 3개월 Walk-Forward
+.venv/bin/python scripts/train_model.py --data data/xrpusdt_1m.parquet --decay 2.0 --wf
+
+# combined 1년 Walk-Forward
+.venv/bin/python scripts/train_model.py --data data/combined_1m.parquet --decay 2.0 --wf
+```
+
+기대 결과: 폴드별 AUC가 0.50~0.58 범위, 평균 0.52+
+
+---
+
+## Task 3: 강한 시간 가중치 + 1년 데이터 최적화
+
+**Files:**
+- Modify: `scripts/train_model.py` — `train()` 함수 내 `--decay` 기본값 및 권장값 주석
+
+**목표:** `combined_1m.parquet`에서 decay=4.0~5.0으로 최근 3개월에 집중하되 1년치 패턴도 참고한다.
+
+**Step 1: decay 값별 AUC 비교 스크립트 실행**
+
+```bash
+for decay in 1.0 2.0 3.0 4.0 5.0; do
+    echo "=== decay=$decay ==="
+    .venv/bin/python scripts/train_model.py --data data/combined_1m.parquet --decay $decay --wf --wf-splits 3 2>&1 | grep "Walk-Forward 평균"
+done
+```
+
+**Step 2: 최적 decay 값으로 최종 학습**
+
+Walk-Forward 평균 AUC가 가장 높은 decay 값으로:
+
+```bash
+.venv/bin/python scripts/train_model.py --data data/combined_1m.parquet --decay <최적값>
+```
+
+**Step 3: 결과 확인**
+
+```bash
+.venv/bin/python -c "import json; log=json.load(open('models/training_log.json')); [print(e) for e in log[-3:]]"
+```
+
+---
+
+## 예상 결과
+
+| 개선 단계 | 예상 AUC |
+|-----------|---------|
+| 현재 (3개월, 기본) | 0.54 |
+| + rolling z-score 정규화 | 0.54~0.56 |
+| + Walk-Forward로 정확한 측정 | 측정 정확도 향상 |
+| + decay=4.0, 1년 데이터 | 0.55~0.58 |
+
+---
+
+## 주의사항
+
+- `_rolling_zscore`는 `dataset_builder.py` 내부에서만 사용 (실시간 봇 경로 `ml_features.py`는 건드리지 않음)
+- Walk-Forward는 `--wf` 플래그로만 실행, 기본 `train()`은 그대로 유지
+- rolling window=200은 약 3~4시간치 1분봉 → 단기 레짐 변화 반영
--- a/docs/plans/2026-03-01-oi-nan-epsilon-precision-threshold.md
+++ b/docs/plans/2026-03-01-oi-nan-epsilon-precision-threshold.md
@@ -0,0 +1,463 @@
+# OI NaN 마스킹 / 분모 epsilon / 정밀도 우선 임계값 구현 계획
+
+> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
+
+**Goal:** OI 데이터 결측 구간을 np.nan으로 처리하고, 분모 연산을 1e-8 패턴으로 통일하며, 임계값 탐색을 정밀도 우선(최소 재현율 조건부)으로 변경한다.
+
+**Architecture:**
+- `dataset_builder.py`: OI/펀딩비 nan 마스킹 + 분모 epsilon 통일 + `_rolling_zscore`의 nan-safe 처리
+- `mlx_filter.py`: `fit()` 정규화 시 `np.nanmean`/`np.nanstd` + `nan_to_num` 적용
+- `train_model.py`: 임계값 탐색 함수를 `precision_recall_curve` 기반으로 교체
+- `train_mlx_model.py`: 동일한 임계값 탐색 함수 적용
+
+**Tech Stack:** numpy, pandas, scikit-learn(precision_recall_curve), lightgbm, mlx
+
+---
+
+### Task 1: `dataset_builder.py` — OI/펀딩비 nan 마스킹
+
+**Files:**
+- Modify: `src/dataset_builder.py:261-268`
+- Test: `tests/test_dataset_builder.py`
+
+**Step 1: 기존 테스트 실행 (기준선 확인)**
+
+```bash
+python -m pytest tests/test_dataset_builder.py -v
+```
+Expected: 기존 테스트 전부 PASS (변경 전 기준선)
+
+**Step 2: OI nan 마스킹 테스트 작성**
+
+`tests/test_dataset_builder.py`에 아래 테스트 추가:
+
+```python
+def test_oi_nan_masking_no_column():
+    """oi_change 컬럼이 없으면 전체가 nan이어야 한다."""
+    import numpy as np
+    import pandas as pd
+    from src.dataset_builder import _calc_features_vectorized, _calc_signals, _calc_indicators
+
+    # 최소한의 OHLCV 데이터 (지표 계산에 충분한 길이)
+    n = 100
+    np.random.seed(0)
+    df = pd.DataFrame({
+        "open":   np.random.uniform(1, 2, n),
+        "high":   np.random.uniform(2, 3, n),
+        "low":    np.random.uniform(0.5, 1, n),
+        "close":  np.random.uniform(1, 2, n),
+        "volume": np.random.uniform(1000, 5000, n),
+    })
+    d = _calc_indicators(df)
+    sig = _calc_signals(d)
+    feat = _calc_features_vectorized(d, sig)
+
+    # oi_change 컬럼이 없으면 oi_change 피처는 전부 nan이어야 함
+    # (rolling zscore 후에도 nan이 전파되어야 함)
+    assert feat["oi_change"].isna().all(), "oi_change 컬럼 없을 때 전부 nan이어야 함"
+
+
+def test_oi_nan_masking_with_zeros():
+    """oi_change 컬럼이 있어도 0.0 구간은 nan으로 마스킹되어야 한다."""
+    import numpy as np
+    import pandas as pd
+    from src.dataset_builder import _calc_features_vectorized, _calc_signals, _calc_indicators
+
+    n = 100
+    np.random.seed(0)
+    df = pd.DataFrame({
+        "open":      np.random.uniform(1, 2, n),
+        "high":      np.random.uniform(2, 3, n),
+        "low":       np.random.uniform(0.5, 1, n),
+        "close":     np.random.uniform(1, 2, n),
+        "volume":    np.random.uniform(1000, 5000, n),
+        "oi_change": np.concatenate([np.zeros(50), np.random.uniform(-0.1, 0.1, 50)]),
+    })
+    d = _calc_indicators(df)
+    sig = _calc_signals(d)
+    feat = _calc_features_vectorized(d, sig)
+
+    # 앞 50개 구간은 0이었으므로 nan으로 마스킹 → rolling zscore 후에도 nan 전파
+    # 뒤 50개 구간은 실제 값이 있으므로 일부는 유한값이어야 함
+    assert feat["oi_change"].iloc[50:].notna().any(), "실제 OI 값 구간에 유한값이 있어야 함"
+```
+
+**Step 3: 테스트 실행 (FAIL 확인)**
+
+```bash
+python -m pytest tests/test_dataset_builder.py::test_oi_nan_masking_no_column tests/test_dataset_builder.py::test_oi_nan_masking_with_zeros -v
+```
+Expected: FAIL (현재 0.0으로 채우므로 isna().all()이 False)
+
+**Step 4: `dataset_builder.py` 수정**
+
+`src/dataset_builder.py` 261~268줄을 아래로 교체:
+
+```python
+    # OI 변화율 / 펀딩비 피처
+    # 컬럼 없으면 전체 nan, 있으면 0.0 구간(데이터 미제공 구간)을 nan으로 마스킹
+    # LightGBM은 nan을 자체 처리; MLX는 fit()에서 nanmean/nanstd + nan_to_num 처리
+    if "oi_change" in d.columns:
+        oi_raw = np.where(d["oi_change"].values == 0.0, np.nan, d["oi_change"].values)
+    else:
+        oi_raw = np.full(len(d), np.nan)
+
+    if "funding_rate" in d.columns:
+        fr_raw = np.where(d["funding_rate"].values == 0.0, np.nan, d["funding_rate"].values)
+    else:
+        fr_raw = np.full(len(d), np.nan)
+
+    result["oi_change"]    = _rolling_zscore(oi_raw.astype(np.float64))
+    result["funding_rate"] = _rolling_zscore(fr_raw.astype(np.float64))
+```
+
+**Step 5: `_rolling_zscore` nan-safe 처리 확인 및 수정**
+
+`src/dataset_builder.py` `_rolling_zscore` 함수 (118~128줄)를 nan-safe하게 수정:
+
+```python
+def _rolling_zscore(arr: np.ndarray, window: int = 288) -> np.ndarray:
+    """rolling window z-score 정규화. nan은 전파된다(nan-safe).
+    15분봉 기준 3일(288캔들) 윈도우. min_periods=1로 초반 데이터도 활용."""
+    s = pd.Series(arr.astype(np.float64))
+    r = s.rolling(window=window, min_periods=1)
+    mean = r.mean()   # pandas rolling은 nan을 자동으로 건너뜀
+    std  = r.std(ddof=0)
+    std  = std.where(std >= 1e-8, other=1e-8)
+    z = (s - mean) / std
+    return z.values.astype(np.float32)
+```
+
+> 참고: pandas `rolling().mean()`은 기본적으로 nan을 건너뛰므로 별도 처리 불필요.
+> nan 입력 → nan 출력이 자연스럽게 전파됨.
+
+**Step 6: 테스트 재실행 (PASS 확인)**
+
+```bash
+python -m pytest tests/test_dataset_builder.py -v
+```
+Expected: 모든 테스트 PASS
+
+**Step 7: 커밋**
+
+```bash
+git add src/dataset_builder.py tests/test_dataset_builder.py
+git commit -m "feat: OI/펀딩비 결측 구간을 np.nan으로 마스킹 (0.0 → nan)"
+```
+
+---
+
+### Task 2: `dataset_builder.py` — 분모 epsilon 통일
+
+**Files:**
+- Modify: `src/dataset_builder.py:157-168`
+- Test: `tests/test_dataset_builder.py`
+
+**Step 1: epsilon 통일 테스트 작성**
+
+`tests/test_dataset_builder.py`에 추가:
+
+```python
+def test_epsilon_no_division_by_zero():
+    """bb_range=0, close=0, vol_ma20=0 극단값에서 nan/inf가 발생하지 않아야 한다."""
+    import numpy as np
+    import pandas as pd
+    from src.dataset_builder import _calc_features_vectorized, _calc_signals, _calc_indicators
+
+    n = 100
+    # close를 모두 같은 값으로 → bb_range=0 유발
+    df = pd.DataFrame({
+        "open":   np.ones(n),
+        "high":   np.ones(n),
+        "low":    np.ones(n),
+        "close":  np.ones(n),
+        "volume": np.ones(n),
+    })
+    d = _calc_indicators(df)
+    sig = _calc_signals(d)
+    feat = _calc_features_vectorized(d, sig)
+
+    numeric_cols = feat.select_dtypes(include=[np.number]).columns
+    assert not feat[numeric_cols].isin([np.inf, -np.inf]).any().any(), \
+        "inf 값이 있으면 안 됨"
+```
+
+**Step 2: 테스트 실행 (기준선)**
+
+```bash
+python -m pytest tests/test_dataset_builder.py::test_epsilon_no_division_by_zero -v
+```
+
+**Step 3: `_calc_features_vectorized` 분모 epsilon 통일**
+
+`src/dataset_builder.py` 157~168줄을 아래로 교체:
+
+```python
+    bb_range = bb_upper - bb_lower
+    bb_pct = (close - bb_lower) / (bb_range + 1e-8)
+
+    ema_align = np.where(
+        (ema9 > ema21) & (ema21 > ema50),  1,
+        np.where(
+            (ema9 < ema21) & (ema21 < ema50), -1, 0
+        )
+    ).astype(np.float32)
+
+    atr_pct   = atr / (close + 1e-8)
+    vol_ratio = volume / (vol_ma20 + 1e-8)
+```
+
+그리고 상대강도 계산 (246~247줄):
+
+```python
+        xrp_btc_rs_raw = (xrp_r1 / (btc_r1 + 1e-8)).astype(np.float32)
+        xrp_eth_rs_raw = (xrp_r1 / (eth_r1 + 1e-8)).astype(np.float32)
+```
+
+**Step 4: 테스트 재실행**
+
+```bash
+python -m pytest tests/test_dataset_builder.py -v
+```
+Expected: 모든 테스트 PASS
+
+**Step 5: 커밋**
+
+```bash
+git add src/dataset_builder.py tests/test_dataset_builder.py
+git commit -m "refactor: 분모 연산을 1e-8 epsilon 패턴으로 통일"
+```
+
+---
+
+### Task 3: `mlx_filter.py` — nan-safe 정규화
+
+**Files:**
+- Modify: `src/mlx_filter.py:140-145`
+- Test: `tests/test_mlx_filter.py`
+
+**Step 1: nan-safe 정규화 테스트 작성**
+
+`tests/test_mlx_filter.py`에 추가:
+
+```python
+def test_fit_with_nan_features():
+    """oi_change 피처에 nan이 포함된 경우 학습이 정상 완료되어야 한다."""
+    import numpy as np
+    import pandas as pd
+    from src.mlx_filter import MLXFilter
+    from src.ml_features import FEATURE_COLS
+
+    n = 300
+    np.random.seed(42)
+    X = pd.DataFrame(
+        np.random.randn(n, len(FEATURE_COLS)).astype(np.float32),
+        columns=FEATURE_COLS,
+    )
+    # oi_change 앞 절반을 nan으로
+    X["oi_change"] = np.where(np.arange(n) < n // 2, np.nan, X["oi_change"])
+    y = pd.Series((np.random.rand(n) > 0.5).astype(np.float32))
+
+    model = MLXFilter(input_dim=len(FEATURE_COLS), hidden_dim=32, epochs=3)
+    model.fit(X, y)  # nan 있어도 예외 없이 완료되어야 함
+
+    proba = model.predict_proba(X)
+    assert not np.any(np.isnan(proba)), "예측 확률에 nan이 없어야 함"
+    assert proba.min() >= 0.0 and proba.max() <= 1.0
+```
+
+**Step 2: 테스트 실행 (FAIL 확인)**
+
+```bash
+python -m pytest tests/test_mlx_filter.py::test_fit_with_nan_features -v
+```
+Expected: FAIL (현재 nan이 그대로 들어가 loss=nan 발생)
+
+**Step 3: `mlx_filter.py` fit() 정규화 수정**
+
+`src/mlx_filter.py` 140~145줄을 아래로 교체:
+
+```python
+        X_np = X[FEATURE_COLS].values.astype(np.float32)
+        y_np = y.values.astype(np.float32)
+
+        # nan-safe 정규화: nanmean/nanstd로 통계 계산 후 nan → 0.0 대치
+        # (z-score 후 0.0 = 평균값, 신경망에 줄 수 있는 가장 무난한 결측 대치값)
+        self._mean = np.nanmean(X_np, axis=0)
+        self._std  = np.nanstd(X_np, axis=0) + 1e-8
+        X_np = (X_np - self._mean) / self._std
+        X_np = np.nan_to_num(X_np, nan=0.0)
+```
+
+**Step 4: `predict_proba`도 nan_to_num 적용**
+
+`src/mlx_filter.py` 185~189줄:
+
+```python
+    def predict_proba(self, X: pd.DataFrame) -> np.ndarray:
+        X_np = X[FEATURE_COLS].values.astype(np.float32)
+        if self._trained and self._mean is not None:
+            X_np = (X_np - self._mean) / self._std
+            X_np = np.nan_to_num(X_np, nan=0.0)
+```
+
+**Step 5: 테스트 재실행**
+
+```bash
+python -m pytest tests/test_mlx_filter.py -v
+```
+Expected: 모든 테스트 PASS
+
+**Step 6: 커밋**
+
+```bash
+git add src/mlx_filter.py tests/test_mlx_filter.py
+git commit -m "fix: MLXFilter fit/predict에 nan-safe 정규화 적용 (nanmean + nan_to_num)"
+```
+
+---
+
+### Task 4: `train_model.py` — 정밀도 우선 임계값 탐색
+
+**Files:**
+- Modify: `scripts/train_model.py:236-246`
+- Test: 없음 (스크립트 레벨 변경, 수동 검증)
+
+**Step 1: `train_model.py` 임계값 탐색 교체**
+
+`scripts/train_model.py` 234~246줄을 아래로 교체:
+
+```python
+    val_proba = model.predict_proba(X_val)[:, 1]
+    auc = roc_auc_score(y_val, val_proba)
+
+    # 최적 임계값 탐색: 최소 재현율(0.15) 조건부 정밀도 최대화
+    from sklearn.metrics import precision_recall_curve
+    precisions, recalls, thresholds = precision_recall_curve(y_val, val_proba)
+    # precision_recall_curve의 마지막 원소는 (1.0, 0.0)이므로 제외
+    precisions, recalls = precisions[:-1], recalls[:-1]
+
+    MIN_RECALL = 0.15
+    valid_idx = np.where(recalls >= MIN_RECALL)[0]
+    if len(valid_idx) > 0:
+        best_idx  = valid_idx[np.argmax(precisions[valid_idx])]
+        best_thr  = float(thresholds[best_idx])
+        best_prec = float(precisions[best_idx])
+        best_rec  = float(recalls[best_idx])
+    else:
+        best_thr, best_prec, best_rec = 0.50, 0.0, 0.0
+        print(f"  [경고] recall >= {MIN_RECALL} 조건 만족 임계값 없음 → 기본값 0.50 사용")
+
+    print(f"\n검증 AUC: {auc:.4f}  |  최적 임계값: {best_thr:.4f} "
+          f"(Precision={best_prec:.3f}, Recall={best_rec:.3f})")
+    print(classification_report(y_val, (val_proba >= best_thr).astype(int), zero_division=0))
+```
+
+그리고 로그 저장 부분 (261~271줄)에 임계값 정보 추가:
+
+```python
+    log.append({
+        "date": datetime.now().isoformat(),
+        "backend": "lgbm",
+        "auc": round(auc, 4),
+        "best_threshold": round(best_thr, 4),
+        "best_precision": round(best_prec, 3),
+        "best_recall":    round(best_rec, 3),
+        "samples": len(dataset),
+        "features": len(actual_feature_cols),
+        "time_weight_decay": time_weight_decay,
+        "model_path": str(MODEL_PATH),
+    })
+```
+
+**Step 2: 수동 검증 (dry-run)**
+
+```bash
+python scripts/train_model.py --data data/combined_15m.parquet 2>&1 | tail -30
+```
+Expected: "최적 임계값: X.XXXX (Precision=X.XXX, Recall=X.XXX)" 형태 출력
+
+**Step 3: 커밋**
+
+```bash
+git add scripts/train_model.py
+git commit -m "feat: LightGBM 임계값 탐색을 정밀도 우선(recall>=0.15 조건부)으로 변경"
+```
+
+---
+
+### Task 5: `train_mlx_model.py` — 동일한 임계값 탐색 적용
+
+**Files:**
+- Modify: `scripts/train_mlx_model.py:119-122`
+
+**Step 1: `train_mlx_model.py` 임계값 탐색 교체**
+
+`scripts/train_mlx_model.py` 119~122줄을 아래로 교체:
+
+```python
+    val_proba = model.predict_proba(X_val)
+    auc = roc_auc_score(y_val, val_proba)
+
+    # 최적 임계값 탐색: 최소 재현율(0.15) 조건부 정밀도 최대화
+    from sklearn.metrics import precision_recall_curve, classification_report
+    precisions, recalls, thresholds = precision_recall_curve(y_val, val_proba)
+    precisions, recalls = precisions[:-1], recalls[:-1]
+
+    MIN_RECALL = 0.15
+    valid_idx = np.where(recalls >= MIN_RECALL)[0]
+    if len(valid_idx) > 0:
+        best_idx  = valid_idx[np.argmax(precisions[valid_idx])]
+        best_thr  = float(thresholds[best_idx])
+        best_prec = float(precisions[best_idx])
+        best_rec  = float(recalls[best_idx])
+    else:
+        best_thr, best_prec, best_rec = 0.50, 0.0, 0.0
+        print(f"  [경고] recall >= {MIN_RECALL} 조건 만족 임계값 없음 → 기본값 0.50 사용")
+
+    print(f"\n검증 AUC: {auc:.4f}  |  최적 임계값: {best_thr:.4f} "
+          f"(Precision={best_prec:.3f}, Recall={best_rec:.3f})")
+    print(classification_report(y_val, (val_proba >= best_thr).astype(int), zero_division=0))
+```
+
+그리고 로그 저장 부분에 임계값 정보 추가:
+
+```python
+    log.append({
+        "date": datetime.now().isoformat(),
+        "backend": "mlx",
+        "auc": round(auc, 4),
+        "best_threshold": round(best_thr, 4),
+        "best_precision": round(best_prec, 3),
+        "best_recall":    round(best_rec, 3),
+        "samples": len(dataset),
+        "train_sec": round(t3 - t2, 1),
+        "time_weight_decay": time_weight_decay,
+        "model_path": str(MLX_MODEL_PATH),
+    })
+```
+
+**Step 2: 커밋**
+
+```bash
+git add scripts/train_mlx_model.py
+git commit -m "feat: MLX 임계값 탐색을 정밀도 우선(recall>=0.15 조건부)으로 변경"
+```
+
+---
+
+### Task 6: 전체 테스트 통과 확인
+
+**Step 1: 전체 테스트 실행**
+
+```bash
+python -m pytest tests/ -v --tb=short 2>&1 | tail -40
+```
+Expected: 모든 테스트 PASS
+
+**Step 2: 최종 커밋 (필요 시)**
+
+```bash
+git add -A
+git commit -m "chore: OI nan 마스킹 / epsilon 통일 / 정밀도 우선 임계값 전체 통합"
+```
--- a/models/mlx_filter.meta.npz
+++ b/models/mlx_filter.meta.npz
--- a/models/mlx_filter.npz
+++ b/models/mlx_filter.npz
--- a/models/mlx_filter.onnx
+++ b/models/mlx_filter.onnx
--- a/models/mlx_filter.weights.onnx
+++ b/models/mlx_filter.weights.onnx
--- a/models/training_log.json
+++ b/models/training_log.json
@@ -31,5 +31,190 @@
    "samples": 1696,
    "features": 21,
    "model_path": "models/lgbm_filter.pkl"
+  },
+  {
+    "date": "2026-03-01T21:03:56.314547",
+    "auc": 0.5406,
+    "samples": 1707,
+    "features": 21,
+    "model_path": "models/lgbm_filter.pkl"
+  },
+  {
+    "date": "2026-03-01T21:12:23.866860",
+    "auc": 0.502,
+    "samples": 3269,
+    "features": 21,
+    "model_path": "models/lgbm_filter.pkl"
+  },
+  {
+    "date": "2026-03-01T21:46:29.599674",
+    "backend": "mlx",
+    "auc": 0.516,
+    "samples": 6470,
+    "train_sec": 1.3,
+    "time_weight_decay": 2.0,
+    "model_path": "models/mlx_filter.weights"
+  },
+  {
+    "date": "2026-03-01T21:50:12.449819",
+    "backend": "lgbm",
+    "auc": 0.4772,
+    "samples": 6470,
+    "features": 21,
+    "time_weight_decay": 2.0,
+    "model_path": "models/lgbm_filter.pkl"
+  },
+  {
+    "date": "2026-03-01T21:50:32.491318",
+    "backend": "lgbm",
+    "auc": 0.4943,
+    "samples": 6470,
+    "features": 21,
+    "time_weight_decay": 2.0,
+    "model_path": "models/lgbm_filter.pkl"
+  },
+  {
+    "date": "2026-03-01T21:50:48.665654",
+    "backend": "lgbm",
+    "auc": 0.4943,
+    "samples": 6470,
+    "features": 21,
+    "time_weight_decay": 2.0,
+    "model_path": "models/lgbm_filter.pkl"
+  },
+  {
+    "date": "2026-03-01T21:51:02.539565",
+    "backend": "lgbm",
+    "auc": 0.4943,
+    "samples": 6470,
+    "features": 21,
+    "time_weight_decay": 2.0,
+    "model_path": "models/lgbm_filter.pkl"
+  },
+  {
+    "date": "2026-03-01T21:51:09.830250",
+    "backend": "lgbm",
+    "auc": 0.4925,
+    "samples": 1716,
+    "features": 13,
+    "time_weight_decay": 2.0,
+    "model_path": "models/lgbm_filter.pkl"
+  },
+  {
+    "date": "2026-03-01T21:51:20.133303",
+    "backend": "lgbm",
+    "auc": 0.54,
+    "samples": 1716,
+    "features": 13,
+    "time_weight_decay": 2.0,
+    "model_path": "models/lgbm_filter.pkl"
+  },
+  {
+    "date": "2026-03-01T21:51:25.445363",
+    "backend": "lgbm",
+    "auc": 0.4943,
+    "samples": 6470,
+    "features": 21,
+    "time_weight_decay": 2.0,
+    "model_path": "models/lgbm_filter.pkl"
+  },
+  {
+    "date": "2026-03-01T21:52:24.296191",
+    "backend": "lgbm",
+    "auc": 0.54,
+    "samples": 1716,
+    "features": 13,
+    "time_weight_decay": 2.0,
+    "model_path": "models/lgbm_filter.pkl"
+  },
+  {
+    "date": "2026-03-01T22:00:34.737597",
+    "backend": "lgbm",
+    "auc": 0.5097,
+    "samples": 6470,
+    "features": 21,
+    "time_weight_decay": 3.0,
+    "model_path": "models/lgbm_filter.pkl"
+  },
+  {
+    "date": "2026-03-01T22:12:06.299119",
+    "backend": "mlx",
+    "auc": 0.5746,
+    "samples": 533,
+    "train_sec": 0.2,
+    "time_weight_decay": 2.0,
+    "model_path": "models/mlx_filter.weights"
+  },
+  {
+    "date": "2026-03-01T22:13:20.434893",
+    "backend": "mlx",
+    "auc": 0.5663,
+    "samples": 533,
+    "train_sec": 0.2,
+    "time_weight_decay": 2.0,
+    "model_path": "models/mlx_filter.weights"
+  },
+  {
+    "date": "2026-03-01T22:15:43.163315",
+    "backend": "lgbm",
+    "auc": 0.5581,
+    "samples": 533,
+    "features": 21,
+    "time_weight_decay": 2.0,
+    "model_path": "models/lgbm_filter.pkl"
+  },
+  {
+    "date": "2026-03-01T22:18:59.852831",
+    "backend": "lgbm",
+    "auc": 0.5504,
+    "samples": 533,
+    "features": 21,
+    "time_weight_decay": 2.0,
+    "model_path": "models/lgbm_filter.pkl"
+  },
+  {
+    "date": "2026-03-01T22:19:29.532472",
+    "backend": "lgbm",
+    "auc": 0.5504,
+    "samples": 533,
+    "features": 21,
+    "time_weight_decay": 2.0,
+    "model_path": "models/lgbm_filter.pkl"
+  },
+  {
+    "date": "2026-03-01T22:19:30.938005",
+    "backend": "mlx",
+    "auc": 0.5714,
+    "samples": 533,
+    "train_sec": 0.1,
+    "time_weight_decay": 2.0,
+    "model_path": "models/mlx_filter.weights"
+  },
+  {
+    "date": "2026-03-01T22:26:46.459326",
+    "backend": "mlx",
+    "auc": 0.6167,
+    "samples": 533,
+    "train_sec": 0.2,
+    "time_weight_decay": 2.0,
+    "model_path": "models/mlx_filter.weights"
+  },
+  {
+    "date": "2026-03-01T22:45:55.473533",
+    "backend": "lgbm",
+    "auc": 0.556,
+    "samples": 533,
+    "features": 23,
+    "time_weight_decay": 2.0,
+    "model_path": "models/lgbm_filter.pkl"
+  },
+  {
+    "date": "2026-03-01T23:04:51.194544",
+    "backend": "mlx",
+    "auc": 0.5972,
+    "samples": 533,
+    "train_sec": 0.1,
+    "time_weight_decay": 2.0,
+    "model_path": "models/mlx_filter.weights"
  }
 ]
--- a/requirements.txt
+++ b/requirements.txt
@@ -12,3 +12,4 @@ lightgbm>=4.3.0
 scikit-learn>=1.4.0
 joblib>=1.3.0
 pyarrow>=15.0.0
+onnxruntime>=1.18.0
--- a/scripts/deploy_model.sh
+++ b/scripts/deploy_model.sh
@@ -1,66 +1,78 @@
 #!/usr/bin/env bash
 # 맥미니에서 학습한 모델을 LXC 컨테이너 볼륨 경로로 전송한다.
-# 사용법: bash scripts/deploy_model.sh [LXC_HOST] [LXC_MODELS_PATH]
+# 사용법: bash scripts/deploy_model.sh [lgbm|mlx]
 #
 # 예시:
-#   bash scripts/deploy_model.sh 10.1.10.28 /path/to/cointrader/models
-#   bash scripts/deploy_model.sh root@10.1.10.28 /root/cointrader/models
+#   bash scripts/deploy_model.sh        # LightGBM (기본값)
+#   bash scripts/deploy_model.sh mlx    # MLX 신경망

 set -euo pipefail

-LXC_HOST="${1:-root@10.1.10.24}"
-LXC_MODELS_PATH="${2:-/root/cointrader/models}"
-LOCAL_MODEL="models/lgbm_filter.pkl"
+BACKEND="${1:-lgbm}"
+LXC_HOST="root@10.1.10.24"
+LXC_MODELS_PATH="/root/cointrader/models"
 LOCAL_LOG="models/training_log.json"

-if [[ ! -f "$LOCAL_MODEL" ]]; then
-  echo "[오류] 모델 파일 없음: $LOCAL_MODEL"
-  echo "먼저 python scripts/train_model.py 를 실행하세요."
-  exit 1
+# ── 백엔드별 파일 목록 설정 ──────────────────────────────────────────────────
+# mlx: ONNX 파일만 전송 (Linux 서버는 onnxruntime으로 추론)
+# lgbm: pkl 파일 전송
+if [ "$BACKEND" = "mlx" ]; then
+  LOCAL_FILES=("models/mlx_filter.weights.onnx")
+else
+  LOCAL_FILES=("models/lgbm_filter.pkl")
 fi

-echo "=== 모델 전송 시작 ==="
-echo "  대상: ${LXC_HOST}:${LXC_MODELS_PATH}"
-echo "  파일: $LOCAL_MODEL"
-
-# 기존 모델을 prev로 백업 (원격)
-ssh "${LXC_HOST}" "
-  if [ -f '${LXC_MODELS_PATH}/lgbm_filter.pkl' ]; then
-    cp '${LXC_MODELS_PATH}/lgbm_filter.pkl' '${LXC_MODELS_PATH}/lgbm_filter_prev.pkl'
-    echo '  기존 모델 백업 완료'
+# ── 파일 존재 확인 ────────────────────────────────────────────────────────────
+for f in "${LOCAL_FILES[@]}"; do
+  if [[ ! -f "$f" ]]; then
+    echo "[오류] 모델 파일 없음: $f"
+    exit 1
  fi
+done
+
+echo "=== 모델 전송 시작 (백엔드: ${BACKEND}) ==="
+echo "  대상: ${LXC_HOST}:${LXC_MODELS_PATH}"
+
+# ── 원격 디렉터리 생성 + lgbm 기존 모델 백업 ─────────────────────────────────
+ssh "${LXC_HOST}" "
  mkdir -p '${LXC_MODELS_PATH}'
+  if [ '$BACKEND' = 'lgbm' ] && [ -f '${LXC_MODELS_PATH}/lgbm_filter.pkl' ]; then
+    cp '${LXC_MODELS_PATH}/lgbm_filter.pkl' '${LXC_MODELS_PATH}/lgbm_filter_prev.pkl'
+    echo '  기존 lgbm 모델 백업 완료'
+  fi
 "

-# 모델 파일 전송 (rsync 우선, 없으면 scp 폴백)
-if command -v rsync &>/dev/null && ssh "${LXC_HOST}" "command -v rsync" &>/dev/null; then
-  rsync -avz --progress \
-    "$LOCAL_MODEL" \
-    "${LXC_HOST}:${LXC_MODELS_PATH}/lgbm_filter.pkl"
-else
-  echo "  rsync 없음 → scp 사용"
-  scp "$LOCAL_MODEL" "${LXC_HOST}:${LXC_MODELS_PATH}/lgbm_filter.pkl"
-fi
-
-# 학습 로그도 함께 전송 (있을 경우)
-if [[ -f "$LOCAL_LOG" ]]; then
+# ── 파일 전송 헬퍼 (rsync 우선, scp 폴백) ────────────────────────────────────
+_send() {
+  local src="$1" dst="$2"
+  echo "  전송: $src → ${LXC_HOST}:$dst"
  if command -v rsync &>/dev/null && ssh "${LXC_HOST}" "command -v rsync" &>/dev/null; then
-    rsync -avz "$LOCAL_LOG" "${LXC_HOST}:${LXC_MODELS_PATH}/training_log.json"
+    rsync -avz --progress "$src" "${LXC_HOST}:$dst"
  else
-    scp "$LOCAL_LOG" "${LXC_HOST}:${LXC_MODELS_PATH}/training_log.json"
+    scp "$src" "${LXC_HOST}:$dst"
  fi
+}
+
+# ── 모델 파일 전송 ────────────────────────────────────────────────────────────
+for f in "${LOCAL_FILES[@]}"; do
+  _send "$f" "${LXC_MODELS_PATH}/$(basename "$f")"
+done
+
+# ── 학습 로그 전송 ────────────────────────────────────────────────────────────
+if [[ -f "$LOCAL_LOG" ]]; then
+  _send "$LOCAL_LOG" "${LXC_MODELS_PATH}/training_log.json"
  echo "  학습 로그 전송 완료"
 fi

 echo "=== 전송 완료 ==="
 echo ""

-# 봇 컨테이너가 실행 중이면 모델 핫리로드, 아니면 건너뜀
-echo "=== 핫리로드 시도 ==="
+# ── 핫리로드 안내 ────────────────────────────────────────────────────────────
+# 봇이 캔들마다 모델 파일 mtime을 감지해 자동 리로드한다.
+# 컨테이너가 실행 중이면 다음 캔들(최대 1분) 안에 자동 적용된다.
+echo "=== 모델 전송 완료 — 봇이 다음 캔들에서 자동 리로드합니다 ==="
 if ssh "${LXC_HOST}" "docker inspect -f '{{.State.Running}}' cointrader 2>/dev/null | grep -q true"; then
-  ssh "${LXC_HOST}" "docker exec cointrader python -c \
-    \"from src.ml_filter import MLFilter; f=MLFilter(); f.reload_model(); print('리로드 완료')\""
-  echo "=== 핫리로드 완료 ==="
+  echo "  컨테이너 실행 중: 다음 캔들 마감 시 자동 핫리로드 예정"
 else
-  echo "  cointrader 컨테이너가 실행 중이 아닙니다. 건너뜁니다."
+  echo "  cointrader 컨테이너가 실행 중이 아닙니다."
 fi
--- a/scripts/fetch_history.py
+++ b/scripts/fetch_history.py
@@ -2,6 +2,11 @@
 바이낸스 선물 REST API로 과거 캔들 데이터를 수집해 parquet으로 저장한다.
 사용법: python scripts/fetch_history.py --symbol XRPUSDT --interval 1m --days 90
       python scripts/fetch_history.py --symbols XRPUSDT BTCUSDT ETHUSDT --days 90
+
+OI/펀딩비 수집 제약:
+  - OI 히스토리: 바이낸스 API 제한으로 최근 30일치만 제공 (period=15m, limit=500/req)
+  - 펀딩비: 8시간 주기 → 15분봉에 forward-fill 병합
+  - 30일 이전 구간은 oi_change=0, funding_rate=0으로 채움
 """
 import sys
 from pathlib import Path
@@ -9,6 +14,7 @@ sys.path.insert(0, str(Path(__file__).parent.parent))

 import asyncio
 import argparse
+import aiohttp
 from datetime import datetime, timezone, timedelta
 import pandas as pd
 from binance import AsyncClient
@@ -21,6 +27,7 @@ load_dotenv()
 # 1500개씩 가져오므로 90일 1m 데이터 = ~65회 요청/심볼
 # 심볼 간 딜레이 없이 연속 요청하면 레이트 리밋(-1003) 발생
 _REQUEST_DELAY = 0.3  # 초당 ~3.3 req → 안전 마진 충분
+_FAPI_BASE = "https://fapi.binance.com"


 def _now_ms() -> int:
@@ -107,15 +114,164 @@ async def fetch_klines_all(
    return dfs


+async def _fetch_oi_hist(
+    session: aiohttp.ClientSession,
+    symbol: str,
+    period: str = "15m",
+) -> pd.DataFrame:
+    """
+    바이낸스 /futures/data/openInterestHist 엔드포인트로 OI 히스토리를 수집한다.
+    API 제한: 최근 30일치만 제공, 1회 최대 500개.
+    """
+    url = f"{_FAPI_BASE}/futures/data/openInterestHist"
+    all_rows = []
+    # 30일 전부터 현재까지 수집
+    start_ts = int((datetime.now(timezone.utc) - timedelta(days=30)).timestamp() * 1000)
+    now_ms = int(datetime.now(timezone.utc).timestamp() * 1000)
+
+    print(f"  [{symbol}] OI 히스토리 수집 중 (최근 30일)...")
+    while start_ts < now_ms:
+        params = {
+            "symbol": symbol,
+            "period": period,
+            "limit": 500,
+            "startTime": start_ts,
+        }
+        async with session.get(url, params=params) as resp:
+            data = await resp.json()
+
+        if not data or not isinstance(data, list):
+            break
+
+        all_rows.extend(data)
+        last_ts = int(data[-1]["timestamp"])
+        if last_ts >= now_ms or len(data) < 500:
+            break
+        start_ts = last_ts + 1
+        await asyncio.sleep(_REQUEST_DELAY)
+
+    if not all_rows:
+        print(f"  [{symbol}] OI 데이터 없음 — 빈 DataFrame 반환")
+        return pd.DataFrame(columns=["oi", "oi_value"])
+
+    df = pd.DataFrame(all_rows)
+    df["timestamp"] = pd.to_datetime(df["timestamp"].astype(int), unit="ms", utc=True)
+    df = df.set_index("timestamp")
+    df = df[["sumOpenInterest", "sumOpenInterestValue"]].copy()
+    df.columns = ["oi", "oi_value"]
+    df["oi"] = df["oi"].astype(float)
+    df["oi_value"] = df["oi_value"].astype(float)
+    # OI 변화율 (1캔들 전 대비)
+    df["oi_change"] = df["oi"].pct_change(1).fillna(0)
+    print(f"  [{symbol}] OI 수집 완료: {len(df):,}행")
+    return df[["oi_change"]]
+
+
+async def _fetch_funding_rate(
+    session: aiohttp.ClientSession,
+    symbol: str,
+    days: int,
+) -> pd.DataFrame:
+    """
+    바이낸스 /fapi/v1/fundingRate 엔드포인트로 펀딩비 히스토리를 수집한다.
+    8시간 주기 데이터 → 15분봉 인덱스에 forward-fill로 병합 예정.
+    """
+    url = f"{_FAPI_BASE}/fapi/v1/fundingRate"
+    all_rows = []
+    start_ts = int((datetime.now(timezone.utc) - timedelta(days=days)).timestamp() * 1000)
+    now_ms = int(datetime.now(timezone.utc).timestamp() * 1000)
+
+    print(f"  [{symbol}] 펀딩비 히스토리 수집 중 ({days}일)...")
+    while start_ts < now_ms:
+        params = {
+            "symbol": symbol,
+            "startTime": start_ts,
+            "limit": 1000,
+        }
+        async with session.get(url, params=params) as resp:
+            data = await resp.json()
+
+        if not data or not isinstance(data, list):
+            break
+
+        all_rows.extend(data)
+        last_ts = int(data[-1]["fundingTime"])
+        if last_ts >= now_ms or len(data) < 1000:
+            break
+        start_ts = last_ts + 1
+        await asyncio.sleep(_REQUEST_DELAY)
+
+    if not all_rows:
+        print(f"  [{symbol}] 펀딩비 데이터 없음 — 빈 DataFrame 반환")
+        return pd.DataFrame(columns=["funding_rate"])
+
+    df = pd.DataFrame(all_rows)
+    df["timestamp"] = pd.to_datetime(df["fundingTime"].astype(int), unit="ms", utc=True)
+    df = df.set_index("timestamp")
+    df["funding_rate"] = df["fundingRate"].astype(float)
+    print(f"  [{symbol}] 펀딩비 수집 완료: {len(df):,}행")
+    return df[["funding_rate"]]
+
+
+def _merge_oi_funding(
+    candles: pd.DataFrame,
+    oi_df: pd.DataFrame,
+    funding_df: pd.DataFrame,
+) -> pd.DataFrame:
+    """
+    캔들 DataFrame에 OI 변화율과 펀딩비를 병합한다.
+    - oi_change: 15분봉 인덱스에 nearest merge (없는 구간은 0)
+    - funding_rate: 8시간 주기 → forward-fill 후 병합 (없는 구간은 0)
+    """
+    result = candles.copy()
+
+    # OI 병합: 타임스탬프 기준 reindex + nearest fill
+    if not oi_df.empty:
+        oi_reindexed = oi_df.reindex(result.index, method="nearest", tolerance=pd.Timedelta("8min"))
+        result["oi_change"] = oi_reindexed["oi_change"].fillna(0).astype(float)
+    else:
+        result["oi_change"] = 0.0
+
+    # 펀딩비 병합: forward-fill (8시간 주기이므로 다음 펀딩 시점까지 이전 값 유지)
+    if not funding_df.empty:
+        funding_reindexed = funding_df.reindex(
+            result.index.union(funding_df.index)
+        ).sort_index()
+        funding_reindexed = funding_reindexed["funding_rate"].ffill()
+        result["funding_rate"] = funding_reindexed.reindex(result.index).fillna(0).astype(float)
+    else:
+        result["funding_rate"] = 0.0
+
+    return result
+
+
+async def _fetch_oi_and_funding(
+    symbol: str,
+    days: int,
+    candles: pd.DataFrame,
+) -> pd.DataFrame:
+    """단일 심볼의 OI + 펀딩비를 수집해 캔들에 병합한다."""
+    async with aiohttp.ClientSession() as session:
+        oi_df = await _fetch_oi_hist(session, symbol)
+        await asyncio.sleep(1)
+        funding_df = await _fetch_funding_rate(session, symbol, days)
+
+    return _merge_oi_funding(candles, oi_df, funding_df)
+
+
 def main():
    parser = argparse.ArgumentParser(
        description="바이낸스 선물 과거 캔들 수집. 단일 심볼 또는 멀티 심볼 병합 저장."
    )
    parser.add_argument("--symbols", nargs="+", default=["XRPUSDT"])
    parser.add_argument("--symbol",   default=None, help="단일 심볼 (--symbols 미사용 시)")
-    parser.add_argument("--interval", default="1m")
-    parser.add_argument("--days",     type=int, default=90)
-    parser.add_argument("--output",   default="data/xrpusdt_1m.parquet")
+    parser.add_argument("--interval", default="15m")
+    parser.add_argument("--days",     type=int, default=365)
+    parser.add_argument("--output",   default="data/combined_15m.parquet")
+    parser.add_argument(
+        "--no-oi", action="store_true",
+        help="OI/펀딩비 수집을 건너뜀 (캔들 데이터만 저장)",
+    )
    args = parser.parse_args()

    # 하위 호환: --symbol 단독 사용 시 symbols로 통합
@@ -124,8 +280,11 @@ def main():

    if len(args.symbols) == 1:
        df = asyncio.run(fetch_klines(args.symbols[0], args.interval, args.days))
+        if not args.no_oi:
+            print(f"\n[OI/펀딩비] {args.symbols[0]} 수집 중...")
+            df = asyncio.run(_fetch_oi_and_funding(args.symbols[0], args.days, df))
        df.to_parquet(args.output)
-        print(f"저장 완료: {args.output} ({len(df):,}행)")
+        print(f"저장 완료: {args.output} ({len(df):,}행, {len(df.columns)}컬럼)")
    else:
        # 멀티 심볼: 단일 클라이언트로 순차 수집 후 타임스탬프 기준 inner join 병합
        dfs = asyncio.run(fetch_klines_all(args.symbols, args.interval, args.days))
@@ -139,6 +298,11 @@ def main():
                how="inner",
            )

+        # 주 심볼(XRP)에 대해서만 OI/펀딩비 수집 후 병합
+        if not args.no_oi:
+            print(f"\n[OI/펀딩비] {primary} 수집 중...")
+            merged = asyncio.run(_fetch_oi_and_funding(primary, args.days, merged))
+
        output = args.output.replace("xrpusdt", "combined")
        merged.to_parquet(output)
        print(f"\n병합 저장 완료: {output} ({len(merged):,}행, {len(merged.columns)}컬럼)")
--- a/scripts/train_and_deploy.sh
+++ b/scripts/train_and_deploy.sh
@@ -1,47 +1,75 @@
 #!/usr/bin/env bash
 # 맥미니에서 전체 학습 파이프라인을 실행하고 LXC로 배포한다.
-# 사용법: bash scripts/train_and_deploy.sh [LXC_HOST] [LXC_MODELS_PATH]
+# 사용법: bash scripts/train_and_deploy.sh [mlx|lgbm] [wf-splits]
 #
 # 예시:
-#   bash scripts/train_and_deploy.sh
-#   bash scripts/train_and_deploy.sh root@10.1.10.24 /root/cointrader/models
+#   bash scripts/train_and_deploy.sh             # LightGBM + Walk-Forward 5폴드 (기본값)
+#   bash scripts/train_and_deploy.sh mlx         # MLX GPU 학습 + Walk-Forward 5폴드
+#   bash scripts/train_and_deploy.sh lgbm 3      # LightGBM + Walk-Forward 3폴드
+#   bash scripts/train_and_deploy.sh mlx 0       # MLX 학습만 (Walk-Forward 건너뜀)
+#   bash scripts/train_and_deploy.sh lgbm 0      # LightGBM 학습만 (Walk-Forward 건너뜀)

 set -euo pipefail

-LXC_HOST="${1:-root@10.1.10.24}"
-LXC_MODELS_PATH="${2:-/root/cointrader/models}"
-
 SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
 PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"

+VENV_PATH="${VENV_PATH:-$PROJECT_ROOT/.venv}"
+if [ -f "$VENV_PATH/bin/activate" ]; then
+    # shellcheck source=/dev/null
+    source "$VENV_PATH/bin/activate"
+else
+    echo "경고: 가상환경을 찾을 수 없습니다 ($VENV_PATH). 시스템 Python을 사용합니다." >&2
+fi
+
+BACKEND="${1:-lgbm}"
+WF_SPLITS="${2:-5}"   # 두 번째 인자: Walk-Forward 폴드 수 (0이면 건너뜀)
+
 cd "$PROJECT_ROOT"

-echo "=== [1/3] 데이터 수집 (XRP + BTC + ETH 3심볼) ==="
+echo "=== [1/3] 데이터 수집 (XRP + BTC + ETH 3심볼, 1년치 + OI/펀딩비) ==="
 python scripts/fetch_history.py \
    --symbols XRPUSDT BTCUSDT ETHUSDT \
-    --interval 1m \
-    --days 90 \
-    --output data/xrpusdt_1m.parquet
-# 결과: data/combined_1m.parquet (타임스탬프 기준 병합)
+    --interval 15m \
+    --days 365 \
+    --output data/combined_15m.parquet

 echo ""
-echo "=== [2/3] 모델 학습 (21개 피처: XRP 13 + BTC/ETH 상관관계 8) ==="
-# TRAIN_BACKEND=mlx 로 설정하면 Apple Silicon GPU(Metal)를 사용한다 (기본: lgbm)
-BACKEND="${TRAIN_BACKEND:-lgbm}"
+echo "=== [2/3] 모델 학습 (23개 피처: XRP 13 + BTC/ETH 8 + OI/펀딩비 2) ==="
+DECAY="${TIME_WEIGHT_DECAY:-2.0}"
 if [ "$BACKEND" = "mlx" ]; then
-    echo "  백엔드: MLX (Apple Silicon GPU)"
-    python scripts/train_mlx_model.py --data data/combined_1m.parquet
+    echo "  백엔드: MLX (Apple Silicon GPU), decay=${DECAY}"
+    python scripts/train_mlx_model.py --data data/combined_15m.parquet --decay "$DECAY"
 else
-    echo "  백엔드: LightGBM (CPU)"
-    python scripts/train_model.py --data data/combined_1m.parquet
+    echo "  백엔드: LightGBM (CPU), decay=${DECAY}"
+    python scripts/train_model.py --data data/combined_15m.parquet --decay "$DECAY"
+fi
+
+# Walk-Forward 검증 (WF_SPLITS > 0 인 경우)
+if [ "$WF_SPLITS" -gt 0 ] 2>/dev/null; then
+    echo ""
+    echo "=== [2.5/3] Walk-Forward 검증 (${WF_SPLITS}폴드) ==="
+    if [ "$BACKEND" = "mlx" ]; then
+        python scripts/train_mlx_model.py \
+            --data data/combined_15m.parquet \
+            --decay "$DECAY" \
+            --wf \
+            --wf-splits "$WF_SPLITS"
+    else
+        python scripts/train_model.py \
+            --data data/combined_15m.parquet \
+            --decay "$DECAY" \
+            --wf \
+            --wf-splits "$WF_SPLITS"
+    fi
 fi

 echo ""
 echo "=== [3/3] LXC 배포 ==="
-bash scripts/deploy_model.sh "$LXC_HOST" "$LXC_MODELS_PATH"
+bash scripts/deploy_model.sh "$BACKEND"

 echo ""
 echo "=== 전체 파이프라인 완료 ==="
 echo ""
 echo "봇 재시작이 필요하면:"
-echo "  ssh ${LXC_HOST} 'cd /root/cointrader && docker compose restart cointrader'"
+echo "  ssh root@10.1.10.24 'cd /root/cointrader && docker compose restart cointrader'"
--- a/scripts/train_mlx_model.py
+++ b/scripts/train_mlx_model.py
@@ -25,14 +25,40 @@ MLX_MODEL_PATH = Path("models/mlx_filter.weights")
 LOG_PATH = Path("models/training_log.json")


-def train_mlx(data_path: str) -> float:
+def _split_combined(df: pd.DataFrame) -> tuple[pd.DataFrame, pd.DataFrame | None, pd.DataFrame | None]:
+    """combined parquet에서 XRP/BTC/ETH DataFrame을 분리한다."""
+    xrp_cols = ["open", "high", "low", "close", "volume"]
+    xrp_df = df[xrp_cols].copy()
+
+    btc_df = None
+    eth_df = None
+    btc_raw = [c for c in df.columns if c.endswith("_btc")]
+    eth_raw = [c for c in df.columns if c.endswith("_eth")]
+
+    if btc_raw:
+        btc_df = df[btc_raw].copy()
+        btc_df.columns = [c.replace("_btc", "") for c in btc_raw]
+    if eth_raw:
+        eth_df = df[eth_raw].copy()
+        eth_df.columns = [c.replace("_eth", "") for c in eth_raw]
+
+    return xrp_df, btc_df, eth_df
+
+
+def train_mlx(data_path: str, time_weight_decay: float = 2.0) -> float:
    print(f"데이터 로드: {data_path}")
-    df = pd.read_parquet(data_path)
-    print(f"캔들 수: {len(df)}")
+    raw = pd.read_parquet(data_path)
+    print(f"캔들 수: {len(raw)}")
+
+    df, btc_df, eth_df = _split_combined(raw)
+    if btc_df is not None:
+        print(f"  BTC/ETH 피처 활성화 (21개 피처)")
+    else:
+        print(f"  XRP 단독 데이터 (13개 피처)")

    print("\n데이터셋 생성 중...")
    t0 = time.perf_counter()
-    dataset = generate_dataset_vectorized(df)
+    dataset = generate_dataset_vectorized(df, btc_df=btc_df, eth_df=eth_df, time_weight_decay=time_weight_decay)
    t1 = time.perf_counter()
    print(f"데이터셋 생성 완료: {t1 - t0:.1f}초, {len(dataset)}개 샘플")

@@ -44,12 +70,38 @@ def train_mlx(data_path: str) -> float:
    if len(dataset) < 200:
        raise ValueError(f"학습 샘플 부족: {len(dataset)}개 (최소 200 필요)")

+    actual_cols = [c for c in FEATURE_COLS if c in dataset.columns]
+    missing = [c for c in FEATURE_COLS if c not in dataset.columns]
+    if missing:
+        print(f"  경고: 데이터셋에 없는 피처 {missing} → 0으로 채움 (BTC/ETH 데이터 미제공)")
+        for col in missing:
+            dataset[col] = 0.0
    X = dataset[FEATURE_COLS]
    y = dataset["label"]
+    w = dataset["sample_weight"].values

    split = int(len(X) * 0.8)
    X_train, X_val = X.iloc[:split], X.iloc[split:]
    y_train, y_val = y.iloc[:split], y.iloc[split:]
+    w_train = w[:split]
+
+    # --- 클래스 불균형 처리: 언더샘플링 (가중치 인덱스 보존) ---
+    pos_idx = np.where(y_train == 1)[0]
+    neg_idx = np.where(y_train == 0)[0]
+
+    if len(neg_idx) > len(pos_idx):
+        np.random.seed(42)
+        neg_idx = np.random.choice(neg_idx, size=len(pos_idx), replace=False)
+
+    balanced_idx = np.concatenate([pos_idx, neg_idx])
+    np.random.shuffle(balanced_idx)
+
+    X_train = X_train.iloc[balanced_idx]
+    y_train = y_train.iloc[balanced_idx]
+    w_train = w_train[balanced_idx]
+
+    print(f"\n언더샘플링 적용 후 학습 데이터: {len(X_train)}개 (양성={y_train.sum()}, 음성={(y_train==0).sum()})")
+    # --------------------------------------

    print("\nMLX 신경망 학습 시작 (GPU)...")
    t2 = time.perf_counter()
@@ -60,14 +112,32 @@ def train_mlx(data_path: str) -> float:
        epochs=100,
        batch_size=256,
    )
-    model.fit(X_train, y_train)
+    model.fit(X_train, y_train, sample_weight=w_train)
    t3 = time.perf_counter()
    print(f"학습 완료: {t3 - t2:.1f}초")

    val_proba = model.predict_proba(X_val)
    auc = roc_auc_score(y_val, val_proba)
-    print(f"\n검증 AUC: {auc:.4f}")
-    print(classification_report(y_val, (val_proba >= 0.60).astype(int)))
+
+    # 최적 임계값 탐색: 최소 재현율(0.15) 조건부 정밀도 최대화
+    from sklearn.metrics import precision_recall_curve
+    precisions, recalls, thresholds = precision_recall_curve(y_val, val_proba)
+    precisions, recalls = precisions[:-1], recalls[:-1]
+
+    MIN_RECALL = 0.15
+    valid_idx = np.where(recalls >= MIN_RECALL)[0]
+    if len(valid_idx) > 0:
+        best_idx  = valid_idx[np.argmax(precisions[valid_idx])]
+        best_thr  = float(thresholds[best_idx])
+        best_prec = float(precisions[best_idx])
+        best_rec  = float(recalls[best_idx])
+    else:
+        best_thr, best_prec, best_rec = 0.50, 0.0, 0.0
+        print(f"  [경고] recall >= {MIN_RECALL} 조건 만족 임계값 없음 → 기본값 0.50 사용")
+
+    print(f"\n검증 AUC: {auc:.4f}  |  최적 임계값: {best_thr:.4f} "
+          f"(Precision={best_prec:.3f}, Recall={best_rec:.3f})")
+    print(classification_report(y_val, (val_proba >= best_thr).astype(int), zero_division=0))

    MLX_MODEL_PATH.parent.mkdir(exist_ok=True)
    model.save(MLX_MODEL_PATH)
@@ -81,8 +151,12 @@ def train_mlx(data_path: str) -> float:
        "date": datetime.now().isoformat(),
        "backend": "mlx",
        "auc": round(auc, 4),
+        "best_threshold": round(best_thr, 4),
+        "best_precision": round(best_prec, 3),
+        "best_recall":    round(best_rec, 3),
        "samples": len(dataset),
        "train_sec": round(t3 - t2, 1),
+        "time_weight_decay": time_weight_decay,
        "model_path": str(MLX_MODEL_PATH),
    })
    with open(LOG_PATH, "w") as f:
@@ -91,11 +165,107 @@ def train_mlx(data_path: str) -> float:
    return auc


+def walk_forward_auc(
+    data_path: str,
+    time_weight_decay: float = 2.0,
+    n_splits: int = 5,
+    train_ratio: float = 0.6,
+) -> None:
+    """Walk-Forward 검증: 슬라이딩 윈도우로 n_splits번 학습/검증 반복."""
+    print(f"\n=== Walk-Forward 검증 ({n_splits}폴드, decay={time_weight_decay}) ===")
+    raw = pd.read_parquet(data_path)
+    df, btc_df, eth_df = _split_combined(raw)
+
+    dataset = generate_dataset_vectorized(
+        df, btc_df=btc_df, eth_df=eth_df, time_weight_decay=time_weight_decay
+    )
+    missing = [c for c in FEATURE_COLS if c not in dataset.columns]
+    for col in missing:
+        dataset[col] = 0.0
+
+    X_all = dataset[FEATURE_COLS].values.astype(np.float32)
+    y_all = dataset["label"].values.astype(np.float32)
+    w_all = dataset["sample_weight"].values.astype(np.float32)
+    n = len(dataset)
+
+    step = max(1, int(n * (1 - train_ratio) / n_splits))
+    train_end_start = int(n * train_ratio)
+
+    aucs = []
+    for i in range(n_splits):
+        tr_end = train_end_start + i * step
+        val_end = tr_end + step
+        if val_end > n:
+            break
+
+        X_tr_raw = X_all[:tr_end]
+        y_tr = y_all[:tr_end]
+        w_tr = w_all[:tr_end]
+        X_val_raw = X_all[tr_end:val_end]
+        y_val = y_all[tr_end:val_end]
+
+        pos_idx = np.where(y_tr == 1)[0]
+        neg_idx = np.where(y_tr == 0)[0]
+        if len(neg_idx) > len(pos_idx):
+            np.random.seed(42)
+            neg_idx = np.random.choice(neg_idx, size=len(pos_idx), replace=False)
+        bal_idx = np.sort(np.concatenate([pos_idx, neg_idx]))
+
+        X_tr_bal = X_tr_raw[bal_idx]
+        y_tr_bal = y_tr[bal_idx]
+        w_tr_bal = w_tr[bal_idx]
+
+        # 폴드별 정규화 (학습 데이터 기준으로 계산, 검증에도 동일 적용)
+        mean = X_tr_bal.mean(axis=0)
+        std = X_tr_bal.std(axis=0) + 1e-8
+        X_tr_norm = (X_tr_bal - mean) / std
+        X_val_norm = (X_val_raw - mean) / std
+
+        # DataFrame으로 래핑해서 MLXFilter.fit()에 전달
+        # fit() 내부 정규화가 덮어쓰지 않도록 이미 정규화된 데이터를 넘기고
+        # _mean=0, _std=1로 고정해 이중 정규화를 방지
+        X_tr_df = pd.DataFrame(X_tr_norm, columns=FEATURE_COLS)
+        X_val_df = pd.DataFrame(X_val_norm, columns=FEATURE_COLS)
+
+        model = MLXFilter(
+            input_dim=len(FEATURE_COLS),
+            hidden_dim=128,
+            lr=1e-3,
+            epochs=100,
+            batch_size=256,
+        )
+        model.fit(X_tr_df, pd.Series(y_tr_bal), sample_weight=w_tr_bal)
+        # fit()이 내부에서 다시 정규화하므로 저장된 mean/std를 항등 변환으로 교체
+        model._mean = np.zeros(len(FEATURE_COLS), dtype=np.float32)
+        model._std = np.ones(len(FEATURE_COLS), dtype=np.float32)
+
+        proba = model.predict_proba(X_val_df)
+        auc = roc_auc_score(y_val, proba) if len(np.unique(y_val)) > 1 else 0.5
+        aucs.append(auc)
+        print(
+            f"  폴드 {i+1}/{n_splits}: 학습={tr_end}개, "
+            f"검증={tr_end}~{val_end} ({step}개), AUC={auc:.4f}"
+        )
+
+    print(f"\n  Walk-Forward 평균 AUC: {np.mean(aucs):.4f} ± {np.std(aucs):.4f}")
+    print(f"  폴드별: {[round(a, 4) for a in aucs]}")
+
+
 def main():
    parser = argparse.ArgumentParser()
-    parser.add_argument("--data", default="data/xrpusdt_1m.parquet")
+    parser.add_argument("--data", default="data/combined_15m.parquet")
+    parser.add_argument(
+        "--decay", type=float, default=2.0,
+        help="시간 가중치 감쇠 강도 (0=균등, 2.0=최신이 ~7.4배 높음)",
+    )
+    parser.add_argument("--wf", action="store_true", help="Walk-Forward 검증 실행")
+    parser.add_argument("--wf-splits", type=int, default=5, help="Walk-Forward 폴드 수")
    args = parser.parse_args()
-    train_mlx(args.data)
+
+    if args.wf:
+        walk_forward_auc(args.data, time_weight_decay=args.decay, n_splits=args.wf_splits)
+    else:
+        train_mlx(args.data, time_weight_decay=args.decay)


 if __name__ == "__main__":
--- a/scripts/train_model.py
+++ b/scripts/train_model.py
@@ -53,7 +53,7 @@ def _cgroup_cpu_count() -> int:
    return cpu_count()


-LOOKAHEAD = 60
+LOOKAHEAD = 24  # 15분봉 × 24 = 6시간 (dataset_builder.py와 동기화)
 ATR_SL_MULT = 1.5
 ATR_TP_MULT = 3.0
 MODEL_PATH = Path("models/lgbm_filter.pkl")
@@ -146,7 +146,7 @@ def generate_dataset(df: pd.DataFrame, n_jobs: int | None = None) -> pd.DataFram
    return pd.DataFrame(rows)


-def train(data_path: str):
+def train(data_path: str, time_weight_decay: float = 2.0):
    print(f"데이터 로드: {data_path}")
    df_raw = pd.read_parquet(data_path)
    print(f"캔들 수: {len(df_raw)}, 컬럼: {list(df_raw.columns)}")
@@ -169,7 +169,7 @@ def train(data_path: str):
    df = df_raw[base_cols].copy()

    print("데이터셋 생성 중...")
-    dataset = generate_dataset_vectorized(df, btc_df=btc_df, eth_df=eth_df)
+    dataset = generate_dataset_vectorized(df, btc_df=btc_df, eth_df=eth_df, time_weight_decay=time_weight_decay)

    if dataset.empty or "label" not in dataset.columns:
        raise ValueError(f"데이터셋 생성 실패: 샘플 0개. 위 오류 메시지를 확인하세요.")
@@ -183,32 +183,77 @@ def train(data_path: str):
    print(f"사용 피처: {len(actual_feature_cols)}개 {actual_feature_cols}")
    X = dataset[actual_feature_cols]
    y = dataset["label"]
+    w = dataset["sample_weight"].values

    split = int(len(X) * 0.8)
    X_train, X_val = X.iloc[:split], X.iloc[split:]
    y_train, y_val = y.iloc[:split], y.iloc[split:]
+    w_train = w[:split]
+
+    # --- 클래스 불균형 처리: 언더샘플링 (시간 가중치 인덱스 보존) ---
+    pos_idx = np.where(y_train == 1)[0]
+    neg_idx = np.where(y_train == 0)[0]
+
+    if len(neg_idx) > len(pos_idx):
+        np.random.seed(42)
+        neg_idx = np.random.choice(neg_idx, size=len(pos_idx), replace=False)
+
+    balanced_idx = np.sort(np.concatenate([pos_idx, neg_idx]))  # 시간 순서 유지
+
+    X_train = X_train.iloc[balanced_idx]
+    y_train = y_train.iloc[balanced_idx]
+    w_train = w_train[balanced_idx]
+
+    print(f"\n언더샘플링 후 학습 데이터: {len(X_train)}개 (양성={y_train.sum()}, 음성={(y_train==0).sum()})")
+    print(f"검증 데이터: {len(X_val)}개 (양성={int(y_val.sum())}, 음성={int((y_val==0).sum())})")
+    # ---------------------------------------------------------------

    model = lgb.LGBMClassifier(
-        n_estimators=300,
+        n_estimators=500,
        learning_rate=0.05,
        num_leaves=31,
-        min_child_samples=20,
+        min_child_samples=15,
        subsample=0.8,
        colsample_bytree=0.8,
-        class_weight="balanced",
+        reg_alpha=0.05,
+        reg_lambda=0.1,
        random_state=42,
        verbose=-1,
    )
    model.fit(
        X_train, y_train,
+        sample_weight=w_train,
        eval_set=[(X_val, y_val)],
-        callbacks=[lgb.early_stopping(30, verbose=False), lgb.log_evaluation(50)],
+        eval_metric="auc",
+        callbacks=[
+            lgb.early_stopping(80, first_metric_only=True, verbose=False),
+            lgb.log_evaluation(50),
+        ],
    )

    val_proba = model.predict_proba(X_val)[:, 1]
    auc = roc_auc_score(y_val, val_proba)
-    print(f"\n검증 AUC: {auc:.4f}")
-    print(classification_report(y_val, (val_proba >= 0.60).astype(int)))
+
+    # 최적 임계값 탐색: 최소 재현율(0.15) 조건부 정밀도 최대화
+    from sklearn.metrics import precision_recall_curve
+    precisions, recalls, thresholds = precision_recall_curve(y_val, val_proba)
+    # precision_recall_curve의 마지막 원소는 (1.0, 0.0)이므로 제외
+    precisions, recalls = precisions[:-1], recalls[:-1]
+
+    MIN_RECALL = 0.15
+    valid_idx = np.where(recalls >= MIN_RECALL)[0]
+    if len(valid_idx) > 0:
+        best_idx  = valid_idx[np.argmax(precisions[valid_idx])]
+        best_thr  = float(thresholds[best_idx])
+        best_prec = float(precisions[best_idx])
+        best_rec  = float(recalls[best_idx])
+    else:
+        best_thr, best_prec, best_rec = 0.50, 0.0, 0.0
+        print(f"  [경고] recall >= {MIN_RECALL} 조건 만족 임계값 없음 → 기본값 0.50 사용")
+
+    print(f"\n검증 AUC: {auc:.4f}  |  최적 임계값: {best_thr:.4f} "
+          f"(Precision={best_prec:.3f}, Recall={best_rec:.3f})")
+    print(classification_report(y_val, (val_proba >= best_thr).astype(int), zero_division=0))

    if MODEL_PATH.exists():
        import shutil
@@ -225,9 +270,14 @@ def train(data_path: str):
            log = json.load(f)
    log.append({
        "date": datetime.now().isoformat(),
+        "backend": "lgbm",
        "auc": round(auc, 4),
+        "best_threshold": round(best_thr, 4),
+        "best_precision": round(best_prec, 3),
+        "best_recall":    round(best_rec, 3),
        "samples": len(dataset),
        "features": len(actual_feature_cols),
+        "time_weight_decay": time_weight_decay,
        "model_path": str(MODEL_PATH),
    })
    with open(LOG_PATH, "w") as f:
@@ -236,11 +286,103 @@ def train(data_path: str):
    return auc


+def walk_forward_auc(
+    data_path: str,
+    time_weight_decay: float = 2.0,
+    n_splits: int = 5,
+    train_ratio: float = 0.6,
+) -> None:
+    """Walk-Forward 검증: 슬라이딩 윈도우로 n_splits번 학습/검증 반복.
+
+    시계열 순서를 지키면서 매 폴드마다 학습 구간을 늘려가며 검증한다.
+    실제 미래 예측력의 평균 AUC를 측정하는 데 사용한다.
+    """
+    import warnings
+
+    print(f"\n=== Walk-Forward 검증 ({n_splits}폴드, decay={time_weight_decay}) ===")
+    df_raw = pd.read_parquet(data_path)
+    base_cols = ["open", "high", "low", "close", "volume"]
+    btc_df = eth_df = None
+    if "close_btc" in df_raw.columns:
+        btc_df = df_raw[[c + "_btc" for c in base_cols]].copy()
+        btc_df.columns = base_cols
+    if "close_eth" in df_raw.columns:
+        eth_df = df_raw[[c + "_eth" for c in base_cols]].copy()
+        eth_df.columns = base_cols
+    df = df_raw[base_cols].copy()
+
+    dataset = generate_dataset_vectorized(
+        df, btc_df=btc_df, eth_df=eth_df, time_weight_decay=time_weight_decay
+    )
+    actual_feature_cols = [c for c in FEATURE_COLS if c in dataset.columns]
+    X = dataset[actual_feature_cols].values
+    y = dataset["label"].values
+    w = dataset["sample_weight"].values
+    n = len(dataset)
+
+    step = max(1, int(n * (1 - train_ratio) / n_splits))
+    train_end_start = int(n * train_ratio)
+
+    aucs = []
+    for i in range(n_splits):
+        tr_end = train_end_start + i * step
+        val_end = tr_end + step
+        if val_end > n:
+            break
+
+        X_tr, y_tr, w_tr = X[:tr_end], y[:tr_end], w[:tr_end]
+        X_val, y_val = X[tr_end:val_end], y[tr_end:val_end]
+
+        pos_idx = np.where(y_tr == 1)[0]
+        neg_idx = np.where(y_tr == 0)[0]
+        if len(neg_idx) > len(pos_idx):
+            np.random.seed(42)
+            neg_idx = np.random.choice(neg_idx, size=len(pos_idx), replace=False)
+        idx = np.sort(np.concatenate([pos_idx, neg_idx]))
+
+        model = lgb.LGBMClassifier(
+            n_estimators=500,
+            learning_rate=0.05,
+            num_leaves=31,
+            min_child_samples=15,
+            subsample=0.8,
+            colsample_bytree=0.8,
+            reg_alpha=0.05,
+            reg_lambda=0.1,
+            random_state=42,
+            verbose=-1,
+        )
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            model.fit(X_tr[idx], y_tr[idx], sample_weight=w_tr[idx])
+
+        proba = model.predict_proba(X_val)[:, 1]
+        auc = roc_auc_score(y_val, proba) if len(np.unique(y_val)) > 1 else 0.5
+        aucs.append(auc)
+        print(
+            f"  폴드 {i+1}/{n_splits}: 학습={tr_end}개, "
+            f"검증={tr_end}~{val_end} ({step}개), AUC={auc:.4f}"
+        )
+
+    print(f"\n  Walk-Forward 평균 AUC: {np.mean(aucs):.4f} ± {np.std(aucs):.4f}")
+    print(f"  폴드별: {[round(a, 4) for a in aucs]}")
+
+
 def main():
    parser = argparse.ArgumentParser()
-    parser.add_argument("--data", default="data/xrpusdt_1m.parquet")
+    parser.add_argument("--data", default="data/combined_15m.parquet")
+    parser.add_argument(
+        "--decay", type=float, default=2.0,
+        help="시간 가중치 감쇠 강도 (0=균등, 2.0=최신이 ~7.4배 높음)",
+    )
+    parser.add_argument("--wf", action="store_true", help="Walk-Forward 검증 실행")
+    parser.add_argument("--wf-splits", type=int, default=5, help="Walk-Forward 폴드 수")
    args = parser.parse_args()
-    train(args.data)
+
+    if args.wf:
+        walk_forward_auc(args.data, time_weight_decay=args.decay, n_splits=args.wf_splits)
+    else:
+        train(args.data, time_weight_decay=args.decay)


 if __name__ == "__main__":
--- a/src/bot.py
+++ b/src/bot.py
@@ -20,7 +20,7 @@ class TradingBot:
        self.current_trade_side: str | None = None  # "LONG" | "SHORT"
        self.stream = MultiSymbolStream(
            symbols=[config.symbol, "BTCUSDT", "ETHUSDT"],
-            interval="1m",
+            interval="15m",
            on_candle=self._on_candle_closed,
        )

@@ -50,6 +50,8 @@ class TradingBot:
            logger.info("기존 포지션 없음 - 신규 진입 대기")

    async def process_candle(self, df, btc_df=None, eth_df=None):
+        self.ml_filter.check_and_reload()
+
        if not self.risk.is_trading_allowed():
            logger.warning("리스크 한도 초과 - 거래 중단")
            return
@@ -85,9 +87,11 @@ class TradingBot:
    async def _open_position(self, signal: str, df):
        balance = await self.exchange.get_balance()
        price = df["close"].iloc[-1]
+        margin_ratio = self.risk.get_dynamic_margin_ratio(balance)
        quantity = self.exchange.calculate_quantity(
-            balance=balance, price=price, leverage=self.config.leverage
+            balance=balance, price=price, leverage=self.config.leverage, margin_ratio=margin_ratio
        )
+        logger.info(f"포지션 크기: 잔고={balance:.2f} USDT, 증거금비율={margin_ratio:.1%}, 수량={quantity}")
        stop_loss, take_profit = Indicators(df).get_atr_stop(df, signal, price)

        notional = quantity * price
@@ -165,6 +169,9 @@ class TradingBot:
    async def run(self):
        logger.info(f"봇 시작: {self.config.symbol}, 레버리지 {self.config.leverage}x")
        await self._recover_position()
+        balance = await self.exchange.get_balance()
+        self.risk.set_base_balance(balance)
+        logger.info(f"기준 잔고 설정: {balance:.2f} USDT (동적 증거금 비율 기준점)")
        await self.stream.start(
            api_key=self.config.api_key,
            api_secret=self.config.api_secret,
--- a/src/config.py
+++ b/src/config.py
@@ -11,17 +11,21 @@ class Config:
    api_secret: str = ""
    symbol: str = "XRPUSDT"
    leverage: int = 10
-    risk_per_trade: float = 0.02
    max_positions: int = 3
    stop_loss_pct: float = 0.015    # 1.5%
    take_profit_pct: float = 0.045  # 4.5% (3:1 RR)
    trailing_stop_pct: float = 0.01  # 1%
    discord_webhook_url: str = ""
+    margin_max_ratio: float = 0.50
+    margin_min_ratio: float = 0.20
+    margin_decay_rate: float = 0.0006

    def __post_init__(self):
        self.api_key = os.getenv("BINANCE_API_KEY", "")
        self.api_secret = os.getenv("BINANCE_API_SECRET", "")
        self.symbol = os.getenv("SYMBOL", "XRPUSDT")
        self.leverage = int(os.getenv("LEVERAGE", "10"))
-        self.risk_per_trade = float(os.getenv("RISK_PER_TRADE", "0.02"))
        self.discord_webhook_url = os.getenv("DISCORD_WEBHOOK_URL", "")
+        self.margin_max_ratio = float(os.getenv("MARGIN_MAX_RATIO", "0.50"))
+        self.margin_min_ratio = float(os.getenv("MARGIN_MIN_RATIO", "0.20"))
+        self.margin_decay_rate = float(os.getenv("MARGIN_DECAY_RATE", "0.0006"))
--- a/src/data_stream.py
+++ b/src/data_stream.py
@@ -5,13 +5,21 @@ import pandas as pd
 from binance import AsyncClient, BinanceSocketManager
 from loguru import logger

+# 15분봉 기준 EMA50 안정화에 필요한 최소 캔들 수.
+# EMA50=50, StochRSI(14,14,3,3)=44, MACD(12,26,9)=33 중 최댓값에 여유분 추가.
+_MIN_CANDLES_FOR_SIGNAL = 100
+
+# 초기 구동 시 REST API로 가져올 과거 캔들 수.
+# 15분봉 200개 = 50시간치 — EMA50(12.5h) 대비 4배 여유.
+_PRELOAD_LIMIT = 200
+


 class KlineStream:
    def __init__(
        self,
        symbol: str,
-        interval: str = "1m",
+        interval: str = "15m",
        buffer_size: int = 200,
        on_candle: Callable = None,
    ):
@@ -40,13 +48,13 @@ class KlineStream:
                self.on_candle(candle)

    def get_dataframe(self) -> pd.DataFrame | None:
-        if len(self.buffer) < 50:
+        if len(self.buffer) < _MIN_CANDLES_FOR_SIGNAL:
            return None
        df = pd.DataFrame(list(self.buffer))
        df.set_index("timestamp", inplace=True)
        return df

-    async def _preload_history(self, client: AsyncClient, limit: int = 200):
+    async def _preload_history(self, client: AsyncClient, limit: int = _PRELOAD_LIMIT):
        """REST API로 과거 캔들 데이터를 버퍼에 미리 채운다."""
        logger.info(f"과거 캔들 {limit}개 로드 중...")
        klines = await client.futures_klines(
@@ -96,7 +104,7 @@ class MultiSymbolStream:
    def __init__(
        self,
        symbols: list[str],
-        interval: str = "1m",
+        interval: str = "15m",
        buffer_size: int = 200,
        on_candle: Callable = None,
    ):
@@ -142,13 +150,13 @@ class MultiSymbolStream:
    def get_dataframe(self, symbol: str) -> pd.DataFrame | None:
        key = symbol.lower()
        buf = self.buffers.get(key)
-        if buf is None or len(buf) < 50:
+        if buf is None or len(buf) < _MIN_CANDLES_FOR_SIGNAL:
            return None
        df = pd.DataFrame(list(buf))
        df.set_index("timestamp", inplace=True)
        return df

-    async def _preload_history(self, client: AsyncClient, limit: int = 200):
+    async def _preload_history(self, client: AsyncClient, limit: int = _PRELOAD_LIMIT):
        """REST API로 모든 심볼의 과거 캔들을 버퍼에 미리 채운다."""
        for symbol in self.symbols:
            logger.info(f"{symbol.upper()} 과거 캔들 {limit}개 로드 중...")
--- a/src/dataset_builder.py
+++ b/src/dataset_builder.py
@@ -11,10 +11,10 @@ import pandas_ta as ta

 from src.ml_features import FEATURE_COLS

-LOOKAHEAD    = 60
+LOOKAHEAD    = 24   # 15분봉 × 24 = 6시간 뷰
 ATR_SL_MULT  = 1.5
-ATR_TP_MULT  = 3.0
-WARMUP       = 60   # 지표 안정화에 필요한 최소 행 수
+ATR_TP_MULT  = 2.0
+WARMUP       = 60   # 15분봉 기준 60캔들 = 15시간 (지표 안정화 충분)


 def _calc_indicators(df: pd.DataFrame) -> pd.DataFrame:
@@ -115,6 +115,18 @@ def _calc_signals(d: pd.DataFrame) -> np.ndarray:
    return signal_arr


+def _rolling_zscore(arr: np.ndarray, window: int = 288) -> np.ndarray:
+    """rolling window z-score 정규화. nan은 전파된다(nan-safe).
+    15분봉 기준 3일(288캔들) 윈도우. min_periods=1로 초반 데이터도 활용."""
+    s = pd.Series(arr.astype(np.float64))
+    r = s.rolling(window=window, min_periods=1)
+    mean = r.mean()   # pandas rolling은 nan을 자동으로 건너뜀
+    std  = r.std(ddof=0)
+    std  = std.where(std >= 1e-8, other=1e-8)
+    z = (s - mean) / std
+    return z.values.astype(np.float32)
+
+
 def _calc_features_vectorized(
    d: pd.DataFrame,
    signal_arr: np.ndarray,
@@ -142,7 +154,7 @@ def _calc_features_vectorized(
    macd_sig = d["macd_signal"]

    bb_range = bb_upper - bb_lower
-    bb_pct = np.where(bb_range > 0, (close - bb_lower) / bb_range, 0.5)
+    bb_pct = (close - bb_lower) / (bb_range + 1e-8)

    ema_align = np.where(
        (ema9 > ema21) & (ema21 > ema50),  1,
@@ -151,13 +163,20 @@ def _calc_features_vectorized(
        )
    ).astype(np.float32)

-    atr_pct   = np.where(close > 0, atr / close, 0.0)
-    vol_ratio = np.where(vol_ma20 > 0, volume / vol_ma20, 1.0)
+    atr_pct   = atr / (close + 1e-8)
+    vol_ratio = volume / (vol_ma20 + 1e-8)

    ret_1 = close.pct_change(1).fillna(0).values
    ret_3 = close.pct_change(3).fillna(0).values
    ret_5 = close.pct_change(5).fillna(0).values

+    # 절대값 피처를 rolling z-score로 정규화 (레짐 변화에 강하게)
+    atr_pct_z   = _rolling_zscore(atr_pct)
+    vol_ratio_z = _rolling_zscore(vol_ratio)
+    ret_1_z     = _rolling_zscore(ret_1)
+    ret_3_z     = _rolling_zscore(ret_3)
+    ret_5_z     = _rolling_zscore(ret_5)
+
    prev_macd     = macd.shift(1).fillna(0).values
    prev_macd_sig = macd_sig.shift(1).fillna(0).values

@@ -190,11 +209,11 @@ def _calc_features_vectorized(
        "ema_align":       ema_align,
        "stoch_k":         stoch_k.values.astype(np.float32),
        "stoch_d":         stoch_d.values.astype(np.float32),
-        "atr_pct":         atr_pct.astype(np.float32),
-        "vol_ratio":       vol_ratio.astype(np.float32),
-        "ret_1":           ret_1.astype(np.float32),
-        "ret_3":           ret_3.astype(np.float32),
-        "ret_5":           ret_5.astype(np.float32),
+        "atr_pct":         atr_pct_z,
+        "vol_ratio":       vol_ratio_z,
+        "ret_1":           ret_1_z,
+        "ret_3":           ret_3_z,
+        "ret_5":           ret_5_z,
        "signal_strength": strength,
        "side":            side,
        "_signal":         signal_arr,   # 레이블 계산용 임시 컬럼
@@ -223,16 +242,37 @@ def _calc_features_vectorized(
        eth_r5 = _align(eth_ret_5, n).astype(np.float32)

        xrp_r1 = ret_1.astype(np.float32)
-        xrp_btc_rs = np.where(btc_r1 != 0, xrp_r1 / btc_r1, 0.0).astype(np.float32)
-        xrp_eth_rs = np.where(eth_r1 != 0, xrp_r1 / eth_r1, 0.0).astype(np.float32)
+        xrp_btc_rs_raw = (xrp_r1 / (btc_r1 + 1e-8)).astype(np.float32)
+        xrp_eth_rs_raw = (xrp_r1 / (eth_r1 + 1e-8)).astype(np.float32)

        extra = pd.DataFrame({
-            "btc_ret_1": btc_r1, "btc_ret_3": btc_r3, "btc_ret_5": btc_r5,
-            "eth_ret_1": eth_r1, "eth_ret_3": eth_r3, "eth_ret_5": eth_r5,
-            "xrp_btc_rs": xrp_btc_rs, "xrp_eth_rs": xrp_eth_rs,
+            "btc_ret_1":  _rolling_zscore(btc_r1),
+            "btc_ret_3":  _rolling_zscore(btc_r3),
+            "btc_ret_5":  _rolling_zscore(btc_r5),
+            "eth_ret_1":  _rolling_zscore(eth_r1),
+            "eth_ret_3":  _rolling_zscore(eth_r3),
+            "eth_ret_5":  _rolling_zscore(eth_r5),
+            "xrp_btc_rs": _rolling_zscore(xrp_btc_rs_raw),
+            "xrp_eth_rs": _rolling_zscore(xrp_eth_rs_raw),
        }, index=d.index)
        result = pd.concat([result, extra], axis=1)

+    # OI 변화율 / 펀딩비 피처
+    # 컬럼 없으면 전체 nan, 있으면 0.0 구간(데이터 미제공 구간)을 nan으로 마스킹
+    # LightGBM은 nan을 자체 처리; MLX는 fit()에서 nanmean/nanstd + nan_to_num 처리
+    if "oi_change" in d.columns:
+        oi_raw = np.where(d["oi_change"].values == 0.0, np.nan, d["oi_change"].values)
+    else:
+        oi_raw = np.full(len(d), np.nan)
+
+    if "funding_rate" in d.columns:
+        fr_raw = np.where(d["funding_rate"].values == 0.0, np.nan, d["funding_rate"].values)
+    else:
+        fr_raw = np.full(len(d), np.nan)
+
+    result["oi_change"]    = _rolling_zscore(oi_raw.astype(np.float64))
+    result["funding_rate"] = _rolling_zscore(fr_raw.astype(np.float64))
+
    return result


@@ -275,28 +315,26 @@ def _calc_labels_vectorized(
        fut_high = highs[idx + 1 : end]
        fut_low  = lows[idx + 1 : end]

-        label = None
+        label = 0  # 미도달(타임아웃) 시 실패로 간주
+
        for h, l in zip(fut_high, fut_low):
            if signal == "LONG":
-                if h >= tp:
-                    label = 1
-                    break
                if l <= sl:
                    label = 0
                    break
-            else:
-                if l <= tp:
+                if h >= tp:
                    label = 1
                    break
+            else:  # SHORT
                if h >= sl:
                    label = 0
                    break
+                if l <= tp:
+                    label = 1
+                    break

-        if label is None:
-            valid_mask.append(False)
-        else:
-            labels.append(label)
-            valid_mask.append(True)
+        labels.append(label)
+        valid_mask.append(True)

    return np.array(labels, dtype=np.int8), np.array(valid_mask, dtype=bool)

@@ -305,11 +343,17 @@ def generate_dataset_vectorized(
    df: pd.DataFrame,
    btc_df: pd.DataFrame | None = None,
    eth_df: pd.DataFrame | None = None,
+    time_weight_decay: float = 0.0,
 ) -> pd.DataFrame:
    """
    전체 시계열을 1회 계산해 학습 데이터셋을 생성한다.
    기존 generate_dataset()의 drop-in 대체제.
    btc_df, eth_df가 제공되면 21개 피처로 확장한다.
+
+    time_weight_decay: 지수 감쇠 강도. 0이면 균등 가중치.
+        양수일수록 최신 샘플에 더 높은 가중치를 부여한다.
+        예) 2.0 → 최신 샘플이 가장 오래된 샘플보다 e^2 ≈ 7.4배 높은 가중치.
+        결과 DataFrame에 'sample_weight' 컬럼으로 포함된다.
    """
    print("  [1/3] 전체 시계열 지표 계산 (1회)...")
    d = _calc_indicators(df)
@@ -319,7 +363,12 @@ def generate_dataset_vectorized(
    feat_all   = _calc_features_vectorized(d, signal_arr, btc_df=btc_df, eth_df=eth_df)

    # 신호 발생 + NaN 없음 + 미래 데이터 충분한 인덱스만
-    available_cols_for_nan_check = [c for c in FEATURE_COLS if c in feat_all.columns]
+    # oi_change/funding_rate는 선택적 피처(컬럼 없으면 전체 nan)이므로 NaN 체크에서 제외
+    OPTIONAL_COLS = {"oi_change", "funding_rate"}
+    available_cols_for_nan_check = [
+        c for c in FEATURE_COLS
+        if c in feat_all.columns and c not in OPTIONAL_COLS
+    ]
    valid_rows = (
        (signal_arr != "HOLD") &
        (~feat_all[available_cols_for_nan_check].isna().any(axis=1).values) &
@@ -338,4 +387,17 @@ def generate_dataset_vectorized(
    feat_final = feat_all.iloc[final_idx][available_feature_cols].copy()
    feat_final["label"] = labels

-    return feat_final.reset_index(drop=True)
+    # 시간 가중치: 오래된 샘플 → 낮은 가중치, 최신 샘플 → 높은 가중치
+    n = len(feat_final)
+    if time_weight_decay > 0 and n > 1:
+        weights = np.exp(time_weight_decay * np.linspace(0.0, 1.0, n)).astype(np.float32)
+        weights /= weights.mean()  # 평균 1로 정규화해 학습률 스케일 유지
+        print(f"  시간 가중치 적용 (decay={time_weight_decay}): "
+              f"min={weights.min():.3f}, max={weights.max():.3f}")
+    else:
+        weights = np.ones(n, dtype=np.float32)
+
+    feat_final = feat_final.reset_index(drop=True)
+    feat_final["sample_weight"] = weights
+
+    return feat_final
--- a/src/exchange.py
+++ b/src/exchange.py
@@ -15,14 +15,12 @@ class BinanceFuturesClient:

    MIN_NOTIONAL = 5.0  # 바이낸스 선물 최소 명목금액 (USDT)

-    def calculate_quantity(self, balance: float, price: float, leverage: int) -> float:
-        """리스크 기반 포지션 크기 계산 (최소 명목금액 $5 보장)"""
-        risk_amount = balance * self.config.risk_per_trade
-        notional = risk_amount * leverage
+    def calculate_quantity(self, balance: float, price: float, leverage: int, margin_ratio: float) -> float:
+        """동적 증거금 비율 기반 포지션 크기 계산 (최소 명목금액 $5 보장)"""
+        notional = balance * margin_ratio * leverage
        if notional < self.MIN_NOTIONAL:
            notional = self.MIN_NOTIONAL
        quantity = notional / price
-        # XRP는 소수점 1자리, 단 최소 명목금액 충족 여부 재확인
        qty_rounded = round(quantity, 1)
        if qty_rounded * price < self.MIN_NOTIONAL:
            qty_rounded = round(self.MIN_NOTIONAL / price + 0.05, 1)
--- a/src/label_builder.py
+++ b/src/label_builder.py
@@ -9,21 +9,17 @@ def build_labels(
    stop_loss: float,
    side: str,
 ) -> Optional[int]:
-    """
-    진입 이후 미래 캔들을 순서대로 확인해 TP/SL 도달 여부를 판단한다.
-    LONG: high >= TP → 1, low <= SL → 0
-    SHORT: low <= TP → 1, high >= SL → 0
-    둘 다 미도달 → None (학습 데이터에서 제외)
-    """
    for high, low in zip(future_highs, future_lows):
        if side == "LONG":
-            if high >= take_profit:
-                return 1
+            # 보수적 접근: 손절(SL)을 먼저 체크
            if low <= stop_loss:
                return 0
-        else:  # SHORT
-            if low <= take_profit:
+            if high >= take_profit:
                return 1
+        else:  # SHORT
+            # 보수적 접근: 손절(SL)을 먼저 체크
            if high >= stop_loss:
                return 0
+            if low <= take_profit:
+                return 1
    return None
--- a/src/ml_features.py
+++ b/src/ml_features.py
@@ -8,6 +8,9 @@ FEATURE_COLS = [
    "btc_ret_1", "btc_ret_3", "btc_ret_5",
    "eth_ret_1", "eth_ret_3", "eth_ret_5",
    "xrp_btc_rs", "xrp_eth_rs",
+    # 시장 미시구조: OI 변화율(z-score), 펀딩비(z-score)
+    # parquet에 oi_change/funding_rate 컬럼이 없으면 dataset_builder에서 0으로 채움
+    "oi_change", "funding_rate",
 ]


--- a/src/ml_filter.py
+++ b/src/ml_filter.py
@@ -1,32 +1,118 @@
 from pathlib import Path
 import joblib
+import numpy as np
 import pandas as pd
 from loguru import logger

+from src.ml_features import FEATURE_COLS
+
+ONNX_MODEL_PATH = Path("models/mlx_filter.weights.onnx")
+LGBM_MODEL_PATH = Path("models/lgbm_filter.pkl")
+
+
+def _mtime(path: Path) -> float:
+    """파일이 없으면 0.0 반환."""
+    try:
+        return path.stat().st_mtime
+    except FileNotFoundError:
+        return 0.0
+

 class MLFilter:
    """
-    LightGBM 모델을 로드하고 진입 여부를 판단한다.
-    모델 파일이 없으면 항상 진입을 허용한다 (폴백).
+    ML 필터. ONNX(MLX 신경망) 우선 로드, 없으면 LightGBM으로 폴백한다.
+    둘 다 없으면 항상 진입을 허용한다.
+
+    우선순위: ONNX > LightGBM > 폴백(항상 허용)
+
+    check_and_reload()를 주기적으로 호출하면 모델 파일 변경 시 자동 리로드된다.
    """

-    def __init__(self, model_path: str = "models/lgbm_filter.pkl", threshold: float = 0.60):
-        self._model_path = Path(model_path)
+    def __init__(
+        self,
+        onnx_path: str = str(ONNX_MODEL_PATH),
+        lgbm_path: str = str(LGBM_MODEL_PATH),
+        threshold: float = 0.60,
+    ):
+        self._onnx_path = Path(onnx_path)
+        self._lgbm_path = Path(lgbm_path)
        self._threshold = threshold
-        self._model = None
+        self._onnx_session = None
+        self._lgbm_model = None
+        self._loaded_onnx_mtime: float = 0.0
+        self._loaded_lgbm_mtime: float = 0.0
        self._try_load()

    def _try_load(self):
-        if self._model_path.exists():
+        # 로드 여부와 무관하게 두 파일의 현재 mtime을 항상 기록한다.
+        # 이렇게 해야 로드하지 않은 쪽 파일이 나중에 변경됐을 때만 리로드가 트리거된다.
+        self._loaded_onnx_mtime = _mtime(self._onnx_path)
+        self._loaded_lgbm_mtime = _mtime(self._lgbm_path)
+
+        # ONNX 우선 시도
+        if self._onnx_path.exists():
            try:
-                self._model = joblib.load(self._model_path)
-                logger.info(f"ML 필터 모델 로드 완료: {self._model_path}")
+                import onnxruntime as ort
+                self._onnx_session = ort.InferenceSession(
+                    str(self._onnx_path),
+                    providers=["CPUExecutionProvider"],
+                )
+                self._lgbm_model = None
+                logger.info(
+                    f"ML 필터 로드: ONNX ({self._onnx_path}) "
+                    f"| 임계값={self._threshold}"
+                )
+                return
            except Exception as e:
-                logger.warning(f"ML 필터 모델 로드 실패: {e}")
-                self._model = None
+                logger.warning(f"ONNX 모델 로드 실패: {e}")
+                self._onnx_session = None
+
+        # LightGBM 폴백
+        if self._lgbm_path.exists():
+            try:
+                self._lgbm_model = joblib.load(self._lgbm_path)
+                logger.info(
+                    f"ML 필터 로드: LightGBM ({self._lgbm_path}) "
+                    f"| 임계값={self._threshold}"
+                )
+            except Exception as e:
+                logger.warning(f"LightGBM 모델 로드 실패: {e}")
+                self._lgbm_model = None
+        else:
+            logger.warning("ML 필터: 모델 파일 없음 → 모든 신호 허용 (폴백)")

    def is_model_loaded(self) -> bool:
-        return self._model is not None
+        return self._onnx_session is not None or self._lgbm_model is not None
+
+    @property
+    def active_backend(self) -> str:
+        if self._onnx_session is not None:
+            return "ONNX"
+        if self._lgbm_model is not None:
+            return "LightGBM"
+        return "폴백(없음)"
+
+    def check_and_reload(self) -> bool:
+        """
+        모델 파일의 mtime을 확인해 변경됐으면 리로드한다.
+        실제로 리로드가 일어났으면 True 반환.
+        """
+        onnx_changed = _mtime(self._onnx_path) != self._loaded_onnx_mtime
+        lgbm_changed = _mtime(self._lgbm_path) != self._loaded_lgbm_mtime
+
+        if onnx_changed or lgbm_changed:
+            changed_files = []
+            if onnx_changed:
+                changed_files.append(str(self._onnx_path))
+            if lgbm_changed:
+                changed_files.append(str(self._lgbm_path))
+            logger.info(f"ML 필터: 모델 파일 변경 감지 → 리로드 ({', '.join(changed_files)})")
+            self._onnx_session = None
+            self._lgbm_model = None
+            self._try_load()
+            logger.info(f"ML 필터 핫리로드 완료: 백엔드={self.active_backend}")
+            return True
+        return False

    def should_enter(self, features: pd.Series) -> bool:
        """
@@ -36,15 +122,28 @@ class MLFilter:
        if not self.is_model_loaded():
            return True
        try:
-            X = features.to_frame().T
-            proba = self._model.predict_proba(X)[0][1]
-            logger.debug(f"ML 필터 확률: {proba:.3f} (임계값: {self._threshold})")
+            if self._onnx_session is not None:
+                input_name = self._onnx_session.get_inputs()[0].name
+                X = features[FEATURE_COLS].values.astype(np.float32).reshape(1, -1)
+                proba = float(self._onnx_session.run(None, {input_name: X})[0][0])
+            else:
+                X = features.to_frame().T
+                proba = float(self._lgbm_model.predict_proba(X)[0][1])
+            logger.debug(
+                f"ML 필터 [{self.active_backend}] 확률: {proba:.3f} "
+                f"(임계값: {self._threshold})"
+            )
            return bool(proba >= self._threshold)
        except Exception as e:
            logger.warning(f"ML 필터 예측 오류 (폴백 허용): {e}")
            return True

    def reload_model(self):
-        """재학습 후 모델을 핫 리로드한다."""
+        """외부에서 강제 리로드할 때 사용 (하위 호환)."""
+        prev_backend = self.active_backend
+        self._onnx_session = None
+        self._lgbm_model = None
        self._try_load()
-        logger.info("ML 필터 모델 리로드 완료")
+        logger.info(
+            f"ML 필터 강제 리로드 완료: {prev_backend} → {self.active_backend}"
+        )
--- a/src/mlx_filter.py
+++ b/src/mlx_filter.py
@@ -1,6 +1,7 @@
 """
 Apple MLX 기반 경량 신경망 필터.
 M4의 통합 GPU를 자동으로 활용한다.
+학습 후 ONNX로 export해 Linux 서버에서 onnxruntime으로 추론한다.
 """
 import numpy as np
 import pandas as pd
@@ -12,6 +13,83 @@ from pathlib import Path
 from src.ml_features import FEATURE_COLS


+def _export_onnx(
+    weights_npz: Path,
+    meta_npz: Path,
+    onnx_path: Path,
+) -> None:
+    """
+    MLX 가중치(.npz)를 읽어 ONNX 그래프로 변환한다.
+    네트워크 구조: fc1(ReLU) → dropout(추론 시 비활성) → fc2(ReLU) → fc3 → sigmoid
+    """
+    import onnx
+    from onnx import helper, TensorProto, numpy_helper
+
+    meta = np.load(meta_npz)
+    mean: np.ndarray = meta["mean"].astype(np.float32)
+    std: np.ndarray  = meta["std"].astype(np.float32)
+    input_dim  = int(meta["input_dim"])
+    hidden_dim = int(meta["hidden_dim"])
+
+    w = np.load(weights_npz)
+    # MLX save_weights 키 패턴: fc1.weight, fc1.bias, ...
+    fc1_w = w["fc1.weight"].astype(np.float32)   # (hidden, input)
+    fc1_b = w["fc1.bias"].astype(np.float32)
+    fc2_w = w["fc2.weight"].astype(np.float32)   # (hidden//2, hidden)
+    fc2_b = w["fc2.bias"].astype(np.float32)
+    fc3_w = w["fc3.weight"].astype(np.float32)   # (1, hidden//2)
+    fc3_b = w["fc3.bias"].astype(np.float32)
+
+    def _t(name: str, arr: np.ndarray) -> onnx.TensorProto:
+        return numpy_helper.from_array(arr, name=name)
+
+    initializers = [
+        _t("mean",  mean),
+        _t("std",   std),
+        _t("fc1_w", fc1_w),
+        _t("fc1_b", fc1_b),
+        _t("fc2_w", fc2_w),
+        _t("fc2_b", fc2_b),
+        _t("fc3_w", fc3_w),
+        _t("fc3_b", fc3_b),
+    ]
+
+    nodes = [
+        # 정규화: (x - mean) / std
+        helper.make_node("Sub",     ["X", "mean"],      ["x_sub"]),
+        helper.make_node("Div",     ["x_sub", "std"],   ["x_norm"]),
+        # fc1: x_norm @ fc1_w.T + fc1_b
+        helper.make_node("Gemm",    ["x_norm", "fc1_w", "fc1_b"], ["fc1_out"],
+                         transB=1),
+        helper.make_node("Relu",    ["fc1_out"],         ["relu1"]),
+        # fc2: relu1 @ fc2_w.T + fc2_b
+        helper.make_node("Gemm",    ["relu1",  "fc2_w", "fc2_b"], ["fc2_out"],
+                         transB=1),
+        helper.make_node("Relu",    ["fc2_out"],         ["relu2"]),
+        # fc3: relu2 @ fc3_w.T + fc3_b  → (N, 1)
+        helper.make_node("Gemm",    ["relu2",  "fc3_w", "fc3_b"], ["logits"],
+                         transB=1),
+        # sigmoid → (N, 1)
+        helper.make_node("Sigmoid", ["logits"],          ["proba_2d"]),
+        # squeeze: (N, 1) → (N,)
+        helper.make_node("Flatten", ["proba_2d"],        ["proba"], axis=0),
+    ]
+
+    graph = helper.make_graph(
+        nodes,
+        "mlx_filter",
+        inputs=[helper.make_tensor_value_info("X", TensorProto.FLOAT, [None, input_dim])],
+        outputs=[helper.make_tensor_value_info("proba", TensorProto.FLOAT, [None])],
+        initializer=initializers,
+    )
+    model_proto = helper.make_model(graph, opset_imports=[helper.make_opsetid("", 17)])
+    model_proto.ir_version = 8
+    onnx.checker.check_model(model_proto)
+    onnx_path.parent.mkdir(exist_ok=True)
+    onnx.save(model_proto, str(onnx_path))
+    print(f"  ONNX export 완료: {onnx_path}")
+
+
 class _Net(nn.Module):
    """3층 MLP 이진 분류기."""

@@ -53,19 +131,36 @@ class MLXFilter:
        self._std: np.ndarray | None = None
        self._trained = False

-    def fit(self, X: pd.DataFrame, y: pd.Series) -> "MLXFilter":
+    def fit(
+        self,
+        X: pd.DataFrame,
+        y: pd.Series,
+        sample_weight: np.ndarray | None = None,
+    ) -> "MLXFilter":
        X_np = X[FEATURE_COLS].values.astype(np.float32)
        y_np = y.values.astype(np.float32)

-        self._mean = X_np.mean(axis=0)
-        self._std = X_np.std(axis=0) + 1e-8
+        # nan-safe 정규화: nanmean/nanstd로 통계 계산 후 nan → 0.0 대치
+        # (z-score 후 0.0 = 평균값, 신경망에 줄 수 있는 가장 무난한 결측 대치값)
+        self._mean = np.nanmean(X_np, axis=0)
+        self._std  = np.nanstd(X_np, axis=0) + 1e-8
        X_np = (X_np - self._mean) / self._std
+        X_np = np.nan_to_num(X_np, nan=0.0)
+
+        w_np = sample_weight.astype(np.float32) if sample_weight is not None else None

        optimizer = optim.Adam(learning_rate=self.lr)

-        def loss_fn(model: _Net, x: mx.array, y: mx.array) -> mx.array:
+        def loss_fn(
+            model: _Net, x: mx.array, y: mx.array, w: mx.array | None
+        ) -> mx.array:
            logits = model(x)
-            return nn.losses.binary_cross_entropy(logits, y, with_logits=True)
+            per_sample = nn.losses.binary_cross_entropy(
+                logits, y, with_logits=True, reduction="none"
+            )
+            if w is not None:
+                return (per_sample * w).sum() / w.sum()
+            return per_sample.mean()

        loss_and_grad = nn.value_and_grad(self._model, loss_fn)

@@ -78,7 +173,8 @@ class MLXFilter:
                batch_idx = idx[start : start + self.batch_size]
                x_batch = mx.array(X_np[batch_idx])
                y_batch = mx.array(y_np[batch_idx])
-                loss, grads = loss_and_grad(self._model, x_batch, y_batch)
+                w_batch = mx.array(w_np[batch_idx]) if w_np is not None else None
+                loss, grads = loss_and_grad(self._model, x_batch, y_batch, w_batch)
                optimizer.update(self._model, grads)
                mx.eval(self._model.parameters(), optimizer.state)
                epoch_loss += loss.item()
@@ -93,6 +189,7 @@ class MLXFilter:
        X_np = X[FEATURE_COLS].values.astype(np.float32)
        if self._trained and self._mean is not None:
            X_np = (X_np - self._mean) / self._std
+            X_np = np.nan_to_num(X_np, nan=0.0)
        x = mx.array(X_np)
        self._model.eval()
        logits = self._model(x)
@@ -114,6 +211,12 @@ class MLXFilter:
            input_dim=np.array(self.input_dim),
            hidden_dim=np.array(self.hidden_dim),
        )
+        # ONNX export: Linux 서버에서 onnxruntime으로 추론하기 위해 변환
+        try:
+            onnx_path = path.with_suffix(".onnx")
+            _export_onnx(weights_path, meta_path, onnx_path)
+        except ImportError:
+            print("  [경고] onnx 패키지 없음 → ONNX export 생략 (pip install onnx)")

    @classmethod
    def load(cls, path: str | Path) -> "MLXFilter":
--- a/src/risk_manager.py
+++ b/src/risk_manager.py
@@ -34,3 +34,14 @@ class RiskManager:
        """매일 자정 초기화"""
        self.daily_pnl = 0.0
        logger.info("일일 PnL 초기화")
+
+    def set_base_balance(self, balance: float) -> None:
+        """봇 시작 시 기준 잔고 설정 (동적 비율 계산 기준점)"""
+        self.initial_balance = balance
+
+    def get_dynamic_margin_ratio(self, balance: float) -> float:
+        """잔고에 따라 선형 감소하는 증거금 비율 반환"""
+        ratio = self.config.margin_max_ratio - (
+            (balance - self.initial_balance) * self.config.margin_decay_rate
+        )
+        return max(self.config.margin_min_ratio, min(self.config.margin_max_ratio, ratio))
--- a/tests/test_config.py
+++ b/tests/test_config.py
@@ -6,16 +6,16 @@ from src.config import Config
 def test_config_loads_symbol():
    os.environ["SYMBOL"] = "XRPUSDT"
    os.environ["LEVERAGE"] = "10"
-    os.environ["RISK_PER_TRADE"] = "0.02"
    cfg = Config()
    assert cfg.symbol == "XRPUSDT"
    assert cfg.leverage == 10
-    assert cfg.risk_per_trade == 0.02


-def test_config_notion_keys():
-    os.environ["NOTION_TOKEN"] = "secret_test"
-    os.environ["NOTION_DATABASE_ID"] = "db_test_id"
+def test_config_dynamic_margin_params():
+    os.environ["MARGIN_MAX_RATIO"] = "0.50"
+    os.environ["MARGIN_MIN_RATIO"] = "0.20"
+    os.environ["MARGIN_DECAY_RATE"] = "0.0006"
    cfg = Config()
-    assert cfg.notion_token == "secret_test"
-    assert cfg.notion_database_id == "db_test_id"
+    assert cfg.margin_max_ratio == 0.50
+    assert cfg.margin_min_ratio == 0.20
+    assert cfg.margin_decay_rate == 0.0006
--- a/tests/test_dataset_builder.py
+++ b/tests/test_dataset_builder.py
@@ -91,3 +91,72 @@ def test_matches_original_generate_dataset(sample_df):
    assert 0.5 <= ratio <= 2.0, (
        f"샘플 수 차이가 너무 큼: 벡터화={len(vec)}, 기존={len(orig)}, 비율={ratio:.2f}"
    )
+
+
+def test_epsilon_no_division_by_zero():
+    """bb_range=0, close=0, vol_ma20=0 극단값에서 nan/inf가 발생하지 않아야 한다."""
+    import numpy as np
+    import pandas as pd
+    from src.dataset_builder import _calc_features_vectorized, _calc_signals, _calc_indicators
+
+    n = 100
+    # close를 모두 같은 값으로 → bb_range=0 유발
+    df = pd.DataFrame({
+        "open":   np.ones(n),
+        "high":   np.ones(n),
+        "low":    np.ones(n),
+        "close":  np.ones(n),
+        "volume": np.ones(n),
+    })
+    d = _calc_indicators(df)
+    sig = _calc_signals(d)
+    feat = _calc_features_vectorized(d, sig)
+
+    numeric_cols = feat.select_dtypes(include=[np.number]).columns
+    assert not feat[numeric_cols].isin([np.inf, -np.inf]).any().any(), \
+        "inf 값이 있으면 안 됨"
+
+
+def test_oi_nan_masking_no_column():
+    """oi_change 컬럼이 없으면 전체가 nan이어야 한다."""
+    import numpy as np
+    import pandas as pd
+    from src.dataset_builder import _calc_features_vectorized, _calc_signals, _calc_indicators
+
+    n = 100
+    np.random.seed(0)
+    df = pd.DataFrame({
+        "open":   np.random.uniform(1, 2, n),
+        "high":   np.random.uniform(2, 3, n),
+        "low":    np.random.uniform(0.5, 1, n),
+        "close":  np.random.uniform(1, 2, n),
+        "volume": np.random.uniform(1000, 5000, n),
+    })
+    d = _calc_indicators(df)
+    sig = _calc_signals(d)
+    feat = _calc_features_vectorized(d, sig)
+
+    assert feat["oi_change"].isna().all(), "oi_change 컬럼 없을 때 전부 nan이어야 함"
+
+
+def test_oi_nan_masking_with_zeros():
+    """oi_change 컬럼이 있어도 0.0 구간은 nan으로 마스킹되어야 한다."""
+    import numpy as np
+    import pandas as pd
+    from src.dataset_builder import _calc_features_vectorized, _calc_signals, _calc_indicators
+
+    n = 100
+    np.random.seed(0)
+    df = pd.DataFrame({
+        "open":      np.random.uniform(1, 2, n),
+        "high":      np.random.uniform(2, 3, n),
+        "low":       np.random.uniform(0.5, 1, n),
+        "close":     np.random.uniform(1, 2, n),
+        "volume":    np.random.uniform(1000, 5000, n),
+        "oi_change": np.concatenate([np.zeros(50), np.random.uniform(-0.1, 0.1, 50)]),
+    })
+    d = _calc_indicators(df)
+    sig = _calc_signals(d)
+    feat = _calc_features_vectorized(d, sig)
+
+    assert feat["oi_change"].iloc[50:].notna().any(), "실제 OI 값 구간에 유한값이 있어야 함"
--- a/tests/test_exchange.py
+++ b/tests/test_exchange.py
@@ -12,11 +12,19 @@ def config():
        "BINANCE_API_SECRET": "test_secret",
        "SYMBOL": "XRPUSDT",
        "LEVERAGE": "10",
-        "RISK_PER_TRADE": "0.02",
    })
    return Config()


+@pytest.fixture
+def client():
+    config = Config()
+    config.leverage = 10
+    c = BinanceFuturesClient.__new__(BinanceFuturesClient)
+    c.config = config
+    return c
+
+
@pytest.mark.asyncio
 async def test_set_leverage(config):
    with patch("src.exchange.Client") as MockClient:
@@ -28,11 +36,21 @@ async def test_set_leverage(config):
        assert result is not None


-def test_calculate_quantity(config):
-    with patch("src.exchange.Client") as MockClient:
-        MockClient.return_value = MagicMock()
-        client = BinanceFuturesClient(config)
-        # 잔고 1000 USDT, 리스크 2%, 레버리지 10, 가격 0.5
-        qty = client.calculate_quantity(balance=1000.0, price=0.5, leverage=10)
-        # 1000 * 0.02 * 10 / 0.5 = 400
-        assert qty == pytest.approx(400.0, rel=0.01)
+def test_calculate_quantity_basic(client):
+    """잔고 22, 비율 50%, 레버리지 10배 → 명목금액 110, XRP 가격 2.5 → 수량 44.0"""
+    qty = client.calculate_quantity(balance=22.0, price=2.5, leverage=10, margin_ratio=0.50)
+    # 명목금액 = 22 * 0.5 * 10 = 110, 수량 = 110 / 2.5 = 44.0
+    assert qty == pytest.approx(44.0, abs=0.1)
+
+
+def test_calculate_quantity_min_notional(client):
+    """명목금액이 최소(5 USDT) 미만이면 최소값으로 올림"""
+    qty = client.calculate_quantity(balance=1.0, price=2.5, leverage=1, margin_ratio=0.01)
+    # 명목금액 = 1 * 0.01 * 1 = 0.01 < 5 → 최소 5 USDT
+    assert qty * 2.5 >= 5.0
+
+
+def test_calculate_quantity_zero_balance(client):
+    """잔고 0이면 최소 명목금액 기반 수량 반환"""
+    qty = client.calculate_quantity(balance=0.0, price=2.5, leverage=10, margin_ratio=0.50)
+    assert qty > 0
--- a/tests/test_mlx_filter.py
+++ b/tests/test_mlx_filter.py
@@ -65,6 +65,31 @@ def test_mlx_filter_fit_and_predict():
    assert np.all((proba >= 0.0) & (proba <= 1.0))


+def test_fit_with_nan_features():
+    """oi_change 피처에 nan이 포함된 경우 학습이 정상 완료되어야 한다."""
+    import numpy as np
+    import pandas as pd
+    from src.mlx_filter import MLXFilter
+    from src.ml_features import FEATURE_COLS
+
+    n = 300
+    np.random.seed(42)
+    X = pd.DataFrame(
+        np.random.randn(n, len(FEATURE_COLS)).astype(np.float32),
+        columns=FEATURE_COLS,
+    )
+    # oi_change 앞 절반을 nan으로
+    X["oi_change"] = np.where(np.arange(n) < n // 2, np.nan, X["oi_change"])
+    y = pd.Series((np.random.rand(n) > 0.5).astype(np.float32))
+
+    model = MLXFilter(input_dim=len(FEATURE_COLS), hidden_dim=32, epochs=3)
+    model.fit(X, y)  # nan 있어도 예외 없이 완료되어야 함
+
+    proba = model.predict_proba(X)
+    assert not np.any(np.isnan(proba)), "예측 확률에 nan이 없어야 함"
+    assert proba.min() >= 0.0 and proba.max() <= 1.0
+
+
 def test_mlx_filter_save_load(tmp_path):
    """저장 후 로드한 모델이 동일한 예측값을 반환해야 한다."""
    from src.mlx_filter import MLXFilter
--- a/tests/test_risk_manager.py
+++ b/tests/test_risk_manager.py
@@ -11,7 +11,6 @@ def config():
        "BINANCE_API_SECRET": "s",
        "SYMBOL": "XRPUSDT",
        "LEVERAGE": "10",
-        "RISK_PER_TRADE": "0.02",
    })
    return Config()

@@ -34,3 +33,51 @@ def test_position_size_capped(config):
    rm = RiskManager(config, max_daily_loss_pct=0.05)
    rm.open_positions = ["pos1", "pos2", "pos3"]
    assert rm.can_open_new_position() is False
+
+
+# --- 동적 증거금 비율 테스트 ---
+
+@pytest.fixture
+def dynamic_config():
+    c = Config()
+    c.margin_max_ratio = 0.50
+    c.margin_min_ratio = 0.20
+    c.margin_decay_rate = 0.0006
+    return c
+
+
+@pytest.fixture
+def risk(dynamic_config):
+    r = RiskManager(dynamic_config)
+    r.set_base_balance(22.0)
+    return r
+
+
+def test_set_base_balance(risk):
+    assert risk.initial_balance == 22.0
+
+
+def test_ratio_at_base_balance(risk):
+    """기준 잔고에서 최대 비율(50%) 반환"""
+    ratio = risk.get_dynamic_margin_ratio(22.0)
+    assert ratio == pytest.approx(0.50, abs=1e-6)
+
+
+def test_ratio_decreases_as_balance_grows(risk):
+    """잔고가 늘수록 비율 감소"""
+    ratio_100 = risk.get_dynamic_margin_ratio(100.0)
+    ratio_300 = risk.get_dynamic_margin_ratio(300.0)
+    assert ratio_100 < 0.50
+    assert ratio_300 < ratio_100
+
+
+def test_ratio_clamped_at_min(risk):
+    """잔고가 매우 커도 최소 비율(20%) 이하로 내려가지 않음"""
+    ratio = risk.get_dynamic_margin_ratio(10000.0)
+    assert ratio == pytest.approx(0.20, abs=1e-6)
+
+
+def test_ratio_clamped_at_max(risk):
+    """잔고가 기준보다 작아도 최대 비율(50%) 초과하지 않음"""
+    ratio = risk.get_dynamic_margin_ratio(5.0)
+    assert ratio == pytest.approx(0.50, abs=1e-6)
Author	SHA1	Message	Date
21in7	0f6a22fcb5	feat: MLX 임계값 탐색을 정밀도 우선(recall>=0.15 조건부)으로 변경 Made-with: Cursor	2026-03-01 23:54:38 +09:00
21in7	aa413f4d7c	feat: LightGBM 임계값 탐색을 정밀도 우선(recall>=0.15 조건부)으로 변경 Made-with: Cursor	2026-03-01 23:54:13 +09:00
21in7	6ae0f9d81b	fix: MLXFilter fit/predict에 nan-safe 정규화 적용 (nanmean + nan_to_num) Made-with: Cursor	2026-03-01 23:53:49 +09:00
21in7	820d8e0213	refactor: 분모 연산을 1e-8 epsilon 패턴으로 통일 Made-with: Cursor	2026-03-01 23:52:59 +09:00
21in7	417b8e3c6a	feat: OI/펀딩비 결측 구간을 np.nan으로 마스킹 (0.0 → nan) Made-with: Cursor	2026-03-01 23:52:19 +09:00
21in7	3b7ee3e890	chore: .worktrees/ gitignore에 추가 Made-with: Cursor	2026-03-01 23:50:18 +09:00
21in7	24d3ba9411	feat: enhance data fetching and model training with OI and funding rate integration - Updated `fetch_history.py` to collect open interest (OI) and funding rate data from Binance, improving the dataset for model training. - Modified `train_and_deploy.sh` to include options for OI and funding rate collection during data fetching. - Enhanced `dataset_builder.py` to incorporate OI change and funding rate features with rolling z-score normalization. - Updated training logs to reflect new metrics and features, ensuring comprehensive tracking of model performance. - Adjusted feature columns in `ml_features.py` to include OI and funding rate for improved model robustness.	2026-03-01 22:25:38 +09:00
21in7	4245d7cdbf	feat: implement 15-minute timeframe upgrade for model training and data processing - Introduced a new markdown document detailing the plan to transition the entire pipeline from a 1-minute to a 15-minute timeframe, aiming to improve model AUC from 0.49-0.50 to over 0.53. - Updated key parameters across multiple scripts, including `LOOKAHEAD` adjustments and default data paths to reflect the new 15-minute interval. - Modified data fetching and training scripts to ensure compatibility with the new timeframe, including changes in `fetch_history.py`, `train_model.py`, and `train_and_deploy.sh`. - Enhanced the bot's data stream configuration to operate on a 15-minute interval, ensuring real-time data processing aligns with the new model training strategy. - Updated training logs to capture new model performance metrics under the revised timeframe.	2026-03-01 22:16:15 +09:00
21in7	a6697e7cca	feat: implement LightGBM model improvement plan with feature normalization and walk-forward validation - Added a new markdown document outlining the plan to enhance the LightGBM model's AUC from 0.54 to 0.57+ through feature normalization, strong time weighting, and walk-forward validation. - Implemented rolling z-score normalization for absolute value features in `src/dataset_builder.py` to improve model robustness against regime changes. - Introduced a walk-forward validation function in `scripts/train_model.py` to accurately measure future prediction performance. - Updated training log to include new model performance metrics and added ONNX model export functionality for compatibility. - Adjusted model training parameters for better performance and included detailed validation results in the training log.	2026-03-01 22:02:32 +09:00
21in7	c6428af64e	feat: enhance Jenkins pipeline with Discord notifications and model hot-reload functionality - Added a new stage to the Jenkins pipeline to notify Discord when a build starts, succeeds, or fails, improving communication during the CI/CD process. - Implemented model hot-reload functionality in the MLFilter class, allowing automatic reloading of models when file changes are detected, enhancing responsiveness to updates. - Updated deployment scripts to provide clearer messaging regarding model loading and container status, improving user experience and debugging capabilities.	2026-03-01 21:46:36 +09:00
21in7	d9238afaf9	feat: enhance MLX model training with combined data handling - Introduced a new function `_split_combined` to separate XRP, BTC, and ETH data from a combined DataFrame. - Updated `train_mlx` to utilize the new function, improving data management and feature handling. - Adjusted dataset generation to accommodate BTC and ETH features, with warnings for missing features. - Changed default data path in `train_mlx` and `train_model` to point to the combined dataset for consistency. - Increased `LOOKAHEAD` from 60 to 90 and adjusted `ATR_TP_MULT` for better model performance.	2026-03-01 21:43:27 +09:00
21in7	db144750a3	feat: enhance model training and deployment scripts with time-weighted sampling - Updated `train_model.py` and `train_mlx_model.py` to include a time weight decay parameter for improved sample weighting during training. - Modified dataset generation to incorporate sample weights based on time decay, enhancing model performance. - Adjusted deployment scripts to support new backend options and improved error handling for model file transfers. - Added new entries to the training log for better tracking of model performance metrics over time. - Included ONNX model export functionality in the MLX filter for compatibility with Linux servers.	2026-03-01 21:25:06 +09:00
21in7	301457ce57	chore: remove unused risk_per_trade references Made-with: Cursor	2026-03-01 20:39:26 +09:00
21in7	ab580b18af	feat: apply dynamic margin ratio in bot position sizing Made-with: Cursor	2026-03-01 20:39:07 +09:00
21in7	795689ac49	feat: replace risk_per_trade with margin_ratio in calculate_quantity Made-with: Cursor	2026-03-01 20:38:18 +09:00
21in7	fe9690698a	feat: add get_dynamic_margin_ratio to RiskManager Made-with: Cursor	2026-03-01 20:37:46 +09:00
21in7	95abac53a8	feat: add dynamic margin ratio config params Made-with: Cursor	2026-03-01 20:37:04 +09:00