Deploy Hermes Agent LXC (#118) on gihyeon + IaC hygiene #1

Merged
gihyeon merged 11 commits from hermes-agent-lxc into main 2026-06-19 11:04:17 +09:00
Owner

Summary

Adds and deploys the Hermes Agent (Nous Research) as unprivileged LXC #118 on the gihyeon node, managed by Terraform. Also brings main (stuck at the initial commit) up to current, including PBS refinements and IaC hygiene fixes discovered during the deploy.

Status: deployed & verified — container created via Terraform, Docker stack up, Discord connector responding end-to-end (Discord → Hermes → litellm/GLM-5.2).

What's included

  • hermes.tf, hermes-variables.tf — LXC #118 (2C/4GB/512swap, 24G local-lvm, DHCP@intra01, onboot, features { nesting }), Debian 12 template download.
  • scripts/hermes-bootstrap.sh — in-container bootstrap: Docker + compose, docker-compose.yml (image nousresearch/hermes-agent, /data+/fast workspace mounts), .env template.
  • outputs.tf updates, deploy docs under docs/.
  • Storage: /data/mnt/pve/hdd/nfs_shared/hermes (14TB HDD), /fast/media/2tb/hermes (2TB SSD).

Implementation notes / gotchas resolved

  • API token can only set nesting — Proxmox returns HTTP 403 (changing feature flags (except nesting) is only allowed for root@pam) for keyctl. So keyctl + bind mounts (mp0/mp1) are applied out-of-band on the node console as root@pam.
  • download_file uses overwrite_unmanaged = true to adopt the Debian template already present on gihyeon's local datastore (avoids "refusing to override existing file").
  • Container has lifecycle { ignore_changes = [features, mount_point] } so routine plans don't try to strip the console-applied keyctl/mounts.

Security

  • terraform.tfvars (holds the Proxmox API token) is removed from tracking and a .gitignore added (*.tfvars, keeping *.tfvars.example).
  • ⚠️ The token is still in git history on this remote — rotate the Proxmox API token to fully remediate.

Follow-ups (not in this PR)

  • Do not run an untargeted terraform apply yet — pbs.tf has a disk-size drift (16 vs live 48G); use -target or fix pbs.tf first.
  • Reconcile the console-applied keyctl/bind-mounts into IaC if/when desired.

🤖 Generated with Claude Code

## Summary Adds and deploys the **Hermes Agent** (Nous Research) as unprivileged LXC **#118** on the `gihyeon` node, managed by Terraform. Also brings `main` (stuck at the initial commit) up to current, including PBS refinements and IaC hygiene fixes discovered during the deploy. **Status: deployed & verified** — container created via Terraform, Docker stack up, Discord connector responding end-to-end (Discord → Hermes → litellm/GLM-5.2). ## What's included - `hermes.tf`, `hermes-variables.tf` — LXC #118 (2C/4GB/512swap, 24G `local-lvm`, DHCP@`intra01`, onboot, `features { nesting }`), Debian 12 template download. - `scripts/hermes-bootstrap.sh` — in-container bootstrap: Docker + compose, `docker-compose.yml` (image `nousresearch/hermes-agent`, `/data`+`/fast` workspace mounts), `.env` template. - `outputs.tf` updates, deploy docs under `docs/`. - Storage: `/data` ← `/mnt/pve/hdd/nfs_shared/hermes` (14TB HDD), `/fast` ← `/media/2tb/hermes` (2TB SSD). ## Implementation notes / gotchas resolved - **API token can only set `nesting`** — Proxmox returns HTTP 403 (`changing feature flags (except nesting) is only allowed for root@pam`) for `keyctl`. So `keyctl` + bind mounts (`mp0`/`mp1`) are applied out-of-band on the node console as root@pam. - `download_file` uses **`overwrite_unmanaged = true`** to adopt the Debian template already present on `gihyeon`'s `local` datastore (avoids "refusing to override existing file"). - Container has **`lifecycle { ignore_changes = [features, mount_point] }`** so routine plans don't try to strip the console-applied keyctl/mounts. ## Security - `terraform.tfvars` (holds the Proxmox API token) is **removed from tracking** and a `.gitignore` added (`*.tfvars`, keeping `*.tfvars.example`). - ⚠️ The token is still in git history on this remote — **rotate the Proxmox API token** to fully remediate. ## Follow-ups (not in this PR) - Do **not** run an untargeted `terraform apply` yet — `pbs.tf` has a disk-size drift (`16` vs live `48G`); use `-target` or fix `pbs.tf` first. - Reconcile the console-applied keyctl/bind-mounts into IaC if/when desired. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
gihyeon added 11 commits 2026-06-19 11:01:33 +09:00
Design for deploying Nous Research Hermes Agent as an unprivileged Docker
LXC (#118) on node1, using litellm (10.1.10.22:4000) as the OpenAI-compatible
LLM gateway. Messaging-connector use (outbound-only, no inbound ports).
Large workspace via direct host bind mounts (hdd /data + 2tb /fast),
aligned with the Plan A same-host bind-mount decision.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Plan: 10 tasks splitting workstation Terraform (token-safe container skeleton)
from PVE-console host ops (features nesting/keyctl + bind mounts via pct set,
which the API token cannot do) and in-container Docker/hermes bootstrap.

Spec amended for the discovered API-token limitation: bind mounts AND container
features require root@pam/SSH, so both are applied via console pct set rather
than Terraform; terraform import tracked as follow-up.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Defines Hermes Agent LXC (VMID 118) on node gihyeon with 2 cores,
4 GB RAM, 24 GB disk, DHCP on intra01. Token-safe: nesting/keyctl
features and bind mounts are intentionally omitted and must be
applied via pct set after initial deploy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds scripts/hermes-bootstrap.sh which installs rootful Docker,
writes docker-compose.yml (nousresearch/hermes-agent) with bind mounts
for /data and /fast, and writes a .env template pointing at the
litellm gateway (#117, 10.1.10.22:4000). Run once inside LXC #118
console after pct set has applied bind mounts and features.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds hermes.tf / hermes-variables.tf / scripts/hermes-bootstrap.sh
rows to the structure table, and appends a Hermes Agent section with
the 4-step deploy sequence (host prep → terraform apply → pct set
bind mounts → in-container bootstrap). Notes that mp0/mp1 are outside
TF state and need a future terraform import.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
terraform plan revealed proxmox_virtual_environment_container.pbs has disk
drift (live 48G vs code 16G). A blanket apply would shrink it, so the hermes
apply must be -targeted. Recorded in the plan.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Debian-12 (systemd 252) unprivileged create emits a "you may need to
enable nesting" warning, which Proxmox returns as TASK WARNINGS:1 and bpg
treats as a failed apply. nesting/keyctl on an unprivileged CT need only
VM.Allocate (which the API token has) — not root@pam — so set them in TF.
Only bind mounts genuinely require root@pam/console.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Correct README/plan/spec after the apply-failure root cause: nesting/keyctl
are settable by the API token on an unprivileged CT and are required at create
to avoid the systemd-252 TASK WARNINGS that fails apply. Console step reduced
to bind mounts only. README apply uses -target (PBS disk drift).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- download_file.debian12_template_gihyeon: overwrite_unmanaged=true to adopt the
  Debian template already present on gihyeon's local datastore (avoids 'refusing
  to override existing file')
- container.hermes: drop keyctl from features — API token gets HTTP 403
  ('changing feature flags (except nesting) is only allowed for root@pam'); keep
  nesting only so token-based create succeeds
- container.hermes: lifecycle ignore_changes=[features, mount_point] so the
  console-applied keyctl + bind mounts (mp0=/data, mp1=/fast; root@pam-only) do
  not show as drift on routine plans

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
terraform.tfvars holds the Proxmox root@pam API token and was committed since the
initial commit. Remove it from tracking and ignore *.tfvars (keeping
*.tfvars.example). NOTE: the token is still in git history on origin
(git.gihyeon.com) — rotate it in Proxmox to fully remediate.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
gihyeon merged commit 71e056133e into main 2026-06-19 11:04:17 +09:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: gihyeon/proxmox-iac#1