← BlackPan Master memory/v3-build-progress.md

V3 Build Progress

Corvus v3 Clean-Slate Build — Progress Tracker

SINGLE SOURCE OF TRUTH for v3 build state. Every session that touches v3 reads this first, updates it as work completes.

Spec doc: nsfw/pipelines/v3-clean-slate/README.md

Why v3: v1 LoRAs trained on synthetic Sana outputs (model collapse). v2 spec was incomplete patch. v3 = no LoRA training, PuLID identity injection from canonical references. Daniel approved 2026-05-01 with mandate: "I can't believe that is AI" quality across image/voice/video, full body + full nudity + explicit + BDSM tolerance. Cost ceiling ~$112/mo Corvus.

Quality bar: photoreal, identity-coherent, no melted hands, no broken anatomy, full explicit + BDSM/fetish for Voss PPV upsells.

AU compliance constraint: CivitAI banned AU. ALL model downloads happen on RunPod pod (US IP). Daniel never visits CivitAI front-end. No new front-end signups required from Daniel.


Day 1 — 2026-05-01 (in progress)

Day 2 — RunPod pod build (✅ COMPLETE 2026-05-01 evening)

Pod: 7bbbjincs9xi9g (NVIDIA RTX A5000, 24GB VRAM)
SSH: ssh -i ~/.ssh/id_ed25519 -p 22073 [email protected]
State file: nsfw/pipelines/v3-clean-slate/day2-pod-state.json
Total cost: ~$0.50 build (16GB models on a fresh A5000)
Spent during this session: ~$0.30 (rest depends on idle decision below)

Defects found + fixed during build (see commit 309719f):
- PuLID file is .safetensors not .bin
- IPAdapter at sdxl_models/ip-adapter-plus_sdxl_vit-h.safetensors not models/...
- NSFW LoRA mirror dead (commented out, Day-3 task to find live mirror)
- Custom node swap: cubiq/PuLID_ComfyUI (SDXL) instead of sipie800 FLUX-only node
- Sanity-check markers updated for ComfyUI 2026

Final pod state:
- ComfyUI 0.20.1, frontend 1.42.15, comfy-kitchen 0.2.8
- 6 custom nodes loaded clean (PuLID_ComfyUI, ControlNet aux, Impact Pack, essentials, mixlab, Manager)
- 16GB models downloaded: JuggernautXL, PuLID v1.1, IPAdapter+CLIP-vision, ControlNet OpenPose, DWPose, HandRefiner, UltraSharp, ADetailer YOLOs (face/hand/person)
- Sanity check PASSED (Total VRAM detected, no Tracebacks)

Already authored (ready-to-run when balance restored):
- [x] nsfw/pipelines/v3-clean-slate/build-pipeline.py — custom nodes + HuggingFace model downloads (no CivitAI, AU compliance)
- [x] nsfw/pipelines/v3-clean-slate/workflow-v3-canonical-refs.json — JuggernautXL + ADetailer face/hand + 4xUltraSharp upscale
- [x] nsfw/pipelines/v3-clean-slate/run-canonical-batch.py — Day-3 batch runner with rotation logic from canonical-prompts.md
- [x] nsfw/pipelines/v3-clean-slate/canonical-prompts.md — full identity anchors + rotation lists for Mia + Voss face/body refs

Cleanup done same day:
- [x] Terminated 4 EXITED RunPod pods (overnight-gen + 3× emily-sext) — recovered 110GB volume drip
- [x] Disabled 18 non-critical launchd agents (BlackPan-on-hold + Emily personal + ADHD-PA + Voss-future + grant-monitor + nightly-digest + monday-form-notify + receipts + review-server + health-monitor)
- [x] Loaded com.agency.corvus-compliance-monitor plist (daily 08:00 digest to Kip Telegram, smoke-tested OK)
- [x] Wired validate_prompt() into nsfw/overnight-generate.py (compliance gate before pod creation)
- [x] Wired log_post() into venus.py (Mia/Voss X, Mia TikTok, Mia IG, Reddit promo) and nsfw/reddit-warmup.py
- [x] Built nsfw/v3-curation-server.py — Day-4 curation portal (PASS/FAIL UI + auto-promote to PuLID/pose-ref dirs)

On Daniel resume — next steps when balance > $5:
1. Provision RunPod A5000 with ashleykza/comfyui base image (pre-authorized within $5)
2. SCP nsfw/pipelines/v3-clean-slate/ to /workspace/ on the pod
3. python3 build-pipeline.py — runs the prepared script; tail /workspace/build.log
4. Validate ComfyUI loads at http://<pod-ip>:8188
5. Trigger Day 3: python3 run-canonical-batch.py --persona mia --kind face --count 80 (then mia/body, voss/face, voss/body)
6. Sync results back to local for curation

Estimated time: 1.5-2 hrs to provision + run | $3 build + $7 batch generation = $10 total
Blocker risks: NSFW LoRA HF mirrors not yet pinned in build-pipeline.py — Day-2 task is "confirm best-available HF mirror at build time"

Day 3 — Canonical reference generation (✅ COMPLETE 2026-05-01 22:10 AEST)

Started: Fri May 1 10:21 UTC (20:21 AEST), launcher PID 5831 on pod 7bbbjincs9xi9g
Finished: Fri May 1 12:06 UTC (22:06 AEST) — 1h45m elapsed
Cost: ~$1.10 actual (workflow simplified to 7 nodes; ADetailer + 4x upscale return at Day 6 production)
Output: 220/220 — mia/face 80 + mia/body 30 + voss/face 80 + voss/body 30
Local path: ~/agency/nsfw/v3-canonical-refs/{mia,voss}/{face,body}/
Pod: TERMINATED 22:10 AEST via overnight-generate.terminate_pod('7bbbjincs9xi9g')

Live fixes during launch (commit 42ada6f):
- Workflow JSON _comment/_inputs keys tripped ComfyUI validator → strip in queue_prompt
- ComfyUI-Impact-Subpack missing (UltralyticsDetectorProvider lives there) → installed live + added to build-pipeline.py
- FaceDetailer schema mismatch → simplified canonical workflow to 7 nodes (Checkpoint → CLIP → KSampler → VAE Decode → Save). ADetailer + upscale return at Day 6 production workflow.

Finalize chain (executed 22:08–22:10 AEST):
1. Monitor task b1hyfet3i polled pod every 60s, fired on ALL Day 3 batches complete marker
2. rsync /workspace/ComfyUI/output/v3-canonical//tmp/v3-rsync/ with --include='*_00001_.png' filter (skipped 1 ComfyUI internal duplicate of MIA_face_000_00002_.png)
3. Sorted by filename prefix (MIA_face / MIA_body / VOSS_face / VOSS_body) into curation portal source dirs
4. Verified 80/30/80/30 in each bucket
5. Pod terminated via RunPod GraphQL podTerminate mutation
6. Curation portal at http://localhost:8421 (PID 22117) confirmed loading the synced refs

Day 4 portal upgrade (during pod wait, per Daniel's mid-flight ask):
- Per-image notes textarea (auto-save 500ms debounce + blur + decide-flush)
- Hotkeys (P/F/→) suppressed inside textarea, Esc blurs
- /api/note POST + /api/export GET (full audit CSV)
- Promote also writes notes.csv into target dir for retraining/iteration signal
- Decisions JSON wiped clean for fresh curation run


Day 3 (original spec — for reference)

Generate 80 face refs + 30 body refs per persona using JuggernautXL + persona description prompts. NO LoRA, NO PuLID at this stage — these refs ARE the ground truth.

Per-persona prompt prep needed:
- Mia: late-20s Italian, slim athletic, dark hair, hazel eyes — full prompt in nsfw/persona/mia-persona-brief.md (verify exists, may need polish)
- Voss: 40s European matriarch, sharp featured, blonde-grey ombre, ice-blue eyes — full prompt in nsfw/persona/madame-voss-persona-brief.md

Save to nsfw/v3-canonical-refs/{mia,voss}/face/ and .../body/.

For Voss specifically: 30 explicit-NSFW body refs WITH NSFW activator at 0.4 (full body, BDSM-coded for tier prep). For Mia: 30 explicit-NSFW at 0.3 (softer, brand-positioned).

Estimated time: ~12hr A5000 batch ($7) — kick overnight, review next morning.

Day 4 — Daniel curation (✅ COMPLETE 2026-05-01 ~22:50 AEST)

Done via nsfw/v3-curation-server.py portal at http://localhost:8421 with new triplet "Compare 3" mode + per-image notes capture (both shipped mid-flight per Daniel feedback).

Results — promoted into pipeline:

Persona / kind PASS FAIL Notes captured Promoted to
mia/face 79 0 14 pulid-face-refs/mia/ (top 20)
mia/body 22 8 9 pose-refs/mia/ (all 22)
voss/face 44 36 1 pulid-face-refs/voss/ (top 20)
voss/body 23 7 8 pose-refs/voss/ (all 23)

Each target dir also has notes.csv with retraining/iteration signal.

Quality leap: Daniel's verdict — "SIGNIFICANTLY better than old models". Mia 99% pass-rate confirms identity tightness; Voss 55% pass-rate at the right age band confirms tournament-style age-banding worked. Architecture switch from collapsed-LoRA to PuLID+canonical-refs is validated.

Day-3 retro-fix in flight (2026-05-01 evening): Daniel caught the missing NSFW body batch — spec called for 30 SFW + 30 NSFW per persona, only SFW ran. Pod re-fired (69.30.85.197:22113) to generate 60 NSFW body refs (30 mia GFE + 30 voss tier 1-3). Portal extended with body-nsfw kind → promotes to pose-refs/{persona}-nsfw/. Day-5 PuLID workflow JSON hand-authored (will test on same pod; if validation fails, falls back to Daniel's manual ComfyUI GUI build).

Day-3 retro v2 fire (2026-05-02 18:40-19:30 AEST): Daniel reviewed v1 body-nsfw and flagged "only sternum to head, no arms or legs" — SDXL collapsed to bust framing despite "full body" keyword. Fired v2 batch with two-part fix (commit 35fe6fe):
1. Framing: (full body shot showing legs and feet:1.4), (head to toe framing:1.3) + crop-rejecting negatives (head shot / bust shot / cropped at chest etc.)
2. Pose vocabulary expanded to full anatomy spectrum per Daniel: body / breasts / vagina / anus / legs / spread wide / from behind / shower with water overlays. Mia 18 poses (3 teasing + 5 shower + 6 full-body explicit + 4 from-behind/spread). Voss 18 tier-tagged poses (2 teasing + 2 lingerie + 5 controlled-front-nude + 2 shower + 4 from-behind + 3 BDSM authority).

Pod cktl5dszammop6 (RTX A5000) provisioned 18:06. Build complete in ~7 min (warmer base image than first time). Mia 30/30 done 19:03. Voss 30/30 done 19:18. Total batch cost ~$0.40. v1 PNGs + decisions archived to nsfw/archive/body-nsfw-v1-2026-05-02/. v2 PNGs at v3-canonical-refs/{mia,voss}/body-nsfw/. Decisions reset for fresh re-curation. Pod terminated 19:35. Daniel's v1 audit (mia 27p/3f, voss 26p/4f) preserved in archive but not used downstream — v1 framing made them unusable as DWPose pose-refs anyway. Total elapsed: 55 min from "go forward" to terminated pod.

Day 5 — PuLID calibration (✅ COMPLETE 2026-05-02 ~01:00 AEST)

Production weight LOCKED: 0.7 — both Mia and Voss read on-brand at w0.7 across all 5 stress prompts (portrait, three-quarter+hand, profile, full-body walking, wardrobe-swap). w0.9 also strong but at scale risks "stamp" repetition. w0.5/0.6 drift; w0.8 borderline. Spec's prior held.

Gallery for audit: ~/agency/nsfw/v3-test-batch/index.html (150 imgs, 5 weights × 5 prompts × 3 refs × 2 personas).

Pod terminated. Total Day-5 sweep cost: ~$1.25.

Day 5 — original spec section (preserved)

State at 23:25 AEST:
- ✅ Pulid-face-refs promoted (mia: 20, voss: 20)
- ✅ Pose-refs promoted (mia: 22, voss: 23 SFW; nsfw queue running)
- ✅ Pod live at 69.30.85.197:22113 — ComfyUI running, models cached, PuLID node loaded
- ✅ workflow-v3-pulid-test.json hand-authored (skipping the 20-min manual GUI step IF validation passes)
- ✅ run-test-batch.py authored — sweeps 5 weights × 5 prompts × 3 refs × 2 personas = 150 imgs
- 🟡 NSFW body batch in flight on same pod (~50 min, monitor task bxt13cxk4)
- ⏳ Day-5 fires after NSFW body completes: validate-only test first, then full sweep if green

Decision criteria (after 150-img sweep): pick the weight where identity is locked but creative flexibility holds — usually 0.7. Spec at nsfw/pipelines/v3-clean-slate/day5-test-batch-protocol.md.

Day 6-10 — see v3 README

Test batch → production batch → voice clones → video integration → bulk Fanvue upload + relaunch.

Day 6 — Tier 1 pose-fix batch (in progress, attempt #7 of 7+ as of 2026-05-06 21:30 AEST)

The Tier 1 batch has been blocked all week by orchestration + template-image bugs. Successive failures and fixes documented here so future sessions don't re-discover.

Cumulative cost on Tier 1 thrash: ~$5 USD across 7 attempts (each ~$0.50-0.70 — provision + partial build + terminate).

Fixes shipped to launch-tier1-pose-fix.py + build-pipeline.py 2026-05-06:

# Bug Fix File
1 scp -r src dst nests src inside dst when dst exists switched to scp src/ /workspace/ after rm -rf dst launch-tier1-pose-fix.py
2 return 1 on build-fail leaks the pod (no terminate) raise RuntimeError(...) so outer except hits terminate_pod launch-tier1-pose-fix.py
3 Build watchdog 900s too tight; sanity PASSED marker rotates out bumped 900s→1500s + COMFYUP fallback if /system_stats responds launch-tier1-pose-fix.py
4 RunPod template ypt3pl6coj retired ("Template not found") replaced with 9eqyhd7vs0 (ashleykza/comfyui:cu124-py312-v0.20.1) across 4 files launch-tier1-pose-fix.py + launch-day2.py + bpm-ai-image-gen.py + bpm-thumbnail-generator.py
5 ComfyUI v0.20.1 imports sqlalchemy at startup; not in base image added pip install sqlalchemy aiosqlite to build-pipeline.py main() build-pipeline.py
6 SSH watchdog 20s timeout fails on busy pod mid-model-download bumped to 60s + try/except for transient TimeoutExpired launch-tier1-pose-fix.py
7 ComfyUI v0.20.1 also imports alembic (sqlite migrations) + warns on missing blake3 added alembic + blake3 to the dep install line build-pipeline.py

Outstanding suspect (not yet fixed):
- ModuleNotFoundError: No module named 'comfy_aimdo' — appears to come from a custom node, not standard ComfyUI. Surfaced AFTER alembic was missing so not yet observed in isolation. Re-fire after fix #7 will tell if it's a real blocker or downstream noise.

Strategy if attempt #7 also fails:
1. SSH directly into a fresh pod and run python3 -c "import main" from /workspace/ComfyUI to enumerate ALL missing deps in one pass — stop the firing-and-failing pattern
2. OR fall back to template cw3nka7d08 (runpod/comfyui:latest) which is the official RunPod image — known to be more reliable than ashleykza
3. OR roll a custom Docker image with all deps baked in — costs ~$0.50 to build, then pod fires are clean

Compliance moat — running checklist (post-Day-1)

Open questions / blocked-on-Daniel

Auto-resume contract

This file is read at every session start (added to CLAUDE.md MANDATORY bootstrap). The session reads:
1. Last SESSION CLOSE entries
2. MEMORY.md index
3. parking_lot_for_opus.md
4. THIS FILE if Corvus v3 work is active

Then jumps to "next unchecked item" in the current Day section. No re-orientation required.


🔴 IMAGE-STACK AUDIT — 2026-06-25 (Opus, verified against files on disk)

Verdict: NOT production-grade as-is. Three issues outrank the rest:

P1 — Split-brain architecture. Two contradictory stacks coexist:
- v3 PuLID/ComfyUI (intended, LOCKED) — abandoned mid-Day-5; last v3 sample 2026-05-02; ComfyUI graphs are hand-authored JSON never confirmed to run end-to-end on a pod (build stuck on deps incl. unresolved comfy_aimdo ModuleNotFoundError on ashleykza template).
- v2 A1111 + stacked persona LoRAs (DEPRECATED, README calls the LoRAs a "model-collapse vector", archived to contaminated-2026-05-01/) — but overnight-generate.py (1,800 lines) STILL executes it, and it produced the ONLY recent output (voss 2026-06-01). One-line launchd re-enable = contaminated LoRAs back in production. Biggest risk.

P1 — Production runner skips the quality stack. run-production-batch.py loads workflow-v3-pulid-test.json = [email protected] but NO ADetailer, NO DWPose, NO HandRefiner, NO upscale (bare 11-node graph). The full face/hand/pose pipeline lives only in workflow-v3-tier1-pose-fix.json, wired to a DIFFERENT runner (run-tier1-pose-fix-batch.py). So the script you'd fire for real PPV is the bare test graph → weak faces/hands on full-body.

P1 — v3 ComfyUI graph never validated on a pod. Smoke-test before any paid batch: SSH pod, python3 -c "import main" from /workspace/ComfyUI to enumerate missing deps in one pass; consider official runpod/comfyui template.

P2: ADetailer mask_blur rule unmet on ComfyUI path (FaceDetailer uses feather:5, no mask_blur — only the dead A1111 path sets 8/12) · HandRefiner (MeshGraphormer) shipped DISABLED (_handrefiner_disabled) despite hands being the known weak point · NO upscaler in any v3 workflow (output caps 832×1216 / 1024² — below PPV grade) · checkpoint drift: JSONs hardcode Juggernaut v9, README specifies v12.

P3 hygiene: Telegram bot token + chat IDs hardcoded in mia-generate.py & overnight-generate.py · run-production-batch.py compliance validator falls back to a no-op lambda:(True,"") if the monitor script isn't found → compliance silently FAIL-OPENS on a fresh pod · stale TRAINING_POD_ID/IP constants · InsightFace on CPU (slow, minor).

Recommended sequence (when RunPod funded): (1) confirm v3 PuLID = THE path → neutralize/archive the A1111 generation path; (2) repoint production runner at the full quality workflow (tier1 nodes); (3) smoke-test ComfyUI graph on a pod (~$2); (4) enable HandRefiner + real mask_blur; (5) add 4x-UltraSharp upscaler; (6) align checkpoint version; (7) hygiene (secrets→.env, compliance fail-CLOSED).