How the fund is built
An explainable, LLM-run paper hedge fund — in public. A LangGraph-orchestrated daily cycle sits on a hardened LLM gateway, with idempotent stores as the system of record, Qdrant as long-term memory, evals in CI, and every decision traced, cited, and scored.
The fund analyzes markets, debates, trades a simulated $1M book, and justifies itself — every run. Deterministic guardrails (position sizing, turnover, sector limits, stop-loss/take-profit) sit outside the model; the LLM must argue its case and respond to the bear thesis before anything executes. A LangGraph cycle sits on a hardened LLM gateway, with idempotent stores as the system of record, Qdrant as long-term memory, and evals in CI. The differentiator isn't returns — it's radical transparency.
System architecture
One LLM gateway is the single choke point — routing, retries, tracing, and cost tracking are one-time costs, not per-agent ones.
Run state machine
Each daily run is a guarded LangGraph state machine. Guardrails run before execution; an empty or fully-rejected decision skips execution; any node error finalizes with diagnostics; a killed run resumes by reusing its run_id.
run_id — no duplicate trades or journal entriesOne daily cycle — sequence
How the components talk during a single run. Solid arrows are calls; dashed are returns.
Key engineering decisions
The calls that shaped the project — foundational bets first, then the choices a constraint forced later.
| Decision | Why |
|---|---|
| One LLM gateway as the single choke point | The highest-leverage refactor in the repo: routing, retries, structured-output validation, tracing, and cost tracking become one-time costs instead of per-agent ones. Every agent flows through it — the later provider seam and tool loop were built on it. |
| Deterministic guardrails outside the LLM | The portfolio manager can be creative; the risk layer is boring on purpose. Position sizing, turnover, sector limits, and stop-loss/take-profit are enforced in code the model can't argue past. |
| LangGraph as the only runner | The legacy linear orchestrator was deleted once the graph reached parity. Two orchestrators is debt, not safety. |
| SQLite, not Postgres, as the system of record | Transactions, upserts, and queryability without running a server. Stop committing raw state to main; keep committing exports to Pages. |
| Langfuse over LangSmith for tracing | Self-hostable and open source — a better story for an in-public project, and a no-op unless keys are set. |
Idempotency-based resume over LangGraph's native SqliteSaver | The graph state carries non-serializable live handles (engine, clients). Persisting progress + reusing the run_id leans on the idempotent stores for duplicate-free re-execution — correct, without a large risky state refactor. |
| Chunking eval with a deterministic hashing embedder + in-memory Qdrant | Runs in CI with no API key, yet does real vector search. Cosine over TF vectors rewards term concentration — the actual reason chunking helps — so the 0.15 → 1.00 hit@1 win is honest, not rigged. |
| Grounding gate before publish | Both a daily decision and the weekly letter are checked against the facts they had; unsupported claims are blocked, never published or tweeted. |
Deliberately not built
Scope is a feature — what was left out on purpose.