Live Signals Actions Arb Journal Policy Docs Overview Current Build Architecture Agents Intelligence Roadmap Deploy
// OVERVIEW

OrganizedMarket

Hard price data from TastyTrade. Prediction market odds from Polymarket and Kalshi. Sentiment from Twitter/X and financial news — all wired into a correlation engine running inside ClawBox's isolated Linux VM, orchestrated through OpenClaw. Surfaces arbitrage-adjacent signals across venues in real time.

ClawBox OpenClaw :18789 TastyTrade DXLink polymarket-cli kalshi-cli Twitter/X v2 Claude Opus 4.6 × 4.7 LLM Wiki CF Pages
8
Agents
3
Data Sources
RT
Streaming
Signal Loop

The Core Idea

Prediction markets price real-world outcomes. Financial markets price risk. When they diverge — a Polymarket contract implying a Fed cut with probability X while /ZQ options price something different — there's signal. OrganizedMarket finds that gap continuously.

One source of truth
Cross-venue price divergence. Agent conclusions are derived from hard data — DXLink quotes, CLOB order books, contract resolution odds.
One metric
Signal confidence 0.0–1.0. High (>0.75) fires an alert immediately. Medium (0.45–0.75) queues for pattern confirmation. Low is logged to the flywheel.
One directive
Find correlation before causation. Twitter sentiment and news cycle events are leading indicators. Hard price data is ground truth. Never invert that hierarchy.
Human review gate
OrganizedMarket surfaces signals — it does not execute trades. All alerts include a confidence score, data provenance, and a plain-English summary before any action.

Explore the Guide

// CURRENT BUILD — WHAT'S LIVE TODAY

Current Build

OrganizedMarket has grown well past the base 6-agent flow. What ships today is a 13-agent pipeline with deterministic trade plans, a 2-click agentic execution layer, a shadow-journal + hindsight + policy learning loop, six live dashboards, and a Claude Code ↔ Hermes handoff skill that delegates coding tasks to a remote agent on a dedicated Mac mini over Tailscale. Everything mock-first, everything kill-switched.

From signal to trade
Signals carry a full TradePlan — exact contract counts sized against TRADE_UNIT_USD × confidence, expiry month, venue deeplinks, copy-pastable ticket strings for both legs. Nothing opaque, nothing narrative — every number is deterministic.
Human-in-the-loop execution
Every HIGH signal runs through a QA agent (freshness, confidence, edge, cooldown, daily-budget, dedup). Passed items land on /actions with an Approve QA click, then a Place Trade click. Three layered kill switches (AUTO_EXEC_ENABLED + per-venue flags) keep everything paper-only by default.
Reps regardless of execution
Every HIGH signal journals as a SHADOW entry whether you trade it or not. Hindsight labels outcomes at horizon; Polymarket resolution overlays real $/$-risked PnL when markets settle. The policy aggregator turns those labels into per-bucket TP rates + confidence multipliers.
Delegation without lockout
The /hermes delegate skill rsyncs the project to claws-mac-mini, creates an isolated git worktree, runs Hermes chat in a detached tmux session, and brings the diff back into ./.hermes/incoming/ for review. Your Claude Code session keeps full read/write on the local tree throughout.

Six live dashboards

All served by Wrangler Pages; deployed copy lives at organized-market-arch.pages.dev. Each polls its rolling JSON feed every 2s.

13-agent pipeline

signal  ──┐                                            ┌── dispatcher (sinks)
odds    ──┼─▶ correlator ─▶ signal ──▶ qa ──▶ action ──┤
sentiment─┘             │                              ├── journal ─▶ hindsight ─▶ policy ──┐
                        └──▶ arb.poly_tasty ───────────┤                                    │
                                                       └── bridge (HTTP) ─▶ tasty·poly·kalshi
                                                                                            │
autoresearch ◀── tier.lift ◀── tier_correlator                                              │
            ──▶ model.drift                                                                 │
            ──▶ journal.research ─── appended to entries                                    │
sniffer     ──▶ counterparty.fingerprint                                                    │
                                                                                            │
                                         policy feedback ──────────────── modifies next ────┘
  

Every box subscribes to one or more Pydantic-validated topics on a shared in-process asyncio bus. Full wiring + schemas are in the Agent Pipeline + Docs · Message Topics sections.

2-click execution layer

The QA agent takes any HIGH signal / profitable arb and produces a ProposedAction with six deterministic check rows. Passed items surface on /actions.

PROPOSED ──▶ QA_PASSED ──click 1──▶ APPROVED ──click 2──▶ EXECUTING ──▶ EXECUTED
         └─▶ QA_REJECTED                                           └─▶ FAILED
  

The bridge server (scripts/bridge.py, http://127.0.0.1:18799) receives approve / execute POSTs and fires both legs through a fail-closed executor facade:

Master switch AUTO_EXEC_ENABLED=0 overrides everything — every place_* call returns status=manual with a specific "set X=1" message. Bridge still runs, but no orders ever reach the wire.

Learning loop (Layers 1–4)

L1 · Shadow journal
Every HIGH Signal or profitable ArbOpportunity opens a SHADOW JournalEntry. If you later execute, it promotes in place to OPEN — same entry_id, full snapshot history preserved.
L2 · Hindsight evaluator
At HINDSIGHT_HORIZON_SECONDS (default 1h), labels entries with realized_capture, ideal_entry_lag_seconds, ideal_exit_lag_seconds. Separately polls gamma-api.polymarket.com for resolution — when a market settles, overwrites the proxy with real $/$-risked return.
L3 · Policy aggregator
Per pattern_key (symbol|venue|gap:bucket|z:bucket|sentiment:regime): TP rate, mean realized capture, suggested confidence multiplier (clipped [0.3, 1.3], shrunk toward 1.0 until n≥20), suggested exit rule. Read-only; visible on /policy.
L4 · Feedback
Correlator + arb consult the cache at emit time. LEARN_MODE=shadow (default) only stamps an audit. LEARN_MODE=active applies the multipliers live (size capped at ×1.2). Plan carries a PolicyAdjustment record either way so the dashboard always explains why a number moved.

Hermes handoff skill

A Claude Code skill at ~/.claude/skills/hermes/ delegates build tasks to the Hermes agent running on claws-mac-mini over Tailscale while keeping your local Claude Code session fully active on the same project. Hermes works on an isolated git worktree on the remote side. When it's done, /hermes pull rsyncs the worktree into ./.hermes/incoming/<task-id>/ — you review the diff and selectively merge.

/hermes delegate "<task>"   → rsync project → tmux detached hermes chat → task-id
/hermes status [id]          → tail log + list worktree changes
/hermes pull  [id]           → rsync back → HANDOFF_RESULT.md + diff vs local
/hermes list                  → recent task-ids + state
/hermes cancel <id>          → tmux kill-session
  

Sits side-by-side with Claude Code. Both can work on the same project simultaneously — Hermes on the mini, Claude Code on your Mac — with zero mutation conflicts until you explicitly merge.

What's next

// ARCHITECTURE

System Architecture

OrganizedMarket runs entirely inside ClawBox — a Tauri-wrapped macOS app that spins up an isolated Ubuntu 24.04 VM via Lima. OpenClaw runs inside that VM as the agent gateway. Your Mac stays clean; only explicitly uploaded files are shared with the VM.

               ORGANIZEDMARKET — SYSTEM DIAGRAM

┌────────────────────────────────────────────────────────────────────┐
│  CLAWBOX (Tauri + React)                                           │
│  Native macOS UI · one-click VM lifecycle                          │
├────────────────────────────────────────────────────────────────────┤
│  LIMA VM MANAGER                                                   │
│  Ubuntu 24.04 · isolated from host Mac                             │
├────────────────────────────────────────────────────────────────────┤
│  OPENCLAW GATEWAY  :18789                                          │
│                                                                    │
│  ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐  │
│  │ agent-signal     │ │ agent-poly       │ │ agent-kalshi     │  │
│  │ TastyTrade REST  │ │ polymarket-cli   │ │ kalshi-cli       │  │
│  │ DXLink stream    │ │ subprocess JSON  │ │ subprocess JSON  │  │
│  │ options flow     │ │ clob midpoints   │ │ yes_bid/yes_ask  │  │
│  └─────────┬────────┘ └─────────┬────────┘ └────────┬─────────┘  │
│            │                    │                   │             │
│            └────────────────────┼───────────────────┘             │
│                                 │ quote.update · odds.update      │
│  ┌──────────────────────────────▼──────────────────────────────┐ │
│  │  agent-correlator          · pearson + lag                   │ │
│  │  cross-venue divergence · z-scored rolling stats             │ │
│  └──────────────────────────────┬──────────────────────────────┘ │
│                                 │ + sentiment + drift + lift     │
│  ┌──────────────────────────────┴──────────────────────────────┐ │
│  │  agent-sentiment    Twitter/X v2 · news NLP · Claude Sonnet │ │
│  └──────────────────────────────┬──────────────────────────────┘ │
│                                 │ signal (HIGH/MED/LOW)          │
│                                 ▼                                 │
│  ┌─────────────────────────────────────────────────────────────┐ │
│  │  agent-dispatcher    Slack · Discord · webhook · dashboard  │ │
│  │  confidence gate · HIGH-only fires · MED queues for review  │ │
│  └─────────────────────────────────────────────────────────────┘ │
│                                                                    │
│  ─── research + attribution loop ──────────────────────────────    │
│                                                                    │
│  ┌──────────────────┐ tier.lift  ┌─────────────────────────────┐ │
│  │ agent-tier-      │───────────▶│ agent-autoresearch           │ │
│  │ correlator       │            │ Opus 4.6 × 4.7 drift probes  │ │
│  │ MED→HIGH lift    │◀───────────│ model.drift → correlator     │ │
│  └────────┬─────────┘ feedback   └──────────────┬──────────────┘ │
│           │                                      │                │
│           │     signal_log mining                │ drift context   │
│           ▼                                      ▼                │
│  ┌─────────────────────────────────────────────────────────────┐ │
│  │  agent-sniffer                                              │ │
│  │  public-data venue cohorts · signal+drift fingerprinting    │ │
│  └─────────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────────┘
         │                                     │
         ▼                                     ▼
  claws-mac-mini :11434               CF Pages dashboard
  Claude Sonnet via OpenClaw          organized-market-arch.pages.dev
  local inference (optional)
  

Why ClawBox as the Container

ClawBox's Lima-based VM isolation is ideal for a financial intelligence agent: API credentials never touch your host Mac, the VM can be snapshotted before experiments, and the OpenClaw gateway handles multi-agent orchestration without needing to wire up a separate orchestration layer. Everything is already there.

Component Technology Role
ClawBox GUI Tauri + React Native macOS wrapper, one-click VM start/stop
Lima VM Ubuntu 24.04 ARM Isolated execution environment, no host filesystem access
OpenClaw Node.js + Python Agent gateway :18789, multi-agent orchestration, tool registry
agent-signal Python + DXLink WS TastyTrade streaming quotes, options flow, greeks ingestion
agent-poly Python + polymarket-cli Polymarket CLOB via Rust CLI subprocess — markets get + clob midpoints, JSON stdout
agent-kalshi Python + kalshi-cli Kalshi event contracts via CLI subprocess — kalshi market <TICKER> --json, RSA-PSS auth handled by CLI
agent-sentiment Python + Claude Sonnet Twitter/X entity scoring, financial news NLP
agent-correlator Python + NumPy Cross-venue divergence, Pearson + lag correlation matrix, joins model.drift + tier.lift into tier scoring
agent-dispatcher Python + webhooks Confidence-gated alerts, Slack/Discord/ClawBox UI output
agent-autoresearch Python + Anthropic SDK Probes two frontier model versions (Opus 4.6 vs 4.7) on identical market context, emits model.drift. Subscribes to tier.lift for immediate triggered probes
agent-tier-correlator Python + SQLAlchemy Mines signal_log for MED→HIGH lift per pattern bucket, publishes tier.lift when a pattern crosses the promotion threshold
agent-sniffer Python + public signal/drift joins Emits venue-cohort fingerprints from recent signal + model.drift context and tags the likely lagging model for that cohort

Bus topics

All inter-agent communication flows through core.bus.Bus — an in-process asyncio pub/sub with Pydantic validation on publish. New topics from the research + attribution loop:

Topic Schema Publisher → Subscribers
quote.update QuoteUpdate signal → correlator
odds.update OddsUpdate poly, kalshi → correlator
sentiment Sentiment sentiment → correlator
signal Signal correlator → dispatcher, tier-correlator (persisted to signal_log)
arb.poly_tasty ArbOpportunity correlator → dispatcher
model.drift ModelDriftEvent autoresearch → correlator (tier boost on drift × divergence overlap)
market.microstructure MarketMicrostructure poly / kalshi → sniffer (public book spread, imbalance, activity)
tier.lift TierTransitionStat tier-correlator → autoresearch (immediate triggered probe on high-lift pattern)
counterparty.fingerprint CounterpartyFingerprint sniffer → dispatcher / downstream scoring (venue cohort + likely lagging model)

Research + attribution loop

The base signal pipeline produces tiered price-divergence signals. The research loop — tier-correlator → autoresearch → sniffer — converts those signals into attributed venue-cohort opportunities:

  1. Tier-correlator mines signal_log hourly for MED patterns that historically preceded HIGH within a window, computes lift, emits tier.lift when lift > 1.5×.
  2. Autoresearch wakes on tier.lift (instead of waiting its 300s cadence), probes the active frontier models on the matching pattern, emits model.drift if they disagree.
  3. Sniffer joins recent signal structure with model.drift and emits a venue-cohort fingerprint tagged with the likely lagging model.
  4. Dispatcher persists those fingerprints to SQLite and dashboard JSON so the attribution layer is inspectable and testable.

The loop is self-correcting. When a triggered probe successfully promotes a MED to HIGH and resolves in the expected direction, the (pattern, model_version) pair strengthens in the Wiki. When it fails, the lift estimate for that pattern decays faster. No manual tuning — the system finds its own edge and forgets what no longer works.

// CLAWBOX SETUP

ClawBox Setup

ClawBox ships as a native macOS app. It manages the entire Lima VM lifecycle — no CLI required. Once installed, OpenClaw runs inside the Ubuntu 24.04 VM and exposes a gateway at localhost:18789 through a port-forwarded bridge.

macOS only. ClawBox requires macOS 13+ (Ventura or later) with Apple Silicon recommended. Lima requires Homebrew. The VM uses ~4GB RAM when fully running OrganizedMarket's signal stack plus the research loop.

Setup stages

Install ClawBox
Native macOS app via direct DMG or source build. Requires Homebrew for Lima/QEMU. ClawBox handles the rest of the VM lifecycle without a CLI.
Launch VM & install OpenClaw
Click Start VM in ClawBox — provisions Ubuntu 24.04 via Lima in roughly two minutes first-run. OpenClaw installs inside the VM and exposes a gateway on :18789.
Bring in the repo
Clone OrganizedMarket into the VM. Python + Node dependencies install inside the VM sandbox; the host Mac stays untouched.
Wire credentials
Populate the environment file with TastyTrade, Polymarket, Kalshi, Twitter/X, and news-API credentials. Claude routes through OpenClaw OAuth, not an Anthropic key. Optional Slack / Discord webhooks for alert delivery.
Register & start
Register the pipeline agents with the OpenClaw gateway and start the stack from the repo's OpenClaw config. Agent health + uptime are visible through the gateway's status view.

ClawBox Architecture Internals

┌──────────────────────────────────────────┐
│  ClawBox (Tauri + React)                 │
│  Native macOS UI                         │
├──────────────────────────────────────────┤
│  Lima VM Manager                         │
├──────────────────────────────────────────┤
│  Ubuntu 24.04 VM                         │
│  ┌──────────────────────────────────┐    │
│  │  OpenClaw Agent                  │    │
│  │  Browser · Terminal · File Sys   │    │
│  └──────────────────────────────────┘    │
└──────────────────────────────────────────┘
         ▲
         │  Your files stay on your Mac.
         │  You share only what you upload.
         └──────────────────────────────────
// DATA SOURCES

Data Sources

Three primary data sources feed OrganizedMarket: TastyTrade for live financial market data and options flow, Polymarket for binary prediction market odds, and Kalshi for regulated event contracts. Together they form a complete picture of market-implied probabilities versus hard derivative pricing.

TastyTrade API — Financial Market Data

TastyTrade provides a full open REST API with a WebSocket DXLink streamer for real-time quotes. The SDK supports equities, ETFs, options, futures, and futures options. OrganizedMarket uses it primarily for options chain data, implied volatility surfaces, and streaming quotes on macro-sensitive instruments (SPY, QQQ, /ES, /ZQ, TLT).

Polymarket — Prediction Market Odds

Polymarket uses a Central Limit Order Book (CLOB) model. OrganizedMarket pulls live order books and trades for politically and economically sensitive markets — Fed decisions, election outcomes, GDP prints, CPI surprises — and feeds the implied probabilities directly into the correlation engine for comparison against options-implied probabilities.

Kalshi — Regulated Event Contracts

Kalshi is CFTC-regulated, making it the cleanest source of prediction market data for US financial events. OrganizedMarket focuses on Kalshi's Fed rate, CPI, GDP, and jobs report markets — these map directly to instruments TastyTrade can stream.

Source Comparison

Feature TastyTrade Polymarket Kalshi
Data type Options, equities, futures Binary outcome markets Regulated event contracts
Streaming Yes — DXLink WS Yes — CLOB WS Polling (REST)
Regulation FINRA/SEC CFTC (prediction mkt) CFTC regulated ✓
Auth Session token (24hr) API key + L2 signing RSA private key JWT
Free tier Yes (account required) Yes (read-only) Yes (demo env)
Correlation use Ground truth pricing Implied probability Regulated prob. signal
// AGENT PIPELINE

The 6-Agent Pipeline

Each agent runs as an independent OpenClaw sub-process inside the ClawBox VM. They communicate via OpenClaw's internal message bus. The correlator consumes outputs from all data agents simultaneously and scores divergence signals. The dispatcher is the only agent that writes to external systems.

ORGANIZEDMARKET — AGENT DATA FLOW

  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐
  │ agent-signal │  │  agent-poly  │  │ agent-kalshi │
  │  TastyTrade  │  │  Polymarket  │  │    Kalshi    │
  │  DXLink/REST │  │  CLOB API    │  │  REST + JWT  │
  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘
         │                 │                  │
         │  quote_update   │  odds_update     │  contract_update
         └─────────────────┼──────────────────┘
                           │
                    ┌──────▼────────────────────────┐
                    │     agent-correlator           │
                    │  cross-venue divergence calc   │
                    │  Pearson · lag · Z-score       │
                    └──────┬────────────────────────┘
                           │
              ┌────────────┴───────────────┐
              │  agent-sentiment           │
              │  Twitter/X · news · Claude │
              │  score: -1.0 → +1.0        │
              └────────────┬───────────────┘
                           │ signal + context + confidence
                    ┌──────▼────────────────────────┐
                    │     agent-dispatcher           │
                    │  confidence gate (>0.75 alert) │
                    │  Slack · Discord · ClawBox UI  │
                    └───────────────────────────────┘

Agent Definitions

agent-signal
triggercontinuous — DXLink WebSocket stream + REST polling every 30s
inputsTastyTrade session, watchlist symbols (SPY, QQQ, /ES, /ZQ, TLT, GLD)
actionsStream real-time quotes via DXLink · fetch option chain greeks every 5min · calculate options-implied probability for event dates
output→ message-bus: { symbol, price, iv_rank, options_implied_prob, timestamp }
agent-poly
triggerWebSocket stream + REST polling every 15s for active markets
inputsPolymarket CLOB API · market condition IDs from config watchlist
actionsPoll order books · calculate mid-market yes price · detect large order flow (>$10k notional) · flag sudden odds movement >5% in 60s
output→ message-bus: { market_id, question, yes_prob, volume_24h, odds_delta_1h, timestamp }
agent-kalshi
triggerREST polling every 60s (no WebSocket on free tier)
inputsKalshi REST API · category: monetary-policy, economic-indicators
actionsFetch open financial markets · extract yes price · track resolution dates · cross-reference against TastyTrade expiry calendar
output→ message-bus: { ticker, title, yes_price, close_time, category, timestamp }
agent-sentiment
triggerTwitter/X stream continuous · NewsAPI polling every 5min
inputsTwitter/X v2 filtered stream · NewsAPI financial sources · Claude Sonnet via OpenClaw
actionsFilter X stream by financial keywords + watchlist tickers · Claude scores sentiment per entity (-1 to +1) · extract event signals · calculate aggregate sentiment velocity
output→ message-bus: { entity, sentiment_score, velocity, top_keywords, source_count, timestamp }
agent-correlator
triggeron every message-bus update from signal/poly/kalshi
inputsUnified market state from message-bus · sentiment overlay from agent-sentiment
actionsCalculate Pearson correlation between options-implied prob and prediction market yes price · compute time-lag matrix (does sentiment precede price?) · Z-score divergence vs 30-day rolling mean · classify signal tier: HIGH / MED / LOW
output→ dispatcher: { signal_id, instruments, divergence_score, confidence, evidence, tier }
agent-dispatcher
triggeron correlator output — gate: tier HIGH always; MED requires sentiment confirmation
inputsCorrelated signal + confidence score + evidence summary
actionsRoute HIGH signals to Slack/Discord immediately · queue MED signals for pattern confirmation window · log ALL signals to SQLite for flywheel · generate plain-English summary via Claude
outputSlack/Discord alert · ClawBox dashboard update · signal log entry

Message Bus Schema

Full topic-by-topic wiring and Pydantic schema shapes live in the Architecture section's bus-topics table. Every quote, odds update, sentiment event, drift event, tier-lift stat, and final signal flows through one in-process asyncio bus validated against those schemas.

// INTELLIGENCE LAYER

Intelligence Layer

The intelligence layer sits inside agent-correlator and agent-sentiment. It translates raw price feeds and social signals into structured, scored intelligence that a trader can act on.

Correlation Engine

The correlator maintains a rolling state of implied probabilities across all three venues. It computes Pearson correlation between TastyTrade options-implied probabilities and prediction market yes prices, then Z-scores the current divergence against its 30-day history.

Twitter/X Sentiment Pipeline

The Twitter/X v2 filtered stream listens for tweets containing financial entities and watchlist ticker symbols. Claude Sonnet scores sentiment per entity and calculates velocity — the rate at which sentiment is shifting, which is often more predictive than absolute level.

Signal Summary Generation

Before dispatching any alert, Claude generates a plain-English summary that includes data provenance, confidence rationale, and relevant context. The goal is a summary that could be handed to a trader who has no context and they immediately understand the opportunity.

Intelligence Data Flow

SENTIMENT + PRICE → SIGNAL LIFECYCLE

Twitter/X stream ──┐
NewsAPI polling ───┼──→ agent-sentiment ──→ score (-1 to +1)
                   │    (Claude NLP)         velocity calc
                   │
TastyTrade DXLink ─┼──→ options-implied prob
                   │    IV rank + delta
                   │
Polymarket CLOB ───┼──→ yes price (mid)
                   │    order flow delta
                   │
Kalshi REST ───────┘──→ yes price
                        resolution timeline

All streams ──→ agent-correlator
               · Pearson correlation matrix
               · Z-score vs 30-day history
               · Lag analysis (sentiment → price)
               · Confidence + tier scoring
                    │
                 HIGH (>0.75) ───→ immediate alert + Claude summary
                 MED (0.45–0.75) → 15-min confirmation window
                 LOW (<0.45) ────→ flywheel log only
// STACK & CONVENTIONS

Stack & Conventions

Python-first, OpenClaw-native agent architecture. All agents are independent Python processes registered with the OpenClaw gateway. Shared types and utilities live in packages/. Same monorepo conventions as the broader Organized AI codebase.

Runtime

Python 3.12 OpenClaw :18789 ClawBox / Lima VM Ubuntu 24.04 NumPy SQLite (signal log)

Key Dependencies

PackagePurpose
tastytradeOfficial Python SDK — sessions, option chains, DXLink streamer
py-clob-clientPolymarket CLOB API — order books, markets, trades
tweepyTwitter/X v2 filtered stream client
websocketsDXLink WebSocket streaming for TastyTrade real-time quotes
numpyPearson correlation, rolling statistics, Z-score calculation
cryptography / pyjwtKalshi RSA signature auth
newsapi-pythonFinancial news sources — Reuters, Bloomberg, WSJ aggregation
sqlalchemySignal log persistence — all signals stored for flywheel replay
pydanticMessage bus schema validation across all agents
anthropicClaude Sonnet — sentiment NLP + signal summary (via OpenClaw OAuth)

Agent conventions

Every agent registers with the OpenClaw gateway on startup, subscribes to the bus topics it cares about, publishes only validated Pydantic messages, and persists raw data to SQLite so the flywheel can replay. Only the dispatcher writes to external systems — webhooks fire on HIGH, MED queues for human review, LOW logs only. The OpenClaw config file is the single source of truth for which agents boot, which envs each one reads, and their cadence.

// ROADMAP — NEXT-GEN SIGNALS

Roadmap

The base signal pipeline finds price divergence between venues. The next two agents find participant divergence — who's trading, what stack are they running, and when does the frontier of available intelligence shift under them. Autoresearch and tier-correlator, autoresearch, and the sniffer MVP are wired in the repo today. Named-wallet attribution and a full LLM Wiki remain roadmap work.

Agent 7 · autoresearch — frontier-model drift detector

Probes two Claude versions with the same prompt built from the freshest signal_log entries and rolling-stat context for each symbol, then diffs their decisions. Publishes model.drift events when direction, confidence, or rationale diverges beyond a threshold. The arbitrage thesis: counterparties still running the older model will misprice exactly the setups where the newer model disagrees, and that window closes the day they upgrade.

Inputs
Fresh signal_log summaries plus rolling-stat pairs for each symbol, served as the shared context so only the model id varies across calls.
Output schema
ModelDriftEvent { symbol, model_a, model_b, decision_a, decision_b, confidence_a, confidence_b, divergence_score, rationale_delta } on topic model.drift.
Cadence
Default 300s per probe cycle. Tunable via AUTORESEARCH_CADENCE_SECONDS. Live mode is wired behind AGENT_AUTORESEARCH_LIVE=1 + ANTHROPIC_API_KEY.
Edge window
Hours-to-days after a frontier release. Correlator fuses model.drift into signal tier boost when a drift event lines up with an existing cross-venue gap.

Agent 8 · sniffer — counterparty fingerprinting

Literal counterparty hardware — model, GPU, OS — isn't observable from market data. What is observable are behavioral fingerprints from public data that proxy those things, and that's usually enough. The implemented MVP builds persistent venue cohorts from recent signal structure and drift disagreement, then surfaces counterparty.fingerprint events the correlator can join against drift and sentiment.

Observed inputs
Recent signal events, matching model.drift events, and public market.microstructure observations on the same symbol. The MVP does not invent wallet data; it fingerprints venue cohorts from public pipeline outputs already in hand.
Archetypes
Current classifications are heuristic: api_bot, latency_chaser, mean_reverter, and discretionary. They are inferred from confidence, gap size, z-score, and drift magnitude.
Output
CounterpartyFingerprint { cluster_key, venue, symbol, archetype, likely_model, confidence, evidence } persisted to SQLite and dashboard/data/fingerprints.json.
Drift × fingerprint join
When autoresearch emits a drift event on symbol X, the sniffer tags the freshest venue cohort on X with the likely lagging model based on the weaker side of the disagreement.
Horizon
Short-to-medium — the MVP is useful immediately for venue cohorts. Wallet-level attribution and trade-tape identity remain future expansion work.

Autoresearch × Sniffer — the closed loop

Autoresearch and the sniffer are two halves of the same instrument. Autoresearch maps what frontier models disagree on in a given market state; the sniffer maps which venue cohort looks exposed to that disagreement. Joining them turns drift events into inspectable attribution hints instead of anonymous divergence.

Signal context
Every sniffer output points back to the source signal, venue, gap, z-score, and model disagreement that generated it. The evidence is explicit and inspectable.
Attribution pass
The current attribution combines the live drift disagreement with a persisted llm_wiki_signature history per model / symbol / regime, then tags the active venue cohort with the model that scores as historically weaker in that regime.
Edge projection
When autoresearch emits model.drift on symbol X, the sniffer identifies which venue cohort looks most exposed to the lagging side. That gives the operator a practical lead even before wallet-level attribution exists.
Feedback
Each drift event updates a lightweight llm_wiki_signature table in SQLite. Future work is to enrich that store with resolved outcomes and wallet-level clustering.

Agent 9 · tier-correlator — MED→HIGH lift mining

The signal_log already records every tier transition. The gap is a component that mines it: for each MED pattern, compute P(HIGH within window W | MED pattern X) / P(HIGH baseline). Patterns with high lift become autoresearch triggers instead of autoresearch running on a fixed 300s cadence. When a MED with proven lift appears, autoresearch immediately probes frontier-model drift on that exact setup — agreement promotes it toward HIGH before the window closes. This is how the pipeline manufactures more HIGH-tier arb opportunities instead of waiting for them.

Inputs
Rolling window of signal_log (default 14d). Groups MED signals into pattern buckets keyed on (instruments, divergence_gap bin, z_score bin, sentiment regime) and checks whether a HIGH followed on the same instrument within window W.
Output schema
TierTransitionStat { pattern_key, n_med, n_followed_high, lift, baseline_high_rate, window_seconds, last_med_at } on topic tier.lift. Emitted when lift crosses a promotion threshold (default 1.5×).
Cadence
Slow — default 1h rollup. Tunable via TIER_CORRELATOR_CADENCE_SECONDS. Runs against the DB, not the live bus; this is retrospective analysis, not real-time.
Trigger wiring
Autoresearch subscribes to tier.lift. On each event it caches the pattern_key → expected-lift mapping. When the live correlator next emits a MED signal whose pattern matches, autoresearch fires a targeted probe immediately instead of waiting for its cadence tick.
Learned signature feedback
When a triggered probe runs, the model / symbol / regime signature is updated in llm_wiki_signature. That learned history then feeds the sniffer's next attribution pass. Outcome-weighted decay is still future work.
Sample-size caveat
Lift estimates are noisy until enough MED signals exist per bucket. The live implementation enforces TIER_CORRELATOR_MIN_SAMPLES; early runs may emit nothing until the history window is rich enough.

Scope boundary

The sniffer only analyses public market data — on-chain wallets, public order books, self-published social metadata. It never probes remote systems, never fingerprints hardware it doesn't own, and never executes trades off its own signals. Everything it emits flows through the same human-review gate as the base pipeline.

agents/autoresearch agents/sniffer model.drift counterparty.fingerprint polymarket-cli kalshi-cli Anthropic SDK LLM Wiki llm_wiki_signature /wiki
// AUTOAGENT — META-LOOP

AutoAgent

Everything up to this point is a runtime pipeline — it ingests data and emits signals. AutoAgent is a separate build-time loop that mutates the pipeline's heuristics against a measurable benchmark. The meta-agent reads a directive, rewrites a single frozen-signature Python function, scores it, and keeps the rewrite only if it beats best-so-far on both a rotating benchmark and a sealed holdout. Inspiration + pattern: AutoAgent & Autoresearch Guide ↗.

System flow

                        ORGANIZEDMARKET — FULL SYSTEM FLOW

    ┌──────────────────────── RUNTIME PIPELINE (agents/) ────────────────────────┐
    │                                                                             │
    │  TastyTrade ──┐                                                             │
    │  DXLink       │   quote.update                                              │
    │  Polymarket ──┼──────────────▶ agent-correlator ─── signal ──▶ agent-      │
    │  (cli)        │   odds.update     (score: z, gap,        (HIGH/MED/LOW)    │
    │  Kalshi    ───┘                    sentiment)                               │
    │  (cli)                                  ▲                        │         │
    │                                         │ sentiment              ▼         │
    │  Twitter/X ──▶ agent-sentiment ─────────┘              agent-dispatcher    │
    │  + News NLP     (Claude Sonnet)                         (webhooks · UI)    │
    │                                         ▲                                   │
    │                                         │ model.drift                       │
    │                                         │ tier.lift                         │
    │                             ┌───────────┴──────────────┐                    │
    │                             │                          │                    │
    │           agent-autoresearch ◀───── tier.lift ── agent-tier-correlator     │
    │           (Opus 4.6 × 4.7)                       (signal_log rollup)        │
    │                   │                                                         │
    │                   │ signatures                                              │
    │                   ▼                                                         │
    │           ╔═══════════════════╗        ┌────────────────────────┐          │
    │           ║     LLM Wiki      ║◀─────── agent-sniffer (roadmap) │          │
    │           ║ model×symbol×rgm  ║          counterparty fingerprint│          │
    │           ╚═══════════════════╝        └────────────────────────┘          │
    │                                                                             │
    └─────────────────────────────────────────────────────────────────────────────┘
                                         ▲
                                         │ manual promotion after N accepted rounds
                                         │ (meta/agent.py → agents/correlator/scoring.py)
    ┌────────────────────── META LOOP (meta/) ───────────────────────────────────┐
    │                                                                             │
    │   meta/program.md  ──▶ meta/mutate.py ──▶ (1) read meta/agent.py           │
    │   (directive,            six-step loop      (2) run meta/eval/harness.py   │
    │   frozen zones,          driver             (3) handoff to Hermes          │
    │   success gate)                             (4) fallback → Gemma on claws  │
    │                                             (5) frozen-zone diff check     │
    │                                             (6) rerun harness → keep/drop  │
    │                                                                             │
    │                          ┌─────────────────────────────────────────┐        │
    │                          │  HERMES   (claws-mac-mini · Codex OAuth) │       │
    │     primary ─────────────▶  handoff.sh → tmux worktree → pull.sh    │       │
    │                          │  edits meta/agent.py in an isolated wt   │       │
    │                          └─────────────────────────────────────────┘        │
    │                                          │                                   │
    │                                          │ on fail (ssh, creds, timeout)    │
    │                                          ▼                                   │
    │                          ┌─────────────────────────────────────────┐        │
    │                          │  GEMMA    (claws · ollama run gemma3:4b) │       │
    │     fallback ────────────▶  ssh claws → stdin prompt → raw body     │       │
    │                          │  driver wraps body with EDITABLE markers │       │
    │                          └─────────────────────────────────────────┘        │
    │                                                                             │
    │   meta/eval/                                     meta/results.tsv           │
    │   ├── harness.py         Brier + tier-agreement  append-only log            │
    │   ├── fixtures/          blended score           (round, mutator, task_id,  │
    │   │   ├── rotating/        per ISO week          rot_score, hold_score,     │
    │   │   └── holdout.jsonl    sealed                accepted, note)            │
    │                                                                             │
    └─────────────────────────────────────────────────────────────────────────────┘
  

Mutation target

First cut targets agents/correlator/scoring.py — the tier-scoring function that decides HIGH / MED / LOW. Hardcoded thresholds (HIGH > 0.75, MED ≥ 0.45), the sigmoid shape, and the sentiment-agreement weighting are every one a tunable knob. The meta-agent hill-climbs them against the Brier-calibration + tier-agreement blend in meta/eval/harness.py.

Editable target
meta/agent.py — mirrors the runtime scorer but is decoupled. Only the block between # ───── EDITABLE ───── and # ───── END EDITABLE ───── may be mutated. Signature score(z, sentiment_score, sentiment_velocity, divergence) → Scored is frozen.
Benchmark
Blended score = 0.6·(1−Brier) + 0.4·tier_agreement against labeled fixture rows. Rotating set rolls weekly (fixtures/rotating/2026-Wnn.jsonl); sealed holdout never rotates and acts as the promotion gate.
Accept gate
A candidate is kept iff rotating_score > best AND holdout_score ≥ best. Frozen-zone byte change → automatic reject. Harness exit non-zero → reject + restore baseline.
Mutator
Primary: Hermes on claws-mac-mini via the Claude Code hermes skill — Codex OAuth, no Anthropic SDK in this repo. Fallback: Gemma 3 local via ollama run gemma3:4b over SSH to claws when Hermes is unreachable.
Safety rails
Harness aborts if any AGENT_*_LIVE=1 or if dispatcher webhook envs are set. Mutated code may import only math, statistics, numpy. No network, no file I/O, no subprocess from inside the EDITABLE block.
Audit trail
meta/results.tsv — append-only: round · timestamp · mutator · task_id · rotating_score · holdout_score · accepted · agent_sha · note. Every Hermes round also leaves a full worktree under .hermes/incoming/<task_id>/ for after-the-fact forensics.

Next steps for deployment

  1. Seed real fixtures. The seed JSONL rows in meta/eval/fixtures/ are placeholders. First production round should replace them with a rolling export of signal_log joined to realized outcome probabilities (options expiry, prediction-market resolution) for the trailing 14 days. Add scripts/export_meta_fixtures.py that writes the weekly 2026-Wnn.jsonl every Monday.
  2. Provision Gemma on claws. One-time: ssh claws "ollama pull gemma3:4b". Verify with ssh claws "echo 'ping' | ollama run gemma3:4b". Override the tag via GEMMA_MODEL env if you want a different size.
  3. Schedule the loop. Add a launchd plist or cron entry on the developer Mac that runs python3 -m meta.mutate --rounds 5 nightly. Stream stdout to meta/results.tsv (already append-only) and tail into the dashboard.
  4. Promote accepted mutations. When meta/agent.py beats the previous runtime scorer on both rotating AND holdout across three consecutive rounds, copy the EDITABLE body back into agents/correlator/scoring.py, open a PR, run the full pytest suite, deploy to ClawBox. Manual step by design — the loop discovers, the human promotes.
  5. Dashboard surface. Add a /meta route that reads meta/results.tsv and renders a sparkline of rotating_score + holdout_score over time, with accepted rounds highlighted. Lives alongside /signals and the planned /wiki.
  6. Second mutation target. Once the correlator scorer plateaus, spin up meta/program_autoresearch.md pointed at agents/autoresearch/client_live.py prompt + parser. Same loop, same Hermes/Gemma split, different frozen zone. The meta/ directory is designed to hold multiple concurrent targets.
meta/program.md meta/mutate.py Hermes (Codex OAuth) Gemma 3 · ollama Brier + tier-agreement rotating + sealed holdout meta/results.tsv
// DEPLOY & RUN

Deploy & Run

Two deploy targets: the ClawBox agent stack (runs locally inside the VM) and the CF Pages dashboard (public-facing signal feed + docs, deployed via Wrangler CLI). No CI/CD required — run from your Mac.

Deployment stages

Credentials
Populate the environment file with TastyTrade, Polymarket, Kalshi, Twitter/X, and news-API credentials. Never commit the secrets file; keep it and the key directory in .gitignore.
Sandbox first
Point TastyTrade at its certification environment and Kalshi at its demo environment for the first run. Validate signal quality against recent historical events before swapping to production endpoints.
Start the stack
Inside the ClawBox VM, register the pipeline agents with OpenClaw and start everything from the repo's OpenClaw config. Agent status, logs, and live signal feed are all visible through the gateway.
Deploy the dashboard
The CF Pages dashboard (this page, the live-signals feed, and the docs) deploys via Wrangler from the host Mac. Custom domain is optional — the generated *.pages.dev URL works on day one.
Optional relay Worker
If running a Cloudflare Worker alongside the dashboard for live signal relay, push the OpenClaw URL and Codex token as Wrangler secrets so the Worker can proxy authenticated requests to the in-VM gateway.

Architecture Options

A — Cloud onlyB — Local onlyC — ClawBox ✓D — ExoClaw bridge
Claude cost High ($20+/mo) Free (local model) Low ($4–8/mo) Free
Signal quality Best Good Best Good
Setup complexity Low Medium Low High
Reliability Best Mac-dependent Good Mac-dependent
Isolation Full None Full (Lima VM) Partial

Operational Checklist

Never auto-trade
OrganizedMarket surfaces signals only. All alerts include confidence score and provenance. Trading decisions are human-only.
Sandbox first
Run TastyTrade sandbox + Kalshi demo for minimum 48 hours before switching to production credentials. Validate signal quality against known historical events.
Rotate TastyTrade sessions
Session tokens expire every 24 hours. agent-signal handles automatic re-auth. Monitor for auth failures in OpenClaw logs.
Twitter API rate limits
v2 filtered stream is 1 connection per account. Do not start multiple agent-sentiment instances. Basic tier: 500,000 tweets/month read.
Signal log review
Review SQLite signal log weekly. Feed high-confidence signals that preceded market moves into the flywheel for correlation weight tuning.
ClawBox VM snapshots
Snapshot the Lima VM before any major config change. Restore takes <60 seconds. Store snapshots outside ClawBox working directory.

Quick Links

ResourceURL
GitHubgithub.com/Organized-AI/organizedmarket — agents, scripts, dashboard
TastyTrade APIdeveloper.tastytrade.com — REST docs + DXLink WS
Polymarket Docsdocs.polymarket.com — CLOB API reference
Kalshi APIkalshi.com/docs/api — REST + auth guide
ClawBox Repogithub.com/coderkk1992/clawbox — source + releases
OpenClawgithub.com/openclaw/openclaw — gateway + agent registry