Concept · The simulator

Simulator Fidelity

The set of dimensions on which the backtest accounting in this repo matches — and does not match — what a real account on a real perp venue would experience for the same trade signals.

Simulator Fidelity

The set of dimensions on which the backtest accounting in this repo matches — and does not match — what a real account on a real perp venue would experience for the same trade signals.

In plain English

When the leaderboard reports a strategy's return, that number is what the simulator's accounting produced. Whether a real perp account, fed the identical entry/exit signals, would have produced the same outcome depends on which aspects of "real trading" the simulator faithfully models and which it does not. This concept catalogs both columns so headline numbers can be interpreted correctly.

The short version, which the rest of this page expands:

The simulator MODELS: a flat 0.17% fee per leg (charged on both the entry and the exit — its only friction), liquidation (the venue force-closing a position when losses eat through its margin), and leverage (trading a position larger than the cash backing it).
The simulator does NOT model: the funding rate (a periodic payment between the long and short holders of a perpetual future), slippage (the gap between the price you expect and the price you actually fill at), and the maker/taker split (different fees for adding liquidity vs removing it).

Funding is the load-bearing gap. Funding penalizes hold time — the longer you sit in a position, the more funding ticks you pay. And the only rows in this dossier with any measurable edge are the long-hold 4h and 1h swing strategies (slow timeframes, positions held for many hours or days). Those are exactly the rows a missing funding model flatters most: the simulator lets them hold for free, when a real account would bleed funding the whole time. So the strategies that look best are the ones whose results are most inflated by the gap.

What the simulator DOES model (verified by code read 2026-05-17)

Aspect	Where	Notes
Per-(strategy, symbol) equity ledger	`apps/backend/src/evaluation/equity-ledger.ts:14`	Each pair starts at `initialEquity` ($200), evolves via `apply(netPnl)` on each closed trade. Independent per pair.
Equity-scaled position sizing	`evaluator.ts:1340-1342`	`positionSizeUsd = equity × positionSizePct/100` in equityMode. Losses shrink next position; wins grow it. Real compounding.
Soft margin floor	`evaluator.ts:1343-1379`, `MIN_POSITION_USD = $1`	If computed position < $1, the entry signal is written with `fillStatus: 'margin_insufficient'` and no position opens. Triggers when equity drops below $50.
Liquidation distance	`evaluator.ts:143-158`	At leverage L, liquidation triggers on a 1/L adverse intra-candle move (long: low ≤ entry × (1 − 1/L); short: high ≥ entry × (1 + 1/L)).
Trading fees	`feeBps` on both entry and exit legs (`packages/shared/src/index.ts:304-318`)	A flat 0.17% per leg (charged on entry AND exit), applied to gross PnL. This is the simulator's only modeled friction.
Position lifecycle	`evaluation/position/position.ts`	Entry → exit (signal or TP/SL or liquidation), with frozen entry primitives.

What the simulator does NOT model — the gaps to "real account"

1. Leverage DOES amplify PnL — but it never changes which trades fire

A common worry is that the simulator might "ignore" leverage. It does not. Leverage is applied correctly: the pure helper computePnl (packages/shared/src/index.ts:304-318) takes its size argument as the leveraged notional (the full position size, not the cash margin behind it), and the leverage multiply happens at the caller boundary in apps/backend/src/evaluation/position/position.ts:198. A position at higher leverage produces proportionally larger P&L on the same price move, as expected.

The subtle and important fact is what leverage does not touch: the set of trades that fire. Leverage scales the size of each win and loss, and therefore the drawdown — but it never moves when the EMA crossover triggers, so it never changes which trades happen.

The 1×≡2× identity pair (from this dossier) proves it. Two strategies in run 83 are the same EMA 21/50 SOL 1h long, differing only in leverage:

id659 — 1× leverage.
id523 — 2× leverage.

They are the same 436 trades: identical 139 wins and 297 losses, identical 31.9% win rate (the share of trades that closed in profit). Leverage changed only the drawdown (the deepest peak-to-trough equity drop): −5.06% at 1× versus −9.9% at 2×. Same trades, scaled losses. This is the cleanest demonstration in the fleet that leverage is a magnitude knob, not a selection knob — it never altered the strategy's actual decisions, only how hard each decision hit the account.

2. No maintenance margin / partial liquidation cascade

Real perps progressively deleverage as equity approaches the maintenance margin floor (typically 0.5–2% of notional remaining). Simulator triggers only the binary 1/L liquidation rule.

3. No funding rate — the load-bearing gap

Perps charge or credit funding (a periodic payment between long and short holders of a perpetual future) on a regular cadence, typically every 8 hours. Direction depends on the perp-vs-spot price gap. The simulator charges zero funding cost.

This is the single most consequential omission in the whole dossier, and the reason is structural. Funding penalizes hold time — every funding tick a position sits through is another payment. The only strategies in this fleet with any measurable edge are the long-hold 4h and 1h swings, which hold positions for many hours or days and so would absorb the most funding. So the missing funding model flatters exactly the rows that look best. A fast scalp that's in and out within a single funding window barely cares about funding; a multi-day swing cares enormously — and the swings are the dossier's headline performers. Read every edge in this dossier knowing its real-account version would be dragged down by funding the simulator never charged.

3a. No slippage, no maker/taker split

The backtest fills at the candle's close price with no slippage (no gap between the expected and actual fill price) and no maker/taker split (it does not distinguish the cheaper "maker" fee for adding liquidity from the pricier "taker" fee for removing it). Both push real-account costs above what the simulator charges, in the same direction as funding.

Refined 2026-06-22 — funding does NOT flatter the buy-and-hold comparison; it cuts the other way. A tempting argument is "we don't charge funding, so we're unfairly inflating buy-and-hold." It's backwards. The buy-and-hold benchmark (buy and hold) is a spot-like 1× hold, which pays no funding in reality — so its number is honest. The strategies hold perp positions across funding ticks but are charged nothing, so in reality a leveraged perp strategy held through a hot-funding bull would do worse than the sim shows, while spot hold is unchanged. Net: real funding widens the gap against hold-heavy strategies, not narrows it. The funding gap is also non-uniform — it scales with average holding time, hitting slow trend-followers hardest and fast scalpers least. The "missing variable" that actually makes the buy-and-hold comparison less stark is risk adjusted return (hold's hidden 70–95% drawdown), not funding. Full reasoning: wiki/qa-sessions/2026-06-22-session.md#q1; concept: funding rate.

4. RESOLVED — Equity can go negative

Resolved in phase 121.1 — see bankruptcy. The simulator now enforces a configurable equityBankruptcyFloor (default $0, with percent-of-initial-equity support) per (strategyId, symbol). A closing trade whose debit crosses the floor is tagged exit_reason: 'bankruptcy', and subsequent entry signals are recorded with fillStatus: 'account_bankrupt'. The original gap description below is retained as historical record.

The $1 floor blocks new entries when 2% of equity drops below $1, but it does not prevent equity from being driven negative by debits from already-open positions or by accumulated losses. A real perp account would have been force-closed long before equity reached zero.

5. No portfolio effects

Each (strategy, symbol) pair has its own independent $200. No shared bankroll, no correlated drawdown, no cross-pair margin call. Real account: all positions draw from the same equity pool, so losses on one strategy reduce capital available to others.

6. No trader behavior

No size-down after drawdown, no capitulation, no margin top-up, no human discretion overriding signal at panic moments. The simulator continues executing programmatic signals until the $1 floor stops new entries.

7. No regime-aware fills

Real venues see liquidity vanish in fast markets (flash crashes, news events), so fills slip badly exactly when it hurts most. The simulator fills at the candle close with no slippage at all, so it never reflects order-book collapse during high-volatility moments.

Why this matters for interpreting fleet metrics

Different fleet metrics have different fidelity-sensitivity:

Metric	Affected by simulator gaps?
Win Rate (WR)	Mostly accurate — entry/exit logic is faithful.
Profit Factor (PF)	Mostly accurate — same trades, same per-trade PnL ratios.
Sharpe Ratio	Mostly accurate — unitless ratio of return to volatility.
Trade Count (N)	Accurate.
Net PnL ($)	Correctly leveraged at the trade level. Distorted only by missing funding, maintenance-margin cascades, and venue minimums — not by leverage accounting.
Max DD$	Correctly leveraged. Real DD$ at high leverage is amplified by funding drag and partial-liquidation cascades, which the simulator does not model.
`/equity` ratio (RoR proxy)	Useful as a lower bound on risk — real risk is at least this bad, primarily due to venue-mechanics gaps (funding, maintenance margin, min order size).
Expectancy / R-multiple	Sizing-dependent; interpret only at active config snapshot.

Practical takeaways for strategy evaluation

Leverage IS applied to PnL, but never changes which trades fire. Higher leverage scales each win, loss, and the drawdown — the id659 (1×) ≡ id523 (2×) identity pair shows the same 436 trades at both leverages, with drawdown −5.06% vs −9.9%.
Trust win rate, profit factor (gross wins ÷ gross losses), Sharpe (return per unit of volatility), trade count, and the per-trade PnL ratios at the trade-accounting level across leverages. The remaining gaps are venue-mechanics, not leverage math.
Every edge in this dossier is a lower bound on real-world cost. The unmodeled gaps — funding, slippage, maker/taker split — all push real costs up, never down. So a real account's outcome is at least this bad and usually worse.
Funding is the dominant unmodeled hazard. Because it penalizes hold time, it hits the long-hold 4h/1h swing rows hardest — and those are the only rows with any measured edge, so the gap flatters exactly the strategies that look best.

Examples from the live fleet

id659 ≡ id523 (the 1×≡2× identity pair) — both are EMA 21/50 · SOL · 1h · long, differing only in leverage (1× vs 2×). They share the same 436 trades (139 wins / 297 losses, 31.9% win rate, edge SIGNIFICANT). Leverage scaled only the drawdown: −5.06% at 1× vs −9.9% at 2×. This pair is the clean proof that leverage is a magnitude knob, not a selection knob — the accounting is faithful, but the simulator's trade-level accounting is not where the danger hides.
id523 (the swing with a real in-sample edge) — EMA 21/50 · SOL · 1h · 2× · long. Profit factor 1.46, Sharpe 0.110, payoff ratio ≈ 3.13 (the average win is about 3.13× the average loss). It has a genuine in-sample edge — but it is a 1h swing that holds positions for hours, so it is exactly the kind of row the missing funding model flatters most. Its real-account version would carry funding drag the dossier never charged. The trade accounting is trustworthy; the completeness of the cost model is not.
id522 (where leverage masquerades as skill) — EMA 21/50 · ETH · 4h · 50× · long. It posts alpha +8123pp and is the only edge-significant row that beat buy-and-hold — but only because 50× leverage amplified the bull-market beta, paired with a brutal −60.9% max drawdown. A faithful leverage multiply does not make this a "good" strategy; it makes the risk honest. Leverage is doing the work, not skill.

risk of ruin
leverage
liquidation
drawdown
path dependence

Sources

wiki/qa-sessions/2026-05-17-session.md#q12 (first asked here)
apps/backend/src/evaluation/equity-ledger.ts
apps/backend/src/evaluation/evaluator.ts:143-158, 1340-1379
packages/shared/src/index.ts:304-318 (computePnl)
apps/backend/src/evaluation/position/position.ts:198,291 (caller-side leverage multiplication)
.planning/debug/resolved/leverage-ignored-in-pnl.md (Resolved 2026-05-17 — section 1 above corrected)

Related concepts

See it in a real result →

Put it to the test

Does your idea have a real edge, or just a big number?

Spawn your variant, run it on the same engine, and read the edge-significance verdict — before you risk real money.

Test your own idea — free →Free account, no card