Concept · Is it a real edge, or luck?

Edge

A true, repeatable statistical advantage in a strategy's rules over the market. Edge is the underlying truth that metrics like win rate, profit factor, and Sharpe estimate — never observe directly.

Edge

A true, repeatable statistical advantage in a strategy's rules over the market. Edge is the underlying truth that metrics like win rate, profit factor, and Sharpe estimate — never observe directly.

In plain English

If a strategy has edge, it will — on average over enough trades — make money. If it has no edge, it will eventually lose money (or break even minus costs).

You cannot see edge in any single backtest. You can only estimate it from the observed metrics. The larger your sample, the closer your estimate is to the truth. A strategy with PF 1.5 over 1000 trades almost certainly has real edge. A strategy with PF 1.5 over 30 trades might just be lucky.

This distinction — between observed performance and true edge — is the central epistemic question in strategy evaluation.

Why it matters for this fleet

Of 210 EMA Cross variants in this dossier, only 5 clear the edge test — and they resolve to just 3 distinct signals (all "21/50 crossover going long": SOL 1h, SOL 4h, ETH 4h). Here is the brutal part. Across 210 correlated variants, ~11 rows would pass an edge test by pure chance. Five is below eleven. So after the multiple-comparisons haircut (the correction you apply when you run many tests at once, because some will look significant just by luck), no edge in this fleet is distinguishable from luck.

The signals that suggest real edge:

Large sample size (sample size) — more trades, sharper estimate.
A clearing significance test — the win rate or Sharpe is far enough from "no edge" that the sample can rule out luck.
Theoretical justification (the rule should work because of X structural reason, not just because it did work).
Out-of-sample validation (out of sample testing) — the reserved real exam, not yet run on this fleet.

The signals that suggest lucky variance:

Low trade count.
Edge that appears only at extreme leverage (amplified market exposure, not skill).
Performance vanishes in slightly different windows or symbols.

Examples from the live fleet

Three rows make the point that a clearing significance test still does not make a strategy good.

id523 (EMA 21/50 · SOL · 1h · 2× · long): N=436 trades, win rate (the share of trades that close in profit) 31.9%, profit factor (gross profit divided by gross loss) 1.46, Sharpe 0.110, edge SIGNIFICANT. This is a real in-sample edge — and yet it lost badly to simply holding SOL: alpha (return above buy-and-hold) of −2013pp (percentage points). A genuine statistical edge that still trailed the market by a mile.
id522 (EMA 21/50 · ETH · 4h · 50× · long): the only edge-significant row that beat buy-and-hold (alpha +8123pp). But it did so at 50× leverage — it beat the market through 50× amplified beta (raw market exposure), not skill. N=117, win rate 22.2%, max drawdown (worst peak-to-trough equity drop) −60.9%.
id511 (EMA 21/50 · BTC · 1h · 2× · long): the near-miss. N=469, win rate 24.9%, tightly pinned (Wilson confidence interval ±3.9pp). The win rate is measured precisely — but the edge is NOT-significant: to prove this thin a Sharpe (0.020) you would need N ≥ 9,213 trades, and it has 469. Same row, opposite answers to the two confidence questions.

Same "this looks promising" first impression — opposite epistemic status, and even the winners aren't worth deploying.

How to test for edge

Family pooling — same rule across leverages should produce the same edge. If it does, the edge is real. (Leverage scales drawdown and PnL but never changes which trades fire: id523 at 2× and id659 at 1× are the identical 436 trades.)
Symbol consistency — same rule across symbols should preserve edge if the signal is symbol-independent (i.e. the rule's edge does not depend on which symbol it's applied to). If it doesn't, the edge is symbol-specific (still real, but narrower).
Out-of-sample — same rule on a different time period should preserve edge. This is the gold standard. Everything in this dossier is in-sample (selected and measured on the same history); no hold-out re-test has been run.
Walk-forward — re-fit on a moving window and validate on the next window repeatedly. Catches overfitting.

Edge vs. PnL — the most common confusion

PnL is what happened. Edge is what would happen if you ran the strategy forever. The PnL of any single backtest is one sample from a distribution; edge is the mean of that distribution.

A high-PnL backtest with no edge will revert to losing in live trading. A modest-PnL backtest with real edge will compound over time. Identifying the latter is the goal of this entire exercise.

Sources

wiki/qa-sessions/2026-05-17-session.md#q2 (first asked here)
wiki/2026-05-17-ema-cross-symbol-breakdown.md

Related concepts

See it in a real result →

Put it to the test

Does your idea have a real edge, or just a big number?

Spawn your variant, run it on the same engine, and read the edge-significance verdict — before you risk real money.

Test your own idea — free →Free account, no card