What Is It?
A new signal is useful only to the extent that it adds information not already present in your existing book. Independence, or at least low correlation of forecast errors, is what turns many small alphas into a stronger combined alpha.
This is why quants spend so much time measuring overlap. Two signals can look different on paper while loading on the same hidden driver. If that driver breaks, both signals fail together.
How It Actually Works
Correlation shows how often signals move together. But raw correlation is not enough. You care about correlation in predictive content, turnover, and drawdown behaviour. A signal that is mildly correlated in scores may still be highly redundant once converted into positions.
A common workflow is to standardize signals, compute pairwise correlations, then regress a candidate signal on the existing signal set. The residual is the part not explained by the current library. If the residual still has predictive power, it adds incremental edge. If not, the candidate is mostly a relabelled version of what you already know.
The deeper point is that effective diversification is about independent mistakes. Two weak signals with separate failure modes are more valuable than two statistically prettier signals that both die in the same regime.
The Jargon Decoded
- Pairwise correlation — How strongly two variables move together.
- Orthogonalization — Removing the component explained by another signal set.
- Residual — What remains after explained variation is removed.
- Multicollinearity — Predictors carrying overlapping information.
- Hidden factor — An underlying driver that several signals indirectly capture.
- Drawdown correlation — Tendency for strategies to lose money at the same time.
Why This Matters
Without this lens, signal libraries bloat into illusion. You think you have diversification when you really have concentrated risk spread across many dashboards. Prediction-market traders face the same problem when multiple models all rely on the same polling or narrative source.
What This Unlocks
You can build cleaner ensemble models, estimate true breadth more honestly, and identify where a new research idea actually expands the opportunity set.
What Still Breaks
Correlation matrices are unstable, regime-dependent, and often miss nonlinear dependencies. A signal can look independent in calm periods and suddenly converge with everything else during stress.
Sources
- The Math Behind Combining 50 Weak Signals Into One Winning Trade — Roan / RohOnChain — Modern intuition for weak alpha combination, IC/IR, and prediction-market adaptation.
- The Fundamental Law of Active Management — Grinold — Classic result linking skill, breadth, and expected information ratio.
- Advances in Financial Machine Learning — Marcos López de Prado — Practical guidance on features, backtests, leakage, and overfitting.
- The Statistics of Sharpe Ratios — Lo — Why risk-adjusted metrics need careful interpretation under real-world assumptions.