Portfolio Anomaly Detection 2.0: Identifying Security Level Attribution

How I evolved a system from asking “Is this portfolio unusual?” to answering “Which stocks are driving the divergence?”


The Progression

Last month, I shared a portfolio anomaly detection system that used a dual-model ensemble (Autoencoder + Isolation Forest) to identify unusual portfolios. This approach achieved strong results—93% recall on synthetic anomalies, F1 score of 0.63—and provided some insight into portfolio-level risk.

But there was a key limitation: it could tell you that your portfolio was unusual, but not why or if that was even bad.

The next iteration adds an enhancement that combines risk assessment with security-level attribution. In this context, ‘attribution’ means which holdings contribute most to the anomaly score.

The Problem with Portfolio-Level Scores

The original system aggregated features into a single portfolio-weighted vector before scoring. This was intentional—it captured the portfolio’s overall “personality” and detected when that personality was internally inconsistent or unusual relative to the market.

But this leaves unanswered questions:

  • “My portfolio is flagged as anomalous. Which stock is causing it?”
  • “I know tech is volatile right now. Is my portfolio more exposed than the market?”
  • “This stock moved 5% today. Is that unusual given what the market did?”

Individual features can only compare a stock to its own history, but these questions require a different approach—one that looks at individual securities in the context of the overall market.

The Solution: Cross-Sectional Features

I built a second model trained on cross-sectional features that characterize each stock relative to the market at a given point in time. These features capture deviations from normal cross-sectional relationships.

This distinction between individual and cross-sectional features is critical for attribution. A stock can appear normal on a standalone basis, but divergent in the cross-sectional market context.

Specifically, the features measure:

  • Relative returns: excess returns vs. market over multiple horizons
  • Dynamic factor exposure: time-varying beta and correlation
  • Market regime indicators: cross-sectional dispersion and rank position

Anomalies are identified as stocks that deviate significantly from these cross-sectional patterns—such as unusual relative performance, changing correlations, or outliers within the return distribution. Initial results with these features have shown promise, though further feature engineering and selection will be needed to optimize the model.

What Cross-Sectional Features Capture

Relative Returns:

  • How much did this stock move vs. the market?
  • Measured over 1-day, 5-day, and 20-day windows
  • Example: Stock down 5% when market up 2% = -7% relative return

Rolling Correlation:

  • Is this stock moving with or against the market?
  • Measured over 20, 60, and 90-day windows
  • Captures changing co-movement: +1 (perfect positive) to -1 (perfect negative)

Beta Dynamics:

  • How sensitive is this stock to market movements?
  • Trailing 60-day beta (standard CAPM measure)
  • β > 1: amplified market response | β < 1: dampened response

Cross-Sectional Rank:

  • Where does this stock rank in the universe?
  • Percentile based on relative returns (0-100 scale)
  • Shows relative positioning within peer group

Dispersion Features:

  • How dispersed are returns across the market?
  • Cross-sectional standard deviation of returns
  • High dispersion → stock-specific factors driving returns
  • Low dispersion → market-wide factors dominating

The Architecture

Model 1: Individual Risk Assessment

Purpose: Is this portfolio’s behavior unusual? Features: Stock-level features (momentum, volatility, RSI, MACD, etc.) aggregated to a portfolio-weighted vector. Output: Portfolio-level anomaly score and risk level

Model 2: Cross-Sectional Attribution

Purpose: Which stocks are driving divergence? Features: Market-relative characteristics (relative returns, correlation, beta, rank) Output: Per-security anomaly scores, weighted portfolio health score

Models Used in Current System:

  • Autoencoder: Learns normal patterns; flags high reconstruction error (MSE)
  • Isolation Forest: Statistically isolates outliers
  • Ensemble Weighting: Currently using 85% Autoencoder / 15% Isolation Forest (based on validation performance)

Real-World Example: Decoding Attribution

Let’s analyze a tech-heavy portfolio to see the system in action:

python scripts/analyze_portfolio.py META:0.3 AAPL:0.2 MSFT:0.3 TSLA:0.15 NVDA:0.05

Risk Assessment Output (Individual Model):

Portfolio Risk Level: MEDIUM
- Autoencoder Score: 0.0234 (threshold: 0.0198)
- Models Agree: Yes
- Interpretation: Portfolio exhibits unusual characteristics

The portfolio is flagged as anomalous. But why? This is where the cross-sectional model provides the crucial insight.

Attribution Analysis: Reading the Chart

Portfolio Anomaly Attribution by Security

This visualization summarizes the attribution clearly. The bars show each holding’s contribution to the portfolio divergence score.

The X-Axis: Attribution Scores

  • Negative scores (left): Securities that decrease portfolio anomaly (moving with the market normally)
  • Positive scores (right): Securities that increase portfolio anomaly (diverging from market patterns)
  • The further from zero, the stronger the contribution

What This Reveals:

  1. META (-0.77) and MSFT (-0.74) are the “normal” holdings

    • Large negative scores mean they’re behaving as expected relative to the market
    • These stocks are moving in sync with market patterns
    • They’re actually reducing the portfolio’s overall anomaly score
  2. AAPL (-0.52) and NVDA (-0.35) are slightly unusual

    • Still negative, but closer to zero
    • Mild deviations from typical market behavior
    • Not major concerns, but worth monitoring
  3. TSLA (+0.30) is the outlier

    • The only positive score in the portfolio
    • This stock is diverging significantly from market patterns
    • It’s the primary driver of the portfolio’s anomalous classification

The Health(Divergence) Score: +2.72

This indicates significant divergence from normal market behavior. The positive sign tells us the portfolio is moving differently than the overall market would predict.

What’s Actually Happening?

When we dig into TSLA’s behavior on this date:

  • Correlation Breakdown: TSLA’s 20-day correlation with SPY dropped from 0.65 to 0.23
  • Contra-Directional Movement: Market up 2%, TSLA down 5% (relative return: -7%)
  • Elevated Beta: Beta spiked from typical 1.4 to 2.1

Meanwhile, META and MSFT are tracking the market closely, which is why they have large negative attribution scores—they’re behaving “normally” and thus reducing the overall portfolio anomaly.

The Key Insight

Without attribution: “Your portfolio is unusual” (not very helpful)

With attribution: “TSLA is experiencing a correlation break and moving contra to the market, while your other holdings are tracking normally. This single position is driving your elevated risk score.”

This informs the portfolio holder where to investigate and how to make informed decisions.

Performance Validation

Individual Model (Original System)

  • F1 Score: 0.63 on synthetic anomalies
  • Recall: 93%
  • Model Agreement: >94%
  • Validation Scenarios: Market crashes, momentum bubbles, volatility spikes, correlation breakdowns, liquidity crises

Cross-Sectional Model (New Addition)

  • Trained on S&P 500 universe
  • 5 years of market-relative behavior data
  • Reconstruction error (MSE) used for anomaly scoring
  • Provides attribution-style contribution scoring

Combined System Benefits

  • Complementary Views: Risk assessment + attribution
  • Unified Interface: Single command analyzes both perspectives
  • Actionable Insights: From “something’s wrong” to “here’s what to investigate”

The Microcosm Notebook

I created a Jupyter notebook that represents the complete system on a small easy-to-digest scale:

  1. Load S&P 500 Universe Data: 60+ securities, 5 years of history
  2. Calculate Cross-Sectional Features: Relative returns, correlation, beta, ranks
  3. Train Autoencoder: Learn normal patterns of market-relative behavior
  4. Score a Portfolio: Calculate per-security anomaly scores
  5. Visualize Attribution: See which stocks drive portfolio divergence

The notebook serves as both documentation and a practical starting point. -you can run it with your own portfolio in under 5 minutes.

Conclusion

The initial goal of this portfolio anomaly detection system was to answer a single question: “Is this portfolio unusual?”

Version 2.0 answers a more valuable question: “Which stocks are making it unusual, and why?”

By combining individual risk assessment with cross-sectional attribution, the system provides both the “what” and the “why” of portfolio divergence.

Want to try it yourself? Check out the GitHub repository.

Tags: #MachineLearning #Finance #PortfolioManagement #Python #AnomalyDetection #QuantitativeFinance #OpenSource


Appendix: Feature Calculations

Relative Returns:

Relative Return = Stock Return - Market Return

Rolling Correlation:

Correlation = Pearson correlation coefficient between stock and market returns over specified window

Beta Dynamics:

Beta = Covariance(Stock Returns, Market Returns) / Variance(Market Returns)

Cross-Sectional Rank:

Rank Percentile = (Number of stocks with lower returns / Total stocks) × 100

Dispersion Features:

Cross-Sectional Dispersion = Standard Deviation of all stock returns at time t

Comments

Leave a comment