The Financial Market & Performance Analysis
Python · SAS · Pearson Correlation · Stepwise Regression · Feature Engineering · OLS
74 financial variables. One stock. A multi-stage modeling pipeline that cut through the noise to explain 98% of Citigroup's daily price movements.
The Question
Financial data is noisy by nature. With 74 potential variables influencing Citigroup's stock performance, the challenge wasn't building a model, it was knowing which signals to trust and which to discard. A two-stage pipeline was applied: Pearson Correlation screening first eliminated redundant variables, followed by stepwise regression to iteratively identify the most predictive subset. The result was a lean model built only on variables with genuine explanatory power, accurate enough for forecasting, clean enough for executive reporting. Can disciplined feature selection turn market volatility into a structured, interpretable data story?
