Forecasting at Scale: MathCo’s skCATS Model Ranks Among Top 10 in the M6 Competition

All
By
MathCo Team
April 5, 2023 8 minute read

MathCo, a global provider of advanced analytics business solutions, finished in the Top 10 of the 6th edition of the highly successful time-series forecasting M competition. MathCo’s Innovation team was able to outperform the baseline and predict the future performance of selected stocks and ETFs (Exchange Traded Funds) using its original internal model named skCATS (Complete Automated Time-Series).

From the invention of astronomy to modern-day weather predictions, the science of forecasting has been an object of human pursuit since time immemorial. Today, organizations everywhere leverage analytics tools to analyze historical data and forecast trends, aiding in business planning and critical decision-making.

Yet, time-series forecasting—a technique that makes scientific predictions based on historical time-stamped data—remains a complex analytical task for most enterprises even today. Transient anomalies, the adaptivity of forecasting methods, and the scalability of the data and pipeline infrastructure have been some of the persistent challenges of scaling time-series forecasting in business contexts.

Financial markets are one of the foremost areas in which time-series forecasting is applied today. This is where the competition of discussion comes into play. The 6th iteration of the M competition, known as the M6 competition, focused on evaluating the accuracy and value of time-series forecasts towards the explanation and validity of the Efficient Markets Hypothesis (EMH). This year, MathCo’s time-series forecasting skCATS model was able to secure the 8th position overall in the global M6 leaderboard.

Check out the story behind the development of the skCATS model and why it worked, along with a few tips for next year’s participants from our victorious Innovation team.

The M6 competition

In 1982, Spyros Makridakis, one of the world’s leading experts on forecasting, began a series of competitions to monitor forecasts in real life and rate their accuracy. The vision was to develop breakthroughs and solutions in response to real-world challenges. The latest edition of this competition, held from March 2022 to February 2023, was the first to tackle and question the EMH, which posits that all asset prices reflect all known information and therefore, the market is perfectly efficient.

The M6 competition sought to find empirical evidence on how investors can enhance the accuracy of their forecasts and utilize their findings to build resilient and lucrative portfolios. However, given the vastness of the field, the myriad of questions forecasting tries to answer, and the countless number of forecasting approaches available, attaining benchmarking accuracy is no small feat. The M6 competition pits different methods against each other to determine which performs best in different real-world scenarios.

The duathlon challenged the participants in two ways:

1. Provide probabilities for the performance quintile of 100 selected stocks and ETFs for the upcoming month.

2. Provide an investment portfolio each month based on their performance expectation of the same stocks and ETFs.

SkCATs & financial forecasting

The skCATS model was developed to forecast time series at scale, delivering fast, highly accurate, and cost-effective results. The model uses a two-model strategy, where the first model is focused on generating ranked forecasts of the chosen 100 ETFs and stocks based on past performance, and the second model is focused on determining if the results could beat the market baseline. Predictions were produced by combining the two sets of model results. The investment decision model used an ensemble approach, combining portfolio optimization theory and business fundamentals. The development processes to satisfy the dual challenges are explained below.

Development of the skCATS model

a. Forecasting challenge:

The Innovation team began testing and refining its methods prior to the release of the competition’s stock and ETF list. To do so, a random sample of 100 assets from different sectors of the S&P 500 was selected. The team then generated input variables using fundamental analysis (operational variables like profit/loss ratio, liquidity, etc.) as well as the macro drivers (inflation, volatility index, unemployment, etc.) of each industry and applied traditional time-series models to generate the results.

However, it soon became clear during the exercise that this approach would not suffice as performance failed to consistently beat the baseline. The problem was further complicated as there was the need for the generation of forecasts and confidence intervals for the input as well as output variables to be able to forecast the future. With some minor adjustments, the Innovation team was able to transform this task into a classification problem to obtain a simple relative ranking of the 100 assets.

Based on these new assumptions, the team believed that the historical relative ranking of assets would not differ significantly from their ranking in the following month. This assumption was tested by creating rolling relative rank probabilities for a period ranging from 40 to 56 months. After calculating the normalized count of each asset falling into a specific rank, the team combined all 17 rolling probabilities, resulting in 85 rows of features for the model.

These sector-wise models significantly outperformed the baseline during the back tests. MathCo’s team then defined 15 unique sectors within the M6 universe and tuned 15 skCATS classification models with the objective of accurately classifying performance of stocks and ETFS for the following month. The team finalized the models after 10 months of rigorous back testing.

Additionally, since the skCATS model training period did not account for inflation-related volatilities, the team trained a second skCATS model that assessed the inflation volatility each sector faced and re-adjusted the results of the first model accordingly throughout the competition.

b. Investment challenge:

To inform the investment decisions, the team developed multiple independent methods for assigning weights to the assets. Some methods relied purely on historical data while others were based on fundamental ratios. The team created a weighted ensemble of all methods to inform their decisions for the following month and fine-tuned the weighting system over a 10-month back-testing period to maximize the Sharpe ratio. Like with the forecasting process, the team adjusted investment weights with a second model that assessed potential sector volatility based on the sentiments expressed in the Federal Open Market Committee (FOMC) minutes.

How skCATS was modified to tackle real-world problems

Reliable forecasts are needed for multiple applications like capacity planning, merchandizing, and web traffic forecasting. However, scaling forecasts to large volumes of data presents unique challenges, like the needs to parallelize model execution for faster run time, to choose the right modelling approach based on the time-series pattern, and to make trade-offs between explainability and accuracy. MathCo’s proprietary AI/ML master engine, NucliOS, provides a no-code interface to scale forecasts reliably. The skCATS forecasting approaches have now been integrated with this AI-powered analytics platform. NucliOS can then be used for demand forecasting at scale as it automatically provisions the right infrastructure and selects the right model to forecast for each time series. MathCo thus blends state-of-the-art deep learning processes, statistical forecasting methods, and specialized techniques for intermittent forecasting to arrive at optimal forecasts for thousands of time series. Through the use of the updated NucliOS engine, the company’s internal benchmarks have observed up to 60% gains in time in reliable forecast generation for multiple time series.

Conclusions and recommendations

Any analytics solutions provider should be focused on productionizing complex data systems for clients. To achieve this, the company products must work well in live settings, which are characterized by incomplete datasets with varying frequencies, continuous random shocks, multiple sources of noise, and complex feedback loops. Live-testing products prior to launch is, therefore, critical to ensuring process optimization, excellent software quality, and operational efficiency.

As a successful global consulting firm that caters to Fortune 500 firms, MathCo continuously pushes technological boundaries to help clients address concerns and overcome challenges. The live nature of the M6 contest was, therefore, ideal scenario to benchmark the performance of the skCATS model. Since the overall forecasting results are similar to what was observed during the back tests, there is a level of consistency that skCATS can produce in its results under live scenarios.

However, it is important to remain circumspect and not over-extend the conclusions. While the overall forecast results being below baseline during the back tests and the actual competition, when analyzing the results month-to-month, the MathCo approach beat the baseline 6–8 times a year. As the M6 winners managed to keep their predictions below the baseline for all 12 months, there was certainly scope to improve on the skCATS models.

At the same time, although the EMH stipulates that no one can consistently outperform the market, MathCo’s Innovation team saw clear opportunities to outperform the market over the duration of the competition. This was primarily due to a mismatch between the Federal Reserves’ inflation expectations and their interest rate path, in comparison to the market’s outlook on the same. The team found that the markets continuously underestimated and then readjusted to the Federal Reserves’ assumptions and associated actions in this regard.

In every iteration, the M competition witnesses transformative developments in the arena of financial forecasting. MathCo’s Innovation team is proud to have taken part in and been placed in the Top 10 of the prestigious M6 competition. As a firm that is dedicated to optimization and innovation, MathCo continues to use the competition as a benchmark to improve its products for the benefit of its clients.

Retail

Empowering Retail Media Marketing with NucliOS

Read more
All

The Rise of the AI Workforce: Four Key Roles That Will Shape the Future of AI

Read more
All

Data Poisoning and Its Impact on the AI Ecosystem

Read more