This article describes the recommended approach to building market timing models. Synergy is feature rich and quite flexible, so completely valid variations on the workflow presented here are possible.
The goal for the example, is to build models for trading the SP futures on the trading day following the day the systems are updated. All trades are placed market on close (MOC). You will need daily data for the SP futures and daily CBOE VIX data to recreate the modeling run. The SP data is non-adjusted and continually linked. Both the SP futures and VIX data are provided by Pinnacle Data Corp.
Part 1. Exploratory Modeling Run
The first task is to do an exploratory modeling run. Exploratory modeling runs can be completed in a relatively short period of time. After analyzing the results of the exploratory modeling run we can decide if we want to pursue the use of the VIX for modeling the SP futures any further.
The SP futures is the first series loaded and is the traded series. The VIX index is an additional data series. You can load as many additional data series as you like, but it is best to investigate the usefulness of 1 series or group of series, like the NASDAQ breadth data, per modeling run.
All function blocks are selected by default. None have been deselected.
There are quite a few settings that can optionally be modified. The main settings pertinent to the example are displayed. Default settings were applied where not shown.
We are seeking to build models that trade MOC with a 1 day trading delay. i.e. Update the data and systems today and place the trades market on close on the following trading day.
Use of the columns belonging to the traded series has been disabled. This is done because we want to evaluate the use of the VIX series only.
Some intermarket function blocks can use a combination of the traded series columns and columns belonging to additional series. If we wanted to preclude the use of columns belonging to the traded series entirely then we would have checked the Exclude Traded Intmrkt checkbox as well.
The model structure will not be optimized using the genetic optimizer.
The default percentage split for the modeling periods is 70% for the build period, 90% for the validation period which leaves 10% as out-of-sample data. Approximately 65% of the data rows are being used for the build period in this case.
Note that 2 major bull and bear market periods are included in the build period.
One of the keys to executing an exploratory modeling run is to disable the PSO. A much larger variety of models will then be evaluated per unit of time. The PSO is enabled or disabled by checking or unchecking the Use PSO checkbox.
There is no need to enable the Autosave function for the exploratory modeling run, although there is no harm in doing so.
Autovalidation of models based on performance over the validation period will be done.
Note that the automatic walk-forward test has been disabled. There is no need to run the walk-forward test at this stage. The Min Trade Returns Skewness is disabled, but it would not matter if it was enabled.
The next step is to click the Run button on the Control Panel and let Synergy run for 5 to 10 minutes before clicking the stop button.
At the completion of the exploratory modeling run we want to know if Synergy was able to produce any models and if those models look like the types of models that we want. Ignore the out-of-sample performance as it is not important at this point in the procedure.
Plenty of models were produced, the typical in-sample percent of perfect is in the 7% to 9% range and the average traded period is around 7 to 8 trading days. Executing a production modeling run is justified. That is all there is to an exploratory modeling run.
Part 2. Configuring a Production Modeling Run
The goal of a production modeling run is to produce models that will be used as a component of a trading system. Production modeling runs need to run for significant periods of time. At AdaptiveTradingSystems.com we have run the model building process continuously for up to 3 days. Although, often a decent set of models will be produced overnight.
The main differences between an exploratory modeling run and a production modeling run are:
- The PSO is enabled for a production modeling run and disabled for an exploratory modeling run.
- The walk-forward autovalidation test is enabled for a production modeling run and disabled for an exploratory modeling run. If the walk-forward autovalidation test was not enabled for a production modeling run, then the WF Simulator can be used to test how well the models perform walking-forward when the run completes.
An optimization target is going to be applied to influence the characteristics of the models that are produced. We would like to see the percent of trades that are profitable equal 60% or greater. If a given model has achieved 58% profitable trades, but appears to be robust then that is fine.
The PSO has been enabled by ensuring the Use PSO checkbox is checked.
The walk-forward autovalidation test is now enabled.
The walk-forward autovalidation test has been configured so that at least 2 scenarios must perform at 7% of perfect or better over the validation period.
Note that different scenarios are distinguished by their lookback period. The above walk-forward test will run scenarios with lookback periods ranging from 500 to 2500 trading days in increments of 500.
The production modeling run was left to run overnight. After approximately 8 hours, 24 models were produced.
Part 3. Analyzing the Ensemble Report for a Production Modeling Run
The first task is to analyze the Ensemble Report. If the input series are useful for modeling the traded series, then there will be a high bias toward profitability over the out-of-sample period.
The percent of models that were hypothetically profitable over the out-of-sample period is 87.5% and the median percent of perfect out-of-sample is 5.3%. This is a decent result. Ideally we would like to see greater than 90% of models profitable out-of-sample with the median percent of perfect around 7%.
If the Ensemble Report looks good, then the next step is to analyze the individual models. If about half of the models had produced a hypothetical profit out-of-sample, then the entire modeling run would be considered suspect and no further analysis would be done.
Part 4. Analyzing the Models Produced by a Production Modeling Run
When multiple models have highly correlated signals it is usually desirable to export only one model from the set. The first step when analyzing individual models is to select the first model in the Models list and turn on the Model Correlation Filter.
When the filter is applied, only models with signals that are correlated to the selected model’s signal are listed. The minimum correlation is specified in the Min R text box. For this example, the minimum correlation level is 0.92.
According to the Model Correlation Filter no other models have signals that are highly correlated to model 1’s signal. The walk-forward test was applied during model building. I like to manually run the walk-forward test as well.
Leaving the Model Correlation filter on, the next step is to run the WF Simulator for each model in the set. In this case there is only one model.
The walk-forward simulation for model 1 is reasonable. Not brilliant though. The model is probably robust. However, I’ve see better models and I’m not tempted to export it.
Note that at this point we have viewed the out-of-sample performance for a particular model. If we accept and reject individual models based on their out-of-sample performance, then the out-of-sample period is no longer out-of-sample.
Model 1 appears to be robust so it is marked as robust.
Note that if a model does not appear to be robust then it should be marked as analyzed, but not as robust. The purpose of marking a model as analyzed is to keep track of which models have been analyzed and which have not.
That completes the analysis of the first set of models. Noting that the set only contained 1 model.
The next step is to turn off the Model Correlation Filter and select the next model in the list that has not been analyzed. When working through a list of models, following these steps makes remaining organized easy.
After going back to the Models list, turning off the Model Correlation Filter, selecting the next model in the list that has not been analyzed (model 2) and turning on the Model Correlation Filter again we see a larger set of correlated models.
After analyzing each model in the group, comparing the statistics and walk-forward performance, model 2 was identified as the preferred model and marked for export.
At this stage the Model Correlation Filter is still applied and all models in the second set have been analyzed.
The next step is to turn off the Model Correlation Filter and select the next model that has not been analyzed before turning on the filter again. The process is repeated until all models have been analyzed.
Working through a list of over 100 models is very manageable using the presented workflow. A summary of the workflow for analyzing individual models follows.
- Select the next model in the list that has not been analyzed.
- Turn on the Model Correlation Filter.
- Analyze all models in the list of models with highly correlated signals. If a model does not appear to be robust then mark it as analyzed only. If it is considered robust then mark it as robust. Mark the preferred model in the set for export.
- Turn off the correlation filter.
Go to step 1 and repeat until all models have been analyzed.
The modeling run used for illustration is in the Synergy online database and can be downloaded directly into the Synergy application. All Synergy users have access to the database of modeling runs.