facebooktwitteryoutube
Home Managed Accounts About Glossary
in Modeling Software - 17 Oct, 2016
by James - no comments
Model Building and Analysis Tutorial

This article describes the recommended approach to building market timing models. Synergy is feature rich and quite flexible, so completely valid variations on the workflow presented here are possible.

The goal for the example, is to build models for trading the SP futures on the trading day following the day the systems are updated. All trades are placed market on close (MOC). You will need daily data for the SP futures and daily CBOE VIX data to recreate the modeling run. The SP data is non-adjusted and continually linked. Both the SP futures and VIX data are provided by Pinnacle Data Corp.

Part 1. Exploratory Modeling Run

The first task is to do an exploratory modeling run. Exploratory modeling runs can be completed in a relatively short period of time. After analyzing the results of the exploratory modeling run we can decide if we want to pursue the use of the VIX for modeling the SP futures any further.

Figure 1. Data for Exploratory Modeling Run

Figure 1. Data for Exploratory Modeling Run

The SP futures is the first series loaded and is the traded series. The VIX index is an additional data series. You can load as many additional data series as you like, but it is best to investigate the usefulness of 1 series or group of series, like the NASDAQ breadth data, per modeling run.

All function blocks are selected by default. None have been deselected.

Figure 2. Function Blocks for Exploratory Modeling Run

Figure 2. Function Blocks for Exploratory Modeling Run

There are quite a few settings that can optionally be modified. The main settings pertinent to the example are displayed. Default settings were applied where not shown.

Figure 3. Trade settings for Exploratory Modeling Run

Figure 3. Trade settings for Exploratory Modeling Run

We are seeking to build models that trade MOC with a 1 day trading delay. i.e. Update the data and systems today and place the trades market on close on the following trading day.

Use of the columns belonging to the traded series has been disabled. This is done because we want to evaluate the use of the VIX series only.

Figure 4. Model Structure for Exploratory Modeling Run

Figure 4. Model Structure for Exploratory Modeling Run

Some intermarket function blocks can use a combination of the traded series columns and columns belonging to additional series. If we wanted to preclude the use of columns belonging to the traded series entirely then we would have checked the Exclude Traded Intmrkt checkbox as well.

The model structure will not be optimized using the genetic optimizer.

Figure 5. Model Structure Optimization for Exploratory Modeling Run

Figure 5. Model Structure Optimization for Exploratory Modeling Run

The default percentage split for the modeling periods is 70% for the build period, 90% for the validation period which leaves 10% as out-of-sample data. Approximately 65% of the data rows are being used for the build period in this case.

Figure 6. Modeling Periods for Exploratory Modeling Run

Figure 6. Modeling Periods for Exploratory Modeling Run

Note that 2 major bull and bear market periods are included in the build period.

Figure 7. Modeling Periods Chart for Exploratory Modeling Run

Figure 7. Modeling Periods Chart for Exploratory Modeling Run

One of the keys to executing an exploratory modeling run is to disable the PSO. A much larger variety of models will then be evaluated per unit of time. The PSO is enabled or disabled by checking or unchecking the Use PSO checkbox.

Figure 8. PSO Settings for Exploratory Modeling Run

Figure 8. PSO Settings for Exploratory Modeling Run

There is no need to enable the Autosave function for the exploratory modeling run, although there is no harm in doing so.

Figure 9. Control Panel settings for Exploratory Modeling Run

Figure 9. Control Panel settings for Exploratory Modeling Run

Autovalidation of models based on performance over the validation period will be done.

Figure 10. Autovalidation Rules for Exploratory Modeling Run

Figure 10. Autovalidation Rules for Exploratory Modeling Run

Note that the automatic walk-forward test has been disabled. There is no need to run the walk-forward test at this stage. The Min Trade Returns Skewness is disabled, but it would not matter if it was enabled.

The next step is to click the Run button on the Control Panel and let Synergy run for 5 to 10 minutes before clicking the stop button.

Figure 11. Exploratory Modeling Run Stopped After 5 Minutes

Figure 11. Exploratory Modeling Run Stopped After 5 Minutes

At the completion of the exploratory modeling run we want to know if Synergy was able to produce any models and if those models look like the types of models that we want. Ignore the out-of-sample performance as it is not important at this point in the procedure.

Figure 12. Models Produced by Exploratory Modeling Run

Figure 12. Models Produced by Exploratory Modeling Run

Plenty of models were produced, the typical in-sample percent of perfect is in the 7% to 9% range and the average traded period is around 7 to 8 trading days. Executing a production modeling run is justified. That is all there is to an exploratory modeling run.

Part 2. Configuring a Production Modeling Run

The goal of a production modeling run is to produce models that will be used as a component of a trading system. Production modeling runs need to run for significant periods of time. At AdaptiveTradingSystems.com we have run the model building process continuously for up to 3 days. Although, often a decent set of models will be produced overnight.

The main differences between an exploratory modeling run and a production modeling run are:

  • The PSO is enabled for a production modeling run and disabled for an exploratory modeling run.
  • The walk-forward autovalidation test is enabled for a production modeling run and disabled for an exploratory modeling run. If the walk-forward autovalidation test was not enabled for a production modeling run, then the WF Simulator can be used to test how well the models perform walking-forward when the run completes.

An optimization target is going to be applied to influence the characteristics of the models that are produced. We would like to see the percent of trades that are profitable equal 60% or greater. If a given model has achieved 58% profitable trades, but appears to be robust then that is fine.

Figure 13. Optimization Targets for Production Modeling Run

Figure 13. Optimization Targets for Production Modeling Run

The PSO has been enabled by ensuring the Use PSO checkbox is checked.

Figure 14. Production Modeling Run PSO Settings

Figure 14. Production Modeling Run PSO Settings

The walk-forward autovalidation test is now enabled.

Figure 15. Autovalidation Rules for Production Modeling Run

Figure 15. Autovalidation Rules for Production Modeling Run

The walk-forward autovalidation test has been configured so that at least 2 scenarios must perform at 7% of perfect or better over the validation period.

Figure 16. Production Modeling Run Walk-Forward Test Settings

Figure 16. Production Modeling Run Walk-Forward Test Settings

Note that different scenarios are distinguished by their lookback period. The above walk-forward test will run scenarios with lookback periods ranging from 500 to 2500 trading days in increments of 500.

The production modeling run was left to run overnight. After approximately 8 hours, 24 models were produced.

Figure 17. Models produced by Production Modeling Run

Figure 17. Models produced by Production Modeling Run

Part 3. Analyzing the Ensemble Report for a Production Modeling Run

The first task is to analyze the Ensemble Report. If the input series are useful for modeling the traded series, then there will be a high bias toward profitability over the out-of-sample period.

Figure 18. Ensemble Report for Production Modeling Run

Figure 18. Ensemble Report for Production Modeling Run

The percent of models that were hypothetically profitable over the out-of-sample period is 87.5% and the median percent of perfect out-of-sample is 5.3%. This is a decent result. Ideally we would like to see greater than 90% of models profitable out-of-sample with the median percent of perfect around 7%.

If the Ensemble Report looks good, then the next step is to analyze the individual models. If about half of the models had produced a hypothetical profit out-of-sample, then the entire modeling run would be considered suspect and no further analysis would be done.

Part 4. Analyzing the Models Produced by a Production Modeling Run

When multiple models have highly correlated signals it is usually desirable to export only one model from the set. The first step when analyzing individual models is to select the first model in the Models list and turn on the Model Correlation Filter.

Figure 19. Correlation Filter Applied to Model 1

Figure 19. Correlation Filter Applied to Model 1

When the filter is applied, only models with signals that are correlated to the selected model’s signal are listed. The minimum correlation is specified in the Min R text box. For this example, the minimum correlation level is 0.92.

According to the Model Correlation Filter no other models have signals that are highly correlated to model 1’s signal. The walk-forward test was applied during model building. I like to manually run the walk-forward test as well.

Leaving the Model Correlation filter on, the next step is to run the WF Simulator for each model in the set. In this case there is only one model.

Figure 20. Walk-Forward Simulation for Model 1

Figure 20. Walk-Forward Simulation for Model 1

The walk-forward simulation for model 1 is reasonable. Not brilliant though. The model is probably robust. However, I’ve see better models and I’m not tempted to export it.

Note that at this point we have viewed the out-of-sample performance for a particular model. If we accept and reject individual models based on their out-of-sample performance, then the out-of-sample period is no longer out-of-sample.

Model 1 appears to be robust so it is marked as robust.

Figure 21. Marking a Model as Robust

Figure 21. Marking a Model as Robust

Note that if a model does not appear to be robust then it should be marked as analyzed, but not as robust. The purpose of marking a model as analyzed is to keep track of which models have been analyzed and which have not.

That completes the analysis of the first set of models. Noting that the set only contained 1 model.

The next step is to turn off the Model Correlation Filter and select the next model in the list that has not been analyzed. When working through a list of models, following these steps makes remaining organized easy.

Figure 22. Model Correlation Filter Off Again

Figure 22. Model Correlation Filter Off Again

After going back to the Models list, turning off the Model Correlation Filter, selecting the next model in the list that has not been analyzed (model 2) and turning on the Model Correlation Filter again we see a larger set of correlated models.

Figure 23. Model Correlation Filter Applied to Model 2

Figure 23. Model Correlation Filter Applied to Model 2

After analyzing each model in the group, comparing the statistics and walk-forward performance, model 2 was identified as the preferred model and marked for export.

Figure 24. Single Scenario WF Simulation for Model 2

Figure 24. Single Scenario WF Simulation for Model 2

At this stage the Model Correlation Filter is still applied and all models in the second set have been analyzed.

Figure 25. Analysis of Models in Second Set Complete

Figure 25. Analysis of Models in Second Set Complete

The next step is to turn off the Model Correlation Filter and select the next model that has not been analyzed before turning on the filter again. The process is repeated until all models have been analyzed.

Figure 26. Turning Off the Model Correlation Filter Again

Figure 26. Turning Off the Model Correlation Filter Again

Working through a list of over 100 models is very manageable using the presented workflow. A summary of the workflow for analyzing individual models follows.

  1. Select the next model in the list that has not been analyzed.
  2. Turn on the Model Correlation Filter.
  3. Analyze all models in the list of models with highly correlated signals. If a model does not appear to be robust then mark it as analyzed only. If it is considered robust then mark it as robust. Mark the preferred model in the set for export.
  4. Turn off the correlation filter.

Go to step 1 and repeat until all models have been analyzed.

The modeling run used for illustration is in the Synergy online database and can be downloaded directly into the Synergy application. All Synergy users have access to the database of modeling runs.

Kind Regards,

James

Leave a Reply