Systematic futures traders are invariably plagued by the small set of available daily data upon which to develop and test strategies. The last data revolution was the introduction of 24 hour electronic contracts starting in 2000.  Only a few symbols go back that far and many of them start much later.   Although some futures have a longer history, it is not helpful to back-test prior to 2000 because markets changed radically with the introduction of electronic futures. Consequently systematic back-tests have relatively few trading opportunities often taking between 50 and 150 trades in the last 15 years. That is simply not enough for robust strategies because of the high probability of curve fitting.

Volatility Signature

I dreamt of building synthetic daily futures data to mimic actual daily price movement but it seemed impossible because each futures symbol is very different from other futures symbols. However, research in the area of options led me to develop tools to predict price movement of futures contracts on daily data.  I discovered each futures symbol had a unique volatility signature that could be captured and defined algorithmically. The mechanics of doing this is beyond the scope of this article but I have provided a complete detailed explanation on my website for those more academically inclined. Suffice it to say that it is a sophisticated technique that requires considerable work and the result does not fall out directly from the data.

The volatility signature of each symbol is based on the proposition that if we know the current level of volatility and whether price closes higher or lower than the prior day close, we can use a look-up table to predict the possible range of price movement for tomorrow. The table that does this is broken down into 8 libraries of data; 4 to capture and predict price movement if today was an up close from the prior bar and 4 if today was a down close.  Each library contains roughly the same number of data measurements taken from the original set of data.  Each measurement records a price movement and the volatility reading. Each set of 4 libraries represent collections of price movement with similar volatility readings. Library 1 includes the lowest volatility readings and library 4 the highest with 2 and 3 in between, accounting for mid-low and mid-high volatility readings.

Monte Carlo Simulation Technique

The only way I know of to predict price movement is to use Monte Carlo simulation technique. With considerable effort I built a Monte Carlo engine to repeatedly sample the libraries of data that I had collected in an effort to predict price movement up to 70 days in the future. A side project of this effort was the realization that I could compute and create synthetic data with the tools I had built. The synthetic data, if built properly, should represent realistic potential price movement based on the libraries for the chosen symbol.

I fully expected the randomness associated with a Monte Carlo simulation would be fully adequate for making accurate predictions of price movement. It was not until I built synthetic data for the euro that I saw a flaw in my project. I was creating numerous synthetic data series each with over 50 years of data.  If my tool was behaving properly I would have seen the volatility move in ways that introduced periods of low, medium-low, medium-high and high volatility.  Instead, the Monte Carlo simulator never allowed volatility levels to reach mid-high and high levels.  Most of the synthetic simulations that I ran started out with the euro at 1.2 and ended the 50 year simulation with the euro over 15 or 20. This means that the bulk of trend following moves of the Euro occurrs under low volatility conditions.

Low Volatility Library

This is astonishing since Monte Carlo simulation should not allow this type of bias toward low volatility unless markets are not as random as one would expect and instead are influenced in large part by economic events that can have an extended duration. I went back to the Monte Carlo simulation tool and introduced a routine that randomly selected one of the libraries of volatility and a number of days to be used before again switching randomly to a different volatility library for different random duration. So for instance the lowest volatility library might be used for 95 days, and then the highest for 26 days, then mid-high for 194 days etc.  In essence this simulates economic events and their impact and was the key to generating data that looked and behaved much like the original data set.

Quality of Synthetic Data

b1.jpg

So how good is the synthetic data?  This can actually be quantified by seeing how close the volatility signatures of the synthetic data match the original volatility signature of the underlying symbol. Volatility signatures are defined by the volatility levels dividing the quartiles. In the case of the Japanese Yen the signature of the underlying data is 9.96, 11.82, and 13.83.  This means that one quarter of the original data sampled has a volatility measurement less than 9.96, one quarter between 9.96 and 11.82, one quarter between 11.82 and 13.83, and one quarter greater than 13.83.  I then computed the average volatility signatures of 20 of the 50-year synthetic daily data files.  The corresponding average signature was 9.91, 11.54 and 13.50.  The standard deviation of the computations of the 3 data points was .195, .271, and .350 respectively which is clearly very good. At the end of the article I have posted some screenshots of the data. Two of them are from the original data set and 2 of them are synthetic. See if you can tell which is which.

 b2.jpg

The result of this research allowed me to build a series of synthetic data files that simulate the volatility signature of the original symbol. Data is in ASCII, CSV form with DATE, CLOSE information only. There is no attempt yet to include open, high and low price points. If the reader is interested I have made available on my website 5000 years of data for the each of the major currency future symbols.  Specifically 100 50-year files can be downloaded free of charge for each symbol from the link on the homepage of my website at http://insideedge.net .

b3.jpg

###

FOR A FREE DOWNLOAD OF GROUNDBREAKING DATA FOR CURRENCY FUTURES, CLICK HERE

Related Reading: Getting In and Getting Out- All You Need to Know