September 20, 2011
JMS: How did you get into trading
KF: I received an advanced degree in stochastic estimation through the Air Force — that’s the science of extracting information from noisy time series data. I thought that the commodity markets would be the perfect place to test if I could find a way to make money with my education. I bought some commodity data on 6 instruments and went to work. That was in the early 80’s. I learned a lot just playing with the computer on the data, and in 1986 I had something I thought would work. I started trading and learned even more through some real-trading ups and downs.
JMS: What’s the biggest challenge in developing trading systems
KF: Far and away, it’s avoiding excessive curve-fitting. Every strategy has a degree of curve-fitting because you’re using historical data to build a decision engine — it’s not like developing a mathematical model like the law of gravity, but there are ways to minimize it. Unfortunately, today’s development packages are uniformly built around the one-chart, one-system solution. The user is encouraged to build his strategy on the data for one chart. With the hundreds of point-and-click indicators available in the packages and the ability to optimize across entries, stops, filters, etc., the user is led into a trap. He winds up with a strategy what is almost all winners, very small losers, and an equity curve that rises from bottom left to top right of the graph in almost a straight line. Unfortunately, the likelihood that the strategy will trade that way going forward is almost nil. The solution has a relatively small number of trades and many rules, filters, risk control measures that wind up cherry-picking trades out of the data; it’s totally curve-fit.
JMS: You released Aberration in the early 90’s. Has your approach to system development changed since then
KF: My basic approach is to develop trading strategies that are minimally curve-fit. That hasn’t changed from the beginning. What has changed is the process I use from start to end. With the power of today’s computers, you can do things you could only dream of in the 80’s or even the early 90’s. Today, you can do a run on 50 commodities using 30 years of history in minutes — that includes money management. It took me hours to make the same run in the 80’s, and that run had fewer commodities and less history. I take advantage of this time savings by including money management in every run. My system development metric used to be profit per trade. Now it’s profit per year divided by average max draw-down per year — a form of gain-to-pain. I’ve found that the strategy that maximizes profit-per-trade usually isn’t the best trading solution because the draw-downs are relatively large. The strategy that offers the best gain-to-pain is the one I want to trade so now every run is measured that way.
JMS: OK, I understand that your process has changed, but what approach do you use to keep your strategies from being curve-fit
KF: I believe that the degree of curve-fitting in a system is proportional to the number of trades in the development sample. If you develop a system that has 100 total trades, it will be much more curve-fit then one that has 1,000 trades in its back-test. I try to develop systems that have thousands of trades in the back-test. If you use daily bar data, it’s almost impossible to generate thousands of trades on one commodity because of the relatively short history of most commodities. So to get the thousands of trades I want, I use all the liquid commodities and use the same rules across the whole basket.
JMS: Are you saying that the best way to trade commodities is to use the same rules and parameter values for all of them
KF: No. I know that the [factors] that drive price up and down vary from one to another. If I had enough data on one to yield thousands of trades, I’d develop a strategy to just trade that commodity. [However,] that data doesn’t exist so the next best thing is to develop across all of them. What you wind up with probably isn’t the best solution for any of them, but it is the best across all; and if the return to draw-down ratio is high enough, it’s worth trading.
JMS: Why do you think you need thousands of trades to minimize curve-fitting There are references in the trading literature saying that 30 is enough.
KF: Yes, and some people I respect [have] said that, but they’re confusing what statisticians say is the number of samples required to start using normal distribution statistics with what’s required to characterize the sample.
Suppose you had a barrel of socks and pulled out 30 pairs. If 15 were black and 15 white, you’d probably be pretty confident that the distribution in the barrel was about 50-50 black and white. But if I stuffed the barrel with a random number of socks for each of Crayola’s 120 colors, there’s no way you’d have any idea of the distribution with 30 draws. You’d need hundreds maybe thousands to know.
Trading is like the second sock case. Trades range from relatively big losers to relatively big winners. There’s no way 30 trades can characterize the distribution. I illustrate this in my seminars by generating a lot of trades on a basket of commodities. I characterize that group of trades by the average profit-per-trade and the standard deviation of the profit-per-trade. Then I sample from the big distribution and compute the statistics of the smaller sample and compare it with the “true statistics” of the big sample. The comparison is done by computing a statistic called the “standard error”. As the size of the smaller sample gets bigger, the standard error decreases until it is “close” to 0. It’s not until many thousands of trades that the standard error approaches zero.
JMS: OK, thousands of trades. But how do you know it isn’t curve-fit Do you do out-of sample testing
KF: No — out-of-sample testing has two problems. First, you’re leaving out precious data when you develop the strategy. That means there are less trades in your development sample than there would have been if you used all the data. The result will be more curve-fit than if you used all the data. Now maybe that isn’t a problem if you’re developing a stock strategy. In that case, you’ll easily be able to generate hundreds of thousands of trades. But for commodities, I think it is a problem.
The second problem is that you have no reference for performance. Suppose a non-curve-fit solution was developed that had the following performance:
- Year 1: $100,000
- Year 2: $60,000
- Year 3: $140,000
Now let’s assume you developed the strategy on two years of data and used the third year for out-of-sample testing. In this case, you have:
- Year 1: $100,000
- Year 2: $60,000
- Year 3: $80,000
With these results, you might conclude that the strategy was minimally curve-fit because the third year return was the average of the first two years. But in reality, the strategy was heavily curve-fit. The third year return should have been $140,000 but you only got $80,000, almost a 50 percent under-performance. In that case, you’d probably keep the curve-fit strategy. An out-of-sample test could also lead you to discard a non-curve-fit strategy if the out-of-sample results under-performed the average.
JMS: So there’s no definitive way to know a systems not curve-fit
KF: I use a process I call BRAC. That stands for Build, Rebuild, and Compare. I use all the data and build a strategy step-by-step, keeping track of each step and the metrics I use to evaluate the logic and parameter values. That step is Build. If the strategy turns out to be something I want to trade, I’ll redevelop it with some of the data withheld. During the redevelopment process, I’ll go through the same development steps and use the same metrics as in the first Build. That’s the Rebuild step. Then, I compare the results of the two strategies over the time-frame of the withheld data. If the results over that time-frame closely match, I know the original strategy is minimally curve-fit.
JMS: I think I need an example.
KF: OK, suppose I want to build a moving average trend-following system across 50 commodities and the last 30 years. My steps and logic might look like this:
- In 10-day steps from 10days to 100days, find the best number of days to maximize average annual return divided by the average of the 30 max draw-downs. Long entry occurs when there is a close above the n-day average, and reversal to short occurs when there is a close below the n-day average. One contract traded at each signal.
- Using the best n-day average from step 1, add a fixed dollar stop away from the entry. Test from $250 to $1,500 in $250 increments. If stopped out, you cannot re-enter in the same direction until a trade is entered in the opposite direction.
- Using the results of step 1 and 2, add a volatility filter to prohibit entry if the market volatility is too high. Use the average of true range for the last 10 days as baseline volatility. If the true range for the day of the signal is higher than x times the average, don’t enter. Test increments from 1 to 5 in half point increments. Suppose after this analysis, the resultant strategy is the one you want to trade. Go back and redevelop a similar strategy using the same steps on 29 years of data. Leave the last year out. At the end of that development process, see what the redeveloped strategy made in the last year and compare that against what the first strategy made. If the results are “close”, the first strategy is minimally curve-fit. If they aren’t close, the first strategy is curve-fit.
JMS: That’s not as difficult as I thought. Once you have a minimally curve-fit strategy, do you expect it to hold up forever
KF: I used to think that was true for the commodity markets. The longer-term supply and demand cycles that drive prices tend to cause sweeping trends that a system like Aberration can pick out. However, I believe things started to change about the year of 2000. I measure overall market volatility by computing the standard deviation of closing prices over some number of days and then multiply that result by the dollars per point the exchange has set up to get a dollar-volatility number for each commodity. From 1980 until the end of 1999, the average dollar volatility across a large basket of commodities was about $3,250. From the start of the year 2000 until September 2011, that number has risen to about $4,850. That’s a 50 percent increase. I suspect that the change is caused by the dramatic increase in managed money entering the futures market, including sector funds that aren’t trend-followers, just asset-class collectors. But it doesn’t matter what the cause is, it’s a reality that’s probably here to stay. That kind of structural change in a market can cause a strategy to stop working or not work as well.
JMS: Aberration is still ranked by Futures Truth as “One of the Top Ten Trading Systems of All Time”. Has it managed to perform well through this period of increased volatility
KF: Aberration has continued to make money. In fact, 2008 was probably its best year ever, but the increased volatility has changed the risk reward profile of the strategy. The initial volatility-based stop is now 50 percent larger on average than it was pre-2000. That’s increased risk. In 2006, I thought I needed to address the increased risk so I made a few simple changes and now market it as the Aberration Strategy.
JMS: I know you’ve been working on stock strategies. Is that for your own trading Or are you going to sell stock systems
KF: I’ve found some very good end-of-day stock strategies that I’m trading and have started to offer it on my website on a subscription basis. I offer 20 long and sell-short signals each day that have a 2-day hold. If you balance your long and short exposure, the returns are very high and the draw-downs relatively low. Additionally, I expect to offer stock option signals in the future.
JMS: Last question. There are rumors out that you’ve written a book. Is that true
KF: Yes, I wrote a book about developing trading strategies and submitted it to Wiley. They’ve agreed to publish it, but I have a bit of a problem. Most of the book contains techniques and processes that have never been published. Things such as the BRAC process I described earlier. I don’t feel I can write about these development tools if there’s no software that can implement them. [I’ve spoken] to some companies that have trading system development software but have yet to do a deal.
JMS: OK, thanks Keith.