# Cracking The Horse Racing Code

June 6, 2008Yes, I have been away for quite some time now. I have been saying that I had something worthwhile up my sleeve, but it has not been ready to put into words.....until now.

For months I have been researching horse racing. It has not been research to find great horses, jockeys and trainers, although that was an inevitable side effect along the way, or to find a better feel and respect for the tradition, another glorious side effect, but to predict horse races more accurately than the morning line odds, and find the probabilities of each horse coming in first. In other words, I approached horse racing with a gambler's eye, a microscope, a Daily Racing Form, and a calculator, to see just how close I could come to predicting the outcome of a horse race on a regular basis.

I completely understand that succeeding in this endeavor could ruin horse racing as it exists. If the tracks cannot make money, then they will no longer hold races. The sport could collapse. Really though, it doesn't have much further to fall. Name the jockey that rode the winner of last year's Kentucky Derby. Name the Horse that won last year's Kentucky Derby. Precisely. I'm sure Street Sense doesn't care that you do not remember him, but Calvin Borel should be upset. Seriously, only the Triple Crown gets people talking, and only a terrible accident like Barbaro can get people to remember a horse anymore. On the other hand, if the betting public shows an interest in the sport......making money has a tendency to do that to people, and then the sport of horse racing will have a revival. Did the MIT blackjack team destroy cards? Absolutely not. In fact, the time following the MIT blackjack team saw resurgence in the game. Horse racing can have that resurgence too, and it won't take a Triple Crown winner to do it. People just need to think they are smarter than the game and believe they can win. The money can be made up to a degree in television ratings and advertising. If people think they have an edge over the house, they will pay very close attention. As soon as the system starts getting beaten, the people in charge of the track will change it. The house is always designed to win. Horse racing will be fine, especially after finding a new casual audience of supporters. They get their rake no matter what. All it takes is a formula that will produce winners with consistency. A formula that can pick horses better than just looking at the morning line odds would make people feel like they have a huge advantage. Now I have my task.

The base of my research was academic journals. Curiously enough, there are more than a few scholars who have been enthralled enough by the sport and the math involved to have published there own works on the subject. In “Searching for Positive Returns at the Track: A Multinominal Logit Model for Handicapping Horse Races”, by Ruth N. Bolton and Randall G. Chapman, I found the information that would carry me through this project. Bolton and Chapman set out to find which variables would be most important to consider when evaluating the horses in a race, and predicting the outcome. There research suggested that "average amount of money earned per race in the current year" and "average speed rating over the last four races" were the two most important factors. "Lifetime win percentage" was also considered a significant variable, but not so much as the first two. The shocker to me was that jockeys, post position, and weight were deemed inconsequential for the most part. The best jockeys were often put on the best horses, and their correlation nullified much of their value.

Armed with this new information, it was time to make an equation of my own. I have the variables deemed most important by Bolton and Chapman, and needed to weigh them. In the case of betting on a Maiden race, where none of the competing horses has ever won a race in its life, I decided to change the "lifetime win percentage" to "lifetime in the money percentage", or how many times the horse has finished in the top three in the number of times the horse has competed. (Please note: All of the statistics that I use in this formula can be found in he Daily Racing Form, and you can use this the next time you go to the track.) Of my three important variables, "money earned per race in the current year", and "average speed rating over the last four races" were the most important, while "lifetime win percentage" was significant. To weigh them based on ten, the first two, which I will label "$/race" and "AVSPDRT" were given the value of four, while the last, "LifeWin%" is given the value of 2. 4+4+2=10, and all is right with the world. Each horse is then rated by these past performances in comparison to the other horses in the race. Given, in a three horse race, that horse #1 has an AVSPDRT of 64, horse #2 has an AVSPDRT of 61, and horse #3 has an AVSPDRT of 58. Horse #1 earns 3 points for having the highest AVSPDRT, while horse #2 would earn 2 points and horse #3 would earn 1 point. (If it were a four horse race, the top horse would earn 4 points, a five horse race, 5 points, etc.) This point-based ranking would be done for $/race and LifeWin%.

Ok, are we all still together on this one? It's ok to go back and go over that again. It took me a few tries myself. I'd hate for someone to fall astray, get a bad ranking system, and then come back and yell at me because they lost a lot of money. We're good? Moving on then.

Let us suggest that the equations were performed for all the horses in an upcoming five horse race. After each horse was given points based on its ranking in those variables, they stand as such:

Horse Number__pts from $/race__pts from AVSPDRT___pts from LifeWin%

1________________1______________5_____________1

2________________4______________3_____________5

3________________3______________1_____________2

4________________5______________2_____________4

5________________2______________4_____________2

This is based on an actual race from the Aqueduct on April 24th. The #3 and #5 horses both have a 2 in the last column because they have the same LifeWin%. Now, remember how we weighted the values from before? I wasn't just screwing around with nonsense. Multiply their points in each category by the values they were given earlier. Pts from %/race is multiplied by 4. Pts. from AVSPDRT are multiplied by 4. Pts. From LifeWin% is multiplied by 2. Let us look at how they stack up now.

Horse Number__________________New Point Total

1_________________________________26

2_________________________________38

3_________________________________20

4_________________________________36

5_________________________________28

Horse #2 is the favorite, with horse #4 being the second favorite, and the rest of the horses straggling behind. This is your own independent way to calculate the horses on your own. The best part is that now we can take it one step further. With the information you have in that last table, you can calculate what percentage of the total points each horse has, and have a rough estimate of what percentage each horse has of winning the race.

In the case of horse #1, it has 26 of a possible 148 points, 17.5%. Should we have it in a table? I say yes.

Horse Number________________% chance of winning

1______________________________17.5%

2______________________________25.6%

3______________________________13.5%

4______________________________24.3%

5______________________________18.9%

In the case of a small race like this, every horse has a significant chance of winning.**Important Note**: Morning line odds can also be put into percentages like this. The equation is even easy to remember. Add 1, and then divide 100 by that number. (Example #1 3-1 odds: 3+1=4 100/4=25% chance example#2 (this one is tricky) 5-2 odds: since it’s a two, you add that to 5, then divide the new number by 2........5+2=7 7/2=3.5 100/3.5= 28.57) After all the odds have been converted to percentages, you must add them all together. They will always come out to more than 100%, because the track takes into account their own rake, plus about a one point margin of error per horse in the race. In order to get the percentage back to 100, divide 100 by the sum of the morning line odds percentages. (Example: if the sum of the morning line odds percentages is 125, then 100/125 =.8. Multiply all the morning line odds percentages by .8 to find their true value. (Example: 25% becomes 20%)

Now you can compare the track percentages to our own, to see which horses you feel differently on. If you have a horse that you feel has a 35% chance to win, and the track is giving it a 25% chance to win, you will make more money betting on it correctly than you would if you were betting correctly on a horse that the track felt was better than your evaluation.

Finally, when betting exotically, such as exactas, trifectas and superfectas, take into account each horse's "lifetime in the money percentage". A horse may not win many races, but often finish in the top 3. That horse, even if it ranks lowly on your list of winners, may need to be put in your exotic bet, and forgetting those horses could cost you. Also, if after all this analysis, you are still deadlocked on which horse is the favorite, and then go to the jockeys and trainers. In a close race, a better jockey could make the difference. Even if they had been an afterthought until now, you would be wise not to forget them when you need just one more variable.

Now that I made you read the entire thing without telling you if it works, allow me to say.........kind of. I have only tested it over the course of nine races, and much more data needs to be collected. It is basically a lock that I will be running this strategy everyday that Saratoga is open, and a definitive answer should be ready by the beginning of September. Until then, I'll let you know that over the course of nine races at the Aqueduct, my strategy yielded more horses finishing in the money three times, the morning line odds yielded more horses finishing in the money four times, and we had the same number of horses finish in the money two times. My total number of horses finishing in the money over the course of the day was 14 of a possible 27, (one was a technicality, because I picked the 1A horse and the 1 horse placed, but since they are tied together, I would have still gotten the money), and the morning line odds had 14 1/2 of a possible 27 horses finish in the money, (they get a half point because in the second race, three horses scratched, leaving four to run. The morning line odds gave their bottom two horses the same odds. Both finished in the bottom two, but since one had to be third, they were right on a technicality.) So with a technicality on both sides, the track beat my number of in the money selections by 1/2. Only time and more races can really determine which system is really better, and since I have time to tweak mine before Saratoga, I am feeling pretty good about coming out on top.

If this all works out, and people get wind of my system and start using it, racing could find the interest it has been losing slowly since Affirmed last won the Triple Crown in 1978. If I were you, I'd back this horse to win.