deTERMINATOR: Confidence Level vs. Score Problem

Dave Schwartz

deTERMINATOR: Confidence Level vs. Score Problem

I thought some of you guys might find this interesting.

The way The deTERMINATOR works under the hood is that it has created and optimized about 600 systems.
When it handicaps it uses its rule set to sort the systems into an Order of Preference (OOP) by score.
It also creates a Level of Confidence (LOC) for the highest-ranking systems.
This is done uniquely for each race that it handicaps.
The LOC is determined by the AI based upon a set of internally developed weights considering things such as success rate in its own mind (so to speak) and sample size.

Here's where a problem developed.
The system uses a reward & penalty system - an internal economy, if you will.

This could best be described as SUCCEED & BREED vs. FAIL & DIE.

(Yes, men; it's all about sex.)

The problem was that the AI learned that if it gave each race a low confidence level it would not be held as accountable for bad picks!

I actually saw this problem about 30 years ago when I was writing what I called Neural Net #4.
That Neural Network had an output that represented it's confidence level on a particular bet.

It quickly learned that IF IT DOESN'T MAKE ANY BETS, IT WON'T DIE!

This was actually one of the main reasons that I stopped writing Neural Networks.

In DET I demanded that no more than a small percentage of races could be tossed for lack of confidence. I believe it was set at 6.5% during the training process.

Any Ant that skipped more was summarily executed on the spot. (Yes, I am a heartless bastard with digital ants.)
_____________________

Now to the actual problem in deTERMINATOR.

The AI compares the confidence level for its top systems and rejects the ones with a LEVEL OF CONFIDENCE that is below its minimum.

It then begins going down from its top pick until it finds one that qualifies.

If it had to go down too far from the top of its 600 systems to find a qualified system, it would REJECT the race and simply REFUSE TO HANDICAP IT.

Now, the user can still handicap it manually, but I didn't foresee the ramifications.

Imagine that you're a pick 4 player and, right in the middle of the card is a race that is pivotal in writing your tickets and you've lost your primary approach!
_____________________

We quickly learned that the races it targets are maiden races on the turf - especially for lightly raced but somewhat older horses. (2-year olds and young 3-year olds are generally not an issue - don't know why. Perhaps they are easier to handicap because breeding means more?)

The driving force behind the low LEVEL OF CONFIDENCE is probably sample size. There simply aren't that many races for it to look at in this category spread throughout the year. (Time of year can be a factor as well.)
_____________________

The Solution
There is a limit to how far down from best that it can go to select its system. This number is chosen by the AI - remember Succeed & Breed vs. the unpleasant alternative.

So, I forced it to go south looking for a play.

Never considered the fact that there might be some races where the entire system list was low confidence! (It actually was limited to going down a maximum of 21% of the systems.)

I mean, really? Not a single pitifully bad system that qualified?

In the end the solution was to take the top 21% of the systems and build a new sorting metric that included both confidence and score together and have the AI pick the BEST.

Jim Pommier

↪Dave Schwartz

Here's what I'm seeing. I want to make sure that I'm doing this correctly.
1. The first pop-up message: Loc: Glo_Filter_Name=NM-sp-s-turf. (17) Abort or Ignore. I select Ignore.
2. The next pop-up message: The AI Engine WILL handicap this race. Warning: Low Confidence Threshold. I select OK.
3. I then see at the bottom of the screen "LOW CONF" and "1". If you click the "1" you get "When you change the default Low Confidence System the race must be closed & reopened so the AI can re-handicap. The new pop-up is now: Loc: Glo_Filter_Name=Y-sp-s-turf. (21). When I reopened the "1" is now a "2".

Dave Schwartz

Dang.
Upgrade in 15 minutes

Dave Schwartz

It's working but I left in a stop.
I'll explain all of this tonight.

But Upgrade coming to get rid of the UNNECCESSARY stop.

Jim Pommier

Thanks. I'll be on the call tonight.

Dave Schwartz

35 fixes it and is in your DropBox.

Daniel Boals

Hi Dave,
This is my first post here, so hopefully it won't be too much.

I take it you are using your own neural network in DET? I know you have experience with neural networks going back years, and when you started this project, there really weren't any open LLMs like we have today. I am no expert, but it seems like the neural nets in DeepSeek and Alibaba's Qwen are both pretty good and they have resources to train that you probably can't get without a billion dollar investment. Have you thought about using one of their models to check your DET? The new Qwen 3.6 29b is supposed to be really good. It is a dense model and it seems like what your are doing is more of a MoE, but then they do have Qwen 3.6 35b which is a MoE style logic under the hood.

I remember talking to you over lunch at the Atlantis and I was very skeptical that AIs would improve quickly and you were totally right on that one. I did not see the release of DeepSeek coming at all. It really caught me by surprise. I figured that AIs would be a SaaS scam for years before anything really improved, but now with so many people running local AIs, things are moving at light speed. Kudos for you for making the jump. You really had vision.

Hope you are doing well. Can't wait to see DET take the NHC. That will be a huge step. When someone working with an AI wins the big one and takes home the $800k+.

Daniel Boals

Also, and this is kind of a strange idea...

Do you know the paper, Attention is all you need? Well the way they keep words from inflating themselves to the detriment of the whole, is they require each matrix row to sum to 1. Words are still typically more attuned to themselves than the rest of the words in the array/sentence, but they are attentive enough to the words around them that there is meaning. Have you tried a mathematical approach in that manner with your genetic style algorithms that are limiting their guesses to "game" the rules? I wonder if there is a linear algebraic solution to your issue.

Dave Schwartz

I take it you are using your own neural network in DET? I know you have experience with neural networks going back years, and when you started this project, there really weren't any open LLMs like we have today. I am no expert, but it seems like the neural nets in DeepSeek and Alibaba's Qwen are both pretty good and they have resources to train that you probably can't get without a billion dollar investment. Have you thought about using one of their models to check your DET? The new Qwen 3.6 29b is supposed to be really good. It is a dense model and it seems like what your are doing is more of a MoE, but then they do have Qwen 3.6 35b which is a MoE style logic under the hood. — Daniel Boals

First, I stopped using Neural Nets about 1994. They're not very good at "predictive AI." That is, they're not very good at accepting they can't get everything right.

(I write extremely advanced Genetic Algorithms.)

Second, when you upload racing data to an LLM, it absolutely DOES NOT use a Neural Network on it. Instead, it uses statistical methods.

Think of an LLM as a TRAINED tool. The LLM itself is not a NN. It is the OUTPUT of a NN.

Have you thought about using one of their models to check your DET?
Not for a second. LOL

...but it seems like the neural nets in DeepSeek and Alibaba's Qwen are both pretty good and they have resources to train that you probably can't get without a billion dollar investment.

The billion dollar investment could certainly have been used in other directions - but the goal was LANGUAGE, which is far different than making any kind of "investment prediction."

Dave Schwartz

Also, and this is kind of a strange idea...

Do you know the paper, Attention is all you need? Well the way they keep words from inflating themselves to the detriment of the whole, is they require each matrix row to sum to 1. Words are still typically more attuned to themselves than the rest of the words in the array/sentence, but they are attentive enough to the words around them that there is meaning. Have you tried a mathematical approach in that manner with your genetic style algorithms that are limiting their guesses to "game" the rules? I wonder if there is a linear algebraic solution to your issue. — Daniel Boals

I'm sorry, but I do not understand what you mean.

Perhaps we should put one of those lunches at the Atlantis real soon.

BTW, we ate at the Chinese restaurant next to the buffet on Friday night. While the menu is very limited, the food was phenomenal.

Daniel Boals

Hey Dave !!!
Your website emailed me as soon as you replied, pretty cool.

In the Attention idea of LLMs, they update the matrix continuously so that the words are not allowed to focus too much on themselves. Like you said, that is language, not handicapping. I was wondering if you used a procedure that continually evolved to rate the genetic evolutions of your "ants" then they would not be able to game the rules. They would learn the previous generation of rules and then be judged by the current. Perhaps this would get the better ones to not worry too much about the rules, but instead try to improve the outcome so that no matter what the rules for "SUCCEED & BREED vs. FAIL & DIE." get updated to, they are consistently turning out successful iterations and labeling the confidence as accurately as possible.

I would love to have lunch again. Red Bloom it is if you like that one. I had their beef dish and it was very good. Their veggie dishes were good when we tried those as well.

I am proud of you and the hard work you have put into yourself getting healthy. These last two years have been the sickest of my life. In 2024, I got that tumor that was spawning infections and had to be cut out. Last year seemed like nothing but recovery from the operation and the infections... they had me on two different doses of super powerful antibiotics, one was the lyme disease regimen and the other was the flesh eating bacteria regimen. I think they did a number on my body, killed off all the helpful bacteria. This last February, I got a bad head injury that I am just now getting over. It was hell for months, with the room spinning every time I got up or laid down. Anyways, I am trying hard to follow your good example and get my health together now, we shall see. You have about fifteen years on me if I remember right and you are definitely in better health, so that gives me a clear goal. With luck, when I catch you on years, I will also catch up to your good health :)

Would love lunch. This week is bad, my car is in the shop. Maybe next week, the 25th is memorial day right? Maybe later that week if you can do that, otherwise the first week of June?

It is always great talking to you, you have a truly unique insight into a lot of things. There are very few iconoclasts left in our society. Let me know, I am up for sushi, buffet, red bloom, deli, whatever you prefer...

Dave Schwartz

↪Daniel Boals

Beth leaves next Sunday for a trip to grandkids-land for a week.

Give me a call on my cell and we'll make a plan.

Daniel Boals

sounds good, will do

deTERMINATOR: Confidence Level vs. Score Problem

Please register to see more

Categories

More Discussions