Dave Schwartz
Jim Pommier
Dave Schwartz
Daniel Boals
Daniel Boals
Dave Schwartz
I take it you are using your own neural network in DET? I know you have experience with neural networks going back years, and when you started this project, there really weren't any open LLMs like we have today. I am no expert, but it seems like the neural nets in DeepSeek and Alibaba's Qwen are both pretty good and they have resources to train that you probably can't get without a billion dollar investment. Have you thought about using one of their models to check your DET? The new Qwen 3.6 29b is supposed to be really good. It is a dense model and it seems like what your are doing is more of a MoE, but then they do have Qwen 3.6 35b which is a MoE style logic under the hood. — Daniel Boals
...but it seems like the neural nets in DeepSeek and Alibaba's Qwen are both pretty good and they have resources to train that you probably can't get without a billion dollar investment.
Dave Schwartz
Also, and this is kind of a strange idea...
Do you know the paper, Attention is all you need? Well the way they keep words from inflating themselves to the detriment of the whole, is they require each matrix row to sum to 1. Words are still typically more attuned to themselves than the rest of the words in the array/sentence, but they are attentive enough to the words around them that there is meaning. Have you tried a mathematical approach in that manner with your genetic style algorithms that are limiting their guesses to "game" the rules? I wonder if there is a linear algebraic solution to your issue. — Daniel Boals
Daniel Boals
Dave Schwartz
Forum Members always see the latest updates and news first. Sign up today.