I saw that this project on GitHub has passed some tests using Kaggle datasets and won 22 gold medals. Could there be a problem of data contamination here? Can we try real-world competition problems on Kaggle?
I roughly get it, it's very interesting. It feels like this is an area that everyone hasn't delved into deeply so far. I'm still very curious about what exactly these real-time data consist of.
I might not have fully understood. Do you deploy an agent to the GPU that you want to detect, and then use the agent to detect problems that the GPU may encounter during operation?
Although such a measure has not solved the problems encountered in the current evaluation, at least it is indeed a very good measure in terms of decentralization and mobilizing the power of the community for co-construction.
Since skills have become a great boost to the improvement of model capabilities, can we try to distill skills, just like we did model distillation before? I think this can be achieved through multiple iterations.
The current functions of upskill are actually quite complete, but I wonder if we can try to make it generate a compatibility matrix between multiple skills, so that the combined effect is greater than the sum of the parts. In addition, Model A generates skills, and Model B looks for counterexamples, so that they can evolve together.
Actually, I think a very important point is that most independent developers do not have enough case studies to support their work, and at the same time, the cost of online deployment is actually a bit high