Instructions to use codemateai/CodeMate-v0.1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use codemateai/CodeMate-v0.1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="codemateai/CodeMate-v0.1")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("codemateai/CodeMate-v0.1") model = AutoModelForCausalLM.from_pretrained("codemateai/CodeMate-v0.1") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use codemateai/CodeMate-v0.1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "codemateai/CodeMate-v0.1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "codemateai/CodeMate-v0.1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/codemateai/CodeMate-v0.1
- SGLang
How to use codemateai/CodeMate-v0.1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "codemateai/CodeMate-v0.1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "codemateai/CodeMate-v0.1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "codemateai/CodeMate-v0.1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "codemateai/CodeMate-v0.1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use codemateai/CodeMate-v0.1 with Docker Model Runner:
docker model run hf.co/codemateai/CodeMate-v0.1
Human eval scores?
If you make a coding model, then you have to tell use how good it is at coding. Saying it makes "error free" code means nothing. Please upload human eval and humaneval+ scores.
Also this is clearly a codellama finetune, i would add that to the model card so people know
Greetings @rombodawg ,
Thanks for bringing all the valid points to our attention.
We'd like to mention that this model is a mere test model that we have uploaded (v0.1) and we're working on our next model for which we will be releasing all the metrics.
Regards
Thanks for the info, one more question, is there a date or release window for the next model? im quite interested in this model, id like to know how soon it is going to be released, reopening for this question.
Thanks for showing interest in CodeMate-v0.1
For the next version, you can expect it to be out by the second week of February.
However, there can be delays in case the model doesn't pass the internal testing.
Anything else I can help you out with?
actually since you asked, what do you plan on being diffrent between this 0.1 release, and the one that will have benchmarks. How is it an upgrade? And last question do you have any plans to made any models with any other sizes of codellama? 7b, 13b, and the newly released 70b?
With code models, we have noticed that their ability to engage in chat is quite depleted when compared to generic models like Llama, Mistral, and others.
We're focusing on making the code model as capable of being engaging in contextual conversations as other models, without depletion in code generation capabilities.
As for the other sizes, we do have that in our pipeline.