ashikshaffi08
/

coder_7b_v2

Model card Files Files and versions

coder_7b_v2 / README.md

ashikshaffi08's picture

Update README.md

f09394d verified about 1 year ago

|

history blame contribute delete

3.2 kB

	---
	license: apache-2.0
	tags:
	- code
	---
	# Fine-tuned Qwen2.5-Coder-7B for Function Writing

	## Model Description

	This model is a fine-tuned version of Qwen2.5-Coder-7B, specifically optimized for function writing tasks. The base model Qwen2.5-Coder-7B is part of the Qwen2.5-Coder family, which was trained on 5.5 trillion tokens including source code, text-code grounding, and synthetic data.

	### Base Model Details

	* Type: Causal Language Model
	* Architecture: Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
	* Parameters: 7.61B (6.53B Non-Embedding)
	* Layers: 28
	* Attention Heads: 28 for Q and 4 for KV
	* Context Length: Up to 131,072 tokens

	## Fine-tuning Specifications

	The model was fine-tuned using LoRA (Low-Rank Adaptation) with the following configuration:

	### Training Parameters

	* Training Data: 30,000 examples
	* Batch Size: 1 per device
	* Gradient Accumulation Steps: 24
	* Learning Rate: 1e-6
	* Number of Epochs: 2
	* Warmup Ratio: 0.05
	* Maximum Sequence Length: 4,096 tokens
	* Weight Decay: 0.01
	* Maximum Gradient Norm: 0.5
	* Learning Rate Scheduler: Cosine

	### LoRA Configuration

	* Rank (r): 32
	* Alpha: 32
	* Dropout: 0.05
	* Target Modules: q_proj, v_proj, o_proj, gate_proj, up_proj
	* Training Mode: BF16 mixed precision
	* RS-LoRA: Enabled

	### Training Infrastructure

	* Quantization: 4-bit quantization (NF4)
	* Attention Implementation: Flash Attention 2
	* Memory Optimization: Gradient checkpointing enabled

	## Usage

	This model is optimized for function writing tasks and can be loaded using the Hugging Face Transformers library. Here's a basic example:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Load the model and tokenizer
	model = AutoModelForCausalLM.from_pretrained(
	"path_to_your_model",
	trust_remote_code=True,
	torch_dtype=torch.bfloat16,
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained(
	"path_to_your_model",
	trust_remote_code=True
	)

	# Generate text
	input_text = "Write a function that..."
	inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=500)
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	```

	## Limitations

	* The model is specifically fine-tuned for function writing tasks and may not perform optimally for general code generation or other tasks
	* Maximum context length during fine-tuning was limited to 4,096 tokens
	* While the base model supports up to 128K tokens, using beyond 4,096 tokens may require additional validation

	## License

	This model inherits the Apache 2.0 license from its base model Qwen2.5-Coder-7B.

	## Citation

	If you use this model, please cite both the original Qwen2.5-Coder paper and acknowledge the fine-tuning work:

	```bibtex
	@article{hui2024qwen2,
	title={Qwen2.5-Coder Technical Report},
	author={Hui, Binyuan and Yang, Jian and Cui, Zeyu and Yang, Jiaxi and Liu, Dayiheng and Zhang, Lei and Liu, Tianyu and Zhang, Jiajun and Yu, Bowen and Dang, Kai and others},
	journal={arXiv preprint arXiv:2409.12186},
	year={2024}
	}
	```