Gemma 2B Code Generation (Fine-tuned)
Fine-tuned Google Gemma 2B model for code generation using QLoRA on the CodeAlpaca dataset.
Performance
Evaluated on 100 test samples from CodeAlpaca (72 Python, 28 other languages):
| Metric | Baseline | Fine-tuned | Improvement |
|---|---|---|---|
| BLEU Score | 11.00 | 16.83 | +53% ✅ |
| Syntax Correctness | 81% | 76% | -5% |
Key Achievement: 53% improvement in code similarity (BLEU score), demonstrating the model learned to generate code closer to reference solutions.
Head-to-Head Comparison (Python tasks only):
- Both models pass: 45/72 (62.5%)
- Both models fail: 25/72 (34.7%)
- Baseline wins: 0
- Fine-tuned wins: 2 ✅
Model Details
- Base Model: google/gemma-2-2b-it
- Training Method: QLoRA (4-bit quantization with LoRA adapters)
- Dataset: CodeAlpaca-20k (18,000 training examples)
- Checkpoint: Step 2000 (~1.8 epochs, selected for best BLEU score)
- Training Platform: Google Colab (T4 GPU, free tier)
- Training Cost: $0
Training Configuration
- LoRA Rank: 16
- LoRA Alpha: 32
- LoRA Dropout: 0.05
- Target Modules: q_proj, v_proj
- Quantization: 4-bit NF4
- Learning Rate: 2e-4
- Batch Size: 16 (effective)
- Optimizer: Paged AdamW 8-bit
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"google/gemma-2-2b-it",
device_map="auto",
torch_dtype=torch.float16,
load_in_4bit=True
)
# Load fine-tuned adapter
model = PeftModel.from_pretrained(base_model, "nvhuynh16/gemma-2b-code-alpaca-best")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it")
# Generate code
instruction = "Write a function to check if a number is prime"
prompt = f"""### Instruction:
{instruction}
### Input:
### Response:
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True
)
code = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(code.split("### Response:")[-1].strip())
Example Output
Instruction: Write a function to check if a number is prime
Generated Code:
def is_prime(num):
if num <= 1:
return False
for i in range(2, num):
if num % i == 0:
return False
return True
Supported Languages
Primarily trained on:
- Python (majority)
- SQL
- JavaScript
- Java
- HTML/CSS
Limitations
- Syntax correctness slightly lower than base model (-5%) due to sampling variance
- Best for algorithmic/utility functions
- May require prompt tuning for optimal results
- Not optimized for framework-specific code (Django, FastAPI, etc.)
Evaluation Details
The model was evaluated on a held-out test set of 100 examples from CodeAlpaca:
- 72 Python tasks: Evaluated using Python AST parser for syntax validation
- 28 Non-Python tasks (SQL, JavaScript, Java, HTML): Validated by language detection
BLEU scores were calculated using SacreBLEU with smoothing to measure code similarity to reference implementations.
License
This model is based on Gemma 2B and follows the Gemma License.
Citation
@misc{gemma-2b-code-alpaca-best,
title={Gemma 2B Code Generation - Fine-tuned},
author={nvhuynh16},
year={2025},
publisher={HuggingFace},
howpublished={\url{https://huggingface.co/nvhuynh16/gemma-2b-code-alpaca-best}}
}
Acknowledgments
- Google for the Gemma model
- HuggingFace for the transformers and PEFT libraries
- CodeAlpaca dataset creators
- Google Colab for free GPU access
- Downloads last month
- 1