Gemma 2B Code Generation (Fine-tuned)

Fine-tuned Google Gemma 2B model for code generation using QLoRA on the CodeAlpaca dataset.

Performance

Evaluated on 100 test samples from CodeAlpaca (72 Python, 28 other languages):

Metric	Baseline	Fine-tuned	Improvement
BLEU Score	11.00	16.83	+53% ✅
Syntax Correctness	81%	76%	-5%

Key Achievement: 53% improvement in code similarity (BLEU score), demonstrating the model learned to generate code closer to reference solutions.

Head-to-Head Comparison (Python tasks only):

Both models pass: 45/72 (62.5%)
Both models fail: 25/72 (34.7%)
Baseline wins: 0
Fine-tuned wins: 2 ✅

Model Details

Base Model: google/gemma-2-2b-it
Training Method: QLoRA (4-bit quantization with LoRA adapters)
Dataset: CodeAlpaca-20k (18,000 training examples)
Checkpoint: Step 2000 (~1.8 epochs, selected for best BLEU score)
Training Platform: Google Colab (T4 GPU, free tier)
Training Cost: $0

Training Configuration

LoRA Rank: 16
LoRA Alpha: 32
LoRA Dropout: 0.05
Target Modules: q_proj, v_proj
Quantization: 4-bit NF4
Learning Rate: 2e-4
Batch Size: 16 (effective)
Optimizer: Paged AdamW 8-bit

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-2-2b-it",
    device_map="auto",
    torch_dtype=torch.float16,
    load_in_4bit=True
)

# Load fine-tuned adapter
model = PeftModel.from_pretrained(base_model, "nvhuynh16/gemma-2b-code-alpaca-best")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it")

# Generate code
instruction = "Write a function to check if a number is prime"
prompt = f"""### Instruction:
{instruction}

### Input:


### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    do_sample=True
)

code = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(code.split("### Response:")[-1].strip())

Example Output

Instruction: Write a function to check if a number is prime

Generated Code:

def is_prime(num):
    if num <= 1:
        return False
    for i in range(2, num):
        if num % i == 0:
            return False
    return True

Supported Languages

Primarily trained on:

Python (majority)
SQL
JavaScript
Java
HTML/CSS

Limitations

Syntax correctness slightly lower than base model (-5%) due to sampling variance
Best for algorithmic/utility functions
May require prompt tuning for optimal results
Not optimized for framework-specific code (Django, FastAPI, etc.)

Evaluation Details

The model was evaluated on a held-out test set of 100 examples from CodeAlpaca:

72 Python tasks: Evaluated using Python AST parser for syntax validation
28 Non-Python tasks (SQL, JavaScript, Java, HTML): Validated by language detection

BLEU scores were calculated using SacreBLEU with smoothing to measure code similarity to reference implementations.

License

This model is based on Gemma 2B and follows the Gemma License.

Citation

@misc{gemma-2b-code-alpaca-best,
  title={Gemma 2B Code Generation - Fine-tuned},
  author={nvhuynh16},
  year={2025},
  publisher={HuggingFace},
  howpublished={\url{https://huggingface.co/nvhuynh16/gemma-2b-code-alpaca-best}}
}

Acknowledgments

Google for the Gemma model
HuggingFace for the transformers and PEFT libraries
CodeAlpaca dataset creators
Google Colab for free GPU access

Downloads last month: 1

Model tree for nvhuynh16/gemma-2b-code-alpaca-best

Base model

google/gemma-2-2b

Finetuned

google/gemma-2-2b-it

Adapter

(312)

this model

nvhuynh16
/

gemma-2b-code-alpaca-best