YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Qwen2.5-1.5B Math LoRA Collection

This directory aggregates all LoRA checkpoints produced by the train_lora pipeline. Every subfolder corresponds to one math dataset and contains 10 independent 100-shot LoRA runs (group 00–09) trained on Qwen2.5‑1.5B-Instruct with identical hyperparameters. The adapters here are the source of truth for downstream evaluation (../评估体系) and for the parameter_generator project, which learns to map prompts to LoRA weights.

If you are new to the project, this document explains where the data comes from, how the LoRAs are produced, and how you can reuse them for inference, evaluation, or further training.

Provenance

  • Base model: Qwen2.5-1.5B-Instruct
  • Datasets: sampled from ../../prepare/data/math/*.json. Each JSON is a list of {prompt, response, system?} records. dataset_sampler.py draws 10 disjoint groups of 100 samples (unless the dataset has <1β€―000 examples, in which case sampling with replacement keeps the group size fixed) using a deterministic seed derived from the dataset name.
  • Training recipe (from config/default.yaml):
    • sequence length 4β€―096; LoRA r=64, alpha=128, dropout=0.05, target modules = {q,k,v,o,gate,up,down}_proj
    • 12 epochs / max 1β€―800 steps, learning rate 1e-4, batch size per device 2, gradient accumulation 16, BF16 training, gradient checkpointing on, weight decay 0.01, warmup ratio 0.03, checkpoints saved every 300 steps (keeping at most 6) plus a final adapter export
    • Tokenizers are cloned from the base model (pad token defaults to EOS if missing)
  • Monitoring & reproducibility:
    • Trainer logs (loss, LR, throughput) are in ../logs/<dataset>/group_xx/.
    • Slurm stdout/err for each shard live in ../logs/slurm/.
    • metadata.json captures the git commit (if GIT_COMMIT was set), timestamps, seeds, and the effective batch size so any experiment can be repeated exactly.

End-to-end data flow

  1. Raw JSON data comes from ../../prepare/data/math. Each file is a list of dict objects with keys:
    {
      "prompt": "...question...",
      "response": "...reference answer...",
      "system": "optional system message"
    }
    
  2. python -m train_lora.dataset_sampler --config config/default.yaml reads every dataset, filters out GSM8K_test.json, and deterministically samples 10Γ—100 items per dataset. The samples plus metadata (indices, seeds, timestamps) are written to ../prompt_groups/<dataset>/group_xx.json.
  3. python -m train_lora.run_tasks --run (or the Slurm array) iterates dataset/group pairs, loads the corresponding prompt group, and performs LoRA fine-tuning with Hugging Face Trainer.
  4. After training finishes, the following artifacts land in outputs/<dataset>/group_xx/:
    • a ready-to-use LoRA adapter (adapter/)
    • intermediate checkpoints for analysis/resume
    • tokenizers and metadata
  5. The evaluation stacks (../评估体系, ../parameter_generator/θ―„δΌ°) and the LoRA parameter generator both consume these directories directly.

Directory layout

outputs/
β”œβ”€β”€ Competition_Math/
β”œβ”€β”€ GSM8K_train/
β”œβ”€β”€ MATH/
β”œβ”€β”€ Math-IIO-68K-Mini/
β”œβ”€β”€ Math-Plus/
β”œβ”€β”€ Math_QA/
β”œβ”€β”€ Mu-Math/
└── ToT-Math-V1/

Each dataset directory contains group_00 … group_09. Inside every group:

Item Description
adapter/ Final LoRA export (adapter_model.safetensors, adapter_config.json, tokenizer + chat template snapshots, and HF training_args.bin). This is the folder you will load for inference.
checkpoints/checkpoint-xxxx/ Intermediate Trainer checkpoints saved every 300 steps (300–1β€―800). They include optimizer, scheduler, RNG state, and tokenizer copies for resuming or studying training dynamics.
tokenizer/ Standalone tokenizer snapshot identical to the one used during training; useful if you need a self-contained deployment without referencing the base model directory.
prompt_group.json The exact 100-shot dataset used for this training run (a copy of prompt_groups/<dataset>/group_xx.json). Contains metadata such as sampled indices, original source file, and timestamp.
metadata.json Provenance record with training loss, Trainer metrics, LoRA config, effective batch size/world size, timestamps, git commit (if exported), and file paths.
metadata.json -> trainer_state Full training log history (per-step metrics). Disable via metadata.save_training_state: false if you want lighter metadata.

Tip: Use metadata.json to find the latest checkpoint, to confirm which base model/tokenizer were used, or to drive automated uploads/evaluations.

Dataset overview

Dataset dir Source file (relative to prepare/data/math) Notes
Competition_Math Competition_Math.json 100-shot groups drawn from Competition Math practice problems.
GSM8K_train GSM8K_train.json Standard GSM8K train split, excluding the public test set (GSM8K_test.json was filtered out).
MATH MATH.json High-school & olympiad math benchmark.
Math-IIO-68K-Mini Math-IIO-68K-Mini.json Mini version of Math IIO dataset.
Math-Plus Math-Plus.json Composed of challenging math word problems.
Math_QA Math_QA.json Multi-choice MathQA dataset formatted to open-ended QA.
Mu-Math Mu-Math.json MuSR style math reasoning set.
ToT-Math-V1 ToT-Math-V1.json Tree-of-Thought flavored math prompts.

All datasets follow the same JSON schema, so swapping between them only changes topical coverage.

How to navigate a single group

Math_QA/
└── group_00/
    β”œβ”€β”€ adapter/
    β”‚   β”œβ”€β”€ adapter_config.json
    β”‚   β”œβ”€β”€ adapter_model.safetensors
    β”‚   β”œβ”€β”€ tokenizer/… (extra copies of merges, vocab, chat_template.jinja)
    β”‚   └── training_args.bin
    β”œβ”€β”€ checkpoints/
    β”‚   β”œβ”€β”€ checkpoint-300/
    β”‚   β”œβ”€β”€ checkpoint-600/
    β”‚   └── …
    β”œβ”€β”€ tokenizer/         # same as base tokenizer but pinned to this run
    β”œβ”€β”€ prompt_group.json  # 100-shot data
    └── metadata.json

When inspecting or sharing a run, the minimum file set is adapter/ + prompt_group.json + metadata.json. Everything else speeds up resuming or auditing.

Using the adapters

0. Environment prerequisites

  • Python β‰₯ 3.10, transformers >= 4.37, peft >= 0.8, accelerate, safetensors, torch (GPU build).
  • The base model directory must be accessible; otherwise download Qwen2.5-1.5B-Instruct from Hugging Face and update base_model path.
  • Optional: set HF_HOME, TRANSFORMERS_CACHE to avoid repeated downloads.

0.5. Reproduce the training pipeline (optional)

If someone wants to regenerate any adapter from scratch:

cd train_lora
python -m train_lora.dataset_sampler --overwrite   # regenerates prompt groups
python -m train_lora.train_single --dataset Math_QA --group 0
# or run the full queue
python -m train_lora.run_tasks --run

These commands will rebuild prompt_groups/ and outputs/ with exactly the same seeds and configuration documented above. Slurm users should submit sbatch run_lora_multinode.sh.

1. Load adapter with PEFT

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = "Qwen2.5-1.5B-Instruct"
adapter_dir = "outputs/Math_QA/group_00/adapter"

tokenizer = AutoTokenizer.from_pretrained(adapter_dir, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter_dir)

prompt = "Solve 3x + 7 = 22."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Notes:

  • Loading the tokenizer from adapter/ ensures identical chat template and additional tokens (if any). You can also point to the base tokenizer path if you prefer.
  • For batch inference, wrap the model with model.merge_and_unload() if you need a single combined set of weights (at the cost of losing LoRA toggling).
  • If you want maximal throughput on a single GPU, also call model.half() or model.to(torch.bfloat16) depending on your hardware; the adapters were trained with BF16 so keeping BF16 is the safest choice.

2. Resume or continue training

python -m train_lora.train_single \
  --dataset Math_QA \
  --group 0 \
  --group-file outputs/Math_QA/group_00/prompt_group.json

Set --group-file to reuse the same 100 samples, and initialize Trainer with checkpoints/checkpoint-XXXX via TrainingArguments.resume_from_checkpoint. This reproduces a group or lets you extend training steps.

To resume manually:

trainer.train(resume_from_checkpoint="outputs/Math_QA/group_00/checkpoints/checkpoint-1500")

3. Evaluate with Math-Verify

The evaluation stack in ../评估体系 and ../parameter_generator/θ―„δΌ° expects this directory layout. Example:

cd 评估体系
python scripts/run_all_evals.py \
  --config configs/eval_config.yaml \
  --datasets Math_QA \
  --groups 0 1

4. Packaging for distribution

  • Upload only adapter/ and metadata.json when sharing publicly (e.g., Hugging Face) to avoid huge checkpoint directories.
  • Keep prompt_group.json if you want consumers to understand the training data or to regenerate LoRA weights with the same samples.
  • When exporting, include a README snippet that references this document so downstream users know the provenance.
  • Suggested Hugging Face layout:
    Math_QA/
      group_00/
        adapter/
        prompt_group.json
        metadata.json
    README.md (copy sections describing provenance + usage)
    

File reference (metadata.json)

Key fields you may want to automate against:

Field Meaning
dataset_name, group_index Identify the run.
prompt_group_file Absolute path back to the sampled dataset.
checkpoint_root Where all intermediate checkpoints live.
train_loss, metrics Final loss and Trainer metrics dict.
trainer_state Full log history (can be large; disable via metadata.save_training_state).
training_args Exact HF TrainingArguments snapshot.
lora_config Copy of the LoRA hyperparameters used.
effective_batch_size world_size Γ— per_device_batch_size Γ— grad_accum β€” useful for scaling comparisons.
git_commit Populated if the GIT_COMMIT env var was set before training.
metrics.train_runtime, metrics.train_samples_per_second Throughput stats.
generated_at UTC timestamp when the metadata was written.

Best practices

  • Always match BF16 or FP16 settings between base model loading and adapter training; these adapters were trained in BF16.
  • If you edit files inside this directory, keep structure intactβ€”other scripts rely on relative paths (adapter, tokenizer, metadata.json).
  • Before deploying a new LoRA, verify it with the evaluation suite and consider merging multiple groups (e.g., ensemble or checkpoint averaging) only after confirming stability.
  • Use prompt_group.json and metadata.json as documentation when presenting results; they already include seeds, sample indices, and environment details.
  • If you build new LoRAs with different configs (e.g., higher rank, more steps), add a sibling directory (e.g., outputs_v2/) or annotate the README so collaborators know which adapters correspond to which experiment.

Happy finetuning! If you extend this collection (new datasets, extra groups, or different hyperparameters), add another section here describing the changes so downstream consumers stay informed.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support