YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Qwen2.5-1.5B Math LoRA Collection

This directory aggregates all LoRA checkpoints produced by the train_lora pipeline. Every subfolder corresponds to one math dataset and contains 10 independent 100-shot LoRA runs (group 00–09) trained on Qwen2.5‑1.5B-Instruct with identical hyperparameters. The adapters here are the source of truth for downstream evaluation (../评估体系) and for the parameter_generator project, which learns to map prompts to LoRA weights.

If you are new to the project, this document explains where the data comes from, how the LoRAs are produced, and how you can reuse them for inference, evaluation, or further training.

Provenance

Base model: Qwen2.5-1.5B-Instruct
Datasets: sampled from ../../prepare/data/math/*.json. Each JSON is a list of {prompt, response, system?} records. dataset_sampler.py draws 10 disjoint groups of 100 samples (unless the dataset has <1 000 examples, in which case sampling with replacement keeps the group size fixed) using a deterministic seed derived from the dataset name.
Training recipe (from config/default.yaml):
- sequence length 4 096; LoRA r=64, alpha=128, dropout=0.05, target modules = {q,k,v,o,gate,up,down}_proj
- 12 epochs / max 1 800 steps, learning rate 1e-4, batch size per device 2, gradient accumulation 16, BF16 training, gradient checkpointing on, weight decay 0.01, warmup ratio 0.03, checkpoints saved every 300 steps (keeping at most 6) plus a final adapter export
- Tokenizers are cloned from the base model (pad token defaults to EOS if missing)
Monitoring & reproducibility:
- Trainer logs (loss, LR, throughput) are in ../logs/<dataset>/group_xx/.
- Slurm stdout/err for each shard live in ../logs/slurm/.
- metadata.json captures the git commit (if GIT_COMMIT was set), timestamps, seeds, and the effective batch size so any experiment can be repeated exactly.

End-to-end data flow

Raw JSON data comes from ../../prepare/data/math. Each file is a list of dict objects with keys:

{
  "prompt": "...question...",
  "response": "...reference answer...",
  "system": "optional system message"
}

python -m train_lora.dataset_sampler --config config/default.yaml reads every dataset, filters out GSM8K_test.json, and deterministically samples 10×100 items per dataset. The samples plus metadata (indices, seeds, timestamps) are written to ../prompt_groups/<dataset>/group_xx.json.
python -m train_lora.run_tasks --run (or the Slurm array) iterates dataset/group pairs, loads the corresponding prompt group, and performs LoRA fine-tuning with Hugging Face Trainer.
After training finishes, the following artifacts land in outputs/<dataset>/group_xx/:
- a ready-to-use LoRA adapter (adapter/)
- intermediate checkpoints for analysis/resume
- tokenizers and metadata
The evaluation stacks (../评估体系, ../parameter_generator/评估) and the LoRA parameter generator both consume these directories directly.

Directory layout

outputs/
├── Competition_Math/
├── GSM8K_train/
├── MATH/
├── Math-IIO-68K-Mini/
├── Math-Plus/
├── Math_QA/
├── Mu-Math/
└── ToT-Math-V1/

Each dataset directory contains group_00 … group_09. Inside every group:

Item	Description
`adapter/`	Final LoRA export (`adapter_model.safetensors`, `adapter_config.json`, tokenizer + chat template snapshots, and HF `training_args.bin`). This is the folder you will load for inference.
`checkpoints/checkpoint-xxxx/`	Intermediate Trainer checkpoints saved every 300 steps (300–1 800). They include optimizer, scheduler, RNG state, and tokenizer copies for resuming or studying training dynamics.
`tokenizer/`	Standalone tokenizer snapshot identical to the one used during training; useful if you need a self-contained deployment without referencing the base model directory.
`prompt_group.json`	The exact 100-shot dataset used for this training run (a copy of `prompt_groups/<dataset>/group_xx.json`). Contains metadata such as sampled indices, original source file, and timestamp.
`metadata.json`	Provenance record with training loss, Trainer metrics, LoRA config, effective batch size/world size, timestamps, git commit (if exported), and file paths.
`metadata.json -> trainer_state`	Full training log history (per-step metrics). Disable via `metadata.save_training_state: false` if you want lighter metadata.

Tip: Use metadata.json to find the latest checkpoint, to confirm which base model/tokenizer were used, or to drive automated uploads/evaluations.

Dataset overview

Dataset dir	Source file (relative to `prepare/data/math`)	Notes
`Competition_Math`	`Competition_Math.json`	100-shot groups drawn from Competition Math practice problems.
`GSM8K_train`	`GSM8K_train.json`	Standard GSM8K train split, excluding the public test set (`GSM8K_test.json` was filtered out).
`MATH`	`MATH.json`	High-school & olympiad math benchmark.
`Math-IIO-68K-Mini`	`Math-IIO-68K-Mini.json`	Mini version of Math IIO dataset.
`Math-Plus`	`Math-Plus.json`	Composed of challenging math word problems.
`Math_QA`	`Math_QA.json`	Multi-choice MathQA dataset formatted to open-ended QA.
`Mu-Math`	`Mu-Math.json`	MuSR style math reasoning set.
`ToT-Math-V1`	`ToT-Math-V1.json`	Tree-of-Thought flavored math prompts.

All datasets follow the same JSON schema, so swapping between them only changes topical coverage.

How to navigate a single group

Math_QA/
└── group_00/
    ├── adapter/
    │   ├── adapter_config.json
    │   ├── adapter_model.safetensors
    │   ├── tokenizer/… (extra copies of merges, vocab, chat_template.jinja)
    │   └── training_args.bin
    ├── checkpoints/
    │   ├── checkpoint-300/
    │   ├── checkpoint-600/
    │   └── …
    ├── tokenizer/         # same as base tokenizer but pinned to this run
    ├── prompt_group.json  # 100-shot data
    └── metadata.json

When inspecting or sharing a run, the minimum file set is adapter/ + prompt_group.json + metadata.json. Everything else speeds up resuming or auditing.

Using the adapters

0. Environment prerequisites

Python ≥ 3.10, transformers >= 4.37, peft >= 0.8, accelerate, safetensors, torch (GPU build).
The base model directory must be accessible; otherwise download Qwen2.5-1.5B-Instruct from Hugging Face and update base_model path.
Optional: set HF_HOME, TRANSFORMERS_CACHE to avoid repeated downloads.

0.5. Reproduce the training pipeline (optional)

If someone wants to regenerate any adapter from scratch:

cd train_lora
python -m train_lora.dataset_sampler --overwrite   # regenerates prompt groups
python -m train_lora.train_single --dataset Math_QA --group 0
# or run the full queue
python -m train_lora.run_tasks --run

These commands will rebuild prompt_groups/ and outputs/ with exactly the same seeds and configuration documented above. Slurm users should submit sbatch run_lora_multinode.sh.

1. Load adapter with PEFT

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = "Qwen2.5-1.5B-Instruct"
adapter_dir = "outputs/Math_QA/group_00/adapter"

tokenizer = AutoTokenizer.from_pretrained(adapter_dir, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter_dir)

prompt = "Solve 3x + 7 = 22."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Notes:

Loading the tokenizer from adapter/ ensures identical chat template and additional tokens (if any). You can also point to the base tokenizer path if you prefer.
For batch inference, wrap the model with model.merge_and_unload() if you need a single combined set of weights (at the cost of losing LoRA toggling).
If you want maximal throughput on a single GPU, also call model.half() or model.to(torch.bfloat16) depending on your hardware; the adapters were trained with BF16 so keeping BF16 is the safest choice.

2. Resume or continue training

python -m train_lora.train_single \
  --dataset Math_QA \
  --group 0 \
  --group-file outputs/Math_QA/group_00/prompt_group.json

Set --group-file to reuse the same 100 samples, and initialize Trainer with checkpoints/checkpoint-XXXX via TrainingArguments.resume_from_checkpoint. This reproduces a group or lets you extend training steps.

To resume manually:

trainer.train(resume_from_checkpoint="outputs/Math_QA/group_00/checkpoints/checkpoint-1500")

3. Evaluate with Math-Verify

The evaluation stack in ../评估体系 and ../parameter_generator/评估 expects this directory layout. Example:

cd 评估体系
python scripts/run_all_evals.py \
  --config configs/eval_config.yaml \
  --datasets Math_QA \
  --groups 0 1

4. Packaging for distribution

Upload only adapter/ and metadata.json when sharing publicly (e.g., Hugging Face) to avoid huge checkpoint directories.
Keep prompt_group.json if you want consumers to understand the training data or to regenerate LoRA weights with the same samples.
When exporting, include a README snippet that references this document so downstream users know the provenance.

Suggested Hugging Face layout:

Math_QA/
  group_00/
    adapter/
    prompt_group.json
    metadata.json
README.md (copy sections describing provenance + usage)

File reference (`metadata.json`)

Key fields you may want to automate against:

Field	Meaning
`dataset_name`, `group_index`	Identify the run.
`prompt_group_file`	Absolute path back to the sampled dataset.
`checkpoint_root`	Where all intermediate checkpoints live.
`train_loss`, `metrics`	Final loss and Trainer metrics dict.
`trainer_state`	Full log history (can be large; disable via `metadata.save_training_state`).
`training_args`	Exact HF `TrainingArguments` snapshot.
`lora_config`	Copy of the LoRA hyperparameters used.
`effective_batch_size`	`world_size × per_device_batch_size × grad_accum` — useful for scaling comparisons.
`git_commit`	Populated if the `GIT_COMMIT` env var was set before training.
`metrics.train_runtime`, `metrics.train_samples_per_second`	Throughput stats.
`generated_at`	UTC timestamp when the metadata was written.

Best practices

Always match BF16 or FP16 settings between base model loading and adapter training; these adapters were trained in BF16.
If you edit files inside this directory, keep structure intact—other scripts rely on relative paths (adapter, tokenizer, metadata.json).
Before deploying a new LoRA, verify it with the evaluation suite and consider merging multiple groups (e.g., ensemble or checkpoint averaging) only after confirming stability.
Use prompt_group.json and metadata.json as documentation when presenting results; they already include seeds, sample indices, and environment details.
If you build new LoRAs with different configs (e.g., higher rank, more steps), add a sibling directory (e.g., outputs_v2/) or annotate the README so collaborators know which adapters correspond to which experiment.

Happy finetuning! If you extend this collection (new datasets, extra groups, or different hyperparameters), add another section here describing the changes so downstream consumers stay informed.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support