IBLM - GPT2-Small (FineWeb 10B)
A custom GPT model trained on FineWeb 10B dataset.
Model Details
- Architecture: Custom GPT with value residual connections and lambda mixing
- Parameters: ~124M (GPT2-small scale)
- Training Data: FineWeb 10B tokens
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
"Ksgk-fy/iblm",
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("gpt2")
# Generate text
input_ids = tokenizer("Hello, world", return_tensors="pt").input_ids
with torch.no_grad():
outputs = model.generate(input_ids, max_new_tokens=50)
print(tokenizer.decode(outputs[0]))
Citation
If you use this model, please cite...
- Downloads last month
- 50