Model Card for harpertoken/harpertokenGPT2

GPT-2 small model trained from scratch on WikiText-2-raw-v1 dataset for text generation.

Model Details

Model Description

This is a GPT-2 small model (117M parameters) trained from random initialization on the WikiText-2-raw-v1 dataset. It can generate coherent text continuations.

  • Developed by: Niladri Das
  • Model type: GPT-2
  • Language(s) (NLP): English
  • License: Apache-2.0

Model Sources

Uses

Direct Use

Use for text generation tasks, such as completing sentences or generating stories.

Out-of-Scope Use

Not suitable for tasks requiring factual accuracy, safety-critical applications, or languages other than English.

Bias, Risks, and Limitations

Trained on WikiText, which may contain biases from the source data. Model may generate inappropriate or biased content.

Recommendations

Use with caution; implement content filters for production use.

How to Get Started with the Model

from transformers import pipeline

generator = pipeline('text-generation', model='harpertoken/harpertokenGPT2')
print(generator("The quick brown fox"))

Training Details

Training Data

WikiText-2-raw-v1 dataset, a collection of Wikipedia articles.

Training Procedure

Trained from scratch using PyTorch and Transformers.

Training Hyperparameters

  • Epochs: 3
  • Batch size: 1
  • Learning rate: 5e-5
  • Max length: 512

Evaluation

Basic evaluation via text generation coherence.

Results

Generates plausible text continuations.

Environmental Impact

  • Hardware Type: CPU/MPS
  • Hours used: ~10 minutes
  • Carbon Emitted: Minimal (local training)

Technical Specifications

Model Architecture and Objective

GPT-2 decoder-only transformer for causal language modeling.

Compute Infrastructure

  • Hardware: Mac with MPS
  • Software: PyTorch, Transformers
Downloads last month
5
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support