language: - en tags: - spiking-neural-network - SNN - neuromorphic - language-model - from-scratch - energy-efficient
β‘ Nord β Spiking Neural Network Language Model (144M)
The first pure SNN language model with a fully original architecture, trained from scratch.
Model Description
Nord is a 144M-parameter Spiking Neural Network (SNN) for text generation. It uses biologically-inspired neurons with membrane potentials, firing thresholds, and binary spikes. Unlike other SNN language models, Nord was trained entirely from scratch β no transformer teacher, no distillation, no ANN-to-SNN conversion.
Key Features
| Feature | Details |
|---|---|
| Parameters | 144.3M |
| Architecture | Original (not RWKV, not Transformer) |
| Training method | From scratch with surrogate gradients |
| Training data | FineWeb-Edu |
| Sparsity (training) | 97% |
| Sparsity (inference) | 97-99.8% |
| Online learning | STDP active during inference |
| Mobile deployment | Android via Termux |
| Training cost | ~$10 USD |
Architecture
Nord combines five mechanisms from different subfields:
- LeakyClamp β Prevents gradient death in deep SNN layers
- Multi-Scale Temporal Encoding β T_fast=8 + T_slow=2 timesteps
- Associative Cascade β Chain reactions keep sparse networks alive
- Temporal Co-firing Resonance β Feature binding without attention
- Reward-Modulated STDP β Aligns Hebbian learning with backprop
Model Configuration
d_model: 512
n_layers: 6
n_heads: 8
d_ff: 1024
T_fast: 8
T_slow: 2
max_seq_len: 512
vocab_size: 128,256
tokenizer: Llama-3.2 (meta-llama/Llama-3.2-1B)
Training
- Dataset: FineWeb-Edu (~950M tokens, 10GB subset)
- Hardware: RTX A5000 24GB (rented on Vast.ai at $0.177/hr)
- Optimizer: AdamW (lr=6e-4, weight_decay=0.05)
- Batch size: 48 (4 Γ grad_accum=12)
- Sequence length: 512
- Loss progression: 6.28 (1k) β 5.30 (5k) β 5.00 (10k) β 4.59 (34k)
Usage
import torch
from nord_core import NordConfig, NordModel
from transformers import AutoTokenizer
# Load
ckpt = torch.load("nord_latest.pt", map_location="cuda")
cfg = NordConfig(**ckpt["config"])
model = NordModel(cfg).cuda()
model.load_state_dict(ckpt["model_state_dict"])
model.eval()
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B")
Or use the interactive chat:
python chat.py
Generation Examples (Step 34k, loss 4.59)
Prompt: "If you don't write properly I will delete your file"
Nord: "(I'm really careful) that a project is pretty good for the following reasons... The file will then turn this data into your code. It's not an algorithm to say that all the file is a project called 'read'. In order to get a lot of information on you are using this program, and we'll have no more knowledge about them!"
Prompt: "bro my laptop have problems and off wifi how on wifi"
Nord: "The project has been carried out by researchers at a site in 2013, including three main projects that work with wireless phones... using Wi-Fi sensors to monitor their devices such as wireless LANs... network that would have been created using more than 2% of 4 million people to use Bluetooth technology."
Spike Statistics
| Context | Sparsity | Interpretation |
|---|---|---|
| Familiar topic | 99.8% | Confident β minimal neural activity |
| Training | 97% | Active learning β neurons spiking |
| Out-of-distribution | 77% | Uncertain β massive activation |
Sparsity functions as a built-in uncertainty detector β no separate calibration needed.
Limitations
- Repetition remains an issue (mitigated with repetition penalty in decoding)
- Not competitive with GPT-2 in raw quality
- Scaling above 144M is untested
- No formal benchmark evaluation yet
- Hallucination present (generates plausible but fictional details)
Comparison with Other SNN Language Models
| Model | Params | From Scratch? | Architecture |
|---|---|---|---|
| Nord | 144M | β | Fully original |
| SpikeGPT | 216M | β | Modified RWKV |
| SpikeLLM | 7-70B | β | Converted LLaMA |
| SpikeBERT | ~110M | β | Distilled from BERT |
| BrainTransformers | 3B | β | Converted Qwen2 |
Citation
@misc{nord2025,
title={Nord: A Spiking Neural Network Language Model Trained from Scratch},
author={zerdovzad},
year={2025},
url={https://github.com/gtausa197-svg/-Project-Nord-Spiking-Neural-Network-Language-Model}
}
About
Built by an 18-year-old electronics student from Ukraine, studying in Norway. No PhD, no team, no funding.