language: - en tags: - spiking-neural-network - SNN - neuromorphic - language-model - from-scratch - energy-efficient

⚑ Nord β€” Spiking Neural Network Language Model (144M)

The first pure SNN language model with a fully original architecture, trained from scratch.

Model Description

Nord is a 144M-parameter Spiking Neural Network (SNN) for text generation. It uses biologically-inspired neurons with membrane potentials, firing thresholds, and binary spikes. Unlike other SNN language models, Nord was trained entirely from scratch β€” no transformer teacher, no distillation, no ANN-to-SNN conversion.

Key Features

Feature Details
Parameters 144.3M
Architecture Original (not RWKV, not Transformer)
Training method From scratch with surrogate gradients
Training data FineWeb-Edu
Sparsity (training) 97%
Sparsity (inference) 97-99.8%
Online learning STDP active during inference
Mobile deployment Android via Termux
Training cost ~$10 USD

Architecture

Nord combines five mechanisms from different subfields:

  • LeakyClamp β€” Prevents gradient death in deep SNN layers
  • Multi-Scale Temporal Encoding β€” T_fast=8 + T_slow=2 timesteps
  • Associative Cascade β€” Chain reactions keep sparse networks alive
  • Temporal Co-firing Resonance β€” Feature binding without attention
  • Reward-Modulated STDP β€” Aligns Hebbian learning with backprop

Model Configuration

d_model: 512
n_layers: 6
n_heads: 8
d_ff: 1024
T_fast: 8
T_slow: 2
max_seq_len: 512
vocab_size: 128,256
tokenizer: Llama-3.2 (meta-llama/Llama-3.2-1B)

Training

  • Dataset: FineWeb-Edu (~950M tokens, 10GB subset)
  • Hardware: RTX A5000 24GB (rented on Vast.ai at $0.177/hr)
  • Optimizer: AdamW (lr=6e-4, weight_decay=0.05)
  • Batch size: 48 (4 Γ— grad_accum=12)
  • Sequence length: 512
  • Loss progression: 6.28 (1k) β†’ 5.30 (5k) β†’ 5.00 (10k) β†’ 4.59 (34k)

Usage

import torch
from nord_core import NordConfig, NordModel
from transformers import AutoTokenizer

# Load
ckpt = torch.load("nord_latest.pt", map_location="cuda")
cfg = NordConfig(**ckpt["config"])
model = NordModel(cfg).cuda()
model.load_state_dict(ckpt["model_state_dict"])
model.eval()

tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B")

Or use the interactive chat:

python chat.py

Generation Examples (Step 34k, loss 4.59)

Prompt: "If you don't write properly I will delete your file"

Nord: "(I'm really careful) that a project is pretty good for the following reasons... The file will then turn this data into your code. It's not an algorithm to say that all the file is a project called 'read'. In order to get a lot of information on you are using this program, and we'll have no more knowledge about them!"

Prompt: "bro my laptop have problems and off wifi how on wifi"

Nord: "The project has been carried out by researchers at a site in 2013, including three main projects that work with wireless phones... using Wi-Fi sensors to monitor their devices such as wireless LANs... network that would have been created using more than 2% of 4 million people to use Bluetooth technology."

Spike Statistics

Context Sparsity Interpretation
Familiar topic 99.8% Confident β€” minimal neural activity
Training 97% Active learning β€” neurons spiking
Out-of-distribution 77% Uncertain β€” massive activation

Sparsity functions as a built-in uncertainty detector β€” no separate calibration needed.

Limitations

  • Repetition remains an issue (mitigated with repetition penalty in decoding)
  • Not competitive with GPT-2 in raw quality
  • Scaling above 144M is untested
  • No formal benchmark evaluation yet
  • Hallucination present (generates plausible but fictional details)

Comparison with Other SNN Language Models

Model Params From Scratch? Architecture
Nord 144M βœ… Fully original
SpikeGPT 216M βœ… Modified RWKV
SpikeLLM 7-70B ❌ Converted LLaMA
SpikeBERT ~110M ❌ Distilled from BERT
BrainTransformers 3B ❌ Converted Qwen2

Citation

@misc{nord2025,
  title={Nord: A Spiking Neural Network Language Model Trained from Scratch},
  author={zerdovzad},
  year={2025},
  url={https://github.com/gtausa197-svg/-Project-Nord-Spiking-Neural-Network-Language-Model}
}

About

Built by an 18-year-old electronics student from Ukraine, studying in Norway. No PhD, no team, no funding.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support