threshold-4to2-compressor

4:2 compressor for high-speed multiplier trees. Reduces 4 input bits plus carry-in to 2 output bits plus carry-out while preserving arithmetic value.

Circuit

   x      y      z      w      cin
   β”‚      β”‚      β”‚      β”‚       β”‚
   β””β”€β”€β”¬β”€β”€β”€β”΄β”€β”€β”¬β”€β”€β”€β”΄β”€β”€β”¬β”€β”€β”€β”˜       β”‚
      β”‚      β”‚      β”‚           β”‚
      β–Ό      β”‚      β”‚           β”‚
   β”Œβ”€β”€β”€β”€β”€β”   β”‚      β”‚           β”‚
   β”‚XOR  β”‚   β”‚      β”‚           β”‚
   β”‚(x,y)β”‚   β”‚      β”‚           β”‚
   β””β”€β”€β”¬β”€β”€β”˜   β”‚      β”‚           β”‚
      β”‚      β”‚      β”‚           β”‚
      β–Ό      β–Ό      β”‚           β”‚
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚           β”‚
   β”‚  XOR(xy,z)  β”‚  β”‚           β”‚
   β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜  β”‚           β”‚
          β”‚         β”‚           β”‚
          β–Ό         β–Ό           β”‚
       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”‚
       β”‚  XOR(xyz,w)  β”‚         β”‚
       β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜         β”‚
              β”‚                 β”‚
              β–Ό                 β–Ό
           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
           β”‚    XOR(xyzw, cin)   │───► Sum
           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

   cout = MAJ(x,y,z)     (independent of w, cin)
   carry = MAJ(XOR(x,y,z), w, cin)

Function

compress_4to2(x, y, z, w, cin) -> (sum, carry, cout)

Invariant: x + y + z + w + cin = sum + 2*carry + 2*cout

Truth Table (partial - 32 combinations)

x y z w cin sum carry cout verify
0 0 0 0 0 0 0 0 0=0
0 0 0 0 1 1 0 0 1=1
1 1 0 0 0 0 0 1 2=2
1 1 1 0 0 1 0 1 3=3
1 1 1 1 0 0 1 1 4=4
1 1 1 1 1 1 1 1 5=5

Input sum range: 0 to 5 Output encoding: sum + 2carry + 2cout (range 0-5)

Mechanism

The 4:2 compressor is built from two cascaded 3:2 compressors with a twist:

Stage 1: Compress (x, y, z)

  • sum1 = x XOR y XOR z
  • cout = MAJ(x, y, z) ← This goes to next column

Stage 2: Compress (sum1, w, cin)

  • sum = sum1 XOR w XOR cin
  • carry = MAJ(sum1, w, cin) ← This goes to next column

Key insight: The cout is computed early and can propagate horizontally while sum/carry are still being computed.

Architecture

Component Function Neurons Layers
XOR(x,y) First pair 3 2
XOR(xy,z) Add third 3 2
MAJ(x,y,z) cout 1 1
XOR(xyz,w) Add fourth 3 2
XOR(xyzw,cin) sum 3 2
MAJ(xyz,w,cin) carry 1 1

Total: 14 neurons

Parameters

Inputs 5 (x, y, z, w, cin)
Outputs 3 (sum, carry, cout)
Neurons 14
Layers 8
Parameters 44
Magnitude 46

Delay Analysis

Critical path for sum: 4 XOR stages = 8 layers Critical path for carry: 4 XOR stages + 1 MAJ = 9 layers Critical path for cout: 1 MAJ = 1 layer (very fast!)

The early cout enables fast horizontal carry propagation in multiplier arrays.

Usage

from safetensors.torch import load_file
import torch

w = load_file('model.safetensors')

def compress_4to2(x, y, z, w_in, cin):
    # Implementation details in model.py
    pass

# Example: sum of 5 bits
s, carry, cout = compress_4to2(1, 1, 1, 1, 1)
print(f"1+1+1+1+1 = {s} + 2*{carry} + 2*{cout} = {s + 2*carry + 2*cout}")
# Output: 1+1+1+1+1 = 1 + 2*1 + 2*1 = 5

Applications

  • Booth multipliers (radix-4)
  • Wallace/Dadda tree reduction
  • FMA (fused multiply-add) units
  • High-performance DSP

Comparison with 3:2 Compressor

Property 3:2 4:2
Inputs 3 5 (4 + cin)
Outputs 2 3 (2 + cout)
Reduction ratio 3β†’2 4β†’2 per column
Neurons 7 14
Tree depth for n bits O(log₁.β‚… n) O(logβ‚‚ n)

4:2 compressors provide faster reduction in multiplier trees.

Files

threshold-4to2-compressor/
β”œβ”€β”€ model.safetensors
β”œβ”€β”€ model.py
β”œβ”€β”€ create_safetensors.py
β”œβ”€β”€ config.json
└── README.md

License

MIT

Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Collection including phanerozoic/threshold-4to2-compressor