-
Can Large Language Models Understand Context?
Paper ⢠2402.00858 ⢠Published ⢠23 -
OLMo: Accelerating the Science of Language Models
Paper ⢠2402.00838 ⢠Published ⢠85 -
Self-Rewarding Language Models
Paper ⢠2401.10020 ⢠Published ⢠151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper ⢠2401.17072 ⢠Published ⢠25
Collections
Discover the best community collections!
Collections including paper arxiv:2402.17764
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper ⢠2402.17764 ⢠Published ⢠627 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper ⢠2501.08313 ⢠Published ⢠301 -
Group Sequence Policy Optimization
Paper ⢠2507.18071 ⢠Published ⢠315 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper ⢠2509.03867 ⢠Published ⢠210
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper ⢠2402.17764 ⢠Published ⢠627 -
nanonets/Nanonets-OCR-s
Image-Text-to-Text ⢠4B ⢠Updated ⢠88.9k ⢠1.56k -
black-forest-labs/FLUX.1-Kontext-dev
Image-to-Image ⢠Updated ⢠317k ⢠⢠2.46k -
DeepSite v3
š³16kGenerate any application by Vibe Coding
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper ⢠2402.17764 ⢠Published ⢠627 -
Hierarchical Reasoning Model
Paper ⢠2506.21734 ⢠Published ⢠46 -
Less is More: Recursive Reasoning with Tiny Networks
Paper ⢠2510.04871 ⢠Published ⢠497 -
Training language models to follow instructions with human feedback
Paper ⢠2203.02155 ⢠Published ⢠24
-
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Paper ⢠2208.07339 ⢠Published ⢠5 -
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Paper ⢠2210.17323 ⢠Published ⢠8 -
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Paper ⢠2211.10438 ⢠Published ⢠6 -
QLoRA: Efficient Finetuning of Quantized LLMs
Paper ⢠2305.14314 ⢠Published ⢠57
-
crystalai/thoth-guardian-ai-auto-train-cybersecurity-shield
Updated ⢠1 -
crystalai/thoth-guardian-cybersecurity-shield
Updated ⢠1 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper ⢠2402.17764 ⢠Published ⢠627 -
Zenml Server
š§8Create reproducible ML pipelines with ZenML
-
Can Large Language Models Understand Context?
Paper ⢠2402.00858 ⢠Published ⢠23 -
OLMo: Accelerating the Science of Language Models
Paper ⢠2402.00838 ⢠Published ⢠85 -
Self-Rewarding Language Models
Paper ⢠2401.10020 ⢠Published ⢠151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper ⢠2401.17072 ⢠Published ⢠25
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper ⢠2402.17764 ⢠Published ⢠627 -
Hierarchical Reasoning Model
Paper ⢠2506.21734 ⢠Published ⢠46 -
Less is More: Recursive Reasoning with Tiny Networks
Paper ⢠2510.04871 ⢠Published ⢠497 -
Training language models to follow instructions with human feedback
Paper ⢠2203.02155 ⢠Published ⢠24
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper ⢠2402.17764 ⢠Published ⢠627 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper ⢠2501.08313 ⢠Published ⢠301 -
Group Sequence Policy Optimization
Paper ⢠2507.18071 ⢠Published ⢠315 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper ⢠2509.03867 ⢠Published ⢠210
-
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Paper ⢠2208.07339 ⢠Published ⢠5 -
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Paper ⢠2210.17323 ⢠Published ⢠8 -
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Paper ⢠2211.10438 ⢠Published ⢠6 -
QLoRA: Efficient Finetuning of Quantized LLMs
Paper ⢠2305.14314 ⢠Published ⢠57
-
crystalai/thoth-guardian-ai-auto-train-cybersecurity-shield
Updated ⢠1 -
crystalai/thoth-guardian-cybersecurity-shield
Updated ⢠1 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper ⢠2402.17764 ⢠Published ⢠627 -
Zenml Server
š§8Create reproducible ML pipelines with ZenML
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper ⢠2402.17764 ⢠Published ⢠627 -
nanonets/Nanonets-OCR-s
Image-Text-to-Text ⢠4B ⢠Updated ⢠88.9k ⢠1.56k -
black-forest-labs/FLUX.1-Kontext-dev
Image-to-Image ⢠Updated ⢠317k ⢠⢠2.46k -
DeepSite v3
š³16kGenerate any application by Vibe Coding