BlackList – Prompt Enhancer AI
BlackList is a lightweight decoder-only Transformer designed to transform simple visual keywords into masterpiece-level generative prompts for text-to-image engines such as Stable Diffusion, Flux, and Midjourney.
Model Details
Model Description
BlackList is a task-specific text-to-text generative AI model trained to enhance short visual keywords into rich, structured, high-aesthetic prompts optimized for modern text-to-image systems.
Instead of generating long-form text, BlackList specializes in prompt enhancement, automatically enriching minimal input with:
- Lighting descriptions
- Rendering quality (8K, ultra-detailed, cinematic, etc.)
- Artistic medium (digital painting, concept art, illustration)
- Stylistic elements
- Composition cues
The model is optimized for speed, low latency, and efficient deployment.
- Developed by: Bl4ckSpaces
- Model type: Decoder-only Transformer (Mini GPT-2 Base inspired)
- Language(s): English (optimized for visual art terminology)
- License: Apache 2.0
- Finetuned from model: Trained from scratch
Intended Use
Direct Use
BlackList is designed to:
- Convert short keyword input into enhanced prompts
- Improve text-to-image prompt quality
- Serve as a backend API model for image generator applications
- Be embedded into lightweight or mobile-first AI systems
Example:
Input: [SIMPLE] cyberpunk girl
Output: [ENHANCED] cyberpunk girl, neon lighting, ultra detailed, cinematic composition, 8k resolution, digital painting, concept art, highly detailed background
Downstream Use
- Integration into Stable Diffusion pipelines
- Prompt preprocessing layer before large T2I engines
- Web-based AI image generation platforms
- Mobile AI creative tools
Out-of-Scope Use
- Long-form content generation
- Conversational AI
- Factual QA systems
- Sensitive or high-risk decision making
This model is not designed for general language understanding tasks.
Architecture & Technical Specifications
Model Architecture
- Architecture: Custom Decoder-only Transformer
- Parameters: 23,273,472 (~23.27M)
- Hidden Size: 512
- Layers: 6
- Attention Heads: 8
- Max Context Length: 512 tokens
This architecture provides a balance between expressive capacity and low-latency inference.
Tokenizer & Language Design
- Tokenizer: Custom Byte-Pair Encoding (BPE)
- Vocab Size: 8,000 tokens
- Special Tokens:
[SIMPLE]→ Input trigger[ENHANCED]→ Output activation
The tokenizer was trained from scratch with a strong focus on visual art terminology, including rendering engines, lighting styles, artistic mediums, and aesthetic descriptors.
Training Details
Training Data
- ~30,000 curated and filtered high-quality "masterpiece" prompt pairs
- Custom extracted dataset optimized for visual aesthetics
Training Procedure
- Hardware: NVIDIA T4 Tensor Core GPU
- Training Duration: ~7 minutes
- Optimization Steps: 3,748
- Epochs: 4
- Final Loss: ~3.37
The final loss was intentionally stabilized to maintain keyword fluency while minimizing grammatical hallucination and overfitting.
Training Regime
- Mixed precision training (FP16)
Capabilities
Zero-Shot Aesthetic Enrichment
The model can interpret 1–3 base keywords and automatically inject:
- Lighting (cinematic lighting, soft shadows, neon glow)
- Render quality (8k, ultra-detailed)
- Artistic medium (digital painting, illustration, concept art)
- Composition and detail enhancements
Embedded Artist Knowledge
The model has internalized stylistic patterns inspired by legendary artists such as:
- Greg Rutkowski
- Artgerm
- Alphonse Mucha
This allows stylistic richness without requiring explicit artist invocation.
Lightweight & Fast
- Model Size: ~93MB
- Low-latency inference
- Suitable for API deployment
- Mobile-first friendly
Evaluation
Evaluation was performed qualitatively on:
- Aesthetic richness
- Prompt structure coherence
- Stability of enhancement
- Overfitting detection
The model demonstrates strong consistency in structured aesthetic enrichment while maintaining controlled expansion.
Bias, Risks & Limitations
- The model inherits stylistic bias from the curated dataset.
- It may overuse common high-quality descriptors (e.g., "8k", "ultra-detailed").
- It is not suitable for non-visual text tasks.
- It does not guarantee compatibility with every image model configuration.
Users should test and calibrate outputs depending on their target engine.
Environmental Impact
- Hardware: NVIDIA T4 GPU
- Training Time: ~7 minutes
- Compute Region: Unknown
- Carbon Emission: Minimal due to short training duration
How to Use
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "Bl4ckSpaces/BlackList-Prompt-Enhancer"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
input_text = "[SIMPLE] fantasy warrior"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Citation
If you use BlackList in your project, please credit:
Bl4ckSpaces – BlackList Prompt Enhancer AI
Model Card Contact
Developer: Bl4ckSpaces
Hugging Face: https://huggingface.co/Bl4ckSpaces�
- Downloads last month
- 31