BlackList – Prompt Enhancer AI

BlackList is a lightweight decoder-only Transformer designed to transform simple visual keywords into masterpiece-level generative prompts for text-to-image engines such as Stable Diffusion, Flux, and Midjourney.


Model Details

Model Description

BlackList is a task-specific text-to-text generative AI model trained to enhance short visual keywords into rich, structured, high-aesthetic prompts optimized for modern text-to-image systems.

Instead of generating long-form text, BlackList specializes in prompt enhancement, automatically enriching minimal input with:

  • Lighting descriptions
  • Rendering quality (8K, ultra-detailed, cinematic, etc.)
  • Artistic medium (digital painting, concept art, illustration)
  • Stylistic elements
  • Composition cues

The model is optimized for speed, low latency, and efficient deployment.

  • Developed by: Bl4ckSpaces
  • Model type: Decoder-only Transformer (Mini GPT-2 Base inspired)
  • Language(s): English (optimized for visual art terminology)
  • License: Apache 2.0
  • Finetuned from model: Trained from scratch

Intended Use

Direct Use

BlackList is designed to:

  • Convert short keyword input into enhanced prompts
  • Improve text-to-image prompt quality
  • Serve as a backend API model for image generator applications
  • Be embedded into lightweight or mobile-first AI systems

Example:

Input: [SIMPLE] cyberpunk girl

Output: [ENHANCED] cyberpunk girl, neon lighting, ultra detailed, cinematic composition, 8k resolution, digital painting, concept art, highly detailed background

Downstream Use

  • Integration into Stable Diffusion pipelines
  • Prompt preprocessing layer before large T2I engines
  • Web-based AI image generation platforms
  • Mobile AI creative tools

Out-of-Scope Use

  • Long-form content generation
  • Conversational AI
  • Factual QA systems
  • Sensitive or high-risk decision making

This model is not designed for general language understanding tasks.


Architecture & Technical Specifications

Model Architecture

  • Architecture: Custom Decoder-only Transformer
  • Parameters: 23,273,472 (~23.27M)
  • Hidden Size: 512
  • Layers: 6
  • Attention Heads: 8
  • Max Context Length: 512 tokens

This architecture provides a balance between expressive capacity and low-latency inference.


Tokenizer & Language Design

  • Tokenizer: Custom Byte-Pair Encoding (BPE)
  • Vocab Size: 8,000 tokens
  • Special Tokens:
    • [SIMPLE] → Input trigger
    • [ENHANCED] → Output activation

The tokenizer was trained from scratch with a strong focus on visual art terminology, including rendering engines, lighting styles, artistic mediums, and aesthetic descriptors.


Training Details

Training Data

  • ~30,000 curated and filtered high-quality "masterpiece" prompt pairs
  • Custom extracted dataset optimized for visual aesthetics

Training Procedure

  • Hardware: NVIDIA T4 Tensor Core GPU
  • Training Duration: ~7 minutes
  • Optimization Steps: 3,748
  • Epochs: 4
  • Final Loss: ~3.37

The final loss was intentionally stabilized to maintain keyword fluency while minimizing grammatical hallucination and overfitting.

Training Regime

  • Mixed precision training (FP16)

Capabilities

Zero-Shot Aesthetic Enrichment

The model can interpret 1–3 base keywords and automatically inject:

  • Lighting (cinematic lighting, soft shadows, neon glow)
  • Render quality (8k, ultra-detailed)
  • Artistic medium (digital painting, illustration, concept art)
  • Composition and detail enhancements

Embedded Artist Knowledge

The model has internalized stylistic patterns inspired by legendary artists such as:

  • Greg Rutkowski
  • Artgerm
  • Alphonse Mucha

This allows stylistic richness without requiring explicit artist invocation.

Lightweight & Fast

  • Model Size: ~93MB
  • Low-latency inference
  • Suitable for API deployment
  • Mobile-first friendly

Evaluation

Evaluation was performed qualitatively on:

  • Aesthetic richness
  • Prompt structure coherence
  • Stability of enhancement
  • Overfitting detection

The model demonstrates strong consistency in structured aesthetic enrichment while maintaining controlled expansion.


Bias, Risks & Limitations

  • The model inherits stylistic bias from the curated dataset.
  • It may overuse common high-quality descriptors (e.g., "8k", "ultra-detailed").
  • It is not suitable for non-visual text tasks.
  • It does not guarantee compatibility with every image model configuration.

Users should test and calibrate outputs depending on their target engine.


Environmental Impact

  • Hardware: NVIDIA T4 GPU
  • Training Time: ~7 minutes
  • Compute Region: Unknown
  • Carbon Emission: Minimal due to short training duration

How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "Bl4ckSpaces/BlackList-Prompt-Enhancer"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

input_text = "[SIMPLE] fantasy warrior"

inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Citation
If you use BlackList in your project, please credit:
Bl4ckSpaces – BlackList Prompt Enhancer AI
Model Card Contact
Developer: Bl4ckSpaces
Hugging Face: https://huggingface.co/Bl4ckSpaces�
Downloads last month
31
Safetensors
Model size
23.3M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including Bl4ckSpaces/BlackList-Prompt-Enhancer