diff --git "a/deployment/backend/README_Version_Three.md" "b/deployment/backend/README_Version_Three.md"
deleted file mode 100644--- "a/deployment/backend/README_Version_Three.md"
+++ /dev/null
@@ -1,6861 +0,0 @@
-# RoDLA Document Layout Analysis API
-
-
-
-
-
-
-
-
-
-**A Production-Ready API for Robust Document Layout Analysis with Perturbation Testing**
-
-[Features](#-features) • [Installation](#-installation) • [Quick Start](#-quick-start) • [API Reference](#-api-reference) • [Perturbations](#-perturbation-system) • [Architecture](#-architecture)
-
-
-
----
-
-## 📋 Table of Contents
-
-1. [Overview](#-overview)
-2. [Features](#-features)
-3. [System Requirements](#-system-requirements)
-4. [Installation](#-installation)
-5. [Quick Start](#-quick-start)
-6. [Project Structure](#-project-structure)
-7. [Architecture Deep Dive](#-architecture-deep-dive)
-8. [Configuration](#-configuration)
-9. [API Reference](#-api-reference)
-10. [Perturbation System](#-perturbation-system)
-11. [Metrics System](#-metrics-system)
-12. [Visualization Engine](#-visualization-engine)
-13. [Services Layer](#-services-layer)
-14. [Utilities Reference](#-utilities-reference)
-15. [Error Handling](#-error-handling)
-16. [Performance Optimization](#-performance-optimization)
-17. [Security Considerations](#-security-considerations)
-18. [Testing](#-testing)
-19. [Deployment](#-deployment)
-20. [Troubleshooting](#-troubleshooting)
-21. [Contributing](#-contributing)
-22. [Citation](#-citation)
-23. [License](#-license)
-
----
-
-## 🎯 Overview
-
-### What is RoDLA?
-
-RoDLA (Robust Document Layout Analysis) is a state-of-the-art deep learning model for detecting and classifying layout elements in document images. Published at **CVPR 2024**, it focuses on robustness to various perturbations including noise, blur, and geometric distortions.
-
-The model achieves:
-- **70.0% mAP** on clean M6Doc dataset
-- **61.7% average mAP** on perturbed images
-- **147.6 mRD** (Mean Robustness Degradation) score
-- Detection of **74 document element classes**
-
-### What is this API?
-
-This repository provides a **production-ready FastAPI wrapper** around the RoDLA model, featuring:
-
-#### Core Detection Features
-- 🔍 RESTful API endpoints for document analysis
-- 📊 Comprehensive metrics calculation (20+ metrics)
-- 📈 Automated visualization generation (8 chart types)
-- 🛡️ Robustness assessment based on the RoDLA paper
-- 🧠 Human-readable interpretation of results
-- 📁 Flexible output formats (JSON, annotated images)
-
-#### NEW: Perturbation Testing Features
-- 🎨 **12 perturbation types** across 5 categories
-- 🔬 **Apply-only mode** - Test perturbations without detection
-- 🎯 **Combined mode** - Perturb then detect in one request
-- 📊 **Perturbation analytics** - Track success rates and effects
-- 💾 **Save perturbed images** - Keep transformed images for analysis
-- 🔄 **Sequential application** - Apply multiple perturbations in order
-
-### Key Statistics
-
-| Metric | Value |
-|--------|-------|
-| Clean mAP (M6Doc) | 70.0% |
-| Perturbed Average mAP | 61.7% |
-| mRD Score | 147.6 |
-| Max Detections/Image | 300 |
-| Supported Classes | 74 (M6Doc) |
-| **Perturbation Types** | **12** |
-| **Perturbation Categories** | **5** |
-| **Intensity Levels** | **3 (mild/moderate/severe)** |
-
----
-
-## ✨ Features
-
-### Core Capabilities
-
-| Feature | Description |
-|---------|-------------|
-| 🔍 **Multi-class Detection** | Detect 74+ document element types |
-| 📊 **Comprehensive Metrics** | 20+ analytical metrics per image |
-| 📈 **Auto Visualization** | 8 chart types generated automatically |
-| 🛡️ **Robustness Analysis** | mPE and mRD estimation |
-| 🧠 **Smart Interpretation** | Human-readable analysis summaries |
-| ⚡ **GPU Acceleration** | CUDA support for fast inference |
-| 📁 **Flexible Output** | JSON, annotated images, or both |
-
-### NEW: Perturbation Capabilities
-
-| Feature | Description |
-|---------|-------------|
-| 🎨 **12 Perturbation Types** | Blur, noise, content, inconsistency, spatial |
-| 🔬 **Independent Testing** | Apply perturbations without detection |
-| 🎯 **Integrated Pipeline** | Perturb + detect in single request |
-| 📊 **Success Tracking** | Monitor which perturbations succeed/fail |
-| 💾 **Save Outputs** | Persist perturbed images to disk |
-| 🔄 **Sequential Processing** | Apply multiple effects in order |
-| 📈 **Robustness Testing** | Evaluate model performance under stress |
-
-### Document Element Types
-
-The model can detect various document elements including:
-
-```
-Text Elements: Structural Elements: Visual Elements:
-├── Paragraph ├── Header ├── Figure
-├── Title ├── Footer ├── Table
-├── Caption ├── Page Number ├── Chart
-├── List ├── Section ├── Logo
-├── Footnote ├── Column ├── Stamp
-└── Abstract └── Margin └── Equation
-```
-
-### Perturbation Categories
-
-```
-Blur Effects: Noise Effects: Content Effects:
-├── Defocus ├── Speckle ├── Watermark
-└── Vibration └── Texture └── Background
-
-Inconsistency: Spatial Transform:
-├── Ink Holdout ├── Rotation
-├── Ink Bleeding ├── Keystoning
-└── Illumination └── Warping
-```
-
----
-
-## 💻 System Requirements
-
-### Hardware Requirements
-
-| Component | Minimum | Recommended |
-|-----------|---------|-------------|
-| CPU | 4 cores | 8+ cores |
-| RAM | 16 GB | 32 GB |
-| GPU | 8 GB VRAM | 16+ GB VRAM |
-| Storage | 10 GB | 20 GB |
-
-### Software Requirements
-
-| Software | Version |
-|----------|---------|
-| Python | 3.8 - 3.10 |
-| CUDA | 11.7+ |
-| cuDNN | 8.5+ |
-| OS | Linux (Ubuntu 20.04+) / WSL2 |
-
-### Python Dependencies
-
-```python
-# Core Framework
-fastapi>=0.100.0
-uvicorn>=0.23.0
-python-multipart>=0.0.6
-
-# ML/Deep Learning
-torch>=2.0.0
-mmdet>=3.0.0
-mmcv>=2.0.0
-detectron2>=0.6
-
-# Data Processing
-numpy>=1.24.0
-pillow>=9.5.0
-opencv-python>=4.8.0
-
-# Visualization
-matplotlib>=3.7.0
-seaborn>=0.12.0
-
-# Perturbation Dependencies (NEW)
-ocrodeg>=0.3.0
-imgaug>=0.4.0
-pyiqa>=0.1.7
-
-# Utilities
-pydantic>=2.0.0
-```
-
----
-
-## 🚀 Installation
-
-### Step 1: Clone the Repository
-
-```bash
-git clone https://github.com/yourusername/rodla-api.git
-cd rodla-api/deployment
-```
-
-### Step 2: Create Virtual Environment
-
-```bash
-# Using conda (recommended)
-conda create -n rodla python=3.9
-conda activate rodla
-
-# Or using venv
-python -m venv venv
-source venv/bin/activate # Linux/Mac
-.\venv\Scripts\activate # Windows
-```
-
-### Step 3: Install PyTorch with CUDA
-
-```bash
-# For CUDA 11.8
-pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
-
-# For CUDA 12.1
-pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
-
-# Verify installation
-python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"
-```
-
-### Step 4: Install MMDetection
-
-```bash
-pip install -U openmim
-mim install mmengine
-mim install mmcv>=2.0.0
-mim install mmdet>=3.0.0
-```
-
-### Step 5: Install Detectron2
-
-```bash
-# Install pre-built version
-pip install detectron2 -f \
- https://dl.fbaipublicfiles.com/detectron2/wheels/cu118/torch2.0/index.html
-
-# Or build from source
-git clone https://github.com/facebookresearch/detectron2.git
-python -m pip install -e detectron2
-```
-
-### Step 6: Install Perturbation Dependencies
-
-```bash
-# Install perturbation-specific libraries
-pip install ocrodeg>=0.3.0
-pip install imgaug>=0.4.0
-pip install pyiqa>=0.1.7
-
-# Verify installation
-python -c "import ocrodeg, imgaug; print('Perturbation dependencies OK')"
-```
-
-### Step 7: Install Project Dependencies
-
-```bash
-pip install -r requirements.txt
-```
-
-### Step 8: Download Model Weights
-
-```bash
-# Create weights directory
-mkdir -p weights
-
-# Download from official source
-wget https://path-to-weights/rodla_internimage_xl_m6doc.pth \
- -O weights/rodla_internimage_xl_m6doc.pth
-
-# Verify file integrity
-ls -lh weights/rodla_internimage_xl_m6doc.pth
-```
-
-### Step 9: Copy Perturbation Modules
-
-```bash
-# Copy perturbation files from RoDLA root to deployment
-cd /path/to/RoDLA
-cp perturbation/blur.py deployment/perturbations/
-cp perturbation/content.py deployment/perturbations/
-cp perturbation/noise.py deployment/perturbations/
-cp perturbation/inconsistency.py deployment/perturbations/
-cp perturbation/spatial.py deployment/perturbations/
-
-# Verify files are copied
-ls deployment/perturbations/
-```
-
-### Step 10: Setup Background Images (Optional)
-
-```bash
-# For background perturbation, setup background images
-mkdir -p perturbation/background_image
-
-# Copy or download background textures
-cp /path/to/backgrounds/*.jpg perturbation/background_image/
-
-# Create background file index
-cd perturbation/background_image
-python -c "from content import create_background_file; create_background_file('.')"
-```
-
-### Step 11: Configure Paths
-
-Edit `config/settings.py`:
-
-```python
-from pathlib import Path
-
-# Update these paths to match your setup
-REPO_ROOT = Path("/path/to/your/RoDLA")
-MODEL_CONFIG = REPO_ROOT / "model/configs/m6doc/rodla_internimage_xl_m6doc.py"
-MODEL_WEIGHTS = REPO_ROOT / "rodla_internimage_xl_m6doc.pth"
-
-# Output directories (created automatically)
-OUTPUT_DIR = Path("outputs")
-PERTURBATION_OUTPUT_DIR = OUTPUT_DIR / "perturbations"
-
-# Background folder for perturbations
-DEFAULT_BACKGROUND_FOLDER = REPO_ROOT / "perturbation" / "background_image"
-```
-
-### Step 12: Verify Installation
-
-```bash
-# Test imports
-python -c "
-import torch
-import mmdet
-import detectron2
-import ocrodeg
-import imgaug
-from fastapi import FastAPI
-print('✅ All dependencies installed successfully!')
-print(f'PyTorch version: {torch.__version__}')
-print(f'CUDA available: {torch.cuda.is_available()}')
-"
-
-# Test API startup
-python backend.py &
-sleep 5
-curl http://localhost:8000/api/model-info
-```
-
----
-
-## ⚡ Quick Start
-
-### Starting the Server
-
-```bash
-# Development mode (with auto-reload)
-python backend.py
-
-# Production mode with uvicorn
-uvicorn backend:app --host 0.0.0.0 --port 8000 --workers 1
-
-# With specific log level
-uvicorn backend:app --host 0.0.0.0 --port 8000 --log-level info
-```
-
-**Expected Output:**
-```
-============================================================
-Starting RoDLA Document Layout Analysis API
-============================================================
-📁 Creating output directories...
- ✓ Main output: outputs
- ✓ Perturbations: outputs/perturbations
-
-🔧 Loading RoDLA model...
- Loading checkpoint...
- ✓ Model loaded successfully
-
-============================================================
-✅ API Ready!
-============================================================
-🌐 Main API: http://0.0.0.0:8000
-📚 Docs: http://0.0.0.0:8000/docs
-📖 ReDoc: http://0.0.0.0:8000/redoc
-
-🎯 Available Endpoints:
- • GET /api/model-info - Model information
- • POST /api/detect - Standard detection
- • GET /api/perturbations/info - Perturbation info (NEW)
- • POST /api/perturb - Apply perturbations (NEW)
- • POST /api/detect-with-perturbation - Detect with perturbations (NEW)
-============================================================
-```
-
-### Making Your First Request
-
-#### 1. Get Model Information
-
-```bash
-curl http://localhost:8000/api/model-info
-```
-
-#### 2. Basic Detection
-
-```bash
-curl -X POST "http://localhost:8000/api/detect" \
- -H "accept: application/json" \
- -F "file=@document.jpg" \
- -F "score_thr=0.3" \
- -F "generate_visualizations=true"
-```
-
-#### 3. Get Perturbation Information (NEW)
-
-```bash
-curl http://localhost:8000/api/perturbations/info
-```
-
-#### 4. Apply Perturbation (NEW)
-
-```bash
-curl -X POST "http://localhost:8000/api/perturb" \
- -F "file=@document.jpg" \
- -F 'perturbations=[{"type":"defocus","degree":2}]' \
- -F "save_image=true"
-```
-
-#### 5. Detect with Perturbation (NEW)
-
-```bash
-curl -X POST "http://localhost:8000/api/detect-with-perturbation" \
- -F "file=@document.jpg" \
- -F 'perturbations=[{"type":"rotation","degree":1}]' \
- -F "score_thr=0.3"
-```
-
-### Python Client Examples
-
-#### Basic Detection
-
-```python
-import requests
-
-# Upload and analyze document
-with open("document.pdf", "rb") as f:
- response = requests.post(
- "http://localhost:8000/api/detect",
- files={"file": f},
- data={
- "score_thr": "0.3",
- "return_image": "false",
- "generate_visualizations": "true"
- }
- )
-
-result = response.json()
-print(f"Detected {result['core_results']['summary']['total_detections']} elements")
-print(f"Average confidence: {result['core_results']['summary']['average_confidence']:.2%}")
-```
-
-#### Apply Perturbations Only (NEW)
-
-```python
-import requests
-import json
-
-# Apply multiple perturbations
-with open("document.jpg", "rb") as f:
- perturbations = [
- {"type": "defocus", "degree": 2},
- {"type": "speckle", "degree": 1},
- {"type": "rotation", "degree": 1}
- ]
-
- response = requests.post(
- "http://localhost:8000/api/perturb",
- files={"file": f},
- data={
- "perturbations": json.dumps(perturbations),
- "save_image": "true",
- "return_base64": "false"
- }
- )
-
- result = response.json()
- print(f"Success: {result['success']}")
- print(f"Applied: {len(result['perturbations_applied'])}")
- print(f"Failed: {len(result['perturbations_failed'])}")
- print(f"Success rate: {result['success_rate']:.1%}")
-
- if result['saved_path']:
- print(f"Saved to: {result['saved_path']}")
-```
-
-#### Detect with Perturbations (NEW)
-
-```python
-import requests
-import json
-
-# Perturb then detect
-with open("document.jpg", "rb") as f:
- perturbations = [
- {"type": "illumination", "degree": 2},
- {"type": "texture", "degree": 1}
- ]
-
- response = requests.post(
- "http://localhost:8000/api/detect-with-perturbation",
- files={"file": f},
- data={
- "perturbations": json.dumps(perturbations),
- "score_thr": "0.3",
- "save_perturbed_image": "true",
- "generate_visualizations": "true"
- }
- )
-
- result = response.json()
-
- # Detection results
- print(f"Detections: {result['core_results']['summary']['total_detections']}")
- print(f"Confidence: {result['core_results']['summary']['average_confidence']:.2%}")
-
- # Perturbation results
- if 'perturbation_info' in result:
- print(f"\nPerturbations applied: {len(result['perturbation_info']['applied'])}")
- print(f"Success rate: {result['perturbation_info']['success_rate']:.1%}")
-```
-
----
-
-## 📁 Project Structure
-
-```
-deployment/
-├── backend.py # 🚀 Main FastAPI application entry point
-├── requirements.txt # 📦 Python dependencies
-├── README.md # 📖 This comprehensive documentation
-├── test_perturbations.html # 🧪 Interactive web-based testing interface
-│
-├── config/ # ⚙️ Configuration Layer
-│ ├── __init__.py # Package initializer
-│ └── settings.py # All configuration constants (UPDATED)
-│
-├── core/ # 🧠 Core Application Layer
-│ ├── __init__.py # Package initializer
-│ ├── model_loader.py # Singleton model management
-│ └── dependencies.py # FastAPI dependency injection
-│
-├── api/ # 🌐 API Layer
-│ ├── __init__.py # Package initializer
-│ ├── routes.py # API endpoint definitions (UPDATED)
-│ └── schemas.py # Pydantic request/response models (UPDATED)
-│
-├── services/ # 🔧 Business Logic Layer
-│ ├── __init__.py # Package initializer
-│ ├── detection.py # Core detection logic
-│ ├── processing.py # Result aggregation
-│ ├── visualization.py # Chart generation (350+ lines)
-│ ├── interpretation.py # Human-readable insights
-│ └── perturbation.py # 🆕 Perturbation service (NEW)
-│
-├── perturbations/ # 🎨 Perturbation Module (NEW)
-│ ├── __init__.py # Exports all perturbation functions
-│ ├── apply.py # Main orchestration logic
-│ ├── blur.py # Defocus, vibration effects
-│ ├── content.py # Watermark, background additions
-│ ├── noise.py # Speckle, texture noise
-│ ├── inconsistency.py # Ink effects, illumination
-│ └── spatial.py # Rotation, keystoning, warping
-│
-├── utils/ # 🛠️ Utility Layer
-│ ├── __init__.py # Package initializer
-│ ├── helpers.py # General helper functions
-│ ├── serialization.py # JSON conversion utilities
-│ └── metrics/ # Metrics calculation modules
-│ ├── __init__.py # Metrics package initializer
-│ ├── core.py # Core detection metrics
-│ ├── rodla.py # RoDLA-specific metrics
-│ ├── spatial.py # Spatial distribution analysis
-│ └── quality.py # Quality & complexity metrics
-│
-└── outputs/ # 📤 Output Directory (auto-created)
- ├── *.json # Detection results
- ├── *.png # Visualization images
- └── perturbations/ # 🆕 Perturbed images (NEW)
- └── *.png # Saved perturbed images
-```
-
-### File Count Summary
-
-| Layer | Files | Purpose | Status |
-|-------|-------|---------|--------|
-| Config | 2 | Configuration management | Updated |
-| Core | 3 | Model and dependency management | Stable |
-| API | 3 | HTTP endpoints and schemas | Updated |
-| Services | 6 | Business logic implementation | +1 New |
-| **Perturbations** | **7** | **Perturbation operations** | **NEW** |
-| Utils | 7 | Helper functions and metrics | Stable |
-| **Total** | **28** | **Complete modular architecture** | **+7 files** |
-
----
-
-## 🏗️ Architecture Deep Dive
-
-### Layered Architecture with Perturbations
-
-```
-┌─────────────────────────────────────────────────────────────────┐
-│ CLIENT LAYER │
-│ (Web Browser / API Clients) │
-└─────────────────────────┬───────────────────────────────────────┘
- │ HTTP Requests
- ▼
-┌─────────────────────────────────────────────────────────────────┐
-│ API LAYER │
-│ api/routes.py │
-│ ┌──────────────┐ ┌──────────────┐ ┌─────────────────────────┐ │
-│ │GET /model- │ │POST /api/ │ │GET /api/perturbations/ │ │
-│ │ info │ │ detect │ │ info (NEW) │ │
-│ └──────────────┘ └──────────────┘ └─────────────────────────┘ │
-│ ┌─────────────────────────────────────────────────────────────┤
-│ │POST /api/perturb (NEW) POST /api/detect-with- │ │
-│ │ perturbation (NEW) │ │
-│ └─────────────────────────────────────────────────────────────┘ │
-└─────────────────────────┬───────────────────────────────────────┘
- │ Validated Requests
- ▼
-┌─────────────────────────────────────────────────────────────────┐
-│ SERVICES LAYER │
-│ ┌──────────────┐ ┌──────────────┐ ┌────────────────────────┐ │
-│ │detection.py │ │processing.py │ │visualization.py │ │
-│ │• Inference │ │• Aggregate │ │• 8 Chart Types │ │
-│ │• Processing │ │• Save JSON │ │• Base64 Encoding │ │
-│ └──────────────┘ └──────────────┘ └────────────────────────┘ │
-│ ┌──────────────────────────────────────────────────────────┐ │
-│ │ interpretation.py │ │
-│ │ • Human-readable insights │ │
-│ └──────────────────────────────────────────────────────────┘ │
-│ ┌──────────────────────────────────────────────────────────┐ │
-│ │ 🆕 perturbation.py (NEW) │ │
-│ │ • Apply perturbations • Save perturbed images │ │
-│ │ • Track success/failure • Generate metadata │ │
-│ └──────────────────────────────────────────────────────────┘ │
-└─────────────────────────┬───────────────────────────────────────┘
- │ Data Processing
- ▼
-┌─────────────────────────────────────────────────────────────────┐
-│ UTILITIES LAYER │
-│ ┌────────────────────────────────────────────────────────────┐ │
-│ │ utils/metrics/ │ │
-│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────────┐ │ │
-│ │ │core.py │ │rodla.py │ │spatial. │ │quality.py │ │ │
-│ │ │ │ │ │ │ py │ │ │ │ │
-│ │ └─────────┘ └─────────┘ └─────────┘ └─────────────┘ │ │
-│ └────────────────────────────────────────────────────────────┘ │
-│ ┌──────────────┐ ┌──────────────���─────────────────────┐ │
-│ │helpers.py │ │serialization.py │ │
-│ └──────────────┘ └────────────────────────────────────┘ │
-└─────────────────────────┬───────────────────────────────────────┘
- │ Perturbation Operations
- ▼
-┌─────────────────────────────────────────────────────────────────┐
-│ 🆕 PERTURBATIONS LAYER (NEW) │
-│ perturbations/ │
-│ ┌────────────────────────────────────────────────────────────┐ │
-│ │ apply.py │ │
-│ │ • Orchestrates all perturbation types │ │
-│ │ • Sequential application logic │ │
-│ │ • Error handling and validation │ │
-│ └────────────────────────────────────────────────────────────┘ │
-│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────────────┐ │
-│ │blur.py │ │noise.py │ │content. │ │inconsistency.py │ │
-│ │• Defocus │ │• Speckle │ │ py │ │• Ink holdout │ │
-│ │•Vibration│ │• Texture │ │•Watermark│ │• Ink bleeding │ │
-│ │ │ │ │ │•Backgrnd │ │• Illumination │ │
-│ └──────────┘ └──────────┘ └──────────┘ └────────────────────┘ │
-│ ┌──────────────────────────────────────────────────────────────│
-│ │ spatial.py │ │
-│ │ • Rotation • Keystoning • Warping │ │
-│ └──────────────────────────────────────────────────────────────│
-└─────────────────────────┬───────────────────────────────────────┘
- │ Model Operations
- ▼
-┌─────────────────────────────────────────────────────────────────┐
-│ CORE LAYER │
-│ ┌─────────────────────────┐ ┌───────────────────────────┐ │
-│ │ model_loader.py │ │ dependencies.py │ │
-│ │ │ │ │ │
-│ │ • Singleton Pattern │ │ • FastAPI DI │ │
-│ │ • GPU Management │ │ • Model Injection │ │
-│ │ • Lazy Loading │ │ │ │
-│ └─────────────────────────┘ └───────────────────────────┘ │
-└─────────────────────────┬───────────────────────────────────────┘
- │ Configuration
- ▼
-┌─────────────────────────────────────────────────────────────────┐
-│ CONFIG LAYER │
-│ config/settings.py (UPDATED) │
-│ • Paths • Constants • Baseline Metrics • Thresholds │
-│ 🆕 • Perturbation Config • Background Folders • Categories │
-└─────────────────────────────────────────────────────────────────┘
-```
-
-### Design Patterns Used
-
-| Pattern | Location | Purpose |
-|---------|----------|---------|
-| **Singleton** | `model_loader.py` | Single model instance for efficiency |
-| **Factory** | `visualization.py` | Create multiple chart types dynamically |
-| **Dependency Injection** | `dependencies.py` | Inject model into routes cleanly |
-| **Repository** | `processing.py` | Abstract data persistence layer |
-| **Facade** | `routes.py` | Simplify complex subsystem interactions |
-| **Strategy** | `metrics/`, `perturbations/` | Interchangeable algorithms |
-| **🆕 Chain of Responsibility** | `perturbations/apply.py` | Sequential perturbation application |
-| **🆕 Template Method** | `perturbations/*.py` | Common perturbation interface |
-
-### Data Flow Diagram - Standard Detection
-
-```
-┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
-│ Image │───▶│ Upload │───▶│ Temp │───▶│ Model │
-│ File │ │ Handler │ │ File │ │ Inference│
-└──────────┘ └──────────┘ └──────────┘ └────┬─────┘
- │
- ┌───────────────────────────────────────────────┘
- ▼
-┌──────────┐ ┌──────────┐ ┌──────────┐
-│ Generate │───▶│ Assemble │───▶│ JSON │
-│ Interp. │ │ Response │ │ Response │
-└──────────┘ └──────────┘ └──────────┘
-```
-
-### Data Flow Diagram - Perturbation Pipeline (NEW)
-
-```
- PERTURBATION-ONLY MODE
-┌──────────┐ ┌──────────┐ ┌──────────────────────┐
-│ Image │───▶│ Upload │───▶│ Perturbation Config │
-│ File │ │ Handler │ │ (Type + Degree) │
-└──────────┘ └──────────┘ └──────────┬───────────┘
- │
- ┌─────────────────────────────────────┘
- ▼
-┌──────────────┐ ┌──────────────┐ ┌──────────────┐
-│ Apply Pert 1 │───▶│ Apply Pert 2 │───▶│ Apply Pert N │
-│ (Defocus) │ │ (Speckle) │ │ (Rotation) │
-└──────────────┘ └──────────────┘ └──────┬───────┘
- │
- ┌────────────────────────────────────────┘
- ▼
-┌──────────────┐ ┌──────────────┐ ┌──────────────┐
-│ Track │───▶│ Save Image │───▶│ Return │
-│ Success/Fail │ │ (Optional) │ │ Results │
-└──────────────┘ └──────────────┘ └──────────────┘
-
- PERTURBATION + DETECTION MODE
-┌──────────┐ ┌──────────┐ ┌──────────────────────┐
-│ Image │───▶│ Upload │───▶│ Perturbation Config │
-│ File │ │ Handler │ │ + Detection Config │
-└──────────┘ └──────────┘ └──────────┬───────────┘
- │
- ┌─────────────────────────────────────┘
- ▼
-┌──────────────────┐ ┌──────────────┐ ┌──────────────┐
-│ Apply │───▶│ Save │───▶│ Run Model │
-│ Perturbations │ │ Temp Image │ │ Inference │
-└──────────────────┘ └────────────��─┘ └──────┬───────┘
- │
- ┌─────────────────────────────────────────────┘
- ▼
-┌──────────────┐ ┌──────────────┐ ┌──────────────┐
-│ Process │───▶│ Calculate │───▶│ Add Pert │
-│ Detections │ │ All Metrics │ │ Metadata │
-└──────────────┘ └──────────────┘ └──────┬───────┘
- │
- ┌────────────────────────────────────────┘
- ▼
-┌──────────────┐ ┌──────────────┐ ┌──────────────┐
-│ Generate │───▶│ Assemble │───▶│ Return Full │
-│ Visualize │ │ Response │ │ Results │
-└──────────────┘ └──────────────┘ └──────────────┘
-```
-
----
-
-## ⚙️ Configuration
-
-### config/settings.py (UPDATED)
-
-This file centralizes all configuration parameters including new perturbation settings.
-
-```python
-"""
-Configuration Settings Module
-=============================
-All application constants and configuration in one place.
-Now includes perturbation configuration!
-"""
-
-from pathlib import Path
-
-# =============================================================================
-# PATH CONFIGURATION
-# =============================================================================
-
-# Root directory of the RoDLA model repository
-REPO_ROOT = Path("/mnt/d/MyStuff/University/Current/CV/Project/RoDLA")
-
-# Model configuration file path
-MODEL_CONFIG = REPO_ROOT / "model/configs/m6doc/rodla_internimage_xl_m6doc.py"
-
-# Pre-trained model weights path
-MODEL_WEIGHTS = REPO_ROOT / "rodla_internimage_xl_m6doc.pth"
-
-# Output directory for results and visualizations
-OUTPUT_DIR = Path("outputs")
-
-# 🆕 NEW: Perturbation output directory
-PERTURBATION_OUTPUT_DIR = OUTPUT_DIR / "perturbations"
-
-# =============================================================================
-# API CONFIGURATION
-# =============================================================================
-
-# CORS settings
-CORS_ORIGINS = ["*"] # Restrict in production
-CORS_METHODS = ["*"]
-CORS_HEADERS = ["*"]
-
-# API metadata
-API_TITLE = "RoDLA Object Detection API"
-API_VERSION = "2.1.0" # 🆕 Bumped for perturbation feature
-API_DESCRIPTION = "Production-ready API for Robust Document Layout Analysis with Perturbation Testing"
-API_HOST = "0.0.0.0"
-API_PORT = 8000
-
-# =============================================================================
-# MODEL CONFIGURATION
-# =============================================================================
-
-# Default confidence threshold for detections
-DEFAULT_SCORE_THRESHOLD = 0.3
-
-# Maximum number of detections per image
-MAX_DETECTIONS = 300
-
-# Model metadata
-MODEL_INFO = {
- "name": "RoDLA InternImage-XL",
- "paper": "RoDLA: Benchmarking the Robustness of Document Layout Analysis Models",
- "conference": "CVPR 2024",
- "backbone": "InternImage-XL",
- "framework": "DINO with Channel Attention + Average Pooling",
- "dataset": "M6Doc-P"
-}
-
-# =============================================================================
-# 🆕 PERTURBATION CONFIGURATION (NEW)
-# =============================================================================
-
-# Maximum number of perturbations per request
-MAX_PERTURBATIONS_PER_REQUEST = 5
-
-# Default background images folder
-DEFAULT_BACKGROUND_FOLDER = REPO_ROOT / "perturbation" / "background_image"
-
-# Perturbation categories and their types
-PERTURBATION_CATEGORIES = {
- "blur": {
- "types": ["defocus", "vibration"],
- "description": "Blur effects simulating optical issues",
- "typical_use": "Test robustness to camera/scanning quality"
- },
- "noise": {
- "types": ["speckle", "texture"],
- "description": "Noise patterns and texture artifacts",
- "typical_use": "Test robustness to paper quality and printing defects"
- },
- "content": {
- "types": ["watermark", "background"],
- "description": "Content additions like watermarks and backgrounds",
- "typical_use": "Test robustness to document modifications"
- },
- "inconsistency": {
- "types": ["ink_holdout", "ink_bleeding", "illumination"],
- "description": "Print quality issues and lighting variations",
- "typical_use": "Test robustness to printing and lighting conditions"
- },
- "spatial": {
- "types": ["rotation", "keystoning", "warping"],
- "description": "Geometric transformations",
- "typical_use": "Test robustness to document positioning and distortion"
- }
-}
-
-# Flatten for validation
-ALL_PERTURBATIONS = [
- p for category in PERTURBATION_CATEGORIES.values()
- for p in category["types"]
-]
-
-# Degree intensity descriptions
-PERTURBATION_DEGREES = {
- 1: "Mild - Subtle effect, barely noticeable",
- 2: "Moderate - Noticeable effect, moderate degradation",
- 3: "Severe - Strong effect, significant degradation"
-}
-
-# Perturbation-specific parameters
-PERTURBATION_PARAMS = {
- "defocus": {
- "base_kernel_size": 1,
- "description": "Gaussian blur simulating out-of-focus camera"
- },
- "vibration": {
- "base_size": 3,
- "description": "Motion blur simulating camera shake"
- },
- "speckle": {
- "base_density": 1e-4,
- "description": "Random black/white spots"
- },
- "texture": {
- "base_fibers": 300,
- "description": "Paper texture and fiber patterns"
- },
- "watermark": {
- "default_text": "Watermark_test",
- "description": "Semi-transparent text overlay"
- },
- "background": {
- "requires_folder": True,
- "description": "Background image overlay"
- },
- "ink_holdout": {
- "description": "Ink not reaching paper (erosion effect)"
- },
- "ink_bleeding": {
- "description": "Ink spreading beyond intended areas"
- },
- "illumination": {
- "description": "Uneven lighting and shadows"
- },
- "rotation": {
- "max_angle_deg": [5, 10, 15],
- "description": "Document rotation"
- },
- "keystoning": {
- "max_perspective": [0.02, 0.04, 0.06],
- "description": "Perspective distortion"
- },
- "warping": {
- "description": "Elastic deformation"
- }
-}
-
-# =============================================================================
-# BASELINE PERFORMANCE METRICS
-# =============================================================================
-
-# Clean performance baselines from the RoDLA paper (mAP scores)
-BASELINE_MAP = {
- "M6Doc": 70.0, # Main evaluation dataset
- "PubLayNet": 96.0, # Scientific documents
- "DocLayNet": 80.5 # Diverse document types
-}
-
-# State-of-the-art performance metrics
-SOTA_PERFORMANCE = {
- "clean_mAP": 70.0,
- "perturbed_avg_mAP": 61.7,
- "mRD_score": 147.6
-}
-
-# =============================================================================
-# ANALYSIS THRESHOLDS
-# =============================================================================
-
-# Size distribution thresholds (as percentage of image area)
-SIZE_THRESHOLDS = {
- "tiny": 0.005, # < 0.5% of image
- "small": 0.02, # 0.5% - 2%
- "medium": 0.1, # 2% - 10%
- "large": 1.0 # >= 10%
-}
-
-# Confidence level thresholds
-CONFIDENCE_THRESHOLDS = {
- "very_high": 0.9,
- "high": 0.8,
- "medium": 0.6,
- "low": 0.4
-}
-
-# Robustness assessment thresholds
-ROBUSTNESS_THRESHOLDS = {
- "mPE_low": 20,
- "mPE_medium": 40,
- "mRD_excellent": 100,
- "mRD_good": 150,
- "cv_stable": 0.15,
- "cv_moderate": 0.30
-}
-
-# Complexity scoring weights
-COMPLEXITY_WEIGHTS = {
- "class_diversity": 30,
- "detection_count": 30,
- "density": 20,
- "clustering": 20
-}
-
-# =============================================================================
-# VISUALIZATION CONFIGURATION
-# =============================================================================
-
-# Figure sizes for different chart types
-FIGURE_SIZES = {
- "bar_chart": (12, 6),
- "histogram": (10, 6),
- "heatmap": (10, 8),
- "boxplot": (12, 6),
- "scatter": (10, 6),
- "pie": (8, 8)
-}
-
-# Color schemes
-COLOR_SCHEMES = {
- "primary": "steelblue",
- "secondary": "forestgreen",
- "accent": "coral",
- "heatmap": "YlOrRd",
- "scatter": "viridis"
-}
-
-# DPI for saved images
-VISUALIZATION_DPI = 100
-```
-
-### Environment Variables
-
-For production deployments, use environment variables:
-
-```bash
-# .env file
-RODLA_REPO_ROOT=/path/to/RoDLA
-RODLA_MODEL_CONFIG=model/configs/m6doc/rodla_internimage_xl_m6doc.py
-RODLA_MODEL_WEIGHTS=rodla_internimage_xl_m6doc.pth
-RODLA_OUTPUT_DIR=outputs
-RODLA_PERTURBATION_DIR=outputs/perturbations
-RODLA_DEFAULT_THRESHOLD=0.3
-RODLA_API_HOST=0.0.0.0
-RODLA_API_PORT=8000
-RODLA_BACKGROUND_FOLDER=/path/to/backgrounds # NEW
-RODLA_MAX_PERTURBATIONS=5 # NEW
-```
-
----
-
-## 🌐 API Reference
-
-### Endpoints Overview
-
-| Method | Endpoint | Description | Status |
-|--------|----------|-------------|--------|
-| GET | `/api/model-info` | Get model metadata | Stable |
-| POST | `/api/detect` | Analyze document image | Stable |
-| **GET** | **`/api/perturbations/info`** | **Get perturbation information** | **🆕 NEW** |
-| **POST** | **`/api/perturb`** | **Apply perturbations only** | **🆕 NEW** |
-| **POST** | **`/api/detect-with-perturbation`** | **Perturb then detect** | **🆕 NEW** |
-| GET | `/health` | Health check | Optional |
-| GET | `/docs` | Swagger UI documentation | Auto-generated |
-| GET | `/redoc` | ReDoc documentation | Auto-generated |
-
----
-
-### GET /api/model-info
-
-Returns comprehensive information about the loaded model.
-
-#### Request
-
-```http
-GET /api/model-info HTTP/1.1
-Host: localhost:8000
-```
-
-#### Response
-
-```json
-{
- "model_name": "RoDLA InternImage-XL",
- "paper": "RoDLA: Benchmarking the Robustness of Document Layout Analysis Models (CVPR 2024)",
- "num_classes": 74,
- "classes": [
- "paragraph", "title", "figure", "table", "caption",
- "header", "footer", "page_number", "list", "abstract",
- // ... 64 more classes
- ],
- "backbone": "InternImage-XL",
- "detection_framework": "DINO with Channel Attention + Average Pooling",
- "dataset": "M6Doc-P",
- "max_detections_per_image": 300,
- "state_of_the_art_performance": {
- "clean_mAP": 70.0,
- "perturbed_avg_mAP": 61.7,
- "mRD_score": 147.6
- }
-}
-```
-
----
-
-### POST /api/detect
-
-Analyzes a document image and returns comprehensive detection results.
-
-#### Request
-
-```http
-POST /api/detect HTTP/1.1
-Host: localhost:8000
-Content-Type: multipart/form-data
-
-file:
-score_thr: "0.3"
-return_image: "false"
-save_json: "true"
-generate_visualizations: "true"
-```
-
-#### Parameters
-
-| Parameter | Type | Default | Description |
-|-----------|------|---------|-------------|
-| `file` | File | Required | Image file (JPEG, PNG, PDF, etc.) |
-| `score_thr` | string | "0.3" | Confidence threshold (0.0-1.0) |
-| `return_image` | string | "false" | Return annotated image instead of JSON |
-| `save_json` | string | "true" | Save results to disk |
-| `generate_visualizations` | string | "true" | Generate visualization charts |
-
-#### Response Structure
-
-The response includes:
-- **Core results**: Detection summary and individual detections
-- **RoDLA metrics**: Perturbation effect estimates
-- **Spatial analysis**: Distribution patterns
-- **Class analysis**: Per-class statistics
-- **Confidence analysis**: Confidence distribution
-- **Robustness indicators**: Stability metrics
-- **Layout complexity**: Complexity assessment
-- **Quality metrics**: Detection quality
-- **Visualizations**: 8 chart types (base64)
-- **Interpretation**: Human-readable insights
-
-*(Full response JSON too large to include - see original documentation or API docs)*
-
----
-
-### 🆕 GET /api/perturbations/info (NEW)
-
-Returns comprehensive information about all available perturbation types.
-
-#### Request
-
-```http
-GET /api/perturbations/info HTTP/1.1
-Host: localhost:8000
-```
-
-#### Response
-
-```json
-{
- "total_perturbations": 12,
- "categories": {
- "blur": {
- "types": ["defocus", "vibration"],
- "description": "Blur effects simulating optical issues",
- "typical_use": "Test robustness to camera/scanning quality"
- },
- "noise": {
- "types": ["speckle", "texture"],
- "description": "Noise patterns and texture artifacts",
- "typical_use": "Test robustness to paper quality and printing defects"
- },
- "content": {
- "types": ["watermark", "background"],
- "description": "Content additions like watermarks and backgrounds",
- "typical_use": "Test robustness to document modifications"
- },
- "inconsistency": {
- "types": ["ink_holdout", "ink_bleeding", "illumination"],
- "description": "Print quality issues and lighting variations",
- "typical_use": "Test robustness to printing and lighting conditions"
- },
- "spatial": {
- "types": ["rotation", "keystoning", "warping"],
- "description": "Geometric transformations",
- "typical_use": "Test robustness to document positioning and distortion"
- }
- },
- "all_types": [
- "defocus", "vibration", "speckle", "texture",
- "watermark", "background", "ink_holdout", "ink_bleeding",
- "illumination", "rotation", "keystoning", "warping"
- ],
- "degree_levels": {
- "1": "Mild - Subtle effect, barely noticeable",
- "2": "Moderate - Noticeable effect, moderate degradation",
- "3": "Severe - Strong effect, significant degradation"
- },
- "notes": {
- "background": "Requires background_folder parameter",
- "spatial": "May change image dimensions slightly",
- "sequential": "Multiple perturbations applied in order specified",
- "max_per_request": "Maximum 5 perturbations per request"
- }
-}
-```
-
----
-
-### 🆕 POST /api/perturb (NEW)
-
-Apply perturbations to an image without performing detection. Perfect for testing perturbation effects or generating augmented datasets.
-
-#### Request
-
-```http
-POST /api/perturb HTTP/1.1
-Host: localhost:8000
-Content-Type: multipart/form-data
-
-file:
-perturbations: '[{"type":"defocus","degree":2},{"type":"speckle","degree":1}]'
-return_base64: "false"
-save_image: "true"
-background_folder: "/path/to/backgrounds" (optional)
-```
-
-#### Parameters
-
-| Parameter | Type | Default | Required | Description |
-|-----------|------|---------|----------|-------------|
-| `file` | File | - | ✅ Yes | Image file to perturb |
-| `perturbations` | JSON string | - | ✅ Yes | Array of perturbation configs |
-| `return_base64` | string | "false" | ❌ No | Return perturbed image as base64 |
-| `save_image` | string | "false" | ❌ No | Save perturbed image to disk |
-| `background_folder` | string | None | ❌ No | Path for 'background' perturbation |
-
-#### Perturbation Config Format
-
-```json
-[
- {
- "type": "defocus", // Required: perturbation type
- "degree": 2, // Required: 1 (mild), 2 (moderate), 3 (severe)
- "background_folder": "..." // Optional: only for 'background' type
- },
- {
- "type": "speckle",
- "degree": 1
- }
-]
-```
-
-#### Response
-
-```json
-{
- "success": true,
- "message": "Applied 3/3 perturbations successfully",
- "perturbations_applied": [
- {
- "order": 1,
- "type": "defocus",
- "degree": 2,
- "message": "Successfully applied defocus (degree 2)"
- },
- {
- "order": 2,
- "type": "speckle",
- "degree": 1,
- "message": "Successfully applied speckle (degree 1)"
- },
- {
- "order": 3,
- "type": "rotation",
- "degree": 1,
- "message": "Successfully applied rotation (degree 1)"
- }
- ],
- "perturbations_failed": [],
- "total_perturbations": 3,
- "success_rate": 1.0,
- "original_shape": [3508, 2480, 3],
- "perturbed_shape": [3508, 2480, 3],
- "image_base64": null, // Only if return_base64=true
- "saved_path": "outputs/perturbations/document_pert_defocus2_speckle1_rotation1_20241115_143022.png"
-}
-```
-
-#### Example Usage
-
-```bash
-# Single perturbation
-curl -X POST "http://localhost:8000/api/perturb" \
- -F "file=@document.jpg" \
- -F 'perturbations=[{"type":"defocus","degree":2}]' \
- -F "save_image=true"
-
-# Multiple perturbations with base64 return
-curl -X POST "http://localhost:8000/api/perturb" \
- -F "file=@document.jpg" \
- -F 'perturbations=[{"type":"defocus","degree":2},{"type":"speckle","degree":1},{"type":"rotation","degree":1}]' \
- -F "return_base64=true" \
- -F "save_image=true"
-
-# Background perturbation (requires folder)
-curl -X POST "http://localhost:8000/api/perturb" \
- -F "file=@document.jpg" \
- -F 'perturbations=[{"type":"background","degree":2}]' \
- -F "background_folder=/path/to/backgrounds"
-```
-
----
-
-### 🆕 POST /api/detect-with-perturbation (NEW)
-
-Apply perturbations to an image, then perform RoDLA detection on the perturbed result. This endpoint combines both operations in a single request.
-
-#### Request
-
-```http
-POST /api/detect-with-perturbation HTTP/1.1
-Host: localhost:8000
-Content-Type: multipart/form-data
-
-file:
-perturbations: '[{"type":"rotation","degree":1}]'
-score_thr: "0.3"
-return_image: "false"
-save_json: "true"
-generate_visualizations: "true"
-save_perturbed_image: "false"
-background_folder: "/path/to/backgrounds" (optional)
-```
-
-#### Parameters
-
-| Parameter | Type | Default | Required | Description |
-|-----------|------|---------|----------|-------------|
-| `file` | File | - | ✅ Yes | Image file to process |
-| `perturbations` | JSON string | None | ❌ No | Array of perturbation configs |
-| `score_thr` | string | "0.3" | ❌ No | Confidence threshold |
-| `return_image` | string | "false" | ❌ No | Return annotated image |
-| `save_json` | string | "true" | ❌ No | Save results to disk |
-| `generate_visualizations` | string | "true" | ❌ No | Generate charts |
-| `save_perturbed_image` | string | "false" | ❌ No | Save perturbed image |
-| `background_folder` | string | None | ❌ No | Path for 'background' perturbation |
-
-#### Response
-
-Standard detection response PLUS perturbation metadata:
-
-```json
-{
- "success": true,
- "timestamp": "2024-01-15T10:30:45.123456",
- "filename": "document.jpg",
-
- // Standard detection fields...
- "image_info": {...},
- "detection_config": {...},
- "core_results": {...},
- "rodla_metrics": {...},
- "spatial_analysis": {...},
- // ... all other detection fields ...
-
- // 🆕 NEW: Perturbation information
- "perturbation_info": {
- "applied": [
- {
- "order": 1,
- "type": "rotation",
- "degree": 1,
- "message": "Successfully applied rotation (degree 1)"
- },
- {
- "order": 2,
- "type": "illumination",
- "degree": 2,
- "message": "Successfully applied illumination (degree 2)"
- }
- ],
- "failed": [],
- "success_rate": 1.0
- }
-}
-```
-
-#### Example Usage
-
-```bash
-# Detect with single perturbation
-curl -X POST "http://localhost:8000/api/detect-with-perturbation" \
- -F "file=@document.jpg" \
- -F 'perturbations=[{"type":"defocus","degree":2}]' \
- -F "score_thr=0.3"
-
-# Detect with multiple perturbations
-curl -X POST "http://localhost:8000/api/detect-with-perturbation" \
- -F "file=@document.jpg" \
- -F 'perturbations=[{"type":"illumination","degree":2},{"type":"speckle","degree":1}]' \
- -F "score_thr=0.3" \
- -F "save_perturbed_image=true"
-
-# Detect without perturbations (same as /api/detect)
-curl -X POST "http://localhost:8000/api/detect-with-perturbation" \
- -F "file=@document.jpg" \
- -F "score_thr=0.3"
-```
-
----
-
-## 🎨 Perturbation System
-
-### Overview
-
-The perturbation system allows you to apply various image transformations that simulate real-world document degradation and variations. This is crucial for:
-
-- **Robustness Testing**: Evaluate how well the model performs under adverse conditions
-- **Data Augmentation**: Generate training data variations
-- **Benchmark Creation**: Create standardized test sets
-- **Research**: Study model behavior under controlled perturbations
-
-### Perturbation Categories
-
-#### 1. Blur Effects
-
-Simulates optical issues from cameras or scanners.
-
-| Type | Description | Use Case | Degree Effects |
-|------|-------------|----------|----------------|
-| **defocus** | Gaussian blur | Out-of-focus camera | 1: σ=1, 2: σ=3, 3: σ=5 |
-| **vibration** | Motion blur | Camera shake | 1: 3px, 2: 9px, 3: 15px |
-
-**Example:**
-```json
-[
- {"type": "defocus", "degree": 2},
- {"type": "vibration", "degree": 1}
-]
-```
-
-#### 2. Noise Effects
-
-Simulates paper and printing quality issues.
-
-| Type | Description | Use Case | Degree Effects |
-|------|-------------|----------|----------------|
-| **speckle** | Random black/white spots | Paper defects | 1-3: Increasing density |
-| **texture** | Fibrous paper texture | Paper grain | 1: 300 fibers, 3: 900 fibers |
-
-**Example:**
-```json
-[
- {"type": "speckle", "degree": 1},
- {"type": "texture", "degree": 2}
-]
-```
-
-#### 3. Content Effects
-
-Adds overlays and modifications.
-
-| Type | Description | Use Case | Degree Effects |
-|------|-------------|----------|----------------|
-| **watermark** | Semi-transparent text | Document marking | 1-3: Increasing opacity |
-| **background** | Background image overlay | Texture addition | 1-3: More background images |
-
-**Example:**
-```json
-[
- {"type": "watermark", "degree": 2},
- {"type": "background", "degree": 1, "background_folder": "/path/to/backgrounds"}
-]
-```
-
-**Note**: `background` requires `background_folder` parameter.
-
-#### 4. Inconsistency Effects
-
-Simulates printing and lighting issues.
-
-| Type | Description | Use Case | Degree Effects |
-|------|-------------|----------|----------------|
-| **ink_holdout** | Ink not reaching paper | Print quality | 1-3: Increasing erosion |
-| **ink_bleeding** | Ink spreading | Poor paper quality | 1-3: Increasing dilation |
-| **illumination** | Uneven lighting | Shadow/highlights | 1-3: Stronger effect |
-
-**Example:**
-```json
-[
- {"type": "ink_holdout", "degree": 1},
- {"type": "ink_bleeding", "degree": 2},
- {"type": "illumination", "degree": 1}
-]
-```
-
-#### 5. Spatial Transformations
-
-Geometric distortions and positioning issues.
-
-| Type | Description | Use Case | Degree Effects |
-|------|-------------|----------|----------------|
-| **rotation** | Document rotation | Scanning angle | 1: ±5°, 2: ±10°, 3: ±15° |
-| **keystoning** | Perspective distortion | Camera angle | 1-3: Increasing distortion |
-| **warping** | Elastic deformation | Page curl | 1-3: Increasing warping |
-
-**Example:**
-```json
-[
- {"type": "rotation", "degree": 1},
- {"type": "keystoning", "degree": 2},
- {"type": "warping", "degree": 1}
-]
-```
-
-**Note**: Spatial perturbations may modify bounding box annotations if provided.
-
-### Intensity Levels
-
-All perturbations support 3 intensity levels:
-
-| Degree | Name | Description | Visual Impact |
-|--------|------|-------------|---------------|
-| **1** | Mild | Subtle, barely noticeable | Minimal degradation |
-| **2** | Moderate | Noticeable, moderate effect | Visible but readable |
-| **3** | Severe | Strong, significant effect | Clearly degraded |
-
-### Sequential Application
-
-Perturbations are applied in the order specified:
-
-```json
-[
- {"type": "defocus", "degree": 2}, // Applied first
- {"type": "speckle", "degree": 1}, // Applied to defocused image
- {"type": "rotation", "degree": 1} // Applied to defocused+speckled image
-]
-```
-
-**Important**: Order matters! Different orders produce different results.
-
-### Perturbation Module Structure
-
-```python
-perturbations/
-├── __init__.py # Exports all functions
-├── apply.py # Main orchestration
-│ ├── apply_perturbation() # Apply single perturbation
-│ ├── apply_multiple_perturbations() # Apply sequence
-│ └── get_perturbation_info() # Get available perturbations
-├── blur.py # Defocus, vibration
-├── content.py # Watermark, background
-├── noise.py # Speckle, texture
-├── inconsistency.py # Ink effects, illumination
-└── spatial.py # Rotation, keystoning, warping
-```
-
-### Usage Examples
-
-#### Python Client
-
-```python
-import requests
-import json
-
-# Test different perturbation categories
-perturbation_tests = {
- "blur_test": [
- {"type": "defocus", "degree": 2},
- {"type": "vibration", "degree": 1}
- ],
- "noise_test": [
- {"type": "speckle", "degree": 2},
- {"type": "texture", "degree": 1}
- ],
- "spatial_test": [
- {"type": "rotation", "degree": 1},
- {"type": "keystoning", "degree": 2}
- ],
- "combined_test": [
- {"type": "illumination", "degree": 2},
- {"type": "speckle", "degree": 1},
- {"type": "rotation", "degree": 1}
- ]
-}
-
-for test_name, perturbations in perturbation_tests.items():
- with open("document.jpg", "rb") as f:
- response = requests.post(
- "http://localhost:8000/api/perturb",
- files={"file": f},
- data={
- "perturbations": json.dumps(perturbations),
- "save_image": "true"
- }
- )
-
- result = response.json()
- print(f"\n{test_name}:")
- print(f" Success rate: {result['success_rate']:.1%}")
- print(f" Saved to: {result.get('saved_path', 'N/A')}")
-```
-
-#### Batch Processing
-
-```python
-import os
-from pathlib import Path
-
-def batch_perturb_documents(input_dir, perturbations):
- """Apply same perturbations to all images in directory."""
- input_path = Path(input_dir)
- results = []
-
- for image_file in input_path.glob("*.jpg"):
- with open(image_file, "rb") as f:
- response = requests.post(
- "http://localhost:8000/api/perturb",
- files={"file": f},
- data={
- "perturbations": json.dumps(perturbations)
- }
- )
-
- # Should return error or limit to max
- assert response.status_code in [400, 200]
-```
-
-#### Integration Tests
-
-```python
-# tests/test_integration/test_perturbation_pipeline.py
-
-import pytest
-from fastapi.testclient import TestClient
-import json
-import numpy as np
-import cv2
-from backend import app
-
-client = TestClient(app)
-
-class TestPerturbationPipeline:
- @pytest.fixture
- def test_image(self, tmp_path):
- """Create realistic test document image."""
- # Create white background
- img = np.ones((500, 700, 3), dtype=np.uint8) * 255
-
- # Add some text-like rectangles
- cv2.rectangle(img, (50, 50), (650, 100), (0, 0, 0), -1)
- cv2.rectangle(img, (50, 150), (650, 200), (0, 0, 0), -1)
- cv2.rectangle(img, (50, 250), (650, 450), (128, 128, 128), 2)
-
- img_path = tmp_path / "test_document.jpg"
- cv2.imwrite(str(img_path), img)
-
- return img_path
-
- def test_full_pipeline_perturb_then_detect(self, test_image):
- """Test complete pipeline: perturb then detect."""
- perturbations = [
- {"type": "defocus", "degree": 1},
- {"type": "speckle", "degree": 1}
- ]
-
- with open(test_image, "rb") as f:
- response = client.post(
- "/api/detect-with-perturbation",
- files={"file": ("test.jpg", f, "image/jpeg")},
- data={
- "perturbations": json.dumps(perturbations),
- "score_thr": "0.3",
- "generate_visualizations": "false"
- }
- )
-
- assert response.status_code == 200
- data = response.json()
-
- # Check detection results exist
- assert "core_results" in data
- assert "summary" in data["core_results"]
-
- # Check perturbation info exists
- assert "perturbation_info" in data
- assert len(data["perturbation_info"]["applied"]) == 2
-
- def test_compare_clean_vs_perturbed(self, test_image):
- """Compare detection results: clean vs perturbed."""
- # Get clean results
- with open(test_image, "rb") as f:
- clean_response = client.post(
- "/api/detect",
- files={"file": ("test.jpg", f, "image/jpeg")},
- data={"score_thr": "0.3"}
- )
-
- clean_data = clean_response.json()
- clean_detections = clean_data["core_results"]["summary"]["total_detections"]
- clean_confidence = clean_data["core_results"]["summary"]["average_confidence"]
-
- # Get perturbed results
- with open(test_image, "rb") as f:
- pert_response = client.post(
- "/api/detect-with-perturbation",
- files={"file": ("test.jpg", f, "image/jpeg")},
- data={
- "perturbations": json.dumps([{"type": "defocus", "degree": 3}]),
- "score_thr": "0.3"
- }
- )
-
- pert_data = pert_response.json()
- pert_detections = pert_data["core_results"]["summary"]["total_detections"]
- pert_confidence = pert_data["core_results"]["summary"]["average_confidence"]
-
- # Severe perturbation should affect results
- # (Either fewer detections or lower confidence)
- assert pert_detections <= clean_detections or pert_confidence < clean_confidence
-```
-
-### Mocking Dependencies
-
-```python
-# tests/conftest.py
-
-import pytest
-from unittest.mock import Mock, patch
-import numpy as np
-
-@pytest.fixture
-def mock_model():
- """Create a mock detection model."""
- model = Mock()
- model.CLASSES = ['paragraph', 'title', 'figure', 'table']
-
- # Mock inference result
- def mock_inference(img):
- # Return fake detections
- return [
- np.array([[100, 100, 500, 300, 0.95]]), # paragraph
- np.array([[100, 50, 500, 90, 0.90]]), # title
- np.array([]), # figure (none)
- np.array([[200, 350, 450, 480, 0.85]]) # table
- ]
-
- model.side_effect = mock_inference
- return model
-
-@pytest.fixture
-def mock_perturbation_functions():
- """Mock perturbation functions for testing."""
- with patch('perturbations.blur.apply_defocus') as mock_defocus, \
- patch('perturbations.noise.apply_speckle') as mock_speckle:
-
- # Return slightly modified image
- mock_defocus.side_effect = lambda img, **kwargs: img + 10
- mock_speckle.side_effect = lambda img, **kwargs: img + 5
-
- yield {
- 'defocus': mock_defocus,
- 'speckle': mock_speckle
- }
-```
-
----
-
-## 🚢 Deployment
-
-### Development Server
-
-```bash
-# Simple development server
-python backend.py
-
-# With uvicorn and auto-reload
-uvicorn backend:app --reload --host 0.0.0.0 --port 8000
-
-# With specific log level
-uvicorn backend:app --reload --log-level debug
-```
-
-### Production with Gunicorn
-
-```bash
-# Single worker (recommended for GPU)
-gunicorn backend:app \
- -w 1 \
- -k uvicorn.workers.UvicornWorker \
- --bind 0.0.0.0:8000 \
- --timeout 120 \
- --keep-alive 5 \
- --access-logfile - \
- --error-logfile -
-
-# With systemd service
-# /etc/systemd/system/rodla-api.service
-[Unit]
-Description=RoDLA Document Analysis API
-After=network.target
-
-[Service]
-User=rodla
-WorkingDirectory=/opt/rodla-api/deployment
-Environment="PATH=/opt/rodla-api/venv/bin"
-ExecStart=/opt/rodla-api/venv/bin/gunicorn backend:app \
- -w 1 -k uvicorn.workers.UvicornWorker \
- --bind 0.0.0.0:8000 \
- --timeout 120
-Restart=always
-
-[Install]
-WantedBy=multi-user.target
-```
-
-**Note:** Use `workers=1` for GPU models to avoid CUDA initialization issues across processes.
-
-### Docker Deployment
-
-```dockerfile
-# Dockerfile
-FROM nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu22.04
-
-# Install system dependencies
-RUN apt-get update && apt-get install -y \
- python3.9 \
- python3-pip \
- libgl1-mesa-glx \
- libglib2.0-0 \
- && rm -rf /var/lib/apt/lists/*
-
-# Set working directory
-WORKDIR /app
-
-# Copy requirements first for layer caching
-COPY requirements.txt .
-RUN pip3 install --no-cache-dir -r requirements.txt
-
-# Copy application code
-COPY . .
-
-# Create output directories
-RUN mkdir -p outputs outputs/perturbations
-
-# Expose port
-EXPOSE 8000
-
-# Health check
-HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
- CMD curl -f http://localhost:8000/health || exit 1
-
-# Run application
-CMD ["uvicorn", "backend:app", "--host", "0.0.0.0", "--port", "8000"]
-```
-
-```yaml
-# docker-compose.yml
-version: '3.8'
-
-services:
- rodla-api:
- build: .
- ports:
- - "8000:8000"
- volumes:
- - ./outputs:/app/outputs
- - ./weights:/app/weights
- - ./perturbation:/app/perturbation # NEW: Mount perturbation resources
- environment:
- - RODLA_API_KEY=${RODLA_API_KEY}
- - RODLA_BACKGROUND_FOLDER=/app/perturbation/background_image
- deploy:
- resources:
- reservations:
- devices:
- - driver: nvidia
- count: 1
- capabilities: [gpu]
- restart: unless-stopped
- healthcheck:
- test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
- interval: 30s
- timeout: 10s
- retries: 3
- start_period: 60s
-```
-
-**Build and Run:**
-```bash
-# Build image
-docker-compose build
-
-# Run container
-docker-compose up -d
-
-# View logs
-docker-compose logs -f
-
-# Stop container
-docker-compose down
-```
-
-### Kubernetes Deployment
-
-```yaml
-# k8s/deployment.yaml
-apiVersion: apps/v1
-kind: Deployment
-metadata:
- name: rodla-api
- labels:
- app: rodla-api
-spec:
- replicas: 1 # Single replica for GPU
- selector:
- matchLabels:
- app: rodla-api
- template:
- metadata:
- labels:
- app: rodla-api
- spec:
- containers:
- - name: rodla-api
- image: your-registry/rodla-api:latest
- ports:
- - containerPort: 8000
- name: http
- env:
- - name: RODLA_API_KEY
- valueFrom:
- secretKeyRef:
- name: rodla-secrets
- key: api-key
- resources:
- limits:
- nvidia.com/gpu: 1
- memory: "16Gi"
- cpu: "4"
- requests:
- memory: "8Gi"
- cpu: "2"
- volumeMounts:
- - name: outputs
- mountPath: /app/outputs
- - name: weights
- mountPath: /app/weights
- - name: perturbation-resources
- mountPath: /app/perturbation
- livenessProbe:
- httpGet:
- path: /health
- port: 8000
- initialDelaySeconds: 60
- periodSeconds: 30
- readinessProbe:
- httpGet:
- path: /health
- port: 8000
- initialDelaySeconds: 30
- periodSeconds: 10
- volumes:
- - name: outputs
- persistentVolumeClaim:
- claimName: rodla-outputs-pvc
- - name: weights
- persistentVolumeClaim:
- claimName: rodla-weights-pvc
- - name: perturbation-resources
- persistentVolumeClaim:
- claimName: rodla-perturbation-pvc
----
-apiVersion: v1
-kind: Service
-metadata:
- name: rodla-api-service
-spec:
- selector:
- app: rodla-api
- ports:
- - protocol: TCP
- port: 80
- targetPort: 8000
- type: LoadBalancer
-```
-
-### Nginx Reverse Proxy
-
-```nginx
-# /etc/nginx/sites-available/rodla-api
-upstream rodla_backend {
- server 127.0.0.1:8000;
- keepalive 32;
-}
-
-server {
- listen 80;
- server_name api.yourdomain.com;
-
- # Redirect to HTTPS
- return 301 https://$server_name$request_uri;
-}
-
-server {
- listen 443 ssl http2;
- server_name api.yourdomain.com;
-
- # SSL certificates
- ssl_certificate /etc/letsencrypt/live/api.yourdomain.com/fullchain.pem;
- ssl_certificate_key /etc/letsencrypt/live/api.yourdomain.com/privkey.pem;
-
- # SSL configuration
- ssl_protocols TLSv1.2 TLSv1.3;
- ssl_ciphers HIGH:!aNULL:!MD5;
- ssl_prefer_server_ciphers on;
-
- # Security headers
- add_header Strict-Transport-Security "max-age=31536000" always;
- add_header X-Frame-Options "SAMEORIGIN" always;
- add_header X-Content-Type-Options "nosniff" always;
-
- # File upload size (important for large images)
- client_max_body_size 50M;
-
- # Timeouts (important for slow perturbations)
- proxy_connect_timeout 120s;
- proxy_send_timeout 120s;
- proxy_read_timeout 120s;
-
- location / {
- proxy_pass http://rodla_backend;
- proxy_set_header Host $host;
- proxy_set_header X-Real-IP $remote_addr;
- proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
- proxy_set_header X-Forwarded-Proto $scheme;
-
- # WebSocket support (if needed)
- proxy_http_version 1.1;
- proxy_set_header Upgrade $http_upgrade;
- proxy_set_header Connection "upgrade";
- }
-
- # Rate limiting
- limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/m;
- limit_req zone=api_limit burst=20 nodelay;
-
- # Access logs
- access_log /var/log/nginx/rodla-api-access.log;
- error_log /var/log/nginx/rodla-api-error.log;
-}
-```
-
----
-
-## 🔧 Troubleshooting
-
-### Common Issues
-
-#### 1. Model Loading Failures
-
-**Symptom:** `RuntimeError: CUDA out of memory`
-
-**Solutions:**
-```bash
-# Clear GPU memory
-nvidia-smi --gpu-reset
-
-# Or in Python
-import torch
-torch.cuda.empty_cache()
-gc.collect()
-
-# Check available memory
-nvidia-smi
-
-# Reduce model precision (if supported)
-model = model.half() # Use FP16
-```
-
-**Symptom:** `ModuleNotFoundError: No module named 'mmdet'`
-
-**Solution:**
-```bash
-pip install -U openmim
-mim install mmengine mmcv mmdet
-```
-
-**Symptom:** `FileNotFoundError: Config file not found`
-
-**Solution:**
-```python
-# Verify paths in config/settings.py
-from pathlib import Path
-from config.settings import MODEL_CONFIG, MODEL_WEIGHTS
-
-print(f"Config exists: {Path(MODEL_CONFIG).exists()}")
-print(f"Weights exist: {Path(MODEL_WEIGHTS).exists()}")
-```
-
-#### 2. Perturbation Failures (NEW)
-
-**Symptom:** `ModuleNotFoundError: No module named 'ocrodeg'`
-
-**Solution:**
-```bash
-# Install perturbation dependencies
-pip install ocrodeg imgaug pyiqa
-
-# Verify installation
-python -c "import ocrodeg, imgaug; print('OK')"
-```
-
-**Symptom:** `ValueError: background_folder required for 'background' perturbation`
-
-**Solution:**
-```bash
-# Setup background folder
-mkdir -p perturbation/background_image
-
-# Add background images
-cp /path/to/textures/*.jpg perturbation/background_image/
-
-# Create index file
-python -c "
-from content import create_background_file
-create_background_file('perturbation/background_image/')
-"
-
-# Or provide folder in API call
-curl -X POST "http://localhost:8000/api/perturb" \
- -F "file=@image.jpg" \
- -F 'perturbations=[{"type":"background","degree":2}]' \
- -F "background_folder=/path/to/backgrounds"
-```
-
-**Symptom:** `ImportError: cannot import name 'apply_rotation' from 'spatial'`
-
-**Solution:**
-```bash
-# Ensure all perturbation files are copied
-ls -la deployment/perturbations/
-
-# Should contain:
-# blur.py, content.py, noise.py, inconsistency.py, spatial.py
-
-# Re-copy from RoDLA root if missing
-cp /path/to/RoDLA/perturbation/*.py deployment/perturbations/
-```
-
-#### 3. Inference Errors
-
-**Symptom:** `RuntimeError: Input type and weight type should be the same`
-
-**Solution:**
-```python
-# Ensure model and input on same device
-model = model.to('cuda')
-# or
-model = model.to('cpu')
-
-# Check device
-print(f"Model device: {next(model.parameters()).device}")
-```
-
-**Symptom:** `ValueError: could not broadcast input array`
-
-**Solution:**
-```python
-# Check image dimensions
-from PIL import Image
-img = Image.open(image_path)
-print(f"Image size: {img.size}") # (width, height)
-print(f"Image mode: {img.mode}") # Should be RGB or L
-
-# Ensure proper format
-img = img.convert('RGB')
-```
-
-#### 4. Visualization Errors
-
-**Symptom:** `RuntimeError: main thread is not in main loop`
-
-**Solution:**
-```python
-# Set matplotlib backend before importing pyplot
-import matplotlib
-matplotlib.use('Agg') # Non-interactive backend
-import matplotlib.pyplot as plt
-```
-
-**Symptom:** Memory grows with each request
-
-**Solution:**
-```python
-# Always close figures
-fig, ax = plt.subplots()
-# ... plotting code ...
-plt.savefig(buffer, format='png')
-plt.close(fig) # CRITICAL
-plt.close('all') # Nuclear option
-
-# Monitor memory
-import psutil
-print(f"Memory usage: {psutil.virtual_memory().percent}%")
-```
-
-#### 5. API Errors
-
-**Symptom:** `422 Unprocessable Entity`
-
-**Cause:** Invalid request format
-
-**Solution:**
-```bash
-# Correct format with proper content type
-curl -X POST "http://localhost:8000/api/detect" \
- -H "accept: application/json" \
- -F "file=@image.jpg;type=image/jpeg" \
- -F "score_thr=0.3"
-
-# For perturbations, ensure valid JSON
-curl -X POST "http://localhost:8000/api/perturb" \
- -F "file=@image.jpg" \
- -F 'perturbations=[{"type":"defocus","degree":2}]' # Note single quotes around JSON
-```
-
-**Symptom:** `413 Request Entity Too Large`
-
-**Solution:**
-```python
-# Increase in Nginx
-# client_max_body_size 50M;
-
-# Or check FastAPI limits
-from fastapi import FastAPI, File, UploadFile
-
-# No explicit limit in FastAPI, controlled by server
-```
-
-### Debugging Tips
-
-#### 1. Enable Debug Logging
-
-```python
-import logging
-
-logging.basicConfig(
- level=logging.DEBUG,
- format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
-)
-logger = logging.getLogger(__name__)
-
-# In your code
-logger.debug(f"Processing image: {filename}")
-logger.debug(f"Detections found: {len(detections)}")
-logger.debug(f"Perturbations applied: {perturbations}")
-```
-
-#### 2. GPU Monitoring
-
-```bash
-# Real-time monitoring
-watch -n 1 nvidia-smi
-
-# With gpustat (more readable)
-pip install gpustat
-gpustat -i 1
-
-# Log to file
-nvidia-smi -l 1 > gpu_usage.log &
-```
-
-#### 3. Memory Profiling
-
-```python
-# Install profiler
-pip install memory_profiler
-
-# Decorate functions
-from memory_profiler import profile
-
-@profile
-def detect_objects(...):
- ...
-
-# Run with profiling
-python -m memory_profiler backend.py
-```
-
-#### 4. Request Timing
-
-```python
-import time
-from functools import wraps
-
-def timer(func):
- @wraps(func)
- async def wrapper(*args, **kwargs):
- start = time.time()
- result = await func(*args, **kwargs)
- elapsed = time.time() - start
- logger.info(f"{func.__name__} took {elapsed:.2f}s")
- return result
- return wrapper
-
-@app.post("/api/detect")
-@timer
-async def detect_objects(...):
- ...
-```
-
-#### 5. Perturbation Debugging (NEW)
-
-```python
-# Test perturbations individually
-from perturbations import apply_perturbation
-import cv2
-
-image = cv2.imread("test.jpg")
-
-# Test each perturbation
-for pert_type in ["defocus", "speckle", "rotation"]:
- for degree in [1, 2, 3]:
- result, success, msg = apply_perturbation(image, pert_type, degree)
- print(f"{pert_type} degree {degree}: {msg}")
-
- if success:
- cv2.imwrite(f"debug_{pert_type}_{degree}.jpg", result)
-```
-
-### Health Checks
-
-```python
-# Add comprehensive health check endpoint
-@app.get("/health")
-async def health_check():
- health_status = {
- "status": "healthy",
- "timestamp": datetime.now().isoformat(),
- "components": {}
- }
-
- # Check model
- try:
- health_status["components"]["model"] = {
- "status": "ok" if model is not None else "error",
- "loaded": model is not None
- }
- except Exception as e:
- health_status["components"]["model"] = {
- "status": "error",
- "error": str(e)
- }
-
- # Check GPU
- try:
- import torch
- health_status["components"]["gpu"] = {
- "status": "ok" if torch.cuda.is_available() else "warning",
- "available": torch.cuda.is_available(),
- "device_count": torch.cuda.device_count() if torch.cuda.is_available() else 0,
- "memory_allocated": f"{torch.cuda.memory_allocated(0) / 1024**3:.2f} GB" if torch.cuda.is_available() else "N/A"
- }
- except Exception as e:
- health_status["components"]["gpu"] = {
- "status": "error",
- "error": str(e)
- }
-
- # Check perturbation dependencies
- try:
- import ocrodeg
- import imgaug
- health_status["components"]["perturbations"] = {
- "status": "ok",
- "dependencies_loaded": True
- }
- except ImportError as e:
- health_status["components"]["perturbations"] = {
- "status": "warning",
- "dependencies_loaded": False,
- "error": str(e)
- }
-
- # Check disk space
- try:
- import shutil
- total, used, free = shutil.disk_usage("/")
- health_status["components"]["disk"] = {
- "status": "ok" if free / total > 0.1 else "warning",
- "free_gb": f"{free / 1024**3:.2f}",
- "total_gb": f"{total / 1024**3:.2f}",
- "usage_percent": f"{(used / total) * 100:.1f}"
- }
- except Exception as e:
- health_status["components"]["disk"] = {
- "status": "error",
- "error": str(e)
- }
-
- # Overall status
- component_statuses = [c["status"] for c in health_status["components"].values()]
- if "error" in component_statuses:
- health_status["status"] = "unhealthy"
- elif "warning" in component_statuses:
- health_status["status"] = "degraded"
-
- return health_status
-```
-
----
-
-## 🤝 Contributing
-
-### Getting Started
-
-1. **Fork the repository**
- ```bash
- # On GitHub, click "Fork"
- git clone https://github.com/yourusername/rodla-api.git
- cd rodla-api
- ```
-
-2. **Create a feature branch**
- ```bash
- git checkout -b feature/amazing-feature
- # or
- git checkout -b fix/bug-description
- ```
-
-3. **Make your changes**
- - Follow code style guidelines
- - Add tests for new features
- - Update documentation
-
-4. **Run tests**
- ```bash
- pytest
- black .
- isort .
- flake8 .
- ```
-
-5. **Commit your changes**
- ```bash
- git add .
- git commit -m "Add: Brief description of changes"
- ```
-
-6. **Push to your fork**
- ```bash
- git push origin feature/amazing-feature
- ```
-
-7. **Open a Pull Request**
- - Describe your changes
- - Link any related issues
- - Wait for review
-
-### Code Style
-
-```bash
-# Install development dependencies
-pip install black isort flake8 mypy pytest
-
-# Format code
-black . --line-length 100
-isort . --profile black
-
-# Check style
-flake8 . --max-line-length 100 --ignore E203,W503
-
-# Type checking
-mypy . --ignore-missing-imports
-```
-
-### Pre-commit Hooks
-
-```yaml
-# .pre-commit-config.yaml
-repos:
- - repo: https://github.com/psf/black
- rev: 23.7.0
- hooks:
- - id: black
- language_version: python3.9
- args: ['--line-length=100']
-
- - repo: https://github.com/pycqa/isort
- rev: 5.12.0
- hooks:
- - id: isort
- args: ['--profile', 'black']
-
- - repo: https://github.com/pycqa/flake8
- rev: 6.1.0
- hooks:
- - id: flake8
- args: ['--max-line-length=100', '--ignore=E203,W503']
-
- - repo: https://github.com/pre-commit/pre-commit-hooks
- rev: v4.4.0
- hooks:
- - id: trailing-whitespace
- - id: end-of-file-fixer
- - id: check-yaml
- - id: check-added-large-files
-```
-
-```bash
-# Install pre-commit
-pip install pre-commit
-pre-commit install
-
-# Run manually
-pre-commit run --all-files
-```
-
-### Adding New Features
-
-#### Adding a New Perturbation Type
-
-1. **Implement the perturbation function** in appropriate module:
- ```python
- # perturbations/blur.py (example)
-
- def apply_new_blur(image, degree, save_path=None):
- """
- Apply new blur effect.
-
- Args:
- image: Input BGR image
- degree: Intensity (1-3)
- save_path: Optional path to save result
-
- Returns:
- Blurred image
- """
- if degree == 0:
- return image
-
- # Implementation here
- degree_value = 2 * degree - 1
- # ... blur logic ...
-
- if save_path:
- cv2.imwrite(save_path, result)
-
- return result
- ```
-
-2. **Export from `__init__.py`**:
- ```python
- # perturbations/__init__.py
- from .blur import apply_defocus, apply_vibration, apply_new_blur # Add new,
- "save_image": "true"
- }
- )
-
- result = response.json()
- results.append({
- "filename": image_file.name,
- "success": result["success"],
- "success_rate": result["success_rate"]
- })
-
- return results
-
-# Apply consistent perturbations to entire dataset
-perturbations = [
- {"type": "defocus", "degree": 1},
- {"type": "speckle", "degree": 1}
-]
-
-results = batch_perturb_documents("input_images/", perturbations)
-print(f"Processed {len(results)} images")
-print(f"Overall success: {sum(r['success'] for r in results)}/{len(results)}")
-```
-
-#### Robustness Testing Pipeline
-
-```python
-def test_model_robustness(image_path, perturbation_types, degrees=[1, 2, 3]):
- """
- Test model performance across different perturbation intensities.
- """
- results = {}
-
- for pert_type in perturbation_types:
- results[pert_type] = {}
-
- for degree in degrees:
- with open(image_path, "rb") as f:
- response = requests.post(
- "http://localhost:8000/api/detect-with-perturbation",
- files={"file": f},
- data={
- "perturbations": json.dumps([{"type": pert_type, "degree": degree}]),
- "score_thr": "0.3"
- }
- )
-
- result = response.json()
-
- results[pert_type][f"degree_{degree}"] = {
- "detections": result["core_results"]["summary"]["total_detections"],
- "avg_confidence": result["core_results"]["summary"]["average_confidence"],
- "robustness_score": result["robustness_indicators"]["robustness_rating"]["score"]
- }
-
- return results
-
-# Test all blur perturbations at all intensities
-robustness_results = test_model_robustness(
- "test_document.jpg",
- perturbation_types=["defocus", "vibration", "speckle"],
- degrees=[1, 2, 3]
-)
-
-# Analyze results
-for pert_type, degrees in robustness_results.items():
- print(f"\n{pert_type}:")
- for degree, metrics in degrees.items():
- print(f" {degree}: {metrics['detections']} detections, "
- f"{metrics['avg_confidence']:.2%} confidence")
-```
-
----
-
-## 📊 Metrics System
-
-### Metrics Architecture
-
-```
-utils/metrics/
-├── __init__.py # Exports all metric functions
-├── core.py # Core detection metrics
-├── rodla.py # RoDLA-specific robustness metrics
-├── spatial.py # Spatial distribution analysis
-└── quality.py # Quality and complexity metrics
-```
-
-### Core Metrics (utils/metrics/core.py)
-
-#### `calculate_core_metrics(detections, img_width, img_height)`
-
-Computes fundamental detection statistics.
-
-| Metric | Type | Description | Range |
-|--------|------|-------------|-------|
-| `total_detections` | int | Number of detected elements | 0 - 300 |
-| `unique_classes` | int | Number of distinct element types | 0 - 74 |
-| `average_confidence` | float | Mean confidence score | 0.0 - 1.0 |
-| `median_confidence` | float | Median confidence score | 0.0 - 1.0 |
-| `min_confidence` | float | Lowest confidence | 0.0 - 1.0 |
-| `max_confidence` | float | Highest confidence | 0.0 - 1.0 |
-| `coverage_percentage` | float | % of image covered | 0.0 - 100.0 |
-| `average_detection_area` | float | Mean area per detection | pixels² |
-
-#### `calculate_class_metrics(detections)`
-
-Per-class statistical analysis.
-
-```python
-{
- "paragraph": {
- "count": 15,
- "percentage": 31.91,
- "confidence_stats": {
- "mean": 0.8234,
- "std": 0.0876,
- "min": 0.6543,
- "max": 0.9654
- },
- "area_stats": {
- "mean": 125432.5,
- "std": 45678.2,
- "total": 1881487.5
- },
- "aspect_ratio_stats": {
- "mean": 2.345,
- "orientation": "horizontal" # horizontal/vertical/square
- }
- },
- "title": {...},
- "figure": {...}
- // ... other classes
-}
-```
-
-#### `calculate_confidence_metrics(detections)`
-
-Detailed confidence distribution analysis.
-
-**Confidence Bins:**
-- **Very High**: 0.9 - 1.0 (highly certain)
-- **High**: 0.8 - 0.9 (confident)
-- **Medium**: 0.6 - 0.8 (acceptable)
-- **Low**: 0.4 - 0.6 (uncertain)
-- **Very Low**: 0.0 - 0.4 (very uncertain)
-
-**Output:**
-```python
-{
- "distribution": {
- "mean": 0.7823,
- "median": 0.8156,
- "std": 0.1234,
- "min": 0.3012,
- "max": 0.9876,
- "q1": 0.7012,
- "q3": 0.8967
- },
- "binned_distribution": {
- "very_high": 15,
- "high": 20,
- "medium": 10,
- "low": 2,
- "very_low": 0
- },
- "percentages": {
- "very_high": 31.91,
- "high": 42.55,
- "medium": 21.28,
- "low": 4.26,
- "very_low": 0.0
- },
- "entropy": 2.3456 # Shannon entropy
-}
-```
-
----
-
-### RoDLA Metrics (utils/metrics/rodla.py)
-
-These metrics estimate robustness based on the RoDLA paper's methodology.
-
-#### `calculate_rodla_metrics(detections, core_metrics)`
-
-Estimates perturbation effects and robustness degradation.
-
-| Metric | Formula | Range | Interpretation |
-|--------|---------|-------|----------------|
-| `estimated_mPE` | `(std × 100) + (range × 50)` | 0-100+ | Mean Perturbation Effect |
-| `estimated_mRD` | `(degradation / mPE) × 100` | 0-200+ | Mean Robustness Degradation |
-| `robustness_score` | `(1 - mRD/200) × 100` | 0-100 | Overall robustness |
-
-**mPE Interpretation:**
-```
-low: mPE < 20 → Minimal perturbation effect
- Predictions are very consistent
-
-medium: 20 ≤ mPE < 40 → Moderate perturbation effect
- Some variability in predictions
-
-high: mPE ≥ 40 → Significant perturbation effect
- High prediction variability
-```
-
-**mRD Interpretation:**
-```
-excellent: mRD < 100 → Highly robust model
- Minimal performance degradation
-
-good: 100 ≤ mRD < 150 → Acceptable robustness
- Moderate degradation
-
-needs_improvement: mRD ≥ 150 → Robustness concerns
- Significant degradation
-```
-
-#### `calculate_robustness_indicators(detections, core_metrics)`
-
-Stability and consistency metrics.
-
-```python
-{
- "stability_score": 87.65, # (1 - CV) × 100
- "coefficient_of_variation": 0.12, # std / mean
- "high_confidence_ratio": 0.72, # % detections with conf ≥ 0.8
- "prediction_consistency": "high", # Based on CV thresholds
- "model_certainty": "medium", # Based on avg confidence
- "robustness_rating": {
- "rating": "good", # excellent/good/fair/poor
- "score": 72.34 # Composite score
- }
-}
-```
-
-**Robustness Rating Formula:**
-```
-score = (avg_confidence × 40) +
- ((1 - CV) × 30) +
- (high_conf_ratio × 30)
-
-Rating categories:
-- excellent: score ≥ 80 (very robust)
-- good: 60 ≤ score < 80 (robust)
-- fair: 40 ≤ score < 60 (acceptable)
-- poor: score < 40 (concerning)
-```
-
----
-
-### Spatial Metrics (utils/metrics/spatial.py)
-
-#### `calculate_spatial_analysis(detections, img_width, img_height)`
-
-Comprehensive spatial distribution analysis.
-
-##### Horizontal Distribution
-```python
-{
- "mean": 1240.5, # Mean x-coordinate
- "std": 456.7, # Standard deviation
- "skewness": -0.234, # Distribution asymmetry
- # Negative = left-skewed
- # Positive = right-skewed
- "left_third": 12, # Count in left 33%
- "center_third": 25, # Count in center 33%
- "right_third": 10 # Count in right 33%
-}
-```
-
-##### Vertical Distribution
-```python
-{
- "mean": 1754.2, # Mean y-coordinate
- "std": 892.4, # Standard deviation
- "skewness": 0.156, # Distribution asymmetry
- # Negative = top-heavy
- # Positive = bottom-heavy
- "top_third": 8, # Count in top 33%
- "middle_third": 22, # Count in middle 33%
- "bottom_third": 17 # Count in bottom 33%
-}
-```
-
-##### Quadrant Distribution
-
-Document divided into 4 equal quadrants:
-
-```
-┌─────────────┬─────────────┐
-│ Q1 │ Q2 │
-│ (top-left) │ (top-right) │
-│ │ │
-├─────────────┼─────────────┤
-│ Q3 │ Q4 │
-│ (bot-left) │ (bot-right) │
-│ │ │
-└─────────────┴─────────────┘
-```
-
-```python
-{
- "Q1": 12, # Top-left
- "Q2": 15, # Top-right
- "Q3": 10, # Bottom-left
- "Q4": 10 # Bottom-right
-}
-```
-
-##### Size Distribution
-
-Elements categorized by area relative to image size:
-
-| Category | Threshold | Description | Typical Elements |
-|----------|-----------|-------------|------------------|
-| **tiny** | < 0.5% | Very small | Footnotes, page numbers |
-| **small** | 0.5% - 2% | Small | Captions, list items |
-| **medium** | 2% - 10% | Medium | Paragraphs, titles |
-| **large** | ≥ 10% | Large | Figures, tables |
-
-```python
-{
- "tiny": 5,
- "small": 15,
- "medium": 20,
- "large": 7
-}
-```
-
-##### Density Metrics
-
-```python
-{
- "average_nearest_neighbor_distance": 234.56, # pixels
- "spatial_clustering_score": 0.67 # 0-1 scale
- # Higher = more clustered
- # Lower = more dispersed
-}
-```
-
-**Clustering Interpretation:**
-- **0.0 - 0.3**: Highly dispersed (scattered layout)
-- **0.3 - 0.7**: Moderately clustered (typical documents)
-- **0.7 - 1.0**: Highly clustered (dense regions)
-
----
-
-### Quality Metrics (utils/metrics/quality.py)
-
-#### `calculate_layout_complexity(detections, img_width, img_height)`
-
-Quantifies document structure complexity.
-
-**Complexity Score Formula:**
-```
-score = (class_diversity / 20) × 30 # Max 20 unique classes
- + min(detections / 50, 1) × 30 # Detection count normalized
- + min(density / 10, 1) × 20 # Elements per megapixel
- + (1 - min(avg_dist / 500, 1)) × 20 # Spatial clustering
-```
-
-**Complexity Levels:**
-
-| Level | Score Range | Description | Examples |
-|-------|-------------|-------------|----------|
-| **simple** | < 30 | Basic layout | Single-column text, minimal structure |
-| **moderate** | 30 - 60 | Average complexity | Multi-column, some figures/tables |
-| **complex** | ≥ 60 | Complex layout | Academic papers, magazines, forms |
-
-**Layout Characteristics:**
-```python
-{
- "class_diversity": 12, # Number of unique classes
- "total_elements": 47, # Total detections
- "detection_density": 5.41, # Elements per megapixel
- "average_element_distance": 234.56, # Mean nearest neighbor distance
- "complexity_score": 58.23, # Computed score
- "complexity_level": "moderate", # simple/moderate/complex
- "layout_characteristics": {
- "is_dense": True, # density > 5 elements/megapixel
- "is_diverse": True, # unique_classes ≥ 10
- "is_structured": False # avg_distance < 200 pixels
- }
-}
-```
-
-#### `calculate_quality_metrics(detections, img_width, img_height)`
-
-Detection quality assessment.
-
-##### Overlap Analysis
-
-Measures how many detections overlap (potential errors).
-
-```python
-{
- "total_overlapping_pairs": 5, # Number of pairs with IoU > 0
- "overlap_percentage": 10.64, # % of detections involved in overlaps
- "average_iou": 0.1234 # Mean IoU of overlapping pairs
-}
-```
-
-**Overlap Interpretation:**
-- **< 5%**: Excellent (minimal overlaps)
-- **5% - 15%**: Good (acceptable overlaps)
-- **15% - 30%**: Fair (moderate overlaps)
-- **> 30%**: Poor (excessive overlaps)
-
-##### Size Consistency
-
-Measures variability in detection sizes.
-
-```python
-{
- "coefficient_of_variation": 0.876, # std/mean of areas
- "consistency_level": "medium" # high/medium/low
-}
-```
-
-**Consistency Levels:**
-- **high** (CV < 0.5): Very consistent sizes
-- **medium** (0.5 ≤ CV < 1.0): Moderate variation
-- **low** (CV ≥ 1.0): High variation
-
-##### Detection Quality Score
-
-Overall quality assessment combining overlap and size consistency.
-
-```
-score = (1 - min(overlap_% / 100, 1)) × 50 +
- (1 - min(size_cv, 1)) × 50
-```
-
-**Score Interpretation:**
-- **80-100**: Excellent quality
-- **60-80**: Good quality
-- **40-60**: Fair quality
-- **< 40**: Poor quality
-
----
-
-## 📈 Visualization Engine
-
-### services/visualization.py
-
-Generates 8 distinct chart types providing comprehensive visual analysis.
-
-### Chart Types
-
-#### 1. Class Distribution Bar Chart
-
-**Purpose**: Show count of detections per class
-
-**Features:**
-- Vertical bars sorted by count (descending)
-- Value labels on top of each bar
-- Rotated x-axis labels (45°) for readability
-- Grid lines for easy counting
-- Color: Steel blue
-
-**Code:**
-```python
-fig, ax = plt.subplots(figsize=(12, 6))
-class_counts = sorted(class_metrics.items(), key=lambda x: x[1]['count'], reverse=True)
-classes = [item[0] for item in class_counts]
-counts = [item[1]['count'] for item in class_counts]
-
-ax.bar(classes, counts, color='steelblue')
-ax.set_xlabel('Class')
-ax.set_ylabel('Count')
-ax.set_title('Detection Count by Class')
-plt.xticks(rotation=45, ha='right')
-plt.tight_layout()
-```
-
-#### 2. Confidence Distribution Histogram
-
-**Purpose**: Show distribution of confidence scores
-
-**Features:**
-- 20 bins spanning 0.0 to 1.0
-- Red dashed line for mean
-- Orange dashed line for median
-- Legend with exact values
-- Grid for readability
-
-**Code:**
-```python
-confidences = [d['confidence'] for d in detections]
-fig, ax = plt.subplots(figsize=(10, 6))
-ax.hist(confidences, bins=20, color='steelblue', alpha=0.7, edgecolor='black')
-ax.axvline(np.mean(confidences), color='red', linestyle='--', label=f'Mean: {np.mean(confidences):.3f}')
-ax.axvline(np.median(confidences), color='orange', linestyle='--', label=f'Median: {np.median(confidences):.3f}')
-ax.set_xlabel('Confidence Score')
-ax.set_ylabel('Frequency')
-ax.set_title('Confidence Distribution')
-ax.legend()
-plt.tight_layout()
-```
-
-#### 3. Spatial Distribution Heatmap
-
-**Purpose**: Visualize where detections are concentrated
-
-**Features:**
-- 2D histogram (50x50 bins)
-- YlOrRd colormap (yellow → orange → red)
-- Colorbar showing density
-- Axes in pixel coordinates
-
-**Code:**
-```python
-x_coords = [d['bbox']['center_x'] for d in detections]
-y_coords = [d['bbox']['center_y'] for d in detections]
-
-fig, ax = plt.subplots(figsize=(10, 8))
-h = ax.hist2d(x_coords, y_coords, bins=50, cmap='YlOrRd')
-plt.colorbar(h[3], ax=ax, label='Density')
-ax.set_xlabel('X Coordinate (pixels)')
-ax.set_ylabel('Y Coordinate (pixels)')
-ax.set_title('Spatial Distribution Heatmap')
-plt.tight_layout()
-```
-
-#### 4. Confidence by Class Box Plot
-
-**Purpose**: Compare confidence distributions across classes
-
-**Features:**
-- Box plot for top 10 classes (by count)
-- Shows median, Q1, Q3, outliers
-- Sample sizes in x-axis labels
-- Light blue boxes with black edges
-
-**Code:**
-```python
-top_classes = sorted(class_metrics.items(), key=lambda x: x[1]['count'], reverse=True)[:10]
-data_to_plot = []
-labels = []
-
-for class_name, metrics in top_classes:
- class_detections = [d for d in detections if d['class_name'] == class_name]
- confidences = [d['confidence'] for d in class_detections]
- data_to_plot.append(confidences)
- labels.append(f"{class_name}\n(n={len(confidences)})")
-
-fig, ax = plt.subplots(figsize=(12, 6))
-bp = ax.boxplot(data_to_plot, labels=labels, patch_artist=True)
-for patch in bp['boxes']:
- patch.set_facecolor('lightblue')
-ax.set_ylabel('Confidence Score')
-ax.set_title('Confidence Distribution by Class (Top 10)')
-plt.xticks(rotation=45, ha='right')
-plt.tight_layout()
-```
-
-#### 5. Area vs Confidence Scatter Plot
-
-**Purpose**: Examine relationship between size and confidence
-
-**Features:**
-- Each point = one detection
-- Color-coded by confidence (viridis colormap)
-- Colorbar showing confidence scale
-- Logarithmic x-axis for better spread
-
-**Code:**
-```python
-areas = [d['area'] for d in detections]
-confidences = [d['confidence'] for d in detections]
-
-fig, ax = plt.subplots(figsize=(10, 6))
-scatter = ax.scatter(areas, confidences, c=confidences, cmap='viridis', alpha=0.6)
-plt.colorbar(scatter, label='Confidence')
-ax.set_xlabel('Detection Area (pixels²)')
-ax.set_ylabel('Confidence Score')
-ax.set_title('Area vs Confidence')
-ax.set_xscale('log')
-ax.grid(True, alpha=0.3)
-plt.tight_layout()
-```
-
-#### 6. Quadrant Distribution Pie Chart
-
-**Purpose**: Show spatial distribution by quadrant
-
-**Features:**
-- 4 segments (Q1, Q2, Q3, Q4)
-- Percentage labels with counts
-- Distinct colors per quadrant
-- Automatic percentage calculation
-
-**Code:**
-```python
-quadrants = spatial_metrics['quadrant_distribution']
-labels = [f'Q{i}\n({quadrants[f"Q{i}"]} elements)' for i in range(1, 5)]
-sizes = [quadrants[f"Q{i}"] for i in range(1, 5)]
-colors = ['#ff9999', '#66b3ff', '#99ff99', '#ffcc99']
-
-fig, ax = plt.subplots(figsize=(8, 8))
-ax.pie(sizes, labels=labels, colors=colors, autopct='%1.1f%%', startangle=90)
-ax.set_title('Spatial Distribution by Quadrant')
-plt.tight_layout()
-```
-
-#### 7. Size Distribution Bar Chart
-
-**Purpose**: Show distribution of detection sizes
-
-**Features:**
-- 4 categories (tiny, small, medium, large)
-- Distinct color per category
-- Value labels on bars
-- Horizontal grid lines
-
-**Code:**
-```python
-size_dist = spatial_metrics['size_distribution']
-categories = ['tiny', 'small', 'medium', 'large']
-counts = [size_dist[cat] for cat in categories]
-colors_map = ['#ff6b6b', '#ffd93d', '#6bcf7f', '#4d96ff']
-
-fig, ax = plt.subplots(figsize=(10, 6))
-bars = ax.bar(categories, counts, color=colors_map)
-for bar in bars:
- height = bar.get_height()
- ax.text(bar.get_x() + bar.get_width()/2., height,
- f'{int(height)}', ha='center', va='bottom')
-ax.set_xlabel('Size Category')
-ax.set_ylabel('Count')
-ax.set_title('Size Distribution')
-ax.grid(axis='y', alpha=0.3)
-plt.tight_layout()
-```
-
-#### 8. Top Classes by Average Confidence
-
-**Purpose**: Identify most confidently detected classes
-
-**Features:**
-- Horizontal bars (easier to read class names)
-- Top 15 classes only
-- Sorted by average confidence
-- Value labels at bar ends
-- Coral color scheme
-
-**Code:**
-```python
-class_conf = [(name, metrics['confidence_stats']['mean'])
- for name, metrics in class_metrics.items()]
-class_conf_sorted = sorted(class_conf, key=lambda x: x[1], reverse=True)[:15]
-classes = [item[0] for item in class_conf_sorted]
-confidences = [item[1] for item in class_conf_sorted]
-
-fig, ax = plt.subplots(figsize=(12, 8))
-bars = ax.barh(classes, confidences, color='coral')
-for i, bar in enumerate(bars):
- width = bar.get_width()
- ax.text(width, bar.get_y() + bar.get_height()/2.,
- f'{confidences[i]:.3f}', ha='left', va='center', fontsize=9)
-ax.set_xlabel('Average Confidence')
-ax.set_title('Top 15 Classes by Average Confidence')
-ax.invert_yaxis()
-plt.tight_layout()
-```
-
-### Technical Implementation
-
-```python
-def generate_comprehensive_visualizations(
- detections: List[dict],
- class_metrics: dict,
- confidence_metrics: dict,
- spatial_metrics: dict,
- img_width: int,
- img_height: int
-) -> dict:
- """
- Generate all visualization types.
-
- Args:
- detections: List of detection dictionaries
- class_metrics: Per-class statistics
- confidence_metrics: Confidence distribution data
- spatial_metrics: Spatial analysis results
- img_width: Original image width
- img_height: Original image height
-
- Returns:
- Dictionary mapping visualization names to base64-encoded PNG images
- """
- visualizations = {}
-
- # Set matplotlib to non-interactive backend
- import matplotlib
- matplotlib.use('Agg')
- import matplotlib.pyplot as plt
-
- # Each visualization wrapped in try-except for isolation
- viz_functions = {
- 'class_distribution': lambda: generate_class_distribution(class_metrics),
- 'confidence_distribution': lambda: generate_confidence_histogram(detections),
- 'spatial_heatmap': lambda: generate_spatial_heatmap(detections),
- 'confidence_by_class': lambda: generate_confidence_boxplot(detections, class_metrics),
- 'area_vs_confidence': lambda: generate_area_scatter(detections),
- 'quadrant_distribution': lambda: generate_quadrant_pie(spatial_metrics),
- 'size_distribution': lambda: generate_size_bars(spatial_metrics),
- 'top_classes_confidence': lambda: generate_top_confidence(class_metrics)
- }
-
- for viz_name, viz_func in viz_functions.items():
- try:
- fig = viz_func()
- visualizations[viz_name] = fig_to_base64(fig)
- plt.close(fig) # CRITICAL: Prevents memory leaks
- except Exception as e:
- print(f"Error generating {viz_name}: {e}")
- visualizations[viz_name] = None
-
- return visualizations
-```
-
-### Base64 Encoding
-
-```python
-from io import BytesIO
-import base64
-
-def fig_to_base64(fig) -> str:
- """
- Convert matplotlib figure to base64 data URI.
-
- Args:
- fig: Matplotlib figure object
-
- Returns:
- Base64-encoded string with data URI prefix
- """
- buffer = BytesIO()
- fig.savefig(buffer, format='png', dpi=100, bbox_inches='tight')
- buffer.seek(0)
- image_base64 = base64.b64encode(buffer.read()).decode()
- buffer.close()
- return f"data:image/png;base64,{image_base64}"
-```
-
-### Usage in HTML/Frontend
-
-```html
-
-
-
-
Class Distribution
-

-
-
-
-
Confidence Distribution
-

-
-
-
-
Spatial Heatmap
-

-
-
-
-
-```
-
----
-
-## 🔧 Services Layer
-
-### services/detection.py
-
-Core detection logic and result processing.
-
-#### `process_detections(result, score_thr=0.3)`
-
-Converts raw MMDetection output to structured format.
-
-**Input**: Raw model result (list of numpy arrays per class)
-
-**Processing Steps:**
-1. Iterate through each class's detection array
-2. Filter detections by confidence threshold
-3. Extract bounding box coordinates
-4. Calculate derived metrics (area, aspect ratio, center point)
-5. Format as structured dictionaries
-6. Sort by confidence (descending)
-
-**Output:**
-```python
-[
- {
- "class_id": 0,
- "class_name": "paragraph",
- "bbox": {
- "x1": 100.5,
- "y1": 200.3,
- "x2": 500.8,
- "y2": 350.2,
- "width": 400.3,
- "height": 149.9,
- "center_x": 300.65,
- "center_y": 275.25
- },
- "confidence": 0.9234,
- "area": 60005.0,
- "aspect_ratio": 2.67
- },
- // ... more detections sorted by confidence
-]
-```
-
----
-
-### services/processing.py
-
-Result aggregation and persistence.
-
-#### `aggregate_results(...)`
-
-Assembles all analysis components into the final response.
-
-**Parameters:**
-- `detections`: Processed detection list
-- `core_metrics`: Basic statistics
-- `rodla_metrics`: Robustness estimates
-- `spatial_metrics`: Spatial analysis
-- `class_metrics`: Per-class stats
-- `confidence_metrics`: Confidence distribution
-- `robustness_indicators`: Stability measures
-- `layout_complexity`: Complexity assessment
-- `quality_metrics`: Quality scores
-- `visualizations`: Base64 charts
-- `interpretation`: Human-readable insights
-- `file_info`: Image metadata
-- `config`: Detection configuration
-
-**Returns**: Complete JSON response dictionary
-
-#### `save_results(results, filename, output_dir)`
-
-Persists results to disk with optimizations.
-
-**Process:**
-1. Remove large base64 visualizations from JSON
-2. Convert numpy types to Python native types
-3. Save JSON with pretty formatting
-4. Optionally save visualizations as separate PNG files
-5. Return path to saved JSON
-
----
-
-### 🆕 services/perturbation.py (NEW)
-
-Business logic for image perturbations.
-
-#### `perturb_image_service(image, perturbations, filename, ...)`
-
-Main service function for applying perturbations.
-
-**Parameters:**
-```python
-image: np.ndarray # Input BGR image
-perturbations: List[dict] # Perturbation configurations
-filename: str # Original filename
-save_image: bool = False # Save perturbed image
-return_base64: bool = False # Return as base64
-background_folder: str = None # Background images path
-```
-
-**Returns:**
-```python
-(response_dict, perturbed_image)
-```
-
-**Process:**
-1. Store original image shape
-2. Apply perturbations sequentially via `apply_multiple_perturbations()`
-3. Track success/failure for each perturbation
-4. Optionally convert to base64
-5. Optionally save to disk with descriptive filename
-6. Return results and perturbed image
-
-#### `image_to_base64(image)`
-
-Converts OpenCV image to base64 data URI.
-
-**Process:**
-1. Encode image to PNG format
-2. Convert bytes to base64 string
-3. Add data URI prefix
-4. Return complete data URI
-
-#### `save_perturbed_image(image, original_filename, perturbations)`
-
-Saves perturbed image with descriptive filename.
-
-**Filename Format:**
-```
-{original_name}_pert_{pert1type}{degree}_{pert2type}{degree}_{timestamp}.png
-```
-
-**Example:**
-```
-document_pert_defocus2_speckle1_rotation1_20241115_143022.png
-```
-
----
-
-### services/interpretation.py
-
-Human-readable insight generation.
-
-#### `generate_comprehensive_interpretation(...)`
-
-Creates natural language analysis of results.
-
-**Output Sections:**
-
-| Section | Type | Description |
-|---------|------|-------------|
-| `overview` | string | High-level summary paragraph |
-| `top_elements` | string | Most common element description |
-| `rodla_analysis` | string | Robustness assessment |
-| `layout_complexity` | string | Complexity analysis |
-| `key_findings` | list | Important observations (bullet points) |
-| `perturbation_assessment` | string | Perturbation effect analysis |
-| `recommendations` | list | Actionable suggestions |
-| `confidence_summary` | dict | Confidence level breakdown |
-
-**Example Output:**
-```python
-{
- "overview": """Document Analysis Summary:
-Detected 47 layout elements across 12 different classes.
-The model achieved an average confidence of 78.2%,
-indicating medium certainty in predictions. The detected
-elements cover 68.5% of the document area.""",
-
- "top_elements": """The most common elements are:
-- Paragraph (15 occurrences, 31.9%)
-- Title (8 occurrences, 17.0%)
-- Figure (6 occurrences, 12.8%)""",
-
- "rodla_analysis": """RoDLA Robustness Analysis:
-Estimated mPE: 18.45 (low perturbation effect)
-Estimated mRD: 87.32 (excellent robustness)
-The model shows excellent robustness with minimal
-predicted degradation under perturbations.""",
-
- "layout_complexity": """Layout Complexity: Moderate
-The document has moderate structural complexity with
-12 different element types and 47 total elements.
-Detection density: 5.41 elements per megapixel.""",
-
- "key_findings": [
- "✓ Excellent detection confidence - model is highly certain",
- "✓ High document coverage - most of page contains elements",
- "ℹ Complex document structure with diverse element types",
- "⚠ Some overlapping detections detected (10.6%)"
- ],
-
- "perturbation_assessment": """Based on confidence
-variability, the model is expected to maintain good
-performance under mild perturbations.""",
-
- "recommendations": [
- "No specific recommendations - detection quality is good"
- ],
-
- "confidence_summary": {
- "very_high_count": 15,
- "high_count": 20,
- "medium_count": 10,
- "low_count": 2,
- "very_low_count": 0
- }
-}
-```
-
----
-
-## 🛠️ Utilities Reference
-
-### utils/helpers.py
-
-General-purpose mathematical and utility functions.
-
-#### Mathematical Functions
-
-##### `calculate_skewness(data)`
-
-Measures distribution asymmetry.
-
-**Formula:** `mean(((x - μ) / σ)³)`
-
-**Interpretation:**
-- **Negative**: Left-skewed (tail on left)
-- **Zero**: Symmetric
-- **Positive**: Right-skewed (tail on right)
-
-##### `calculate_entropy(values)`
-
-Measures information content/uncertainty.
-
-**Formula:** `-Σ(p × log₂(p))`
-
-**Range:** 0 (certain) to log₂(n) (maximum uncertainty)
-
-##### `calculate_avg_nn_distance(xs, ys)`
-
-Average distance to nearest neighbor.
-
-**Process:**
-1. For each point, find closest other point
-2. Calculate Euclidean distance
-3. Return mean of all distances
-
-##### `calculate_clustering_score(xs, ys)`
-
-Spatial clustering measure.
-
-**Formula:** `1 - (std / mean)` of nearest neighbor distances
-
-**Range:** 0-1 (higher = more clustered)
-
-##### `calculate_iou(bbox1, bbox2)`
-
-Intersection over Union for bounding boxes.
-
-**Formula:** `intersection_area / union_area`
-
-**Range:** 0 (no overlap) to 1 (complete overlap)
-
-#### Utility Functions
-
-##### `calculate_detection_overlaps(detections)`
-
-Finds all overlapping detection pairs.
-
-**Returns:**
-```python
-{
- 'count': int, # Number of overlapping pairs
- 'percentage': float, # % of detections with overlaps
- 'avg_iou': float # Mean IoU of overlaps
-}
-```
-
----
-
-### utils/serialization.py
-
-JSON conversion utilities for numpy types.
-
-#### `convert_to_json_serializable(obj)`
-
-Recursively converts numpy types to Python native types.
-
-**Conversions:**
-
-| NumPy Type | Python Type | Example |
-|------------|-------------|---------|
-| `np.integer` (int8, int32, etc.) | `int` | 42 |
-| `np.floating` (float32, float64, etc.) | `float` | 3.14 |
-| `np.ndarray` | `list` | [1, 2, 3] |
-| `np.bool_` | `bool` | True |
-
-**Implementation:**
-```python
-def convert_to_json_serializable(obj):
- """
- Recursively convert numpy types for JSON serialization.
-
- Handles:
- - Dictionaries (recursive on values)
- - Lists (recursive on items)
- - NumPy scalars and arrays
- - Native Python types (pass-through)
- """
- if isinstance(obj, dict):
- return {k: convert_to_json_serializable(v) for k, v in obj.items()}
- elif isinstance(obj, list):
- return [convert_to_json_serializable(item) for item in obj]
- elif isinstance(obj, np.integer):
- return int(obj)
- elif isinstance(obj, np.floating):
- return float(obj)
- elif isinstance(obj, np.ndarray):
- return obj.tolist()
- elif isinstance(obj, np.bool_):
- return bool(obj)
- return obj
-```
-
----
-
-## ⚠️ Error Handling
-
-### Exception Hierarchy
-
-```
-Exception
-├── HTTPException (FastAPI)
-│ ├── 400 Bad Request
-│ │ ├── Invalid file type
-│ │ ├── Invalid perturbation config (NEW)
-│ │ └── Missing required parameters (NEW)
-│ └── 500 Internal Server Error
-│ ├── Model not loaded
-│ ├── Inference failed
-│ ├── Perturbation failed (NEW)
-│ └── Processing error
-└── Standard Exceptions
- ├── FileNotFoundError
- ├── ValueError
- ├── RuntimeError
- └── ImportError (NEW - perturbation dependencies)
-```
-
-### Error Handling Strategy
-
-#### API Level
-
-```python
-@app.post("/api/detect")
-async def detect_objects(...):
- tmp_path = None
-
- try:
- # Validate file type
- if not file.content_type.startswith('image/'):
- raise HTTPException(400, "File must be an image")
-
- # Main processing logic
- ...
-
- except HTTPException:
- # Re-raise HTTP exceptions unchanged
- if tmp_path and os.path.exists(tmp_path):
- os.unlink(tmp_path)
- raise
-
- except Exception as e:
- # Handle unexpected errors
- if tmp_path and os.path.exists(tmp_path):
- os.unlink(tmp_path)
-
- # Log full traceback
- import traceback
- traceback.print_exc()
-
- # Return structured error response
- return JSONResponse(
- {"success": False, "error": str(e)},
- status_code=500
- )
-
- finally:
- # Always cleanup temp files
- if tmp_path and os.path.exists(tmp_path):
- os.unlink(tmp_path)
-```
-
-#### Perturbation Level (NEW)
-
-```python
-def apply_perturbation(image, perturbation_type, degree):
- """Apply single perturbation with error handling."""
- try:
- # Validate inputs
- if perturbation_type not in ALL_PERTURBATIONS:
- raise ValueError(f"Invalid perturbation: {perturbation_type}")
-
- if not 1 <= degree <= 3:
- raise ValueError(f"Degree must be 1-3, got: {degree}")
-
- # Apply perturbation
- result_image = ...
-
- return result_image, True, "Success"
-
- except Exception as e:
- # Log error but don't crash
- print(f"Error in {perturbation_type}: {e}")
- return image, False, str(e)
-```
-
-### Visualization Error Isolation
-
-Each visualization wrapped individually to prevent cascade failures:
-
-```python
-for viz_name, viz_func in visualization_functions.items():
- try:
- visualizations[viz_name] = viz_func()
- except Exception as e:
- print(f"Error generating {viz_name}: {e}")
- visualizations[viz_name] = None
- # Continue with other visualizations
-```
-
-### Resource Cleanup
-
-Guaranteed cleanup using try-finally:
-
-```python
-def process_image(file):
- tmp_path = None
- perturbed_tmp_path = None
-
- try:
- # Processing logic
- ...
- finally:
- # Always cleanup temp files
- for path in [tmp_path, perturbed_tmp_path]:
- if path and os.path.exists(path):
- os.unlink(path)
-```
-
----
-
-## ⚡ Performance Optimization
-
-### GPU Memory Management
-
-```python
-# At startup - clear GPU cache
-if torch.cuda.is_available():
- torch.cuda.empty_cache()
- gc.collect()
- print(f"GPU Memory: {torch.cuda.memory_allocated(0) / 1024**3:.2f} GB")
-
-# During inference - monitor usage
-def detect_with_memory_tracking(model, image):
- initial_memory = torch.cuda.memory_allocated(0)
-
- result = model(image)
-
- peak_memory = torch.cuda.max_memory_allocated(0)
- print(f"Memory used: {(peak_memory - initial_memory) / 1024**2:.2f} MB")
-
- return result
-```
-
-### Memory-Efficient Visualizations
-
-```python
-def generate_chart():
- fig, ax = plt.subplots()
-
- # ... generate chart ...
-
- # Convert to base64
- base64_str = fig_to_base64(fig)
-
- # CRITICAL: Close figure to free memory
- plt.close(fig)
-
- return base64_str
-
-# After generating all charts
-plt.close('all') # Nuclear option if needed
-```
-
-### Response Size Optimization
-
-```python
-def save_results(results, filename):
- # Remove large base64 images from saved JSON
- json_results = {
- k: v for k, v in results.items()
- if k != "visualizations"
- }
-
- # Save lightweight JSON
- json_path = output_dir / f"results_{filename}.json"
- with open(json_path, 'w') as f:
- json.dump(json_results, f, indent=2)
-
- # Save visualizations as separate files
- for viz_name, viz_data in results['visualizations'].items():
- if viz_data:
- save_visualization_to_file(viz_data, f"{filename}_{viz_name}.png")
-```
-
-### Lazy Model Loading
-
-```python
-# Model loaded once at startup, reused for all requests
-@app.on_event("startup")
-async def startup_event():
- global model
- model = init_detector(config, weights, device='cuda')
- print("Model loaded and ready")
-
-# Each request uses the same model instance
-@app.post("/api/detect")
-async def detect_objects(model=Depends(get_model)):
- # No model loading overhead per request
- result = inference_detector(model, image)
-```
-
-### Perturbation Optimization (NEW)
-
-```python
-def apply_multiple_perturbations(image, perturbations):
- """Apply perturbations efficiently."""
- current_image = image.copy() # One copy at start
-
- for pert in perturbations:
- # Apply in-place when possible
- current_image, success, msg = apply_perturbation(
- current_image,
- pert['type'],
- pert['degree']
- )
-
- if not success:
- # Log but continue with remaining perturbations
- print(f"Perturbation failed: {msg}")
-
- return current_image, results
-```
-
-### Performance Benchmarks
-
-| Operation | Time (GPU) | Time (CPU) | Memory (GPU) |
-|-----------|------------|------------|--------------|
-| Model loading | 10-15s | 20-30s | 6-8 GB |
-| Single inference | 0.3-0.5s | 2-5s | +2-3 GB |
-| Metrics calculation | 0.1-0.2s | 0.1-0.2s | Minimal |
-| Visualization (all 8) | 1-2s | 1-2s | Minimal |
-| **Perturbation (single)** | **0.1-0.5s** | **0.1-0.5s** | **Minimal** |
-| **Perturbation (3x)** | **0.3-1.5s** | **0.3-1.5s** | **Minimal** |
-| **Total (detect only)** | **1.5-3s** | **4-8s** | **8-11 GB** |
-| **Total (pert + detect)** | **2-4.5s** | **4.5-9.5s** | **8-11 GB** |
-
----
-
-## 🔒 Security Considerations
-
-### Current Security Status
-
-| Aspect | Status | Risk | Recommendation |
-|--------|--------|------|----------------|
-| Authentication | ❌ None | High | Add API key auth |
-| CORS | ⚠️ Permissive | Medium | Restrict origins |
-| Rate Limiting | ❌ None | Medium | Add throttling |
-| Input Validation | ⚠️ Basic | Low | Add size limits |
-| Path Handling | ⚠️ Hardcoded | Low | Use env vars |
-| **Perturbation Files** | **⚠️ No validation** | **Medium** | **Validate file access** |
-| **Background Folder** | **⚠️ User-provided** | **High** | **Sanitize paths** |
-
-### Recommended Security Enhancements
-
-#### 1. API Key Authentication
-
-```python
-from fastapi import Security, HTTPException
-from fastapi.security.api_key import APIKeyHeader
-import os
-
-API_KEY = os.environ.get("RODLA_API_KEY")
-api_key_header = APIKeyHeader(name="X-API-Key", auto_error=False)
-
-async def verify_api_key(api_key: str = Security(api_key_header)):
- if not api_key or api_key != API_KEY:
- raise HTTPException(
- status_code=403,
- detail="Invalid or missing API key"
- )
- return api_key
-
-# Apply to all endpoints
-@app.post("/api/detect")
-async def detect_objects(
- ...,
- api_key: str = Depends(verify_api_key)
-):
- ...
-```
-
-#### 2. Rate Limiting
-
-```python
-from slowapi import Limiter
-from slowapi.util import get_remote_address
-from slowapi.errors import RateLimitExceeded
-
-limiter = Limiter(key_func=get_remote_address)
-app.state.limiter = limiter
-
-@app.exception_handler(RateLimitExceeded)
-async def rate_limit_handler(request, exc):
- return JSONResponse(
- status_code=429,
- content={"error": "Rate limit exceeded"}
- )
-
-@app.post("/api/detect")
-@limiter.limit("10/minute")
-async def detect_objects(...):
- ...
-
-@app.post("/api/perturb")
-@limiter.limit("20/minute") # Higher limit for perturbation-only
-async def perturb_image(...):
- ...
-```
-
-#### 3. File Size Limits
-
-```python
-MAX_FILE_SIZE = 10 * 1024 * 1024 # 10MB
-
-@app.post("/api/detect")
-async def detect_objects(file: UploadFile = File(...)):
- # Read file content
- content = await file.read()
-
- # Check size
- if len(content) > MAX_FILE_SIZE:
- raise HTTPException(
- status_code=413,
- detail=f"File too large. Maximum size: {MAX_FILE_SIZE / 1024**2:.1f}MB"
- )
-
- # Continue processing
- ...
-```
-
-#### 4. Path Sanitization (NEW - Critical for Perturbations)
-
-```python
-from pathlib import Path
-import os
-
-def sanitize_path(user_path: str, base_dir: Path) -> Path:
- """
- Sanitize user-provided paths to prevent directory traversal.
-
- Args:
- user_path: User-provided path string
- base_dir: Allowed base directory
-
- Returns:
- Sanitized absolute path
-
- Raises:
- ValueError: If path escapes base directory
- """
- # Resolve to absolute path
- requested_path = Path(user_path).resolve()
- base_dir = base_dir.resolve()
-
- # Check if path is within base directory
- try:
- requested_path.relative_to(base_dir)
- except ValueError:
- raise ValueError(
- f"Access denied: Path outside allowed directory"
- )
-
- # Check if path exists
- if not requested_path.exists():
- raise ValueError(f"Path does not exist: {requested_path}")
-
- return requested_path
-
-# Usage in perturbation endpoint
-@app.post("/api/perturb")
-async def perturb_image(
- background_folder: Optional[str] = Form(None)
-):
- if background_folder:
- try:
- # Sanitize background folder path
- safe_path = sanitize_path(
- background_folder,
- REPO_ROOT / "perturbation"
- )
- except ValueError as e:
- raise HTTPException(400, str(e))
-```
-
-#### 5. Input Validation (NEW - Perturbation Configs)
-
-```python
-from pydantic import BaseModel, validator
-
-class PerturbationConfig(BaseModel):
- type: str
- degree: int
-
- @validator('type')
- def validate_type(cls, v):
- if v not in ALL_PERTURBATIONS:
- raise ValueError(
- f"Invalid perturbation type. "
- f"Must be one of: {ALL_PERTURBATIONS}"
- )
- return v
-
- @validator('degree')
- def validate_degree(cls, v):
- if not 1 <= v <= 3:
- raise ValueError("Degree must be 1, 2, or 3")
- return v
-
-# Validate perturbation array
-def validate_perturbations(perturbations: List[dict]) -> List[PerturbationConfig]:
- if len(perturbations) > MAX_PERTURBATIONS_PER_REQUEST:
- raise HTTPException(
- 400,
- f"Maximum {MAX_PERTURBATIONS_PER_REQUEST} perturbations allowed"
- )
-
- validated = []
- for pert in perturbations:
- try:
- validated.append(PerturbationConfig(**pert))
- except Exception as e:
- raise HTTPException(400, f"Invalid perturbation config: {e}")
-
- return validated
-```
-
-#### 6. Restricted CORS
-
-```python
-# Development
-CORS_ORIGINS = ["http://localhost:3000", "http://localhost:8080"]
-
-# Production
-CORS_ORIGINS = ["https://yourdomain.com", "https://app.yourdomain.com"]
-
-app.add_middleware(
- CORSMiddleware,
- allow_origins=CORS_ORIGINS,
- allow_credentials=True,
- allow_methods=["GET", "POST"],
- allow_headers=["X-API-Key", "Content-Type"],
-)
-```
-
----
-
-## 🧪 Testing
-
-### Test Structure
-
-```
-tests/
-├── __init__.py
-├── conftest.py # Pytest fixtures
-├── test_api/
-│ ├── test_routes.py # Endpoint tests
-│ ├── test_perturbation_routes.py # 🆕 Perturbation endpoint tests
-│ └── test_schemas.py # Pydantic model tests
-├── test_services/
-│ ├── test_detection.py # Detection logic tests
-│ ├── test_processing.py # Processing tests
-│ ├── test_visualization.py # Chart generation tests
-│ └── test_perturbation.py # 🆕 Perturbation service tests
-├── test_perturbations/ # 🆕 NEW
-│ ├── test_blur.py # Blur perturbation tests
-│ ├── test_noise.py # Noise perturbation tests
-│ ├── test_content.py # Content perturbation tests
-│ ├── test_inconsistency.py # Inconsistency tests
-│ ├── test_spatial.py # Spatial transform tests
-│ └── test_apply.py # Orchestration tests
-├── test_utils/
-│ ├── test_helpers.py # Helper function tests
-│ ├── test_metrics.py # Metrics calculation tests
-│ └── test_serialization.py # Serialization tests
-└── test_integration/
- ├── test_full_pipeline.py # End-to-end tests
- └── test_perturbation_pipeline.py # 🆕 Perturbation E2E tests
-```
-
-### Running Tests
-
-```bash
-# Run all tests
-pytest
-
-# Run with coverage
-pytest --cov=. --cov-report=html
-
-# Run specific test file
-pytest tests/test_perturbations/test_blur.py
-
-# Run specific test
-pytest tests/test_perturbations/test_blur.py::test_defocus_degree_1
-
-# Run with verbose output
-pytest -v
-
-# Run only fast tests (no model loading)
-pytest -m "not slow"
-
-# Run only perturbation tests
-pytest tests/test_perturbations/
-
-# Run with print output
-pytest -s
-```
-
-### Example Test Cases
-
-#### Testing Perturbation Functions
-
-```python
-# tests/test_perturbations/test_blur.py
-
-import pytest
-import numpy as np
-import cv2
-from perturbations.blur import apply_defocus, apply_vibration
-
-class TestDefocus:
- @pytest.fixture
- def sample_image(self):
- """Create a test image."""
- return np.random.randint(0, 255, (100, 100, 3), dtype=np.uint8)
-
- def test_defocus_degree_0(self, sample_image):
- """Degree 0 should return unchanged image."""
- result = apply_defocus(sample_image, degree=0)
- np.testing.assert_array_equal(result, sample_image)
-
- def test_defocus_degree_1(self, sample_image):
- """Degree 1 should apply mild blur."""
- result = apply_defocus(sample_image, degree=1)
- assert result.shape == sample_image.shape
- assert not np.array_equal(result, sample_image)
-
- def test_defocus_degree_3(self, sample_image):
- """Degree 3 should apply stronger blur than degree 1."""
- result_1 = apply_defocus(sample_image, degree=1)
- result_3 = apply_defocus(sample_image, degree=3)
-
- # Degree 3 should be more different from original
- diff_1 = np.sum(np.abs(sample_image.astype(float) - result_1.astype(float)))
- diff_3 = np.sum(np.abs(sample_image.astype(float) - result_3.astype(float)))
- assert diff_3 > diff_1
-
- def test_defocus_saves_file(self, sample_image, tmp_path):
- """Test saving to file."""
- save_path = tmp_path / "test_defocus.jpg"
- result = apply_defocus(sample_image, degree=2, save_path=str(save_path))
-
- assert save_path.exists()
- loaded = cv2.imread(str(save_path))
- np.testing.assert_array_equal(loaded, result)
-
-class TestVibration:
- @pytest.fixture
- def sample_image(self):
- return np.random.randint(0, 255, (100, 100, 3), dtype=np.uint8)
-
- def test_vibration_degree_0(self, sample_image):
- result = apply_vibration(sample_image, degree=0)
- np.testing.assert_array_equal(result, sample_image)
-
- def test_vibration_creates_motion_blur(self, sample_image):
- result = apply_vibration(sample_image, degree=2)
- assert result.shape == sample_image.shape
- assert not np.array_equal(result, sample_image)
-```
-
-#### Testing API Endpoints
-
-```python
-# tests/test_api/test_perturbation_routes.py
-
-import pytest
-from fastapi.testclient import TestClient
-import json
-from backend import app
-
-client = TestClient(app)
-
-class TestPerturbationInfo:
- def test_get_perturbation_info(self):
- """Test perturbation info endpoint."""
- response = client.get("/api/perturbations/info")
-
- assert response.status_code == 200
- data = response.json()
-
- assert "total_perturbations" in data
- assert data["total_perturbations"] == 12
- assert "categories" in data
- assert len(data["categories"]) == 5
-
-class TestPerturbEndpoint:
- @pytest.fixture
- def sample_image_file(self, tmp_path):
- """Create a sample image file."""
- import numpy as np
- import cv2
-
- img = np.random.randint(0, 255, (100, 100, 3), dtype=np.uint8)
- img_path = tmp_path / "test.jpg"
- cv2.imwrite(str(img_path), img)
-
- return img_path
-
- def test_perturb_single_perturbation(self, sample_image_file):
- """Test applying single perturbation."""
- with open(sample_image_file, "rb") as f:
- response = client.post(
- "/api/perturb",
- files={"file": ("test.jpg", f, "image/jpeg")},
- data={
- "perturbations": json.dumps([{"type": "defocus", "degree": 2}]),
- "save_image": "false"
- }
- )
-
- assert response.status_code == 200
- data = response.json()
-
- assert data["success"] == True
- assert len(data["perturbations_applied"]) == 1
- assert data["perturbations_applied"][0]["type"] == "defocus"
-
- def test_perturb_multiple_perturbations(self, sample_image_file):
- """Test applying multiple perturbations."""
- perturbations = [
- {"type": "defocus", "degree": 2},
- {"type": "speckle", "degree": 1},
- {"type": "rotation", "degree": 1}
- ]
-
- with open(sample_image_file, "rb") as f:
- response = client.post(
- "/api/perturb",
- files={"file": ("test.jpg", f, "image/jpeg")},
- data={
- "perturbations": json.dumps(perturbations),
- "save_image": "false"
- }
- )
-
- assert response.status_code == 200
- data = response.json()
-
- assert data["success"] == True
- assert len(data["perturbations_applied"]) == 3
-
- def test_perturb_invalid_type(self, sample_image_file):
- """Test with invalid perturbation type."""
- with open(sample_image_file, "rb") as f:
- response = client.post(
- "/api/perturb",
- files={"file": ("test.jpg", f, "image/jpeg")},
- data={
- "perturbations": json.dumps([{"type": "invalid", "degree": 2}])
- }
- )
-
- # Should handle gracefully
- assert response.status_code in [400, 200]
-
- def test_perturb_too_many(self, sample_image_file):
- """Test exceeding max perturbations."""
- perturbations = [
- {"type": "defocus", "degree": i % 3 + 1}
- for i in range(10) # More than MAX_PERTURBATIONS_PER_REQUEST
- ]
-
- with open(sample_image_file, "rb") as f:
- response = client.post(
- "/api/perturb",
- files={"file": ("test.jpg", f, "image/jpeg")},
- data={
- "perturbations": json.dumps(perturbations)
- └──────────┘ └────┬─────┘
- │
- ┌───────────────────────────────────────────────┘
- ▼
-┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
-│ Raw │───▶│ Process │───▶│ Calculate│───▶│ Generate │
-│ Results │ │Detections│ │ Metrics │ │ Viz │
-└──────────┘ └──────────┘
-
-
-
-
-# Complete Multi-Image Batch Processing Implementation Plan for RoDLA API
-
-## 📋 Table of Contents
-
-1. [Executive Summary](#executive-summary)
-2. [Architecture Overview](#architecture-overview)
-3. [System Design Decisions](#system-design-decisions)
-4. [Component Specifications](#component-specifications)
-5. [Data Flow Diagrams](#data-flow-diagrams)
-6. [API Endpoint Specifications](#api-endpoint-specifications)
-7. [Database/Storage Schema](#databasestorage-schema)
-8. [Error Handling Strategy](#error-handling-strategy)
-9. [Implementation Checklist](#implementation-checklist)
-10. [Testing Strategy](#testing-strategy)
-11. [Migration Path](#migration-path)
-
----
-
-## 1. Executive Summary
-
-### 1.1 Project Goal
-Add multi-image batch processing capabilities to the existing RoDLA Document Layout Analysis API while maintaining backward compatibility with single-image processing.
-
-### 1.2 Key Requirements Met
-- ✅ **Multiple Image Upload**: Users can select and upload 1-300 images in a single request
-- ✅ **Async Processing**: Long-running batch jobs processed in background with job ID
-- ✅ **Progress Tracking**: Real-time progress via polling endpoint
-- ✅ **Flexible Perturbations**: Support both shared and per-image perturbations
-- ✅ **Flexible Visualizations**: None, per-image, summary, or both modes
-- ✅ **Robust Error Handling**: Continue processing on individual failures
-- ✅ **Backward Compatible**: Existing `/api/detect` unchanged
-- ✅ **Modular Design**: New endpoints separate from existing code
-
-### 1.3 Technical Stack
-- **Framework**: FastAPI with BackgroundTasks
-- **Storage**: In-memory dict + JSON file persistence
-- **Processing**: Sequential (one image at a time)
-- **Upload**: Multipart form-data with List[UploadFile]
-
----
-
-## 2. Architecture Overview
-
-### 2.1 High-Level Architecture
-
-```
-┌─────────────────────────────────────────────────────────────────┐
-│ CLIENT LAYER │
-│ (Frontend / API Client / Postman) │
-└────────────────┬────────────────────────────────────────────────┘
- │
- │ Step 1: POST /api/detect-batch
- │ (Upload files + config)
- ▼
-┌─────────────────────────────────────────────────────────────────┐
-│ API ENDPOINT LAYER │
-│ api/routes.py (UPDATED) │
-│ ┌──────────────────────────────────────────────────────────┐ │
-│ │ POST /api/detect-batch │ │
-│ │ - Validate files (1-300 images) │ │
-│ │ - Generate unique job_id (UUID4) │ │
-│ │ - Create job metadata │ │
-│ │ - Launch BackgroundTask │ │
-│ │ - Return job_id immediately (202 Accepted) │ │
-│ └──────────────────────────────────────────────────────────┘ │
-│ ┌──────────────────────────────────────────────────────────┐ │
-│ │ GET /api/batch-job/{job_id} │ │
-│ │ - Return current job status and progress │ │
-│ │ - 200 if exists, 404 if not found │ │
-│ └──────────────────────────────────────────────────────────┘ │
-└────────────────┬────────────────────────────────────────────────┘
- │
- │ Step 2: Background processing starts
- ▼
-┌─────────────────────────────────────────────────────────────────┐
-│ BACKGROUND PROCESSING │
-│ FastAPI BackgroundTasks + Job Manager │
-│ ┌──────────────────────────────────────────────────────────┐ │
-│ │ BatchJobManager (NEW) │ │
-│ │ - Manages in-memory job registry │ │
-│ │ - Persists jobs to JSON files │ │
-│ │ - Thread-safe operations │ │
-│ └──────────────────────────────────────────────────────────┘ │
-│ ┌──────────────────────────────────────────────────────────┐ │
-│ │ process_batch_job() (NEW) │ │
-│ │ - Loop through each image sequentially │ │
-│ │ - Apply perturbations (if specified) │ │
-│ │ - Run detection │ │
-│ │ - Calculate metrics │ │
-│ │ - Generate visualizations (if requested) │ │
-│ │ - Update job progress after each image │ │
-│ │ - Handle individual image failures gracefully │ │
-│ └──────────────────────────────────────────────────────────┘ │
-└────────────────┬────────────────────────────────────────────────┘
- │
- │ Step 3: Results stored
- ▼
-┌─────────────────────────────────────────────────────────────────┐
-│ STORAGE LAYER │
-│ │
-│ outputs/ │
-│ ├── jobs/ ← NEW: Job metadata │
-│ │ ├── job_abc123.json Job status, progress │
-│ │ └── job_def456.json Results summary │
-│ │ │
-│ ├── batch_20241123_143022/ ← NEW: Batch results │
-│ │ ├── summary.json Overall batch statistics │
-│ │ ├── image1/ │
-│ │ │ ├── results.json Individual results │
-│ │ │ └── visualizations/ 8 PNG charts (optional) │
-│ │ ├── image2/ │
-│ │ │ ├── results.json │
-│ │ │ └── visualizations/ │
-│ │ └── summary_visualizations/ ← Aggregate charts │
-│ │ ├── combined_class_dist.png │
-│ │ └── ... │
-│ │ │
-│ └── perturbations/ ← Existing (unchanged) │
-└─────────────────────────────────────────────────────────────────┘
- │
- │ Step 4: Client polls for status
- ▼
-┌─────────────────────────────────────────────────────────────────┐
-│ CLIENT POLLING │
-│ │
-│ while job.status != "completed": │
-│ response = GET /api/batch-job/{job_id} │
-│ print(f"Progress: {response.processed}/{response.total}") │
-│ sleep(2 seconds) │
-│ │
-│ # Job complete - fetch results │
-│ final_response = GET /api/batch-job/{job_id} │
-│ results = final_response.results │
-└─────────────────────────────────────────────────────────────────┘
-```
-
-### 2.2 Component Hierarchy
-
-```
-deployment/
-├── backend.py ← Main app (minimal changes)
-├── config/
-│ └── settings.py ← UPDATED: Add batch config
-├── api/
-│ ├── routes.py ← UPDATED: Add 2 new endpoints
-│ └── schemas.py ← UPDATED: Add batch schemas
-├── services/
-│ ├── detection.py ← UNCHANGED
-│ ├── processing.py ← UPDATED: Add batch functions
-│ ├── perturbation.py ← UNCHANGED
-│ ├── visualization.py ← UNCHANGED
-│ ├── interpretation.py ← UNCHANGED
-│ └── batch_job_manager.py ← NEW: Job management
-└── outputs/
- ├── jobs/ ← NEW: Job metadata storage
- └── batch_*/ ← NEW: Batch result directories
-```
-
----
-
-## 3. System Design Decisions
-
-### 3.1 Async Processing with Job IDs
-
-#### Why Async?
-- **User Experience**: 300 images × 3 seconds = 15 minutes. Can't block HTTP request.
-- **HTTP Timeout**: Most proxies/browsers timeout after 60-120 seconds.
-- **Scalability**: Allows concurrent batch submissions (future enhancement).
-
-#### Why FastAPI BackgroundTasks?
-- **Simplicity**: Built-in, no external dependencies (Redis, Celery).
-- **Sufficient**: For single-worker, single-GPU setup.
-- **Easy Migration**: Can upgrade to Celery later without changing API contract.
-
-#### Job ID Generation
-```python
-import uuid
-
-job_id = str(uuid.uuid4()) # e.g., "3fa85f64-5717-4562-b3fc-2c963f66afa6"
-```
-- **Unique**: Collision probability negligible
-- **URL-safe**: No special characters
-- **Unguessable**: Cannot enumerate other users' jobs
-
-### 3.2 Storage Strategy: In-Memory + JSON Files
-
-#### In-Memory Registry
-```python
-# Global dictionary (thread-safe with locks)
-job_registry = {
- "job_abc123": {
- "job_id": "abc123",
- "status": "processing", # queued, processing, completed, failed
- "progress": {"current": 5, "total": 10},
- "results": [...],
- # ... more fields
- }
-}
-```
-
-**Advantages:**
-- ✅ Fast reads (no disk I/O)
-- ✅ Simple implementation
-- ✅ Sufficient for single-instance deployment
-
-**Disadvantages:**
-- ❌ Lost on restart (mitigated by JSON persistence)
-- ❌ Not shared across multiple API instances (future concern)
-
-#### JSON File Persistence
-```python
-# Saved to: outputs/jobs/job_abc123.json
-{
- "job_id": "abc123",
- "status": "completed",
- "created_at": "2024-11-23T14:30:22",
- "updated_at": "2024-11-23T14:45:56",
- "config": {...},
- "progress": {...},
- "results": [...]
-}
-```
-
-**When to Persist:**
-- ✅ Job creation (initial state)
-- ✅ After each image processed (progress update)
-- ✅ Job completion (final state)
-- ✅ On error (for debugging)
-
-**Recovery on Restart:**
-- Load all JSON files from `outputs/jobs/` into memory
-- Mark incomplete jobs as "failed" with note "Server restarted"
-
-### 3.3 Sequential Processing
-
-#### Processing Loop
-```python
-for i, image_file in enumerate(uploaded_files):
- try:
- # 1. Save temp file
- # 2. Apply perturbations (if any)
- # 3. Run detection
- # 4. Calculate metrics
- # 5. Generate visualizations (if requested)
- # 6. Save results
- # 7. Update job progress
- # 8. Persist to JSON
- except Exception as e:
- # Log error, mark image as failed, continue
- pass
-```
-
-#### Why Sequential, Not Parallel?
-- **GPU Limitation**: Single GPU, MMDetection not designed for concurrent inference
-- **Memory Safety**: Avoids OOM errors from loading multiple images
-- **Predictability**: Easier to debug, trace, and test
-- **Progress Tracking**: Simple to implement (current = i+1)
-
-#### Future Parallelization (Not in MVP)
-- Could process on multiple GPUs if available
-- Would require semaphore to limit concurrent workers
-- Would need more complex progress tracking
-
-### 3.4 Perturbation Handling
-
-#### Format Support
-```python
-# Option 1: Single set for all images
-perturbations = [{"type": "defocus", "degree": 2}]
-# Applied to image1, image2, ..., imageN
-
-# Option 2: Per-image perturbations
-perturbations = [
- [{"type": "defocus", "degree": 2}], # For image1
- [{"type": "speckle", "degree": 1}], # For image2
- # ... must match number of images
-]
-```
-
-#### Detection Logic
-```python
-if perturbations:
- if isinstance(perturbations[0], list):
- # Per-image mode
- if len(perturbations) != len(files):
- raise ValueError("Perturbations count must match files count")
- pert_for_image = perturbations[i]
- else:
- # Shared mode
- pert_for_image = perturbations
-```
-
-#### Validation
-- **Type Check**: Must be list (not dict, not string)
-- **Length Check**: If nested, must match file count
-- **Perturbation Validation**: Reuse existing validation from `/api/perturb`
-
-### 3.5 Visualization Modes
-
-#### Four Modes
-```python
-visualization_mode: str = Form("none")
-# Options: "none", "per_image", "summary", "both"
-```
-
-#### Mode Behavior
-
-| Mode | Per-Image Charts | Summary Charts | Use Case |
-|------|------------------|----------------|----------|
-| `none` | ❌ No | ❌ No | Fast processing, no analysis |
-| `per_image` | ✅ 8 charts × N images | ❌ No | Detailed per-image inspection |
-| `summary` | ❌ No | ✅ 8 aggregate charts | Quick batch overview |
-| `both` | ✅ Per-image | ✅ Summary | Complete analysis |
-
-#### Storage Impact
-
-**For 100 images with `per_image` mode:**
-```
-batch_20241123_143022/
-├── image001/
-│ └── visualizations/ (8 PNGs × ~200KB = 1.6 MB)
-├── image002/
-│ └── visualizations/ (8 PNGs × ~200KB = 1.6 MB)
-...
-├── image100/
-│ └── visualizations/ (8 PNGs × ~200KB = 1.6 MB)
-└── summary_visualizations/ (optional, +1.6 MB)
-
-Total: ~160 MB (800 PNG files)
-```
-
-#### Summary Statistics Aggregation
-```python
-# Aggregate across all images
-summary_stats = {
- "total_images": 100,
- "total_detections": 4752,
- "average_detections_per_image": 47.52,
- "confidence_distribution": {
- "mean": 0.78,
- "std": 0.12,
- "min": 0.30,
- "max": 0.99
- },
- "class_distribution": {
- "paragraph": 1523,
- "title": 845,
- # ... aggregated across all images
- },
- # ... more aggregate metrics
-}
-```
-
-#### Summary Visualizations
-```python
-# Example: Combined class distribution chart
-# X-axis: Classes
-# Y-axis: Total count across all images
-# Bar colors: Same as single-image charts
-
-# 8 Summary Charts:
-1. combined_class_distribution.png
-2. combined_confidence_histogram.png
-3. average_confidence_by_class.png
-4. detection_count_per_image.png (NEW)
-5. confidence_trend_across_images.png (NEW)
-6. processing_time_per_image.png (NEW)
-7. success_failure_summary.png (NEW)
-8. top_classes_across_batch.png
-```
-
-### 3.6 Error Handling Philosophy
-
-#### Per-Image Error Isolation
-```python
-for image in images:
- try:
- result = process_single_image(image)
- results.append({"status": "success", "data": result})
- except Exception as e:
- results.append({
- "status": "failed",
- "error": str(e),
- "traceback": traceback.format_exc()
- })
- # Continue to next image
-```
-
-#### Job-Level Status
-```python
-if all_images_successful:
- job_status = "completed"
-elif any_images_successful:
- job_status = "partial" # Some succeeded, some failed
-else:
- job_status = "failed" # All images failed
-```
-
-#### Error Categories
-
-| Error Type | HTTP Status | Job Status | Action |
-|------------|-------------|------------|--------|
-| Invalid file format | 400 | Not created | Return error immediately |
-| Too many files (>300) | 400 | Not created | Return error immediately |
-| Model inference error | 500 | Partial | Mark image failed, continue |
-| Perturbation error | 500 | Partial | Mark image failed, continue |
-| Disk space exhausted | 500 | Failed | Stop job, return error |
-| Out of memory | 500 | Partial | Mark image failed, GC, continue |
-
----
-
-## 4. Component Specifications
-
-### 4.1 BatchJobManager Class
-
-**Purpose:** Central manager for all batch job operations.
-
-**Location:** `services/batch_job_manager.py` (NEW FILE)
-
-#### Class Structure
-```python
-import threading
-import json
-from pathlib import Path
-from typing import Dict, Optional, List
-from datetime import datetime
-import uuid
-
-class BatchJobManager:
- """
- Thread-safe manager for batch processing jobs.
-
- Responsibilities:
- - Create new jobs
- - Update job progress
- - Retrieve job status
- - Persist jobs to disk
- - Load jobs from disk (on startup)
- """
-
- def __init__(self, jobs_dir: Path):
- self.jobs_dir = jobs_dir
- self.jobs_dir.mkdir(parents=True, exist_ok=True)
-
- # In-memory registry
- self._jobs: Dict[str, dict] = {}
-
- # Thread safety
- self._lock = threading.Lock()
-
- # Load existing jobs on init
- self._load_jobs_from_disk()
-
- def create_job(self, config: dict) -> str:
- """Create new job and return job_id"""
-
- def get_job(self, job_id: str) -> Optional[dict]:
- """Get job by ID"""
-
- def update_job_progress(self, job_id: str, current: int, total: int):
- """Update processing progress"""
-
- def add_job_result(self, job_id: str, image_name: str, result: dict):
- """Add result for single image"""
-
- def mark_job_completed(self, job_id: str):
- """Mark job as completed"""
-
- def mark_job_failed(self, job_id: str, error: str):
- """Mark job as failed"""
-
- def _persist_job(self, job_id: str):
- """Save job to JSON file"""
-
- def _load_jobs_from_disk(self):
- """Load all jobs from disk on startup"""
-```
-
-#### Job Schema (In-Memory & Persisted)
-```python
-{
- "job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
- "status": "processing", # queued, processing, completed, partial, failed
- "created_at": "2024-11-23T14:30:22.123456",
- "updated_at": "2024-11-23T14:35:45.789012",
- "started_at": "2024-11-23T14:30:23.456789",
- "completed_at": None, # Set when finished
-
- # Input configuration
- "config": {
- "total_images": 10,
- "score_threshold": 0.3,
- "visualization_mode": "summary",
- "perturbations": [...], # or None
- "perturbation_mode": "shared", # or "per_image"
- "save_json": True,
- "filenames": ["img1.jpg", "img2.jpg", ...]
- },
-
- # Progress tracking
- "progress": {
- "current": 5, # Images processed so far
- "total": 10, # Total images
- "percentage": 50.0,
- "successful": 4, # Successfully processed
- "failed": 1, # Failed to process
- "processing_times": [2.3, 1.8, 3.1, 2.0, 2.5] # Seconds per image
- },
-
- # Results (per image)
- "results": [
- {
- "image": "img1.jpg",
- "status": "success",
- "processing_time": 2.3,
- "detections_count": 47,
- "average_confidence": 0.82,
- "result_path": "outputs/batch_xxx/img1/results.json"
- },
- {
- "image": "img2.jpg",
- "status": "failed",
- "error": "Model inference failed: CUDA out of memory",
- "traceback": "..."
- },
- # ... more results
- ],
-
- # Batch output directory
- "output_dir": "outputs/batch_20241123_143022",
-
- # Summary statistics (filled after completion)
- "summary": {
- "total_detections": 456,
- "average_confidence": 0.79,
- "processing_time_total": 23.5,
- "processing_time_average": 2.35,
- # ... more aggregated stats
- }
-}
-```
-
-#### Thread Safety Implementation
-```python
-def update_job_progress(self, job_id: str, current: int, total: int):
- """Thread-safe progress update"""
- with self._lock:
- if job_id not in self._jobs:
- raise ValueError(f"Job {job_id} not found")
-
- job = self._jobs[job_id]
- job["progress"]["current"] = current
- job["progress"]["total"] = total
- job["progress"]["percentage"] = (current / total) * 100
- job["updated_at"] = datetime.now().isoformat()
-
- # Persist to disk
- self._persist_job(job_id)
-```
-
-### 4.2 Batch Processing Function
-
-**Purpose:** Background task that processes all images in a job.
-
-**Location:** `services/processing.py` (ADD NEW FUNCTION)
-
-#### Function Signature
-```python
-def process_batch_job(
- job_id: str,
- uploaded_files: List[UploadFile],
- config: dict,
- job_manager: BatchJobManager
-):
- """
- Process a batch of images in the background.
-
- Args:
- job_id: Unique job identifier
- uploaded_files: List of UploadFile objects
- config: Job configuration dict
- job_manager: BatchJobManager instance
-
- Returns:
- None (updates job status via job_manager)
- """
-```
-
-#### Processing Logic Flow
-```python
-async def process_batch_job(job_id, uploaded_files, config, job_manager):
- # 1. Mark job as started
- job_manager.mark_job_started(job_id)
-
- # 2. Create batch output directory
- timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
- batch_dir = OUTPUT_DIR / f"batch_{timestamp}"
- batch_dir.mkdir(parents=True, exist_ok=True)
-
- # 3. Process each image sequentially
- for i, uploaded_file in enumerate(uploaded_files):
- try:
- # 3.1 Save uploaded file to temp location
- temp_path = save_temp_file(uploaded_file)
-
- # 3.2 Read image
- image = cv2.imread(temp_path)
-
- # 3.3 Apply perturbations (if configured)
- if config.get("perturbations"):
- perturbations = get_perturbations_for_image(
- config, i
- )
- image, pert_result = apply_perturbations(
- image, perturbations, ...
- )
-
- # 3.4 Save perturbed image to temp location
- perturbed_temp_path = save_perturbed_temp(image)
-
- # 3.5 Run detection
- result, img_w, img_h = run_inference(perturbed_temp_path)
- detections = process_detections(result, config["score_threshold"])
-
- # 3.6 Calculate metrics
- results_dict = create_comprehensive_results(
- detections,
- img_w,
- img_h,
- uploaded_file.filename,
- config["score_threshold"],
- generate_viz=(config["visualization_mode"] in ["per_image", "both"])
- )
-
- # 3.7 Save individual result
- image_dir = batch_dir / f"image_{i+1:03d}"
- image_dir.mkdir(exist_ok=True)
- save_individual_result(results_dict, image_dir)
-
- # 3.8 Add to job results
- job_manager.add_job_result(job_id, uploaded_file.filename, {
- "status": "success",
- "processing_time": processing_time,
- "detections_count": len(detections),
- "average_confidence": results_dict["core_results"]["summary"]["average_confidence"],
- "result_path": str(image_dir / "results.json")
- })
-
- # 3.9 Update progress
- job_manager.update_job_progress(
- job_id,
- current=i+1,
- total=len(uploaded_files)
- )
-
- # 3.10 Cleanup temp files
- cleanup_temp_files(temp_path, perturbed_temp_path)
-
- except Exception as e:
- # Handle error for this image
- job_manager.add_job_result(job_id, uploaded_file.filename, {
- "status": "failed",
- "error": str(e),
- "traceback": traceback.format_exc()
- })
-
- # Update progress (still count as processed)
- job_manager.update_job_progress(
- job_id,
- current=i+1,
- total=len(uploaded_files)
- )
-
- # 4. Generate summary statistics
- if config["visualization_mode"] in ["summary", "both"]:
- summary_stats, summary_viz = generate_batch_summary(
- job_id, job_manager
- )
- save_summary(batch_dir, summary_stats, summary_viz)
-
- # 5. Mark job as completed
- job_manager.mark_job_completed(job_id)
-```
-
-#### Helper Functions
-```python
-def get_perturbations_for_image(config: dict, image_index: int) -> List[dict]:
- """
- Get perturbations for specific image based on mode.
-
- Args:
- config: Job config with perturbations
- image_index: Current image index (0-based)
-
- Returns:
- List of perturbation dicts for this image
- """
- perturbations = config.get("perturbations")
- if not perturbations:
- return []
-
- if config["perturbation_mode"] == "per_image":
- return perturbations[image_index]
- else: # shared
- return perturbations
-
-
-def save_individual_result(results_dict: dict, image_dir: Path):
- """
- Save individual image results to directory.
-
- Structure:
- image_dir/
- ├── results.json (Full results)
- └── visualizations/ (If generated)
- ├── class_distribution.png
- ├── confidence_histogram.png
- └── ...
- """
- # Save JSON (without base64 visualizations)
- json_path = image_dir / "results.json"
- save_results(results_dict, json_path.stem, save_visualizations=True)
-
- # Save visualizations if present
- if results_dict.get("visualizations"):
- viz_dir = image_dir / "visualizations"
- viz_dir.mkdir(exist_ok=True)
-
- for viz_name, viz_base64 in results_dict["visualizations"].items():
- if viz_base64:
- viz_path = viz_dir / f"{viz_name}.png"
- save_base64_image(viz_base64, viz_path)
-
-
-def generate_batch_summary(job_id: str, job_manager: BatchJobManager) -> tuple:
- """
- Generate aggregate statistics and visualizations for entire batch.
-
- Returns:
- (summary_stats_dict, summary_visualizations_dict)
- """
- job = job_manager.get_job(job_id)
-
- # Aggregate metrics across all successful results
- all_detections = []
- all_confidences = []
- class_counts = {}
-
- for result in job["results"]:
- if result["status"] == "success":
- # Load individual result file
- result_data = load_json(result["result_path"])
-
- # Collect detections
- all_detections.extend(result_data["all_detections"])
-
- # Collect confidences
- for det in result_data["all_detections"]:
- all_confidences.append(det["confidence"])
-
- # Aggregate class counts
- for class_name, class_data in result_data["class_analysis"].items():
- if class_name not in class_counts:
- class_counts[class_name] = 0
- class_counts[class_name] += class_data["count"]
-
- # Calculate summary statistics
- summary_stats = {
- "total_images": job["config"]["total_images"],
- "successful_images": job["progress"]["successful"],
- "failed_images": job["progress"]["failed"],
- "total_detections": len(all_detections),
- "average_detections_per_image": len(all_detections) / max(job["progress"]["successful"], 1),
- "average_confidence": np.mean(all_confidences) if all_confidences else 0,
- "confidence_std": np.std(all_confidences) if all_confidences else 0,
- "class_distribution": class_counts,
- "processing_time_total": sum(job["progress"]["processing_times"]),
- "processing_time_average": np.mean(job["progress"]["processing_times"]) if job["progress"]["processing_times"] else 0,
- # ... more aggregated metrics
- }
-
- # Generate summary visualizations
- summary_viz = generate_summary_visualizations(
- summary_stats,
- all_detections,
- job["progress"]["processing_times"]
- )
-
- return summary_stats, summary_viz
-
-
-def generate_summary_visualizations(
- summary_stats: dict,
- all_detections: List[dict],
- processing_times: List[float]
-) -> dict:
- """
- Generate 8 summary charts for the entire batch.
-
- Returns:
- Dict mapping chart names to base64 PNGs
- """
- visualizations = {}
-
- # 1. Combined class distribution (bar chart)
- fig, ax = plt.subplots(figsize=(12, 6))
- classes = list(summary_stats["class_distribution"].keys())
- counts = list(summary_stats["class_distribution"].values())
- ax.bar(classes, counts, color='steelblue')
- ax.set_title("Class Distribution Across All Images")
- ax.set_xlabel("Class")
- ax.set_ylabel("Total Count")
- plt.xticks(rotation=45, ha='right')
- plt.tight_layout()
- visualizations['combined_class_distribution'] = fig_to_base64(fig)
- plt.close(fig)
-
- # 2. Combined confidence histogram
- fig, ax = plt.subplots(figsize=(10, 6))
- all_confidences = [d['confidence'] for d in all_detections]
- ax.hist(all_confidences, bins=20, color='steelblue', edgecolor='black')
- ax.axvline(np.mean(all_confidences), color='red', linestyle='--',
- label=f'Mean: {np.mean(all_confidences):.3f}')
- ax.set_title("Confidence Distribution Across All Images")
- ax.set_xlabel("Confidence")
- ax.set_ylabel("Frequency")
- ax.legend()
- plt.tight_layout()
- visualizations['combined_confidence_histogram'] = fig_to_base64(fig)
- plt.close(fig)
-
- # 3. Detection count per image (line chart)
- # Shows trend of detections across images
-
- # 4. Confidence trend across images
- # Shows average confidence per image
-
- # 5. Processing time per image (bar chart)
- fig, ax = plt.subplots(figsize=(12, 6))
- image_indices = list(range(1, len(processing_times) + 1))
- ax.bar(image_indices, processing_times, color='coral')
- ax.axhline(np.mean(processing_times), color='red', linestyle='--',
- label=f'Average: {np.mean(processing_times):.2f}s')
- ax.set_title("Processing Time Per Image")
- ax.set_xlabel("Image Number")
- ax.set_ylabel("Time (seconds)")
- ax.legend()
- plt.tight_layout()
- visualizations['processing_time_per_image'] = fig_to_base64(fig)
- plt.close(fig)
-
- # 6. Success/Failure summary (pie chart)
- # 7. Top classes across batch
- # 8. Average confidence by class
-
- return visualizations
-```
-
----
-
-### 4.3 API Route Updates
-
-**Location:** `api/routes.py` (ADD TWO NEW ENDPOINTS)
-
-#### Endpoint 1: POST /api/detect-batch
-
-```python
-@router.post("/api/detect-batch")
-async def detect_batch(
- files: List[UploadFile] = File(..., description="1-300 image files"),
- score_thr: str = Form("0.3", description="Confidence threshold (0-1)"),
- perturbations: Optional[str] = Form(None, description="JSON: single list or list of lists"),
- visualization_mode: str = Form("none", description="none|per_image|summary|both"),
- save_json: str = Form("true", description="Save results to disk"),
- background_folder: Optional[str] = Form(None, description="Background images folder"),
- background_tasks: BackgroundTasks = None
-):
- """
- Process multiple images in batch mode with async processing.
-
- **Request Parameters:**
- - **files**: 1-300 image files (multipart/form-data)
- - **score_thr**: Confidence threshold (0.0-1.0), default 0.3
- - **perturbations**: Optional JSON string
- - Single list: `[{"type":"defocus","degree":2}]` (applied to all)
- - List of lists: `[[...], [...]]` (per-image, must match file count)
- - **visualization_mode**: Visualization generation mode
- - `none`: No visualizations (fastest)
- - `per_image`: Generate 8 charts per image (slowest)
- - `summary`: Generate 8 aggregate charts (moderate)
- - `both`: Per-image + summary (comprehensive)
- - **save_json**: Save results to disk (default: true)
- - **background_folder**: Path to background images (for 'background' perturbation)
-
- **Response (202 Accepted):**
- ```json
- {
- "job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
- "status": "queued",
- "message": "Batch processing started",
- "total_images": 10,
- "status_endpoint": "/api/batch-job/3fa85f64-5717-4562-b3fc-2c963f66afa6"
- }
- ```
-
- **Poll the status endpoint to track progress.**
- """
-
- try:
- # 1. Validate file count
- if not files or len(files) == 0:
- raise HTTPException(400, "At least one file is required")
-
- if len(files) > 300:
- raise HTTPException(400, f"Maximum 300 files allowed, got {len(files)}")
-
- # 2. Validate file types
- for file in files:
- if not file.content_type.startswith('image/'):
- raise HTTPException(400, f"File {file.filename} is not an image")
-
- # 3. Parse and validate score threshold
- try:
- score_threshold = float(score_thr)
- if not 0 <= score_threshold <= 1:
- raise ValueError("Must be between 0 and 1")
- except ValueError as e:
- raise HTTPException(400, f"Invalid score_thr: {e}")
-
- # 4. Parse and validate perturbations
- perturbation_config = None
- perturbation_mode = "shared" # or "per_image"
-
- if perturbations:
- try:
- pert_data = json.loads(perturbations)
-
- if not isinstance(pert_data, list):
- raise ValueError("Perturbations must be a list")
-
- # Check if per-image mode (list of lists)
- if pert_data and isinstance(pert_data[0], list):
- perturbation_mode = "per_image"
-
- # Validate count matches
- if len(pert_data) != len(files):
- raise ValueError(
- f"Per-image perturbations count ({len(pert_data)}) "
- f"must match file count ({len(files)})"
- )
-
- # Validate each sub-list
- for i, pert_list in enumerate(pert_data):
- if not isinstance(pert_list, list):
- raise ValueError(f"Perturbations[{i}] must be a list")
- # TODO: Validate each perturbation config
- else:
- # Shared mode - validate single list
- # TODO: Validate perturbation configs
- pass
-
- perturbation_config = pert_data
-
- except json.JSONDecodeError:
- raise HTTPException(400, "Invalid JSON in perturbations parameter")
- except ValueError as e:
- raise HTTPException(400, str(e))
-
- # 5. Validate visualization mode
- valid_viz_modes = ["none", "per_image", "summary", "both"]
- if visualization_mode not in valid_viz_modes:
- raise HTTPException(
- 400,
- f"Invalid visualization_mode. Must be one of: {valid_viz_modes}"
- )
-
- # 6. Create job configuration
- job_config = {
- "total_images": len(files),
- "score_threshold": score_threshold,
- "visualization_mode": visualization_mode,
- "perturbations": perturbation_config,
- "perturbation_mode": perturbation_mode,
- "save_json": save_json.lower() == "true",
- "background_folder": background_folder,
- "filenames": [f.filename for f in files]
- }
-
- # 7. Create job in job manager
- job_manager = get_job_manager() # Singleton instance
- job_id = job_manager.create_job(job_config)
-
- # 8. Save uploaded files to temporary location
- # (BackgroundTask needs file paths, not UploadFile objects)
- temp_file_paths = []
- for file in files:
- temp_path = save_uploaded_file_temp(file)
- temp_file_paths.append(temp_path)
-
- # 9. Start background processing
- background_tasks.add_task(
- process_batch_job,
- job_id=job_id,
- temp_file_paths=temp_file_paths,
- config=job_config,
- job_manager=job_manager
- )
-
- # 10. Return job ID immediately
- return JSONResponse(
- content={
- "job_id": job_id,
- "status": "queued",
- "message": f"Batch processing started for {len(files)} images",
- "total_images": len(files),
- "status_endpoint": f"/api/batch-job/{job_id}",
- "estimated_time_seconds": len(files) * 3 # Rough estimate
- },
- status_code=202 # Accepted
- )
-
- except HTTPException:
- raise
- except Exception as e:
- import traceback
- traceback.print_exc()
- return JSONResponse(
- content={"success": False, "error": str(e)},
- status_code=500
- )
-```
-
-#### Endpoint 2: GET /api/batch-job/{job_id}
-
-```python
-@router.get("/api/batch-job/{job_id}")
-async def get_batch_job_status(job_id: str):
- """
- Get status and results of a batch processing job.
-
- **Path Parameters:**
- - **job_id**: Unique job identifier returned by /api/detect-batch
-
- **Response States:**
-
- **1. Queued (just started):**
- ```json
- {
- "job_id": "abc123",
- "status": "queued",
- "created_at": "2024-11-23T14:30:22",
- "progress": {
- "current": 0,
- "total": 10,
- "percentage": 0.0
- }
- }
- ```
-
- **2. Processing (in progress):**
- ```json
- {
- "job_id": "abc123",
- "status": "processing",
- "progress": {
- "current": 5,
- "total": 10,
- "percentage": 50.0,
- "successful": 4,
- "failed": 1
- },
- "results": [
- {"image": "img1.jpg", "status": "success", ...},
- {"image": "img2.jpg", "status": "success", ...},
- {"image": "img3.jpg", "status": "success", ...},
- {"image": "img4.jpg", "status": "success", ...},
- {"image": "img5.jpg", "status": "failed", "error": "..."}
- ],
- "estimated_time_remaining_seconds": 15
- }
- ```
-
- **3. Completed (finished successfully):**
- ```json
- {
- "job_id": "abc123",
- "status": "completed",
- "created_at": "2024-11-23T14:30:22",
- "completed_at": "2024-11-23T14:45:56",
- "progress": {
- "current": 10,
- "total": 10,
- "percentage": 100.0,
- "successful": 10,
- "failed": 0
- },
- "results": [...], // All 10 image results
- "summary": {
- "total_detections": 456,
- "average_confidence": 0.79,
- "processing_time_total": 23.5,
- ...
- },
- "output_dir": "outputs/batch_20241123_143022"
- }
- ```
-
- **4. Partial (some images failed):**
- ```json
- {
- "job_id": "abc123",
- "status": "partial",
- "progress": {
- "current": 10,
- "total": 10,
- "percentage": 100.0,
- "successful": 8,
- "failed": 2
- },
- "results": [...],
- "summary": {...}
- }
- ```
-
- **5. Failed (all images failed or critical error):**
- ```json
- {
- "job_id": "abc123",
- "status": "failed",
- "error": "Critical error description",
- "progress": {...}
- }
- ```
- """
-
- try:
- # Get job from manager
- job_manager = get_job_manager()
- job = job_manager.get_job(job_id)
-
- if not job:
- raise HTTPException(404, f"Job {job_id} not found")
-
- # Return current job state
- return JSONResponse(content=job)
-
- except HTTPException:
- raise
- except Exception as e:
- import traceback
- traceback.print_exc()
- return JSONResponse(
- content={"success": False, "error": str(e)},
- status_code=500
- )
-```
-
-### 4.4 Schema Updates
-
-**Location:** `api/schemas.py` (ADD NEW MODELS)
-
-```python
-# ============================================================================
-# BATCH PROCESSING SCHEMAS
-# ============================================================================
-
-class BatchJobConfig(BaseModel):
- """Configuration for a batch processing job."""
- total_images: int = Field(..., ge=1, le=300)
- score_threshold: float = Field(0.3, ge=0.0, le=1.0)
- visualization_mode: str = Field("none", pattern="^(none|per_image|summary|both)$")
- perturbations: Optional[List] = None # Can be List[dict] or List[List[dict]]
- perturbation_mode: str = Field("shared", pattern="^(shared|per_image)$")
- save_json: bool = True
- background_folder: Optional[str] = None
- filenames: List[str]
-
-
-class BatchJobProgress(BaseModel):
- """Progress tracking for batch job."""
- current: int = Field(..., ge=0)
- total: int = Field(..., ge=1)
- percentage: float = Field(..., ge=0.0, le=100.0)
- successful: int = Field(0, ge=0)
- failed: int = Field(0, ge=0)
- processing_times: List[float] = []
-
-
-class BatchImageResult(BaseModel):
- """Result for a single image in batch."""
- image: str
- status: str # "success" or "failed"
- processing_time: Optional[float] = None
- detections_count: Optional[int] = None
- average_confidence: Optional[float] = None
- result_path: Optional[str] = None
- error: Optional[str] = None
- traceback: Optional[str] = None
-
-
-class BatchJobSummary(BaseModel):
- """Aggregate statistics for completed batch."""
- total_images: int
- successful_images: int
- failed_images: int
- total_detections: int
- average_detections_per_image: float
- average_confidence: float
- confidence_std: float
- class_distribution: Dict[str, int]
- processing_time_total: float
- processing_time_average: float
-
-
-class BatchJobStatus(BaseModel):
- """Complete status of a batch job."""
- job_id: str
- status: str # queued, processing, completed, partial, failed
- created_at: str
- updated_at: str
- started_at: Optional[str] = None
- completed_at: Optional[str] = None
- config: BatchJobConfig
- progress: BatchJobProgress
- results: List[BatchImageResult]
- summary: Optional[BatchJobSummary] = None
- output_dir: Optional[str] = None
- error: Optional[str] = None
-
-
-class BatchJobCreateResponse(BaseModel):
- """Response when creating a new batch job."""
- job_id: str
- status: str = "queued"
- message: str
- total_images: int
- status_endpoint: str
- estimated_time_seconds: int
-```
-
-### 4.5 Configuration Updates
-
-**Location:** `config/settings.py` (ADD NEW SETTINGS)
-
-```python
-# ============================================================================
-# BATCH PROCESSING CONFIGURATION
-# ============================================================================
-
-# Job storage directory
-JOBS_DIR = OUTPUT_DIR / "jobs"
-JOBS_DIR.mkdir(parents=True, exist_ok=True)
-
-# Batch processing limits
-MAX_BATCH_SIZE = 300 # Maximum images per batch
-MIN_BATCH_SIZE = 1 # Minimum images per batch
-
-# Job retention
-JOB_RETENTION_HOURS = 48 # Keep jobs for 48 hours
-MAX_CONCURRENT_JOBS = 10 # Maximum active jobs (future use)
-
-# Batch output directory prefix
-BATCH_OUTPUT_PREFIX = "batch_"
-
-# Visualization modes
-VISUALIZATION_MODES = ["none", "per_image", "summary", "both"]
-
-# Default batch settings
-DEFAULT_BATCH_CONFIG = {
- "score_threshold": 0.3,
- "visualization_mode": "none",
- "save_json": True,
- "perturbation_mode": "shared"
-}
-
-# Processing time estimation (seconds per image)
-ESTIMATED_TIME_PER_IMAGE = 3.0
-ESTIMATED_TIME_PER_VIZ = 0.5 # Additional time for visualizations
-```
-
----
-
-## 5. Data Flow Diagrams
-
-### 5.1 Request Flow: Submit Batch Job
-
-```
-┌──────────┐
-│ Client │
-└────┬─────┘
- │
- │ 1. POST /api/detect-batch
- │ files: [img1, img2, img3, ..., img10]
- │ score_thr: "0.3"
- │ visualization_mode: "summary"
- │ perturbations: '[{"type":"defocus","degree":2}]'
- ▼
-┌─────────────────────────────────────┐
-│ FastAPI Router (routes.py) │
-│ @router.post("/api/detect-batch") │
-└────┬────────────────────────────────┘
- │
- │ 2. Validate inputs
- │ - Check file count (1-300)
- │ - Validate file types (images only)
- │ - Parse score_thr (0.0-1.0)
- │ - Parse perturbations JSON
- │ - Validate visualization_mode
- ▼
-┌─────────────────────────────────────┐
-│ BatchJobManager │
-│ .create_job(config) │
-└────┬────────────────────────────────┘
- │
- │ 3. Generate job_id (UUID4)
- │ Create job dict
- │ Save to memory: self._jobs[job_id] = {...}
- │ Persist to disk: jobs/job_abc123.json
- ▼
-┌─────────────────────────────────────┐
-│ Save Uploaded Files to Temp │
-│ /tmp/upload_img1_xyz.jpg │
-│ /tmp/upload_img2_abc.jpg │
-│ ... │
-└────┬────────────────────────────────┘
- │
- │ 4. Launch BackgroundTask
- │ process_batch_job(job_id, temp_paths, config)
- ▼
-┌─────────────────────────────────────┐
-│ FastAPI BackgroundTasks │
-│ (Runs in separate thread) │
-└────┬────────────────────────────────┘
- │
- │ 5. Return 202 Accepted immediately
- ▼
-┌──────────┐
-│ Client │ Receives:
-└──────────┘ {
- "job_id": "abc123",
- "status": "queued",
- "status_endpoint": "/api/batch-job/abc123"
- }
-```
-
-### 5.2 Background Processing Flow
-
-```
-┌─────────────────────────────────────┐
-│ BackgroundTask Started │
-│ process_batch_job() │
-└────┬────────────────────────────────┘
- │
- │ 1. Mark job as "processing"
- │ job_manager.mark_job_started(job_id)
- ▼
-┌─────────────────────────────────────┐
-│ Create Batch Output Directory │
-│ outputs/batch_20241123_143022/ │
-└────┬────────────────────────────────┘
- │
- │ 2. Loop: for i, temp_path in enumerate(temp_file_paths):
- ▼
-┌──────────────────────────────────────────────────────────┐
-│ Process Image i │
-│ │
-│ ┌──────────────────────────────────────────────────┐ │
-│ │ a. Load image: cv2.imread(temp_path) │ │
-│ └──────────────────────────────────────────────────┘ │
-│ ┌──────────────────────────────────────────────────┐ │
-│ │ b. Apply perturbations (if configured) │ │
-│ │ - Get perturbations for this image │ │
-│ │ - Call perturb_image_service() │ │
-│ └──────────────────────────────────────────────────┘ │
-│ ┌──────────────────────────────────────────────────┐ │
-│ │ c. Run detection │ │
-│ │ - result, w, h = run_inference(image_path) │ │
-│ │ - detections = process_detections(result) │ │
-│ └──────────────────────────────────────────────────┘ │
-│ ┌──────────────────────────────────────────────────┐ │
-│ │ d. Calculate metrics │ │
-│ │ - create_comprehensive_results(...) │ │
-│ └──────────────────────────────────────────────────┘ │
-│ ┌────────────────────────────────��─────────────────┐ │
-│ │ e. Generate visualizations (if mode != "none") │ │
-│ │ - Per-image charts if needed │ │
-│ └──────────────────────────────────────────────────┘ │
-│ ┌──────────────────────────────────────────────────┐ │
-│ │ f. Save individual result │ │
-│ │ - batch_dir/image_001/results.json │ │
-│ │ - batch_dir/image_001/visualizations/ │ │
-│ └──────────────────────────────────────────────────┘ │
-│ ┌──────────────────────────────────────────────────┐ │
-│ │ g. Update job progress │ │
-│ │ - job_manager.add_job_result(...) │ │
-│ │ - job_manager.update_job_progress(i+1, N) │ │
-│ │ - Persist to disk │ │
-│ └──────────────────────────────────────────────────┘ │
-│ ┌──────────────────────────────────────────────────┐ │
-│ │ h. Cleanup temp files │ │
-│ └──────────────────────────────────────────────────┘ │
-│ │
-│ [If error occurs] │
-│ ┌──────────────────────────────────────────────────┐ │
-│ │ - Log error │ │
-│ │ - Add failed result to job │ │
-│ │ - Continue to next image │ │
-│ └──────────────────────────────────────────────────┘ │
-└────┬─────────────────────────────────────────────────────┘
- │
- │ 3. All images processed
- ▼
-┌─────────────────────────────────────┐
-│ Generate Summary (if requested) │
-│ - Aggregate metrics │
-│ - Generate summary visualizations │
-│ - Save to batch_dir/summary.json │
-└────┬────────────────────────────────┘
- │
- │ 4. Mark job completed
- │ job_manager.mark_job_completed(job_id)
- │ - status = "completed" or "partial"
- │ - completed_at = timestamp
- │ - Persist to disk
- ▼
-┌─────────────────────────────────────┐
-│ Background Task Complete │
-└─────────────────────────────────────┘
-```
-
-### 5.3 Client Polling Flow
-
-```
-┌──────────┐
-│ Client │
-└────┬─────┘
- │
- │ Loop until job complete:
- │
- │ 1. GET /api/batch-job/abc123
- ▼
-┌─────────────────────────────────────┐
-│ FastAPI Router │
-│ @router.get("/api/batch-job/...")│
-└────┬────────────────────────────────┘
- │
- │ 2. Fetch from manager
- │ job = job_manager.get_job(job_id)
- ▼
-┌─────────────────────────────────────┐
-│ BatchJobManager │
-│ - Load from memory (fast) │
-│ - If not in memory, load from disk│
-└────┬────────────────────────────────┘
- │
- │ 3. Return job status
- ▼
-┌──────────┐ Receives:
-│ Client │ {
-└──────────┘ "job_id": "abc123",
- "status": "processing",
- "progress": {
- "current": 5,
- "total": 10,
- "percentage": 50.0
- },
- "results": [...]
- }
- │
- │ 4. Check status
- │ if status == "completed" or "partial":
- │ break
- │ else:
- │ sleep(2 seconds)
- │ continue loop
- ▼
-┌──────────┐
-│ Client │ Job complete!
-└──────────┘ Process final results
-```
-
----
-
-## 6. API Endpoint Specifications
-
-### 6.1 Complete API Surface
-
-```
-EXISTING ENDPOINTS (Unchanged):
-├── GET /health
-├── GET /api/model-info
-├── POST /api/detect
-├── GET /api/perturbations/info
-├── POST /api/perturb
-└── POST /api/detect-with-perturbation
-
-NEW ENDPOINTS (Batch Processing):
-├── POST /api/detect-batch ← Main batch submission
-└── GET /api/batch-job/{job_id} ← Status polling
-```
-
-### 6.2 POST /api/detect-batch - Detailed Specification
-
-#### Request Format
-
-**Content-Type:** `multipart/form-data`
-
-**Form Fields:**
-
-| Field | Type | Required | Default | Description |
-|-------|------|----------|---------|-------------|
-| `files` | File[] | ✅ Yes | - | 1-300 image files |
-| `score_thr` | string | ❌ No | `"0.3"` | Confidence threshold (0.0-1.0) |
-| `perturbations` | string | ❌ No | `null` | JSON array (shared or per-image) |
-| `visualization_mode` | string | ❌ No | `"none"` | `none`, `per_image`, `summary`, `both` |
-| `save_json` | string | ❌ No | `"true"` | Save results to disk |
-| `background_folder` | string | ❌ No | `null` | Background images path |
-
-**Example cURL:**
-
-```bash
-curl -X POST "http://localhost:8000/api/detect-batch" \
- -F "files=@image1.jpg" \
- -F "files=@image2.jpg" \
- -F "files=@image3.jpg" \
- -F "score_thr=0.3" \
- -F 'perturbations=[{"type":"defocus","degree":2}]' \
- -F "visualization_mode=summary"
-```
-
-**Example Python:**
-
-```python
-import requests
-
-files = [
- ('files', ('img1.jpg', open('img1.jpg', 'rb'), 'image/jpeg')),
- ('files', ('img2.jpg', open('img2.jpg', 'rb'), 'image/jpeg')),
- ('files', ('img3.jpg', open('img3.jpg', 'rb'), 'image/jpeg')),
-]
-
-data = {
- 'score_thr': '0.3',
- 'perturbations': '[{"type":"defocus","degree":2}]',
- 'visualization_mode': 'summary'
-}
-
-response = requests.post(
- 'http://localhost:8000/api/detect-batch',
- files=files,
- data=data
-)
-
-job = response.json()
-job_id = job['job_id']
-print(f"Job started: {job_id}")
-```
-
-#### Response Format (202 Accepted)
-
-```json
-{
- "job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
- "status": "queued",
- "message": "Batch processing started for 10 images",
- "total_images": 10,
- "status_endpoint": "/api/batch-job/3fa85f64-5717-4562-b3fc-2c963f66afa6",
- "estimated_time_seconds": 30
-}
-```
-
-#### Error Responses
-
-| Status | Condition | Response Body |
-|--------|-----------|---------------|
-| 400 | No files provided | `{"detail": "At least one file is required"}` |
-| 400 | Too many files (>300) | `{"detail": "Maximum 300 files allowed, got 350"}` |
-| 400 | Invalid file type | `{"detail": "File xyz.pdf is not an image"}` |
-| 400 | Invalid score_thr | `{"detail": "Invalid score_thr: Must be between 0 and 1"}` |
-| 400 | Invalid perturbations JSON | `{"detail": "Invalid JSON in perturbations parameter"}` |
-| 400 | Perturbation count mismatch | `{"detail": "Per-image perturbations count (5) must match file count (10)"}` |
-| 500 | Server error | `{"success": false, "error": "Internal server error"}` |
-
-### 6.3 GET /api/batch-job/{job_id} - Detailed Specification
-
-#### Request Format
-
-**Path Parameter:**
-- `job_id` (string): UUID returned by `/api/detect-batch`
-
-**Example cURL:**
-
-```bash
-curl "http://localhost:8000/api/batch-job/3fa85f64-5717-4562-b3fc-2c963f66afa6"
-```
-
-**Example Python (with polling):**
-
-```python
-import requests
-import time
-
-job_id = "3fa85f64-5717-4562-b3fc-2c963f66afa6"
-url = f"http://localhost:8000/api/batch-job/{job_id}"
-
-while True:
- response = requests.get(url)
- job = response.json()
-
- status = job['status']
- progress = job['progress']
-
- print(f"Status: {status} - {progress['current']}/{progress['total']} "
- f"({progress['percentage']:.1f}%) - "
- f"Success: {progress['successful']}, Failed: {progress['failed']}")
-
- if status in ['completed', 'partial', 'failed']:
- print("\nJob finished!")
- print(f"Results saved to: {job.get('output_dir')}")
- break
-
- time.sleep(2) # Poll every 2 seconds
-```
-
-#### Response Format - By Status
-
-**Status: queued (initial state)**
-```json
-{
- "job_id": "abc123",
- "status": "queued",
- "created_at": "2024-11-23T14:30:22.123456",
- "updated_at": "2024-11-23T14:30:22.123456",
- "config": {
- "total_images": 10,
- "score_threshold": 0.3,
- "visualization_mode": "summary",
- "perturbation_mode": "shared",
- "filenames": ["img1.jpg", "img2.jpg", ...]
- },
- "progress": {
- "current": 0,
- "total": 10,
- "percentage": 0.0,
- "successful": 0,
- "failed": 0,
- "processing_times": []
- },
- "results": []
-}
-```
-
-**Status: processing (in progress)**
-```json
-{
- "job_id": "abc123",
- "status": "processing",
- "created_at": "2024-11-23T14:30:22.123456",
- "updated_at": "2024-11-23T14:32:45.789012",
- "started_at": "2024-11-23T14:30:23.456789",
- "config": {...},
- "progress": {
- "current": 5,
- "total": 10,
- "percentage": 50.0,
- "successful": 4,
- "failed": 1,
- "processing_times": [2.3, 1.8, 3.1, 2.0, 2.5]
- },
- "results": [
- {
- "image": "img1.jpg",
- "status": "success",
- "processing_time": 2.3,
- "detections_count": 47,
- "average_confidence": 0.82,
- "result_path": "outputs/batch_20241123_143022/image_001/results.json"
- },
- {
- "image": "img2.jpg",
- "status": "success",
- "processing_time": 1.8,
- "detections_count": 52,
- "average_confidence": 0.79,
- "result_path": "outputs/batch_20241123_143022/image_002/results.json"
- },
- {
- "image": "img3.jpg",
- "status": "failed",
- "error": "Model inference failed: CUDA out of memory",
- "traceback": "Traceback (most recent call last):\n File ..."
- },
- {
- "image": "img4.jpg",
- "status": "success",
- "processing_time": 2.0,
- "detections_count": 41,
- "average_confidence": 0.85,
- "result_path": "outputs/batch_20241123_143022/image_004/results.json"
- },
- {
- "image": "img5.jpg",
- "status": "success",
- "processing_time": 2.5,
- "detections_count": 38,
- "average_confidence": 0.77,
- "result_path": "outputs/batch_20241123_143022/image_005/results.json"
- }
- ]
-}
-```
-
-**Status: completed (all successful)**
-```json
-{
- "job_id": "abc123",
- "status": "completed",
- "created_at": "2024-11-23T14:30:22.123456",
- "updated_at": "2024-11-23T14:35:56.789012",
- "started_at": "2024-11-23T14:30:23.456789",
- "completed_at": "2024-11-23T14:35:56.789012",
- "config": {...},
- "progress": {
- "current": 10,
- "total": 10,
- "percentage": 100.0,
- "successful": 10,
- "failed": 0,
- "processing_times": [2.3, 1.8, 3.1, 2.0, 2.5, 2.2, 1.9, 2.8, 2.1, 2.4]
- },
- "results": [
- // ... all 10 image results with status "success"
- ],
- "summary": {
- "total_images": 10,
- "successful_images": 10,
- "failed_images": 0,
- "total_detections": 456,
- "average_detections_per_image": 45.6,
- "average_confidence": 0.79,
- "confidence_std": 0.12,
- "class_distribution": {
- "paragraph": 152,
- "title": 84,
- "figure": 67,
- "table": 43,
- // ... more classes
- },
- "processing_time_total": 23.5,
- "processing_time_average": 2.35
- },
- "output_dir": "outputs/batch_20241123_143022"
-}
-```
-
-**Status: partial (some failed)**
-```json
-{
- "job_id": "abc123",
- "status": "partial",
- "created_at": "2024-11-23T14:30:22.123456",
- "completed_at": "2024-11-23T14:35:56.789012",
- "config": {...},
- "progress": {
- "current": 10,
- "total": 10,
- "percentage": 100.0,
- "successful": 8,
- "failed": 2
- },
- "results": [
- // ... 8 successful, 2 failed
- ],
- "summary": {
- // ... summary based on successful images only
- "total_images": 10,
- "successful_images": 8,
- "failed_images": 2
- },
- "output_dir": "outputs/batch_20241123_143022"
-}
-```
-
-**Status: failed (critical error or all failed)**
-```json
-{
- "job_id": "abc123",
- "status": "failed",
- "created_at": "2024-11-23T14:30:22.123456",
- "updated_at": "2024-11-23T14:31:45.123456",
- "error": "Critical error: Model failed to load",
- "progress": {
- "current": 0,
- "total": 10,
- "percentage": 0.0,
- "successful": 0,
- "failed": 0
- },
- "results": []
-}
-```
-
-#### Error Responses
-
-| Status | Condition | Response Body |
-|--------|-----------|---------------|
-| 404 | Job not found | `{"detail": "Job abc123 not found"}` |
-| 500 | Server error | `{"success": false, "error": "..."}` |
-
----
-
-## 7. Database/Storage Schema
-
-### 7.1 In-Memory Storage
-
-**Global Singleton Instance:**
-```python
-# In services/batch_job_manager.py
-_job_manager_instance = None
-
-def get_job_manager() -> BatchJobManager:
- global _job_manager_instance
- if _job_manager_instance is None:
- _job_manager_instance = BatchJobManager(JOBS_DIR)
- return _job_manager_instance
-```
-
-**Thread-Safe Dictionary:**
-```python
-class BatchJobManager:
- def __init__(self, jobs_dir: Path):
- self._jobs: Dict[str, dict] = {}
- self._lock = threading.Lock()
-
- def get_job(self, job_id: str) -> Optional[dict]:
- with self._lock:
- return self._jobs.get(job_id)
-
- def update_job_progress(self, job_id: str, current: int, total: int):
- with self._lock:
- # ... update logic
- self._persist_job(job_id)
-```
-
-### 7.2 File System Storage
-
-**Directory Structure:**
-```
-outputs/
-├── jobs/ # Job metadata
-│ ├── job_abc123.json # Job state file
-│ ├── job_def456.json
-│ └── job_ghi789.json
-│
-├── batch_20241123_143022/ # Batch results
-│ ├── summary.json # Aggregate statistics
-│ │
-│ ├── image_001/ # First image
-│ │ ├── results.json # Full RoDLA results
-│ │ └── visualizations/ # (if per_image mode)
-│ │ ├── class_distribution.png
-│ │ ├── confidence_distribution.png
-│ │ ├── spatial_heatmap.png
-│ │ ├── confidence_by_class.png
-│ │ ├── area_vs_confidence.png
-│ │ ├── quadrant_distribution.png
-│ │ ├── size_distribution.png
-│ │ └── top_classes_confidence.png
-│ │
-│ ├── image_002/
-│ │ ├── results.json
-│ │ └── visualizations/
-│ │
-│ ├── image_003/
-│ │ └── results.json # (failed image, no viz)
-│ │
-│ ├── ... (more images)
-│ │
-│ └── summary_visualizations/ # (if summary or both mode)
-│ ├── combined_class_distribution.png
-│ ├── combined_confidence_histogram.png
-│ ├── average_confidence_by_class.png
-│ ├── detection_count_per_image.png
-│ ├── confidence_trend_across_images.png
-│ ├── processing_time_per_image.png
-│ ├── success_failure_summary.png
-│ └── top_classes_across_batch.png
-│
-└── perturbations/ # (existing, unchanged)
- └── ...
-```
-
-**Job File Schema (job_abc123.json):**
-```json
-{
- "job_id": "abc123",
- "status": "completed",
- "created_at": "2024-11-23T14:30:22.123456",
- "updated_at": "2024-11-23T14:35:56.789012",
- "started_at": "2024-11-23T14:30:23.456789",
- "completed_at": "2024-11-23T14:35:56.789012",
-
- "config": {
- "total_images": 10,
- "score_threshold": 0.3,
- "visualization_mode": "summary",
- "perturbations": [{"type": "defocus", "degree": 2}],
- "perturbation_mode": "shared",
- "save_json": true,
- "background_folder": null,
- "filenames": ["img1.jpg", "img2.jpg", ...]
- },
-
- "progress": {
- "current": 10,
- "total": 10,
- "percentage": 100.0,
- "successful": 10,
- "failed": 0,
- "processing_times": [2.3, 1.8, 3.1, 2.0, 2.5, 2.2, 1.9, 2.8, 2.1, 2.4]
- },
-
- "results": [
- {
- "image": "img1.jpg",
- "status": "success",
- "processing_time": 2.3,
- "detections_count": 47,
- "average_confidence": 0.82,
- "result_path": "outputs/batch_20241123_143022/image_001/results.json"
- },
- // ... more results
- ],
-
- "summary": {
- "total_images": 10,
- "successful_images": 10,
- "failed_images": 0,
- "total_detections": 456,
- "average_detections_per_image": 45.6,
- "average_confidence": 0.79,
- "confidence_std": 0.12,
- "class_distribution": {
- "paragraph": 152,
- "title": 84,
- // ... more classes
- },
- "processing_time_total": 23.5,
- "processing_time_average": 2.35
- },
-
- "output_dir": "outputs/batch_20241123_143022"
-}
-```
-
-**Individual Image Result (image_001/results.json):**
-```json
-{
- // Standard RoDLA result format (unchanged from single-image API)
- "success": true,
- "timestamp": "2024-11-23T14:30:25.123456",
- "filename": "img1.jpg",
- "image_info": {...},
- "detection_config": {...},
- "core_results": {...},
- "rodla_metrics": {...},
- "spatial_analysis": {...},
- "class_analysis": {...},
- "confidence_analysis": {...},
- "robustness_indicators": {...},
- "layout_complexity": {...},
- "quality_metrics": {...},
- "interpretation": {...},
- "all_detections": [...]
-}
-```
-
-**Batch Summary (summary.json):**
-```json
-{
- "batch_id": "batch_20241123_143022",
- "created_at": "2024-11-23T14:30:22.123456",
- "completed_at": "2024-11-23T14:35:56.789012",
-
- "statistics": {
- "total_images": 10,
- "successful_images": 10,
- "failed_images": 0,
- "total_detections": 456,
- "average_detections_per_image": 45.6,
- "average_confidence": 0.79,
- "confidence_std": 0.12,
- "min_confidence": 0.31,
- "max_confidence": 0.98,
- "processing_time_total": 23.5,
- "processing_time_average": 2.35,
- "processing_time_min": 1.8,
- "processing_time_max": 3.1
- },
-
- "class_distribution": {
- "paragraph": {
- "total_count": 152,
- "percentage": 33.3,
- "average_confidence": 0.81
- },
- "title": {
- "total_count": 84,
- "percentage": 18.4,
- "average_confidence": 0.87
- },
- // ... more classes
- },
-
- "per_image_summary": [
- {
- "image": "img1.jpg",
- "detections": 47,
- "avg_confidence": 0.82,
- "processing_time": 2.3,
- "status": "success"
- },
- // ... more images
- ],
-
- "visualizations_generated": ["combined_class_distribution", "..."]
-}
-```
-
-### 7.3 Recovery on Restart
-
-**Startup Logic:**
-```python
-class BatchJobManager:
- def _load_jobs_from_disk(self):
- """Load all existing jobs from disk on startup."""
- print(f"Loading jobs from {self.jobs_dir}...")
-
- job_files = list(self.jobs_dir.glob("job_*.json"))
- loaded_count = 0
-
- for job_file in job_files:
- try:
- with open(job_file, 'r') as f:
- job = json.load(f)
-
- job_id = job["job_id"]
-
- # Mark incomplete jobs as failed
- if job["status"] in ["queued", "processing"]:
- job["status"] = "failed"
- job["error"] = "Server restarted during processing"
- job["updated_at"] = datetime.now().isoformat()
-
- self._jobs[job_id] = job
- loaded_count += 1
-
- except Exception as e:
- print(f"Error loading {job_file}: {e}")
-
- print(f"Loaded {loaded_count} jobs from disk")
-```
-
----
-
-## 8. Error Handling Strategy
-
-### 8.1 Error Categories and Responses
-
-**1. Validation Errors (HTTP 400)**
-```python
-# File count validation
-if len(files) > MAX_BATCH_SIZE:
- raise HTTPException(400, f"Maximum {MAX_BATCH_SIZE} files allowed")
-
-# File type validation
-for file in files:
- if not file.content_type.startswith('image/'):
- raise HTTPException(400, f"{file.filename} is not an image")
-
-# Perturbation validation
-if pert_mode == "per_image" and len(perturbations) != len(files):
- raise HTTPException(400, "Perturbation count must match file count")
-```
-
-**2. Per-Image Errors (Continue Processing)**
-```python
-for i, image_path in enumerate(image_paths):
- try:
- # Process image
- result = process_single_image(image_path, config)
-
- job_manager.add_job_result(job_id, filename, {
- "status": "success",
- "data": result
- })
-
- except Exception as e:
- # Log but don't stop batch
- logger.error(f"Image {i} failed: {e}")
-
- job_manager.add_job_result(job_id, filename, {
- "status": "failed",
- "error": str(e),
- "traceback": traceback.format_exc()
- })
-
- # Always update progress
- job_manager.update_job_progress(job_id, i+1, total)
-```
-
-**3. Critical Errors (Stop Batch)**
-```python
-try:
- # Batch processing loop
- for image in images:
- process_image(image)
-
-except Exception as e:
- # Critical error - stop entire batch
- logger.critical(f"Critical error in batch {job_id}: {e}")
-
- job_manager.mark_job_failed(job_id, f"Critical error: {str(e)}")
-
- # Optionally notify client (future: webhook)
-```
-
-### 8.2 Retry Logic (Future Enhancement)
-
-**Per-Image Retry (Not in MVP):**
-```python
-MAX_RETRIES = 3
-
-for attempt in range(MAX_RETRIES):
- try:
- result = process_image(image)
- break # Success
- except Exception as e:
- if attempt == MAX_RETRIES - 1:
- # Final attempt failed
- mark_failed(image, error=e)
- else:
- # Retry with backoff
- time.sleep(2 ** attempt)
-```
-
-### 8.3 Resource Cleanup
-
-**Temp File Cleanup:**
-```python
-def process_batch_job(job_id, temp_paths, config, job_manager):
- temp_files_to_cleanup = temp_paths.copy()
-
- try:
- # Process images
- for temp_path in temp_paths:
- try:
- # ... processing ...
- temp_files_to_cleanup.remove(temp_path)
- os.unlink(temp_path)
- except Exception:
- pass # Cleanup in finally
-
- finally:
- # Ensure all temp files deleted
- for temp_path in temp_files_to_cleanup:
- try:
- if os.path.exists(temp_path):
- os.unlink(temp_path)
- except Exception as e:
- logger.warning(f"Failed to cleanup {temp_path}: {e}")
-```
-
-**GPU Memory Cleanup:**
-```python
-import gc
-import torch
-
-def process_image(image_path):
- try:
- # ... inference ...
- return result
- finally:
- # Clean up GPU memory after each image
- if torch.cuda.is_available():
- torch.cuda.empty_cache()
- gc.collect()
-```
-
----
-
-## 9. Implementation Checklist
-
-### Phase 1: Core Infrastructure (2-3 hours)
-
-- [ ] **Create `services/batch_job_manager.py`**
- - [ ] Implement `BatchJobManager` class
- - [ ] Implement `create_job()` method
- - [ ] Implement `get_job()` method
- - [ ] Implement `update_job_progress()` method
- - [ ] Implement `add_job_result()` method
- - [ ] Implement `mark_job_completed()` method
- - [ ] Implement `mark_job_failed()` method
- - [ ] Implement `_persist_job()` method (JSON save)
- - [ ] Implement `_load_jobs_from_disk()` method
- - [ ] Add thread safety (locks)
- - [ ] Test in isolation (unit tests)
-
-- [ ] **Update `config/settings.py`**
- - [ ] Add `JOBS_DIR` constant
- - [ ] Add `MAX_BATCH_SIZE = 300`
- - [ ] Add `MIN_BATCH_SIZE = 1`
- - [ ] Add `JOB_RETENTION_HOURS = 48`
- - [ ] Add `BATCH_OUTPUT_PREFIX = "batch_"`
- - [ ] Add `VISUALIZATION_MODES` list
- - [ ] Add `ESTIMATED_TIME_PER_IMAGE = 3.0`
- - [ ] Create `JOBS_DIR` on import
-
-- [ ] **Update `api/schemas.py`**
- - [ ] Add `BatchJobConfig` model
- - [ ] Add `BatchJobProgress` model
- - [ ] Add `BatchImageResult` model
- - [ ] Add `BatchJobSummary` model
- - [ ] Add `BatchJobStatus` model
- - [ ] Add `BatchJobCreateResponse` model
-
-### Phase 2: Batch Processing Logic (3-4 hours)
-
-- [ ] **Update `services/processing.py`**
- - [ ] Add `process_batch_job()` async function
- - [ ] Add `get_perturbations_for_image()` helper
- - [ ] Add `save_individual_result()` helper
- - [ ] Add `save_temp_file()` helper
- - [ ] Add `cleanup_temp_files()` helper
- - [ ] Add `generate_batch_summary()` function
- - [ ] Add `generate_summary_visualizations()` function
- - [ ] Add `save_summary()` function
- - [ ] Test batch processing with 3-5 images
-
-### Phase 3: API Endpoints (2-3 hours)
-
-- [ ] **Update `api/routes.py`**
- - [ ] Add `POST /api/detect-batch` endpoint
- - [ ] Validate file count (1-300)
- - [ ] Validate file types
- - [ ] Parse score_thr
- - [ ] Parse and validate perturbations
- - [ ] Validate visualization_mode
- - [ ] Create job config
- - [ ] Create job in manager
- - [ ] Save uploaded files to temp
- - [ ] Launch BackgroundTask
- - [ ] Return 202 response with job_id
- - [ ] Add `GET /api/batch-job/{job_id}` endpoint
- - [ ] Get job from manager
- - [ ] Return 404 if not found
- - [ ] Return current job status
- - [ ] Add `get_job_manager()` dependency function
- - [ ] Test endpoints with Postman/cURL
-
-### Phase 4: Visualization Enhancements (2-3 hours)
-
-- [ ] **Update `services/visualization.py`**
- - [ ] Add `generate_summary_visualizations()` function
- - [ ] Implement combined_class_distribution chart
- - [ ] Implement combined_confidence_histogram chart
- - [ ] Implement detection_count_per_image chart
- - [ ] Implement confidence_trend_across_images chart
- - [ ] Implement processing_time_per_image chart
- - [ ] Implement success_failure_summary chart
- - [ ] Implement top_classes_across_batch chart
- - [ ] Implement average_confidence_by_class chart
- - [ ] Test with sample batch results
-
-### Phase 5: Integration & Testing (2-3 hours)
-
-- [ ] **Integration Testing**
- - [ ] Test single image batch (edge case)
- - [ ] Test small batch (5 images)
- - [ ] Test medium batch (50 images)
- - [ ] Test large batch (200 images)
- - [ ] Test maximum batch (300 images)
- - [ ] Test with shared perturbations
- - [ ] Test with per-image perturbations
- - [ ] Test visualization_mode="none"
- - [ ] Test visualization_mode="per_image"
- - [ ] Test visualization_mode="summary"
- - [ ] Test visualization_mode="both"
- - [ ] Test error scenarios (invalid files, OOM, etc.)
- - [ ] Test job recovery after restart
-
-- [ ] **Performance Testing**
- - [ ] Measure processing time per image
- - [ ] Monitor GPU memory usage
- - [ ] Monitor disk space usage
- - [ ] Test concurrent batch submissions (if needed)
-
-- [ ] **Error Testing**
- - [ ] Test with non-image files
- - [ ] Test with corrupted images
- - [ ] Test with too many files (>300)
- - [ ] Test with invalid perturbations
- - [ ] Test with mismatched perturbation counts
- - [ ] Test GPU OOM recovery
- - [ ] Test disk full scenario
-
-### Phase 6: Documentation (1-2 hours)
-
-- [ ] **Update README.md**
- - [ ] Add batch processing section
- - [ ] Add API endpoint documentation
- - [ ] Add usage examples (cURL + Python)
- - [ ] Add visualization mode explanation
- - [ ] Add error handling documentation
- - [ ] Add performance benchmarks
-
-- [ ] **Create API Examples**
- - [ ] Python client example
- - [ ] cURL examples
- - [ ] Postman collection (optional)
-
-### Phase 7: Backend Updates (1 hour)
-
-- [ ] **Update `backend.py`**
- - [ ] Initialize job manager on startup
- - [ ] Load existing jobs from disk
- - [ ] Log startup messages
- - [ ] No other changes needed (routes auto-included)
-
----
-
-## 10. Testing Strategy
-
-### 10.1 Unit Tests
-
-**Test `BatchJobManager`:**
-```python
-# tests/test_services/test_batch_job_manager.py
-
-def test_create_job():
- manager = BatchJobManager(tmp_path / "jobs")
- config = {"total_images": 5}
-
- job_id = manager.create_job(config)
-
- assert job_id is not None
- assert len(job_id) == 36 # UUID4 length
-
- job = manager.get_job(job_id)
- assert job["status"] == "queued"
- assert job["config"] == config
-
-
-def test_update_progress():
- manager = BatchJobManager(tmp_path / "jobs")
- job_id = manager.create_job({"total_images": 10})
-
- manager.update_job_progress(job_id, 5, 10)
-
- job = manager.get_job(job_id)
- assert job["progress"]["current"] == 5
- assert job["progress"]["percentage"] == 50.0
-
-
-def test_persist_and_load():
- jobs_dir = tmp_path / "jobs"
- manager1 = BatchJobManager(jobs_dir)
- job_id = manager1.create_job({"total_images": 3})
-
- # Create new manager (simulates restart)
- manager2 = BatchJobManager(jobs_dir)
- job = manager2.get_job(job_id)
-
- assert job is not None
- assert job["status"] == "failed" # Marked as failed on restart
-```
-
-### 10.2 Integration Tests
-
-**Test Full Batch Pipeline:**
-```python
-# tests/test_integration/test_batch_pipeline.py
-
-@pytest.mark.slow
-def test_small_batch_end_to_end(test_client, sample_images):
- # Submit batch
- files = [
- ('files', ('img1.jpg', open(sample_images[0], 'rb'), 'image/jpeg')),
- ('files', ('img2.jpg', open(sample_images[1], 'rb'), 'image/jpeg')),
- ('files', ('img3.jpg', open(sample_images[2], 'rb'), 'image/jpeg')),
- ]
-
- response = test_client.post(
- '/api/detect-batch',
- files=files,
- data={'score_thr': '0.3', 'visualization_mode': 'summary'}
- )
-
- assert response.status_code == 202
- job_id = response.json()['job_id']
-
- # Poll until complete
- max_polls = 30
- for _ in range(max_polls):
- status_response = test_client.get(f'/api/batch-job/{job_id}')
- job = status_response.json()
-
- if job['status'] in ['completed', 'partial', 'failed']:
- break
-
- time.sleep(2)
-
- # Verify completion
- assert job['status'] == 'completed'
- assert job['progress']['successful'] == 3
- assert len(job['results']) == 3
- assert job['summary'] is not None
-
-
-@pytest.mark.slow
-def test_batch_with_perturbations(test_client, sample_images):
- files = [
- ('files', ('img1.jpg', open(sample_images[0], 'rb'), 'image/jpeg')),
- ('files', ('img2.jpg', open(sample_images[1], 'rb'), 'image/jpeg')),
- ]
-
- perturbations = [
- [{"type": "defocus", "degree": 2}], # For img1
- [{"type": "speckle", "degree": 1}], # For img2
- ]
-
- response = test_client.post(
- '/api/detect-batch',
- files=files,
- data={
- 'perturbations': json.dumps(perturbations),
- 'visualization_mode': 'none'
- }
- )
-
- assert response.status_code == 202
- job_id = response.json()['job_id']
-
- # Wait and verify
- # ... (similar polling logic)
-```
-
-### 10.3 Load Tests (Optional)
-
-**Test Multiple Concurrent Batches:**
-```python
-import concurrent.futures
-
-def submit_batch(client, num_images):
- files = [...] # Create files
- response = client.post('/api/detect-batch', files=files)
- return response.json()['job_id']
-
-def test_concurrent_batches():
- with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
- futures = [
- executor.submit(submit_batch, client, 10)
- for _ in range(5)
- ]
-
- job_ids = [f.result() for f in futures]
-
- # Verify all jobs eventually complete
- # ...
-```
-
----
-
-## 11. Migration Path
-
-###
\ No newline at end of file