Spaces:
Sleeping
Sleeping
zeeshan
commited on
Commit
Β·
0c00a14
1
Parent(s):
c4ed69a
Inference Fix
Browse files- deployment/backend/README.md +0 -2292
- deployment/backend/README_Version_TWO.md +0 -0
- deployment/backend/README_Version_Three.md +0 -0
- deployment/backend/backend.py +634 -66
- deployment/backend/backend_adaptive.py +0 -500
- deployment/backend/backend_demo.py +0 -366
- deployment/backend/backend_lite.py +0 -618
- deployment/backend/{backend_two.py β backend_old.py} +0 -0
- deployment/backend/perturbations/spatial.py +23 -15
- deployment/backend/perturbations_simple.py +516 -0
- deployment/backend/register_dino.py +68 -0
- frontend/index.html +7 -1
- frontend/script.js +348 -77
- setup.sh +59 -0
- start.sh +0 -143
deployment/backend/README.md
DELETED
|
@@ -1,2292 +0,0 @@
|
|
| 1 |
-
# RoDLA Document Layout Analysis API
|
| 2 |
-
|
| 3 |
-
<div align="center">
|
| 4 |
-
|
| 5 |
-

|
| 6 |
-

|
| 7 |
-

|
| 8 |
-

|
| 9 |
-

|
| 10 |
-
|
| 11 |
-
**A Production-Ready API for Robust Document Layout Analysis**
|
| 12 |
-
|
| 13 |
-
[Features](#-features) β’ [Installation](#-installation) β’ [Quick Start](#-quick-start) β’ [API Reference](#-api-reference) β’ [Architecture](#-architecture) β’ [Metrics](#-metrics-system)
|
| 14 |
-
|
| 15 |
-
</div>
|
| 16 |
-
|
| 17 |
-
---
|
| 18 |
-
|
| 19 |
-
## π Table of Contents
|
| 20 |
-
|
| 21 |
-
1. [Overview](#-overview)
|
| 22 |
-
2. [Features](#-features)
|
| 23 |
-
3. [System Requirements](#-system-requirements)
|
| 24 |
-
4. [Installation](#-installation)
|
| 25 |
-
5. [Quick Start](#-quick-start)
|
| 26 |
-
6. [Project Structure](#-project-structure)
|
| 27 |
-
7. [Architecture Deep Dive](#-architecture-deep-dive)
|
| 28 |
-
8. [Configuration](#-configuration)
|
| 29 |
-
9. [API Reference](#-api-reference)
|
| 30 |
-
10. [Metrics System](#-metrics-system)
|
| 31 |
-
11. [Visualization Engine](#-visualization-engine)
|
| 32 |
-
12. [Services Layer](#-services-layer)
|
| 33 |
-
13. [Utilities Reference](#-utilities-reference)
|
| 34 |
-
14. [Error Handling](#-error-handling)
|
| 35 |
-
15. [Performance Optimization](#-performance-optimization)
|
| 36 |
-
16. [Security Considerations](#-security-considerations)
|
| 37 |
-
17. [Testing](#-testing)
|
| 38 |
-
18. [Deployment](#-deployment)
|
| 39 |
-
19. [Troubleshooting](#-troubleshooting)
|
| 40 |
-
20. [Contributing](#-contributing)
|
| 41 |
-
21. [Citation](#-citation)
|
| 42 |
-
22. [License](#-license)
|
| 43 |
-
|
| 44 |
-
---
|
| 45 |
-
|
| 46 |
-
## π― Overview
|
| 47 |
-
|
| 48 |
-
### What is RoDLA?
|
| 49 |
-
|
| 50 |
-
RoDLA (Robust Document Layout Analysis) is a state-of-the-art deep learning model for detecting and classifying layout elements in document images. Published at **CVPR 2024**, it focuses on robustness to various perturbations including noise, blur, and geometric distortions.
|
| 51 |
-
|
| 52 |
-
### What is this API?
|
| 53 |
-
|
| 54 |
-
This repository provides a **production-ready FastAPI wrapper** around the RoDLA model, featuring:
|
| 55 |
-
|
| 56 |
-
- RESTful API endpoints for document analysis
|
| 57 |
-
- Comprehensive metrics calculation (20+ metrics)
|
| 58 |
-
- Automated visualization generation (8 chart types)
|
| 59 |
-
- Robustness assessment based on the RoDLA paper
|
| 60 |
-
- Human-readable interpretation of results
|
| 61 |
-
- Modular, maintainable code architecture
|
| 62 |
-
|
| 63 |
-
### Key Statistics
|
| 64 |
-
|
| 65 |
-
| Metric | Value |
|
| 66 |
-
|--------|-------|
|
| 67 |
-
| Clean mAP (M6Doc) | 70.0% |
|
| 68 |
-
| Perturbed Average mAP | 61.7% |
|
| 69 |
-
| mRD Score | 147.6 |
|
| 70 |
-
| Max Detections/Image | 300 |
|
| 71 |
-
| Supported Classes | 74 (M6Doc) |
|
| 72 |
-
|
| 73 |
-
---
|
| 74 |
-
|
| 75 |
-
## β¨ Features
|
| 76 |
-
|
| 77 |
-
### Core Capabilities
|
| 78 |
-
|
| 79 |
-
| Feature | Description |
|
| 80 |
-
|---------|-------------|
|
| 81 |
-
| π **Multi-class Detection** | Detect 74+ document element types |
|
| 82 |
-
| π **Comprehensive Metrics** | 20+ analytical metrics per image |
|
| 83 |
-
| π **Auto Visualization** | 8 chart types generated automatically |
|
| 84 |
-
| π‘οΈ **Robustness Analysis** | mPE and mRD estimation |
|
| 85 |
-
| π§ **Smart Interpretation** | Human-readable analysis summaries |
|
| 86 |
-
| β‘ **GPU Acceleration** | CUDA support for fast inference |
|
| 87 |
-
| π **Flexible Output** | JSON, annotated images, or both |
|
| 88 |
-
|
| 89 |
-
### Document Element Types
|
| 90 |
-
|
| 91 |
-
The model can detect various document elements including:
|
| 92 |
-
|
| 93 |
-
```
|
| 94 |
-
Text Elements: Structural Elements: Visual Elements:
|
| 95 |
-
βββ Paragraph βββ Header βββ Figure
|
| 96 |
-
βββ Title βββ Footer βββ Table
|
| 97 |
-
βββ Caption βββ Page Number βββ Chart
|
| 98 |
-
βββ List βββ Section βββ Logo
|
| 99 |
-
βββ Footnote βββ Column βββ Stamp
|
| 100 |
-
βββ Abstract βββ Margin βββ Equation
|
| 101 |
-
```
|
| 102 |
-
|
| 103 |
-
---
|
| 104 |
-
|
| 105 |
-
## π» System Requirements
|
| 106 |
-
|
| 107 |
-
### Hardware Requirements
|
| 108 |
-
|
| 109 |
-
| Component | Minimum | Recommended |
|
| 110 |
-
|-----------|---------|-------------|
|
| 111 |
-
| CPU | 4 cores | 8+ cores |
|
| 112 |
-
| RAM | 16 GB | 32 GB |
|
| 113 |
-
| GPU | 8 GB VRAM | 16+ GB VRAM |
|
| 114 |
-
| Storage | 10 GB | 20 GB |
|
| 115 |
-
|
| 116 |
-
### Software Requirements
|
| 117 |
-
|
| 118 |
-
| Software | Version |
|
| 119 |
-
|----------|---------|
|
| 120 |
-
| Python | 3.8 - 3.10 |
|
| 121 |
-
| CUDA | 11.7+ |
|
| 122 |
-
| cuDNN | 8.5+ |
|
| 123 |
-
| OS | Linux (Ubuntu 20.04+) / WSL2 |
|
| 124 |
-
|
| 125 |
-
### Python Dependencies
|
| 126 |
-
|
| 127 |
-
```
|
| 128 |
-
# Core Framework
|
| 129 |
-
fastapi>=0.100.0
|
| 130 |
-
uvicorn>=0.23.0
|
| 131 |
-
python-multipart>=0.0.6
|
| 132 |
-
|
| 133 |
-
# ML/Deep Learning
|
| 134 |
-
torch>=2.0.0
|
| 135 |
-
mmdet>=3.0.0
|
| 136 |
-
mmcv>=2.0.0
|
| 137 |
-
|
| 138 |
-
# Data Processing
|
| 139 |
-
numpy>=1.24.0
|
| 140 |
-
pillow>=9.5.0
|
| 141 |
-
|
| 142 |
-
# Visualization
|
| 143 |
-
matplotlib>=3.7.0
|
| 144 |
-
seaborn>=0.12.0
|
| 145 |
-
|
| 146 |
-
# Utilities
|
| 147 |
-
pydantic>=2.0.0
|
| 148 |
-
```
|
| 149 |
-
|
| 150 |
-
---
|
| 151 |
-
|
| 152 |
-
## π Installation
|
| 153 |
-
|
| 154 |
-
### Step 1: Clone the Repository
|
| 155 |
-
|
| 156 |
-
```bash
|
| 157 |
-
git clone https://github.com/yourusername/rodla-api.git
|
| 158 |
-
cd rodla-api
|
| 159 |
-
```
|
| 160 |
-
|
| 161 |
-
### Step 2: Create Virtual Environment
|
| 162 |
-
|
| 163 |
-
```bash
|
| 164 |
-
# Using conda (recommended)
|
| 165 |
-
conda create -n rodla python=3.9
|
| 166 |
-
conda activate rodla
|
| 167 |
-
|
| 168 |
-
# Or using venv
|
| 169 |
-
python -m venv venv
|
| 170 |
-
source venv/bin/activate # Linux/Mac
|
| 171 |
-
.\venv\Scripts\activate # Windows
|
| 172 |
-
```
|
| 173 |
-
|
| 174 |
-
### Step 3: Install PyTorch with CUDA
|
| 175 |
-
|
| 176 |
-
```bash
|
| 177 |
-
# For CUDA 11.8
|
| 178 |
-
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
|
| 179 |
-
|
| 180 |
-
# For CUDA 12.1
|
| 181 |
-
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
|
| 182 |
-
```
|
| 183 |
-
|
| 184 |
-
### Step 4: Install MMDetection
|
| 185 |
-
|
| 186 |
-
```bash
|
| 187 |
-
pip install -U openmim
|
| 188 |
-
mim install mmengine
|
| 189 |
-
mim install mmcv>=2.0.0
|
| 190 |
-
mim install mmdet>=3.0.0
|
| 191 |
-
```
|
| 192 |
-
|
| 193 |
-
### Step 5: Install Project Dependencies
|
| 194 |
-
|
| 195 |
-
```bash
|
| 196 |
-
pip install -r requirements.txt
|
| 197 |
-
```
|
| 198 |
-
|
| 199 |
-
### Step 6: Download Model Weights
|
| 200 |
-
|
| 201 |
-
```bash
|
| 202 |
-
# Download from official source
|
| 203 |
-
wget https://path-to-weights/rodla_internimage_xl_m6doc.pth -O weights/rodla_internimage_xl_m6doc.pth
|
| 204 |
-
```
|
| 205 |
-
|
| 206 |
-
### Step 7: Configure Paths
|
| 207 |
-
|
| 208 |
-
Edit `config/settings.py`:
|
| 209 |
-
|
| 210 |
-
```python
|
| 211 |
-
REPO_ROOT = Path("/path/to/your/RoDLA")
|
| 212 |
-
MODEL_CONFIG = REPO_ROOT / "model/configs/m6doc/rodla_internimage_xl_m6doc.py"
|
| 213 |
-
MODEL_WEIGHTS = REPO_ROOT / "rodla_internimage_xl_m6doc.pth"
|
| 214 |
-
```
|
| 215 |
-
|
| 216 |
-
---
|
| 217 |
-
|
| 218 |
-
## β‘ Quick Start
|
| 219 |
-
|
| 220 |
-
### Starting the Server
|
| 221 |
-
|
| 222 |
-
```bash
|
| 223 |
-
# Development mode
|
| 224 |
-
python backend.py
|
| 225 |
-
|
| 226 |
-
# Production mode with uvicorn
|
| 227 |
-
uvicorn backend:app --host 0.0.0.0 --port 8000 --workers 1
|
| 228 |
-
```
|
| 229 |
-
|
| 230 |
-
### Making Your First Request
|
| 231 |
-
|
| 232 |
-
```bash
|
| 233 |
-
# Using curl
|
| 234 |
-
curl -X POST "http://localhost:8000/api/detect" \
|
| 235 |
-
-H "accept: application/json" \
|
| 236 |
-
-F "file=@document.jpg" \
|
| 237 |
-
-F "score_thr=0.3"
|
| 238 |
-
|
| 239 |
-
# Get model information
|
| 240 |
-
curl http://localhost:8000/api/model-info
|
| 241 |
-
```
|
| 242 |
-
|
| 243 |
-
### Python Client Example
|
| 244 |
-
|
| 245 |
-
```python
|
| 246 |
-
import requests
|
| 247 |
-
|
| 248 |
-
# Upload and analyze document
|
| 249 |
-
with open("document.pdf", "rb") as f:
|
| 250 |
-
response = requests.post(
|
| 251 |
-
"http://localhost:8000/api/detect",
|
| 252 |
-
files={"file": f},
|
| 253 |
-
data={
|
| 254 |
-
"score_thr": "0.3",
|
| 255 |
-
"return_image": "false",
|
| 256 |
-
"generate_visualizations": "true"
|
| 257 |
-
}
|
| 258 |
-
)
|
| 259 |
-
|
| 260 |
-
result = response.json()
|
| 261 |
-
print(f"Detected {result['core_results']['summary']['total_detections']} elements")
|
| 262 |
-
```
|
| 263 |
-
|
| 264 |
-
---
|
| 265 |
-
|
| 266 |
-
## π Project Structure
|
| 267 |
-
|
| 268 |
-
```
|
| 269 |
-
deployment/
|
| 270 |
-
βββ backend.py # π Main FastAPI application entry point
|
| 271 |
-
βββ requirements.txt # π¦ Python dependencies
|
| 272 |
-
βββ README.md # π This documentation
|
| 273 |
-
β
|
| 274 |
-
βββ config/ # βοΈ Configuration Layer
|
| 275 |
-
β βββ __init__.py # Package initializer
|
| 276 |
-
β βββ settings.py # All configuration constants
|
| 277 |
-
β
|
| 278 |
-
βββ core/ # π§ Core Application Layer
|
| 279 |
-
β βββ __init__.py # Package initializer
|
| 280 |
-
β βββ model_loader.py # Singleton model management
|
| 281 |
-
β βββ dependencies.py # FastAPI dependency injection
|
| 282 |
-
β
|
| 283 |
-
βββ api/ # π API Layer
|
| 284 |
-
β βββ __init__.py # Package initializer
|
| 285 |
-
β βββ routes.py # API endpoint definitions
|
| 286 |
-
β βββ schemas.py # Pydantic request/response models
|
| 287 |
-
β
|
| 288 |
-
βββ services/ # π§ Business Logic Layer
|
| 289 |
-
β βββ __init__.py # Package initializer
|
| 290 |
-
β βββ detection.py # Core detection logic
|
| 291 |
-
β βββ processing.py # Result aggregation
|
| 292 |
-
β βββ visualization.py # Chart generation (350+ lines)
|
| 293 |
-
β βββ interpretation.py # Human-readable insights
|
| 294 |
-
β
|
| 295 |
-
βββ utils/ # π οΈ Utility Layer
|
| 296 |
-
β βββ __init__.py # Package initializer
|
| 297 |
-
β βββ helpers.py # General helper functions
|
| 298 |
-
β βββ serialization.py # JSON conversion utilities
|
| 299 |
-
β βββ metrics/ # Metrics calculation modules
|
| 300 |
-
β βββ __init__.py # Metrics package initializer
|
| 301 |
-
β βββ core.py # Core detection metrics
|
| 302 |
-
β βββ rodla.py # RoDLA-specific metrics
|
| 303 |
-
β βββ spatial.py # Spatial distribution analysis
|
| 304 |
-
β βββ quality.py # Quality & complexity metrics
|
| 305 |
-
β
|
| 306 |
-
βββ outputs/ # π€ Output Directory
|
| 307 |
-
βββ *.json # Detection results
|
| 308 |
-
βββ *.png # Visualization images
|
| 309 |
-
```
|
| 310 |
-
|
| 311 |
-
### File Count Summary
|
| 312 |
-
|
| 313 |
-
| Layer | Files | Purpose |
|
| 314 |
-
|-------|-------|---------|
|
| 315 |
-
| Config | 2 | Configuration management |
|
| 316 |
-
| Core | 3 | Model and dependency management |
|
| 317 |
-
| API | 3 | HTTP endpoints and schemas |
|
| 318 |
-
| Services | 5 | Business logic implementation |
|
| 319 |
-
| Utils | 7 | Helper functions and metrics |
|
| 320 |
-
| **Total** | **21** | Complete modular architecture |
|
| 321 |
-
|
| 322 |
-
---
|
| 323 |
-
|
| 324 |
-
## ποΈ Architecture Deep Dive
|
| 325 |
-
|
| 326 |
-
### Layered Architecture
|
| 327 |
-
|
| 328 |
-
```
|
| 329 |
-
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 330 |
-
β CLIENT LAYER β
|
| 331 |
-
β (Web Browser / API Clients) β
|
| 332 |
-
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
|
| 333 |
-
β HTTP Requests
|
| 334 |
-
βΌ
|
| 335 |
-
ββοΏ½οΏ½ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 336 |
-
β API LAYER β
|
| 337 |
-
β api/routes.py β
|
| 338 |
-
β βββββββββββββββββββ βββββββββββββββββββββββββββββββββββ β
|
| 339 |
-
β β GET /model-info β β POST /api/detect β β
|
| 340 |
-
β βββββββββββββββββββ βββββββββββββββββββββββββββββββββββ β
|
| 341 |
-
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
|
| 342 |
-
β Validated Requests
|
| 343 |
-
βΌ
|
| 344 |
-
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 345 |
-
β SERVICES LAYER β
|
| 346 |
-
β ββββββββββββββββ ββββββββββββββββ βββββββββββββββββββββββββ
|
| 347 |
-
β β detection.py β βprocessing.py β β visualization.py ββ
|
| 348 |
-
β β β β β β ββ
|
| 349 |
-
β β β’ Inference β β β’ Aggregate β β β’ 8 Chart Types ββ
|
| 350 |
-
β β β’ Processing β β β’ Save JSON β β β’ Base64 Encoding ββ
|
| 351 |
-
β ββββββββββββββββ ββββββββββββββββ βββββββββββββββββββββββββ
|
| 352 |
-
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
|
| 353 |
-
β β interpretation.py β β
|
| 354 |
-
β β β’ Human-readable insights β β
|
| 355 |
-
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
|
| 356 |
-
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
|
| 357 |
-
β Data Processing
|
| 358 |
-
βΌ
|
| 359 |
-
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 360 |
-
β UTILITIES LAYER β
|
| 361 |
-
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 362 |
-
β β utils/metrics/ ββ
|
| 363 |
-
β β βββββββββββ βββββββββββ βββββββββββ βββββββββββββββ ββ
|
| 364 |
-
β β β core.py β βrodla.py β βspatial. β β quality.py β ββ
|
| 365 |
-
β β β β β β β py β β β ββ
|
| 366 |
-
β β βββββββββββ βββββββββββ βββββββββββ βββββββββββββββ ββ
|
| 367 |
-
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 368 |
-
β ββββββββββββββββ ββββββββββββββββββββββββββββββββββββββ β
|
| 369 |
-
β β helpers.py β β serialization.py β β
|
| 370 |
-
β ββββββββββββββββ ββββββββββββββββββββββββββββββββββββββ β
|
| 371 |
-
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
|
| 372 |
-
β Model Operations
|
| 373 |
-
βΌ
|
| 374 |
-
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 375 |
-
β CORE LAYER β
|
| 376 |
-
β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββββ β
|
| 377 |
-
β β model_loader.py β β dependencies.py β β
|
| 378 |
-
β β β β β β
|
| 379 |
-
β β β’ Singleton Pattern β β β’ FastAPI DI β β
|
| 380 |
-
β β β’ GPU Management β β β’ Model Injection β β
|
| 381 |
-
β β β’ Lazy Loading β β β β
|
| 382 |
-
β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββββ β
|
| 383 |
-
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
|
| 384 |
-
β Configuration
|
| 385 |
-
βΌ
|
| 386 |
-
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 387 |
-
β CONFIG LAYER β
|
| 388 |
-
β config/settings.py β
|
| 389 |
-
β β’ Paths β’ Constants β’ Baseline Metrics β’ Thresholds β
|
| 390 |
-
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 391 |
-
```
|
| 392 |
-
|
| 393 |
-
### Design Patterns Used
|
| 394 |
-
|
| 395 |
-
| Pattern | Location | Purpose |
|
| 396 |
-
|---------|----------|---------|
|
| 397 |
-
| **Singleton** | `model_loader.py` | Single model instance |
|
| 398 |
-
| **Factory** | `visualization.py` | Create multiple chart types |
|
| 399 |
-
| **Dependency Injection** | `dependencies.py` | Inject model into routes |
|
| 400 |
-
| **Repository** | `processing.py` | Abstract data persistence |
|
| 401 |
-
| **Facade** | `routes.py` | Simplify complex subsystems |
|
| 402 |
-
| **Strategy** | `metrics/` | Interchangeable metric algorithms |
|
| 403 |
-
|
| 404 |
-
### Data Flow Diagram
|
| 405 |
-
|
| 406 |
-
```
|
| 407 |
-
ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ
|
| 408 |
-
β Image βββββΆβ Upload βββββΆβ Temp βββββΆβ Model β
|
| 409 |
-
β File β β Handler β β File β β Inferenceβ
|
| 410 |
-
ββββββββββββ ββββββββββββ ββββββββββββ ββββββ¬ββββββ
|
| 411 |
-
β
|
| 412 |
-
βββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 413 |
-
βΌ
|
| 414 |
-
ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ
|
| 415 |
-
β Raw βββββΆβ Process βββββΆβ CalculateβββββΆβ Generate β
|
| 416 |
-
β Results β βDetectionsβ β Metrics β β Viz β
|
| 417 |
-
ββββββββββββ ββββββββββββ ββββββββββββ ββββββ¬ββββββ
|
| 418 |
-
β
|
| 419 |
-
βββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 420 |
-
βΌ
|
| 421 |
-
ββββββββββββ ββββββββββββ ββββββββββββ
|
| 422 |
-
β Generate βββββΆβ Assemble βββββΆβ JSON β
|
| 423 |
-
β Interp. β β Response β β Response β
|
| 424 |
-
ββββββββββββ ββββββββββββ ββββββββββββ
|
| 425 |
-
```
|
| 426 |
-
|
| 427 |
-
---
|
| 428 |
-
|
| 429 |
-
## βοΈ Configuration
|
| 430 |
-
|
| 431 |
-
### config/settings.py
|
| 432 |
-
|
| 433 |
-
This file centralizes all configuration parameters.
|
| 434 |
-
|
| 435 |
-
```python
|
| 436 |
-
"""
|
| 437 |
-
Configuration Settings Module
|
| 438 |
-
=============================
|
| 439 |
-
All application constants and configuration in one place.
|
| 440 |
-
"""
|
| 441 |
-
|
| 442 |
-
from pathlib import Path
|
| 443 |
-
|
| 444 |
-
# =============================================================================
|
| 445 |
-
# PATH CONFIGURATION
|
| 446 |
-
# =============================================================================
|
| 447 |
-
|
| 448 |
-
# Root directory of the RoDLA model repository
|
| 449 |
-
REPO_ROOT = Path("/mnt/d/MyStuff/University/Current/CV/Project/RoDLA")
|
| 450 |
-
|
| 451 |
-
# Model configuration file path
|
| 452 |
-
MODEL_CONFIG = REPO_ROOT / "model/configs/m6doc/rodla_internimage_xl_m6doc.py"
|
| 453 |
-
|
| 454 |
-
# Pre-trained model weights path
|
| 455 |
-
MODEL_WEIGHTS = REPO_ROOT / "rodla_internimage_xl_m6doc.pth"
|
| 456 |
-
|
| 457 |
-
# Output directory for results and visualizations
|
| 458 |
-
OUTPUT_DIR = Path("outputs")
|
| 459 |
-
|
| 460 |
-
# =============================================================================
|
| 461 |
-
# MODEL CONFIGURATION
|
| 462 |
-
# =============================================================================
|
| 463 |
-
|
| 464 |
-
# Default confidence threshold for detections
|
| 465 |
-
DEFAULT_SCORE_THRESHOLD = 0.3
|
| 466 |
-
|
| 467 |
-
# Maximum number of detections per image
|
| 468 |
-
MAX_DETECTIONS = 300
|
| 469 |
-
|
| 470 |
-
# Model metadata
|
| 471 |
-
MODEL_INFO = {
|
| 472 |
-
"name": "RoDLA InternImage-XL",
|
| 473 |
-
"paper": "RoDLA: Benchmarking the Robustness of Document Layout Analysis Models",
|
| 474 |
-
"conference": "CVPR 2024",
|
| 475 |
-
"backbone": "InternImage-XL",
|
| 476 |
-
"framework": "DINO with Channel Attention + Average Pooling",
|
| 477 |
-
"dataset": "M6Doc-P"
|
| 478 |
-
}
|
| 479 |
-
|
| 480 |
-
# =============================================================================
|
| 481 |
-
# BASELINE PERFORMANCE METRICS
|
| 482 |
-
# =============================================================================
|
| 483 |
-
|
| 484 |
-
# Clean performance baselines from the RoDLA paper (mAP scores)
|
| 485 |
-
BASELINE_MAP = {
|
| 486 |
-
"M6Doc": 70.0, # Main evaluation dataset
|
| 487 |
-
"PubLayNet": 96.0, # Scientific documents
|
| 488 |
-
"DocLayNet": 80.5 # Diverse document types
|
| 489 |
-
}
|
| 490 |
-
|
| 491 |
-
# State-of-the-art performance metrics
|
| 492 |
-
SOTA_PERFORMANCE = {
|
| 493 |
-
"clean_mAP": 70.0,
|
| 494 |
-
"perturbed_avg_mAP": 61.7,
|
| 495 |
-
"mRD_score": 147.6
|
| 496 |
-
}
|
| 497 |
-
|
| 498 |
-
# =============================================================================
|
| 499 |
-
# ANALYSIS THRESHOLDS
|
| 500 |
-
# =============================================================================
|
| 501 |
-
|
| 502 |
-
# Size distribution thresholds (as percentage of image area)
|
| 503 |
-
SIZE_THRESHOLDS = {
|
| 504 |
-
"tiny": 0.005, # < 0.5% of image
|
| 505 |
-
"small": 0.02, # 0.5% - 2%
|
| 506 |
-
"medium": 0.1, # 2% - 10%
|
| 507 |
-
"large": 1.0 # >= 10%
|
| 508 |
-
}
|
| 509 |
-
|
| 510 |
-
# Confidence level thresholds
|
| 511 |
-
CONFIDENCE_THRESHOLDS = {
|
| 512 |
-
"very_high": 0.9,
|
| 513 |
-
"high": 0.8,
|
| 514 |
-
"medium": 0.6,
|
| 515 |
-
"low": 0.4
|
| 516 |
-
}
|
| 517 |
-
|
| 518 |
-
# Robustness assessment thresholds
|
| 519 |
-
ROBUSTNESS_THRESHOLDS = {
|
| 520 |
-
"mPE_low": 20,
|
| 521 |
-
"mPE_medium": 40,
|
| 522 |
-
"mRD_excellent": 100,
|
| 523 |
-
"mRD_good": 150,
|
| 524 |
-
"cv_stable": 0.15,
|
| 525 |
-
"cv_moderate": 0.30
|
| 526 |
-
}
|
| 527 |
-
|
| 528 |
-
# Complexity scoring weights
|
| 529 |
-
COMPLEXITY_WEIGHTS = {
|
| 530 |
-
"class_diversity": 30,
|
| 531 |
-
"detection_count": 30,
|
| 532 |
-
"density": 20,
|
| 533 |
-
"clustering": 20
|
| 534 |
-
}
|
| 535 |
-
|
| 536 |
-
# =============================================================================
|
| 537 |
-
# API CONFIGURATION
|
| 538 |
-
# =============================================================================
|
| 539 |
-
|
| 540 |
-
# CORS settings
|
| 541 |
-
CORS_ORIGINS = ["*"] # Restrict in production
|
| 542 |
-
CORS_METHODS = ["*"]
|
| 543 |
-
CORS_HEADERS = ["*"]
|
| 544 |
-
|
| 545 |
-
# API metadata
|
| 546 |
-
API_TITLE = "RoDLA Object Detection API"
|
| 547 |
-
API_VERSION = "1.0.0"
|
| 548 |
-
API_DESCRIPTION = "Production-ready API for Robust Document Layout Analysis"
|
| 549 |
-
|
| 550 |
-
# =============================================================================
|
| 551 |
-
# VISUALIZATION CONFIGURATION
|
| 552 |
-
# =============================================================================
|
| 553 |
-
|
| 554 |
-
# Figure sizes for different chart types
|
| 555 |
-
FIGURE_SIZES = {
|
| 556 |
-
"bar_chart": (12, 6),
|
| 557 |
-
"histogram": (10, 6),
|
| 558 |
-
"heatmap": (10, 8),
|
| 559 |
-
"boxplot": (12, 6),
|
| 560 |
-
"scatter": (10, 6),
|
| 561 |
-
"pie": (8, 8)
|
| 562 |
-
}
|
| 563 |
-
|
| 564 |
-
# Color schemes
|
| 565 |
-
COLOR_SCHEMES = {
|
| 566 |
-
"primary": "steelblue",
|
| 567 |
-
"secondary": "forestgreen",
|
| 568 |
-
"accent": "coral",
|
| 569 |
-
"heatmap": "YlOrRd",
|
| 570 |
-
"scatter": "viridis"
|
| 571 |
-
}
|
| 572 |
-
|
| 573 |
-
# DPI for saved images
|
| 574 |
-
VISUALIZATION_DPI = 100
|
| 575 |
-
```
|
| 576 |
-
|
| 577 |
-
### Environment Variables
|
| 578 |
-
|
| 579 |
-
For production deployments, use environment variables:
|
| 580 |
-
|
| 581 |
-
```bash
|
| 582 |
-
# .env file
|
| 583 |
-
RODLA_REPO_ROOT=/path/to/RoDLA
|
| 584 |
-
RODLA_MODEL_CONFIG=model/configs/m6doc/rodla_internimage_xl_m6doc.py
|
| 585 |
-
RODLA_MODEL_WEIGHTS=rodla_internimage_xl_m6doc.pth
|
| 586 |
-
RODLA_OUTPUT_DIR=outputs
|
| 587 |
-
RODLA_DEFAULT_THRESHOLD=0.3
|
| 588 |
-
RODLA_API_HOST=0.0.0.0
|
| 589 |
-
RODLA_API_PORT=8000
|
| 590 |
-
```
|
| 591 |
-
|
| 592 |
-
---
|
| 593 |
-
|
| 594 |
-
## π API Reference
|
| 595 |
-
|
| 596 |
-
### Endpoints Overview
|
| 597 |
-
|
| 598 |
-
| Method | Endpoint | Description |
|
| 599 |
-
|--------|----------|-------------|
|
| 600 |
-
| GET | `/api/model-info` | Get model metadata |
|
| 601 |
-
| POST | `/api/detect` | Analyze document image |
|
| 602 |
-
| GET | `/health` | Health check (if implemented) |
|
| 603 |
-
| GET | `/docs` | Swagger UI documentation |
|
| 604 |
-
| GET | `/redoc` | ReDoc documentation |
|
| 605 |
-
|
| 606 |
-
---
|
| 607 |
-
|
| 608 |
-
### GET /api/model-info
|
| 609 |
-
|
| 610 |
-
Returns comprehensive information about the loaded model.
|
| 611 |
-
|
| 612 |
-
#### Request
|
| 613 |
-
|
| 614 |
-
```http
|
| 615 |
-
GET /api/model-info HTTP/1.1
|
| 616 |
-
Host: localhost:8000
|
| 617 |
-
```
|
| 618 |
-
|
| 619 |
-
#### Response
|
| 620 |
-
|
| 621 |
-
```json
|
| 622 |
-
{
|
| 623 |
-
"model_name": "RoDLA InternImage-XL",
|
| 624 |
-
"paper": "RoDLA: Benchmarking the Robustness of Document Layout Analysis Models (CVPR 2024)",
|
| 625 |
-
"num_classes": 74,
|
| 626 |
-
"classes": [
|
| 627 |
-
"paragraph", "title", "figure", "table", "caption",
|
| 628 |
-
"header", "footer", "page_number", "list", "abstract",
|
| 629 |
-
// ... additional classes
|
| 630 |
-
],
|
| 631 |
-
"backbone": "InternImage-XL",
|
| 632 |
-
"detection_framework": "DINO with Channel Attention + Average Pooling",
|
| 633 |
-
"dataset": "M6Doc-P",
|
| 634 |
-
"max_detections_per_image": 300,
|
| 635 |
-
"state_of_the_art_performance": {
|
| 636 |
-
"clean_mAP": 70.0,
|
| 637 |
-
"perturbed_avg_mAP": 61.7,
|
| 638 |
-
"mRD_score": 147.6
|
| 639 |
-
}
|
| 640 |
-
}
|
| 641 |
-
```
|
| 642 |
-
|
| 643 |
-
#### Error Responses
|
| 644 |
-
|
| 645 |
-
| Status | Description |
|
| 646 |
-
|--------|-------------|
|
| 647 |
-
| 500 | Model not loaded |
|
| 648 |
-
|
| 649 |
-
---
|
| 650 |
-
|
| 651 |
-
### POST /api/detect
|
| 652 |
-
|
| 653 |
-
Analyzes a document image and returns comprehensive detection results.
|
| 654 |
-
|
| 655 |
-
#### Request
|
| 656 |
-
|
| 657 |
-
```http
|
| 658 |
-
POST /api/detect HTTP/1.1
|
| 659 |
-
Host: localhost:8000
|
| 660 |
-
Content-Type: multipart/form-data
|
| 661 |
-
|
| 662 |
-
file: <binary image data>
|
| 663 |
-
score_thr: "0.3"
|
| 664 |
-
return_image: "false"
|
| 665 |
-
save_json: "true"
|
| 666 |
-
generate_visualizations: "true"
|
| 667 |
-
```
|
| 668 |
-
|
| 669 |
-
#### Parameters
|
| 670 |
-
|
| 671 |
-
| Parameter | Type | Default | Description |
|
| 672 |
-
|-----------|------|---------|-------------|
|
| 673 |
-
| `file` | File | Required | Image file (JPEG, PNG, etc.) |
|
| 674 |
-
| `score_thr` | string | "0.3" | Confidence threshold (0.0-1.0) |
|
| 675 |
-
| `return_image` | string | "false" | Return annotated image instead of JSON |
|
| 676 |
-
| `save_json` | string | "true" | Save results to disk |
|
| 677 |
-
| `generate_visualizations` | string | "true" | Generate visualization charts |
|
| 678 |
-
|
| 679 |
-
#### Response (JSON Mode)
|
| 680 |
-
|
| 681 |
-
```json
|
| 682 |
-
{
|
| 683 |
-
"success": true,
|
| 684 |
-
"timestamp": "2024-01-15T10:30:45.123456",
|
| 685 |
-
"filename": "document.jpg",
|
| 686 |
-
|
| 687 |
-
"image_info": {
|
| 688 |
-
"width": 2480,
|
| 689 |
-
"height": 3508,
|
| 690 |
-
"aspect_ratio": 0.707,
|
| 691 |
-
"total_pixels": 8699840
|
| 692 |
-
},
|
| 693 |
-
|
| 694 |
-
"detection_config": {
|
| 695 |
-
"score_threshold": 0.3,
|
| 696 |
-
"model": "RoDLA InternImage-XL",
|
| 697 |
-
"framework": "DINO with Robustness Enhancement",
|
| 698 |
-
"max_detections": 300
|
| 699 |
-
},
|
| 700 |
-
|
| 701 |
-
"core_results": {
|
| 702 |
-
"summary": {
|
| 703 |
-
"total_detections": 47,
|
| 704 |
-
"unique_classes": 12,
|
| 705 |
-
"average_confidence": 0.7823,
|
| 706 |
-
"median_confidence": 0.8156,
|
| 707 |
-
"min_confidence": 0.3012,
|
| 708 |
-
"max_confidence": 0.9876,
|
| 709 |
-
"coverage_percentage": 68.45,
|
| 710 |
-
"average_detection_area": 126543.21
|
| 711 |
-
},
|
| 712 |
-
"detections": [/* top 20 detections */]
|
| 713 |
-
},
|
| 714 |
-
|
| 715 |
-
"rodla_metrics": {
|
| 716 |
-
"note": "Estimated metrics...",
|
| 717 |
-
"estimated_mPE": 18.45,
|
| 718 |
-
"estimated_mRD": 87.32,
|
| 719 |
-
"confidence_std": 0.1234,
|
| 720 |
-
"confidence_range": 0.6864,
|
| 721 |
-
"robustness_score": 56.34,
|
| 722 |
-
"interpretation": {
|
| 723 |
-
"mPE_level": "low",
|
| 724 |
-
"mRD_level": "excellent",
|
| 725 |
-
"overall_robustness": "medium"
|
| 726 |
-
}
|
| 727 |
-
},
|
| 728 |
-
|
| 729 |
-
"spatial_analysis": {
|
| 730 |
-
"horizontal_distribution": {...},
|
| 731 |
-
"vertical_distribution": {...},
|
| 732 |
-
"quadrant_distribution": {...},
|
| 733 |
-
"size_distribution": {...},
|
| 734 |
-
"density_metrics": {...}
|
| 735 |
-
},
|
| 736 |
-
|
| 737 |
-
"class_analysis": {
|
| 738 |
-
"paragraph": {
|
| 739 |
-
"count": 15,
|
| 740 |
-
"percentage": 31.91,
|
| 741 |
-
"confidence_stats": {...},
|
| 742 |
-
"area_stats": {...},
|
| 743 |
-
"aspect_ratio_stats": {...}
|
| 744 |
-
},
|
| 745 |
-
// ... other classes
|
| 746 |
-
},
|
| 747 |
-
|
| 748 |
-
"confidence_analysis": {
|
| 749 |
-
"distribution": {...},
|
| 750 |
-
"binned_distribution": {...},
|
| 751 |
-
"percentages": {...},
|
| 752 |
-
"entropy": 2.3456
|
| 753 |
-
},
|
| 754 |
-
|
| 755 |
-
"robustness_indicators": {
|
| 756 |
-
"stability_score": 87.65,
|
| 757 |
-
"coefficient_of_variation": 0.1234,
|
| 758 |
-
"high_confidence_ratio": 0.7234,
|
| 759 |
-
"prediction_consistency": "high",
|
| 760 |
-
"model_certainty": "medium",
|
| 761 |
-
"robustness_rating": {
|
| 762 |
-
"rating": "good",
|
| 763 |
-
"score": 72.34
|
| 764 |
-
}
|
| 765 |
-
},
|
| 766 |
-
|
| 767 |
-
"layout_complexity": {
|
| 768 |
-
"class_diversity": 12,
|
| 769 |
-
"total_elements": 47,
|
| 770 |
-
"detection_density": 5.41,
|
| 771 |
-
"average_element_distance": 234.56,
|
| 772 |
-
"complexity_score": 58.23,
|
| 773 |
-
"complexity_level": "moderate",
|
| 774 |
-
"layout_characteristics": {
|
| 775 |
-
"is_dense": true,
|
| 776 |
-
"is_diverse": true,
|
| 777 |
-
"is_structured": false
|
| 778 |
-
}
|
| 779 |
-
},
|
| 780 |
-
|
| 781 |
-
"quality_metrics": {
|
| 782 |
-
"overlap_analysis": {...},
|
| 783 |
-
"size_consistency": {...},
|
| 784 |
-
"detection_quality_score": 82.45
|
| 785 |
-
},
|
| 786 |
-
|
| 787 |
-
"visualizations": {
|
| 788 |
-
"class_distribution": "data:image/png;base64,...",
|
| 789 |
-
"confidence_distribution": "data:image/png;base64,...",
|
| 790 |
-
"spatial_heatmap": "data:image/png;base64,...",
|
| 791 |
-
"confidence_by_class": "data:image/png;base64,...",
|
| 792 |
-
"area_vs_confidence": "data:image/png;base64,...",
|
| 793 |
-
"quadrant_distribution": "data:image/png;base64,...",
|
| 794 |
-
"size_distribution": "data:image/png;base64,...",
|
| 795 |
-
"top_classes_confidence": "data:image/png;base64,..."
|
| 796 |
-
},
|
| 797 |
-
|
| 798 |
-
"interpretation": {
|
| 799 |
-
"overview": "Document Analysis Summary...",
|
| 800 |
-
"top_elements": "The most common elements are...",
|
| 801 |
-
"rodla_analysis": "RoDLA Robustness Analysis...",
|
| 802 |
-
"layout_complexity": "Layout Complexity...",
|
| 803 |
-
"key_findings": [...],
|
| 804 |
-
"perturbation_assessment": "...",
|
| 805 |
-
"recommendations": [...],
|
| 806 |
-
"confidence_summary": {...}
|
| 807 |
-
},
|
| 808 |
-
|
| 809 |
-
"all_detections": [/* complete detection list */]
|
| 810 |
-
}
|
| 811 |
-
```
|
| 812 |
-
|
| 813 |
-
#### Response (Image Mode)
|
| 814 |
-
|
| 815 |
-
When `return_image=true`, returns the annotated image directly:
|
| 816 |
-
|
| 817 |
-
```http
|
| 818 |
-
HTTP/1.1 200 OK
|
| 819 |
-
Content-Type: image/jpeg
|
| 820 |
-
Content-Disposition: attachment; filename="annotated_document.jpg"
|
| 821 |
-
|
| 822 |
-
<binary image data>
|
| 823 |
-
```
|
| 824 |
-
|
| 825 |
-
#### Error Responses
|
| 826 |
-
|
| 827 |
-
| Status | Description |
|
| 828 |
-
|--------|-------------|
|
| 829 |
-
| 400 | Invalid file type (not an image) |
|
| 830 |
-
| 500 | Model inference failed |
|
| 831 |
-
| 500 | Visualization generation failed |
|
| 832 |
-
|
| 833 |
-
---
|
| 834 |
-
|
| 835 |
-
## π Metrics System
|
| 836 |
-
|
| 837 |
-
### Metrics Architecture
|
| 838 |
-
|
| 839 |
-
```
|
| 840 |
-
utils/metrics/
|
| 841 |
-
βββ __init__.py # Exports all metric functions
|
| 842 |
-
βββ core.py # Core detection metrics
|
| 843 |
-
βββ rodla.py # RoDLA-specific robustness metrics
|
| 844 |
-
βββ spatial.py # Spatial distribution analysis
|
| 845 |
-
βββ quality.py # Quality and complexity metrics
|
| 846 |
-
```
|
| 847 |
-
|
| 848 |
-
### Core Metrics (utils/metrics/core.py)
|
| 849 |
-
|
| 850 |
-
#### `calculate_core_metrics(detections, img_width, img_height)`
|
| 851 |
-
|
| 852 |
-
Computes fundamental detection statistics.
|
| 853 |
-
|
| 854 |
-
| Metric | Type | Description |
|
| 855 |
-
|--------|------|-------------|
|
| 856 |
-
| `total_detections` | int | Number of detected elements |
|
| 857 |
-
| `unique_classes` | int | Number of distinct element types |
|
| 858 |
-
| `average_confidence` | float | Mean confidence score |
|
| 859 |
-
| `median_confidence` | float | Median confidence score |
|
| 860 |
-
| `min_confidence` | float | Lowest confidence |
|
| 861 |
-
| `max_confidence` | float | Highest confidence |
|
| 862 |
-
| `coverage_percentage` | float | % of image covered by detections |
|
| 863 |
-
| `average_detection_area` | float | Mean area per detection |
|
| 864 |
-
|
| 865 |
-
#### `calculate_class_metrics(detections)`
|
| 866 |
-
|
| 867 |
-
Per-class statistical analysis.
|
| 868 |
-
|
| 869 |
-
```python
|
| 870 |
-
{
|
| 871 |
-
"paragraph": {
|
| 872 |
-
"count": 15,
|
| 873 |
-
"percentage": 31.91,
|
| 874 |
-
"confidence_stats": {
|
| 875 |
-
"mean": 0.8234,
|
| 876 |
-
"std": 0.0876,
|
| 877 |
-
"min": 0.6543,
|
| 878 |
-
"max": 0.9654
|
| 879 |
-
},
|
| 880 |
-
"area_stats": {
|
| 881 |
-
"mean": 125432.5,
|
| 882 |
-
"std": 45678.2,
|
| 883 |
-
"total": 1881487.5
|
| 884 |
-
},
|
| 885 |
-
"aspect_ratio_stats": {
|
| 886 |
-
"mean": 2.345,
|
| 887 |
-
"orientation": "horizontal" # horizontal/vertical/square
|
| 888 |
-
}
|
| 889 |
-
}
|
| 890 |
-
}
|
| 891 |
-
```
|
| 892 |
-
|
| 893 |
-
#### `calculate_confidence_metrics(detections)`
|
| 894 |
-
|
| 895 |
-
Detailed confidence distribution analysis.
|
| 896 |
-
|
| 897 |
-
| Component | Description |
|
| 898 |
-
|-----------|-------------|
|
| 899 |
-
| `distribution` | Statistical measures (mean, median, std, quartiles) |
|
| 900 |
-
| `binned_distribution` | Count per confidence range |
|
| 901 |
-
| `percentages` | Percentage per confidence range |
|
| 902 |
-
| `entropy` | Shannon entropy of distribution |
|
| 903 |
-
|
| 904 |
-
**Confidence Bins:**
|
| 905 |
-
- Very High: 0.9 - 1.0
|
| 906 |
-
- High: 0.8 - 0.9
|
| 907 |
-
- Medium: 0.6 - 0.8
|
| 908 |
-
- Low: 0.4 - 0.6
|
| 909 |
-
- Very Low: 0.0 - 0.4
|
| 910 |
-
|
| 911 |
-
---
|
| 912 |
-
|
| 913 |
-
### RoDLA Metrics (utils/metrics/rodla.py)
|
| 914 |
-
|
| 915 |
-
These metrics are specific to the RoDLA paper's robustness evaluation framework.
|
| 916 |
-
|
| 917 |
-
#### `calculate_rodla_metrics(detections, core_metrics)`
|
| 918 |
-
|
| 919 |
-
Estimates perturbation effects and robustness degradation.
|
| 920 |
-
|
| 921 |
-
| Metric | Formula | Interpretation |
|
| 922 |
-
|--------|---------|----------------|
|
| 923 |
-
| `estimated_mPE` | `(conf_std Γ 100) + (conf_range Γ 50)` | Mean Perturbation Effect |
|
| 924 |
-
| `estimated_mRD` | `(degradation / mPE) Γ 100` | Mean Robustness Degradation |
|
| 925 |
-
| `robustness_score` | `(1 - mRD/200) Γ 100` | Overall robustness (0-100) |
|
| 926 |
-
|
| 927 |
-
**mPE Interpretation:**
|
| 928 |
-
```
|
| 929 |
-
low: mPE < 20 β Minimal perturbation effect
|
| 930 |
-
medium: 20 β€ mPE < 40 β Moderate perturbation
|
| 931 |
-
high: mPE β₯ 40 β Significant perturbation
|
| 932 |
-
```
|
| 933 |
-
|
| 934 |
-
**mRD Interpretation:**
|
| 935 |
-
```
|
| 936 |
-
excellent: mRD < 100 β Highly robust
|
| 937 |
-
good: 100 β€ mRD < 150 β Acceptable robustness
|
| 938 |
-
needs_improvement: mRD β₯ 150 β Robustness concerns
|
| 939 |
-
```
|
| 940 |
-
|
| 941 |
-
#### `calculate_robustness_indicators(detections, core_metrics)`
|
| 942 |
-
|
| 943 |
-
Stability and consistency metrics.
|
| 944 |
-
|
| 945 |
-
```python
|
| 946 |
-
{
|
| 947 |
-
"stability_score": 87.65, # (1 - CV) Γ 100
|
| 948 |
-
"coefficient_of_variation": 0.12, # std / mean
|
| 949 |
-
"high_confidence_ratio": 0.72, # % with conf β₯ 0.8
|
| 950 |
-
"prediction_consistency": "high", # Based on CV
|
| 951 |
-
"model_certainty": "medium", # Based on avg conf
|
| 952 |
-
"robustness_rating": {
|
| 953 |
-
"rating": "good", # excellent/good/fair/poor
|
| 954 |
-
"score": 72.34 # Composite score
|
| 955 |
-
}
|
| 956 |
-
}
|
| 957 |
-
```
|
| 958 |
-
|
| 959 |
-
**Robustness Rating Formula:**
|
| 960 |
-
```
|
| 961 |
-
score = (avg_conf Γ 40) + ((1 - CV) Γ 30) + (high_conf_ratio Γ 30)
|
| 962 |
-
|
| 963 |
-
Rating:
|
| 964 |
-
- excellent: score β₯ 80
|
| 965 |
-
- good: 60 β€ score < 80
|
| 966 |
-
- fair: 40 β€ score < 60
|
| 967 |
-
- poor: score < 40
|
| 968 |
-
```
|
| 969 |
-
|
| 970 |
-
---
|
| 971 |
-
|
| 972 |
-
### Spatial Metrics (utils/metrics/spatial.py)
|
| 973 |
-
|
| 974 |
-
#### `calculate_spatial_analysis(detections, img_width, img_height)`
|
| 975 |
-
|
| 976 |
-
Comprehensive spatial distribution analysis.
|
| 977 |
-
|
| 978 |
-
##### Horizontal Distribution
|
| 979 |
-
```python
|
| 980 |
-
{
|
| 981 |
-
"mean": 1240.5, # Mean x-coordinate
|
| 982 |
-
"std": 456.7, # Standard deviation
|
| 983 |
-
"skewness": -0.234, # Distribution asymmetry
|
| 984 |
-
"left_third": 12, # Count in left 33%
|
| 985 |
-
"center_third": 25, # Count in center 33%
|
| 986 |
-
"right_third": 10 # Count in right 33%
|
| 987 |
-
}
|
| 988 |
-
```
|
| 989 |
-
|
| 990 |
-
##### Vertical Distribution
|
| 991 |
-
```python
|
| 992 |
-
{
|
| 993 |
-
"mean": 1754.2, # Mean y-coordinate
|
| 994 |
-
"std": 892.4, # Standard deviation
|
| 995 |
-
"skewness": 0.156, # Distribution asymmetry
|
| 996 |
-
"top_third": 8, # Count in top 33%
|
| 997 |
-
"middle_third": 22, # Count in middle 33%
|
| 998 |
-
"bottom_third": 17 # Count in bottom 33%
|
| 999 |
-
}
|
| 1000 |
-
```
|
| 1001 |
-
|
| 1002 |
-
##### Quadrant Distribution
|
| 1003 |
-
```
|
| 1004 |
-
Document divided into 4 quadrants:
|
| 1005 |
-
βββββββββββ¬ββββββββββ
|
| 1006 |
-
β Q1 β Q2 β
|
| 1007 |
-
β(top-L) β(top-R) β
|
| 1008 |
-
βββββββββββΌββββββββββ€
|
| 1009 |
-
β Q3 β Q4 β
|
| 1010 |
-
β(bot-L) β(bot-R) β
|
| 1011 |
-
βββββββββββ΄ββββββββββ
|
| 1012 |
-
```
|
| 1013 |
-
|
| 1014 |
-
##### Size Distribution
|
| 1015 |
-
| Category | Threshold | Description |
|
| 1016 |
-
|----------|-----------|-------------|
|
| 1017 |
-
| tiny | < 0.5% of image | Very small elements |
|
| 1018 |
-
| small | 0.5% - 2% | Small elements |
|
| 1019 |
-
| medium | 2% - 10% | Medium elements |
|
| 1020 |
-
| large | β₯ 10% | Large elements |
|
| 1021 |
-
|
| 1022 |
-
##### Density Metrics
|
| 1023 |
-
```python
|
| 1024 |
-
{
|
| 1025 |
-
"average_nearest_neighbor_distance": 234.56, # pixels
|
| 1026 |
-
"spatial_clustering_score": 0.67 # 0-1, higher = more clustered
|
| 1027 |
-
}
|
| 1028 |
-
```
|
| 1029 |
-
|
| 1030 |
-
---
|
| 1031 |
-
|
| 1032 |
-
### Quality Metrics (utils/metrics/quality.py)
|
| 1033 |
-
|
| 1034 |
-
#### `calculate_layout_complexity(detections, img_width, img_height)`
|
| 1035 |
-
|
| 1036 |
-
Quantifies document structure complexity.
|
| 1037 |
-
|
| 1038 |
-
**Complexity Score Formula:**
|
| 1039 |
-
```
|
| 1040 |
-
score = (class_diversity / 20) Γ 30 # Max 20 classes
|
| 1041 |
-
+ min(detections / 50, 1) Γ 30 # Detection count
|
| 1042 |
-
+ min(density / 10, 1) Γ 20 # Spatial density
|
| 1043 |
-
+ (1 - min(avg_dist / 500, 1)) Γ 20 # Clustering
|
| 1044 |
-
```
|
| 1045 |
-
|
| 1046 |
-
**Complexity Levels:**
|
| 1047 |
-
| Level | Score Range | Description |
|
| 1048 |
-
|-------|-------------|-------------|
|
| 1049 |
-
| simple | < 30 | Basic document layout |
|
| 1050 |
-
| moderate | 30 - 60 | Average complexity |
|
| 1051 |
-
| complex | β₯ 60 | Complex multi-element layout |
|
| 1052 |
-
|
| 1053 |
-
**Layout Characteristics:**
|
| 1054 |
-
```python
|
| 1055 |
-
{
|
| 1056 |
-
"is_dense": True, # density > 5 elements/megapixel
|
| 1057 |
-
"is_diverse": True, # unique_classes β₯ 10
|
| 1058 |
-
"is_structured": False # avg_distance < 200 pixels
|
| 1059 |
-
}
|
| 1060 |
-
```
|
| 1061 |
-
|
| 1062 |
-
#### `calculate_quality_metrics(detections, img_width, img_height)`
|
| 1063 |
-
|
| 1064 |
-
Detection quality assessment.
|
| 1065 |
-
|
| 1066 |
-
##### Overlap Analysis
|
| 1067 |
-
```python
|
| 1068 |
-
{
|
| 1069 |
-
"total_overlapping_pairs": 5, # Number of overlapping detection pairs
|
| 1070 |
-
"overlap_percentage": 10.64, # % of detections with overlaps
|
| 1071 |
-
"average_iou": 0.1234 # Mean IoU of overlapping pairs
|
| 1072 |
-
}
|
| 1073 |
-
```
|
| 1074 |
-
|
| 1075 |
-
##### Size Consistency
|
| 1076 |
-
```python
|
| 1077 |
-
{
|
| 1078 |
-
"coefficient_of_variation": 0.876, # std/mean of areas
|
| 1079 |
-
"consistency_level": "medium" # high (<0.5), medium (0.5-1), low (>1)
|
| 1080 |
-
}
|
| 1081 |
-
```
|
| 1082 |
-
|
| 1083 |
-
##### Detection Quality Score
|
| 1084 |
-
```
|
| 1085 |
-
score = (1 - min(overlap_% / 100, 1)) Γ 50 + (1 - min(size_cv, 1)) Γ 50
|
| 1086 |
-
```
|
| 1087 |
-
|
| 1088 |
-
---
|
| 1089 |
-
|
| 1090 |
-
## π Visualization Engine
|
| 1091 |
-
|
| 1092 |
-
### services/visualization.py
|
| 1093 |
-
|
| 1094 |
-
The visualization engine generates 8 distinct chart types, each providing unique insights into the detection results.
|
| 1095 |
-
|
| 1096 |
-
### Chart Types
|
| 1097 |
-
|
| 1098 |
-
#### 1. Class Distribution Bar Chart
|
| 1099 |
-
```
|
| 1100 |
-
Purpose: Show count of detections per class
|
| 1101 |
-
Type: Vertical bar chart
|
| 1102 |
-
Features:
|
| 1103 |
-
- Sorted by count (descending)
|
| 1104 |
-
- Value labels on bars
|
| 1105 |
-
- Rotated x-axis labels for readability
|
| 1106 |
-
- Grid lines for easy reading
|
| 1107 |
-
```
|
| 1108 |
-
|
| 1109 |
-
#### 2. Confidence Distribution Histogram
|
| 1110 |
-
```
|
| 1111 |
-
Purpose: Show distribution of confidence scores
|
| 1112 |
-
Type: Histogram with 20 bins
|
| 1113 |
-
Features:
|
| 1114 |
-
- Mean line (red dashed)
|
| 1115 |
-
- Median line (orange dashed)
|
| 1116 |
-
- Legend with exact values
|
| 1117 |
-
- Grid lines
|
| 1118 |
-
```
|
| 1119 |
-
|
| 1120 |
-
#### 3. Spatial Distribution Heatmap
|
| 1121 |
-
```
|
| 1122 |
-
Purpose: Visualize where detections are concentrated
|
| 1123 |
-
Type: 2D histogram heatmap
|
| 1124 |
-
Features:
|
| 1125 |
-
- YlOrRd colormap (yellow to red)
|
| 1126 |
-
- Colorbar showing density
|
| 1127 |
-
- Axes showing pixel coordinates
|
| 1128 |
-
```
|
| 1129 |
-
|
| 1130 |
-
#### 4. Confidence by Class Box Plot
|
| 1131 |
-
```
|
| 1132 |
-
Purpose: Compare confidence distributions across classes
|
| 1133 |
-
Type: Box plot
|
| 1134 |
-
Features:
|
| 1135 |
-
- Top 10 classes by count
|
| 1136 |
-
- Sample sizes in labels
|
| 1137 |
-
- Median, quartiles, outliers
|
| 1138 |
-
- Light blue boxes
|
| 1139 |
-
```
|
| 1140 |
-
|
| 1141 |
-
#### 5. Area vs Confidence Scatter Plot
|
| 1142 |
-
```
|
| 1143 |
-
Purpose: Examine relationship between size and confidence
|
| 1144 |
-
Type: Scatter plot
|
| 1145 |
-
Features:
|
| 1146 |
-
- Color-coded by confidence (viridis)
|
| 1147 |
-
- Colorbar showing scale
|
| 1148 |
-
- Grid for reading values
|
| 1149 |
-
```
|
| 1150 |
-
|
| 1151 |
-
#### 6. Quadrant Distribution Pie Chart
|
| 1152 |
-
```
|
| 1153 |
-
Purpose: Show spatial distribution by quadrant
|
| 1154 |
-
Type: Pie chart
|
| 1155 |
-
Features:
|
| 1156 |
-
- 4 segments (Q1-Q4)
|
| 1157 |
-
- Percentage labels
|
| 1158 |
-
- Element counts in labels
|
| 1159 |
-
- Distinct colors per quadrant
|
| 1160 |
-
```
|
| 1161 |
-
|
| 1162 |
-
#### 7. Size Distribution Bar Chart
|
| 1163 |
-
```
|
| 1164 |
-
Purpose: Show distribution of detection sizes
|
| 1165 |
-
Type: Vertical bar chart
|
| 1166 |
-
Features:
|
| 1167 |
-
- 4 categories (tiny, small, medium, large)
|
| 1168 |
-
- Distinct color per category
|
| 1169 |
-
- Value labels on bars
|
| 1170 |
-
```
|
| 1171 |
-
|
| 1172 |
-
#### 8. Top Classes by Average Confidence
|
| 1173 |
-
```
|
| 1174 |
-
Purpose: Identify most confidently detected classes
|
| 1175 |
-
Type: Horizontal bar chart
|
| 1176 |
-
Features:
|
| 1177 |
-
- Top 15 classes
|
| 1178 |
-
- Sorted by confidence
|
| 1179 |
-
- Value labels
|
| 1180 |
-
- Coral color scheme
|
| 1181 |
-
```
|
| 1182 |
-
|
| 1183 |
-
### Technical Implementation
|
| 1184 |
-
|
| 1185 |
-
```python
|
| 1186 |
-
def generate_comprehensive_visualizations(
|
| 1187 |
-
detections: List[dict],
|
| 1188 |
-
class_metrics: dict,
|
| 1189 |
-
confidence_metrics: dict,
|
| 1190 |
-
spatial_metrics: dict,
|
| 1191 |
-
img_width: int,
|
| 1192 |
-
img_height: int
|
| 1193 |
-
) -> dict:
|
| 1194 |
-
"""
|
| 1195 |
-
Generate all visualization types.
|
| 1196 |
-
|
| 1197 |
-
Returns:
|
| 1198 |
-
Dictionary with base64-encoded PNG images
|
| 1199 |
-
"""
|
| 1200 |
-
visualizations = {}
|
| 1201 |
-
|
| 1202 |
-
# Each visualization wrapped in try-except for isolation
|
| 1203 |
-
try:
|
| 1204 |
-
fig, ax = plt.subplots(figsize=(12, 6))
|
| 1205 |
-
# ... chart generation code ...
|
| 1206 |
-
visualizations['chart_name'] = fig_to_base64(fig)
|
| 1207 |
-
plt.close(fig) # Prevent memory leaks
|
| 1208 |
-
except Exception as e:
|
| 1209 |
-
print(f"Error generating chart: {e}")
|
| 1210 |
-
|
| 1211 |
-
return visualizations
|
| 1212 |
-
```
|
| 1213 |
-
|
| 1214 |
-
### Base64 Encoding
|
| 1215 |
-
|
| 1216 |
-
```python
|
| 1217 |
-
def fig_to_base64(fig) -> str:
|
| 1218 |
-
"""Convert matplotlib figure to base64 data URI."""
|
| 1219 |
-
buffer = BytesIO()
|
| 1220 |
-
fig.savefig(buffer, format='png', dpi=100, bbox_inches='tight')
|
| 1221 |
-
buffer.seek(0)
|
| 1222 |
-
image_base64 = base64.b64encode(buffer.read()).decode()
|
| 1223 |
-
buffer.close()
|
| 1224 |
-
return f"data:image/png;base64,{image_base64}"
|
| 1225 |
-
```
|
| 1226 |
-
|
| 1227 |
-
### Usage in HTML
|
| 1228 |
-
|
| 1229 |
-
```html
|
| 1230 |
-
<img src="{{ visualizations.class_distribution }}" alt="Class Distribution">
|
| 1231 |
-
```
|
| 1232 |
-
|
| 1233 |
-
---
|
| 1234 |
-
|
| 1235 |
-
## π§ Services Layer
|
| 1236 |
-
|
| 1237 |
-
### services/detection.py
|
| 1238 |
-
|
| 1239 |
-
Core detection logic and result processing.
|
| 1240 |
-
|
| 1241 |
-
#### `process_detections(result, score_thr=0.3)`
|
| 1242 |
-
|
| 1243 |
-
Converts raw model output to structured format.
|
| 1244 |
-
|
| 1245 |
-
**Input:** Raw MMDetection result (list of arrays per class)
|
| 1246 |
-
|
| 1247 |
-
**Output:** List of detection dictionaries
|
| 1248 |
-
|
| 1249 |
-
```python
|
| 1250 |
-
[
|
| 1251 |
-
{
|
| 1252 |
-
"class_id": 0,
|
| 1253 |
-
"class_name": "paragraph",
|
| 1254 |
-
"bbox": {
|
| 1255 |
-
"x1": 100.5, "y1": 200.3,
|
| 1256 |
-
"x2": 500.8, "y2": 350.2,
|
| 1257 |
-
"width": 400.3, "height": 149.9,
|
| 1258 |
-
"center_x": 300.65, "center_y": 275.25
|
| 1259 |
-
},
|
| 1260 |
-
"confidence": 0.9234,
|
| 1261 |
-
"area": 60005.0,
|
| 1262 |
-
"aspect_ratio": 2.67
|
| 1263 |
-
},
|
| 1264 |
-
// ... more detections
|
| 1265 |
-
]
|
| 1266 |
-
```
|
| 1267 |
-
|
| 1268 |
-
**Processing Steps:**
|
| 1269 |
-
1. Iterate through class results
|
| 1270 |
-
2. Filter by confidence threshold
|
| 1271 |
-
3. Extract coordinates and calculate derived values
|
| 1272 |
-
4. Sort by confidence (descending)
|
| 1273 |
-
|
| 1274 |
-
---
|
| 1275 |
-
|
| 1276 |
-
### services/processing.py
|
| 1277 |
-
|
| 1278 |
-
Result aggregation and persistence.
|
| 1279 |
-
|
| 1280 |
-
#### `aggregate_results(...)`
|
| 1281 |
-
|
| 1282 |
-
Assembles the complete response object.
|
| 1283 |
-
|
| 1284 |
-
```python
|
| 1285 |
-
def aggregate_results(
|
| 1286 |
-
detections: List[dict],
|
| 1287 |
-
core_metrics: dict,
|
| 1288 |
-
rodla_metrics: dict,
|
| 1289 |
-
spatial_metrics: dict,
|
| 1290 |
-
class_metrics: dict,
|
| 1291 |
-
confidence_metrics: dict,
|
| 1292 |
-
robustness_indicators: dict,
|
| 1293 |
-
layout_complexity: dict,
|
| 1294 |
-
quality_metrics: dict,
|
| 1295 |
-
visualizations: dict,
|
| 1296 |
-
interpretation: dict,
|
| 1297 |
-
file_info: dict,
|
| 1298 |
-
config: dict
|
| 1299 |
-
) -> dict:
|
| 1300 |
-
"""Combine all analysis results into final response."""
|
| 1301 |
-
return {
|
| 1302 |
-
"success": True,
|
| 1303 |
-
"timestamp": datetime.now().isoformat(),
|
| 1304 |
-
# ... all components ...
|
| 1305 |
-
}
|
| 1306 |
-
```
|
| 1307 |
-
|
| 1308 |
-
#### `save_results(results, filename, output_dir)`
|
| 1309 |
-
|
| 1310 |
-
Persists results to disk.
|
| 1311 |
-
|
| 1312 |
-
```python
|
| 1313 |
-
def save_results(results: dict, filename: str, output_dir: Path) -> Path:
|
| 1314 |
-
"""
|
| 1315 |
-
Save results as JSON file.
|
| 1316 |
-
|
| 1317 |
-
- Removes visualizations to reduce file size
|
| 1318 |
-
- Converts numpy types to Python native
|
| 1319 |
-
- Saves visualizations as separate PNG files
|
| 1320 |
-
"""
|
| 1321 |
-
json_path = output_dir / f"rodla_results_{filename}.json"
|
| 1322 |
-
# ... save logic ...
|
| 1323 |
-
return json_path
|
| 1324 |
-
```
|
| 1325 |
-
|
| 1326 |
-
---
|
| 1327 |
-
|
| 1328 |
-
### services/interpretation.py
|
| 1329 |
-
|
| 1330 |
-
Human-readable insight generation.
|
| 1331 |
-
|
| 1332 |
-
#### `generate_comprehensive_interpretation(...)`
|
| 1333 |
-
|
| 1334 |
-
Creates natural language analysis of results.
|
| 1335 |
-
|
| 1336 |
-
**Output Sections:**
|
| 1337 |
-
|
| 1338 |
-
| Section | Description |
|
| 1339 |
-
|---------|-------------|
|
| 1340 |
-
| `overview` | High-level summary paragraph |
|
| 1341 |
-
| `top_elements` | Description of most common elements |
|
| 1342 |
-
| `rodla_analysis` | Robustness assessment summary |
|
| 1343 |
-
| `layout_complexity` | Complexity analysis text |
|
| 1344 |
-
| `key_findings` | List of important observations |
|
| 1345 |
-
| `perturbation_assessment` | Perturbation effect analysis |
|
| 1346 |
-
| `recommendations` | Actionable suggestions |
|
| 1347 |
-
| `confidence_summary` | Confidence level summary |
|
| 1348 |
-
|
| 1349 |
-
**Example Output:**
|
| 1350 |
-
|
| 1351 |
-
```python
|
| 1352 |
-
{
|
| 1353 |
-
"overview": """Document Analysis Summary:
|
| 1354 |
-
Detected 47 layout elements across 12 different classes.
|
| 1355 |
-
The model achieved an average confidence of 78.2%, indicating
|
| 1356 |
-
medium certainty in predictions. The detected elements cover
|
| 1357 |
-
68.5% of the document area.""",
|
| 1358 |
-
|
| 1359 |
-
"key_findings": [
|
| 1360 |
-
"β Excellent detection confidence - model is highly certain",
|
| 1361 |
-
"β High document coverage - most of the page contains elements",
|
| 1362 |
-
"βΉ Complex document structure with diverse element types"
|
| 1363 |
-
],
|
| 1364 |
-
|
| 1365 |
-
"recommendations": [
|
| 1366 |
-
"No specific recommendations - detection quality is good"
|
| 1367 |
-
]
|
| 1368 |
-
}
|
| 1369 |
-
```
|
| 1370 |
-
|
| 1371 |
-
---
|
| 1372 |
-
|
| 1373 |
-
## π οΈ Utilities Reference
|
| 1374 |
-
|
| 1375 |
-
### utils/helpers.py
|
| 1376 |
-
|
| 1377 |
-
General-purpose helper functions.
|
| 1378 |
-
|
| 1379 |
-
#### Mathematical Functions
|
| 1380 |
-
|
| 1381 |
-
| Function | Purpose | Formula |
|
| 1382 |
-
|----------|---------|---------|
|
| 1383 |
-
| `calculate_skewness(data)` | Distribution asymmetry | `mean(((x - ΞΌ) / Ο)Β³)` |
|
| 1384 |
-
| `calculate_entropy(values)` | Information content | `-Ξ£(p Γ logβ(p))` |
|
| 1385 |
-
| `calculate_avg_nn_distance(xs, ys)` | Average nearest neighbor | Mean of min distances |
|
| 1386 |
-
| `calculate_clustering_score(xs, ys)` | Spatial clustering | `1 - (std / mean)` |
|
| 1387 |
-
| `calculate_iou(bbox1, bbox2)` | Intersection over Union | `intersection / union` |
|
| 1388 |
-
|
| 1389 |
-
#### Utility Functions
|
| 1390 |
-
|
| 1391 |
-
```python
|
| 1392 |
-
def calculate_detection_overlaps(detections: List[dict]) -> dict:
|
| 1393 |
-
"""
|
| 1394 |
-
Find all overlapping detection pairs.
|
| 1395 |
-
|
| 1396 |
-
Returns:
|
| 1397 |
-
{
|
| 1398 |
-
'count': int, # Number of overlapping pairs
|
| 1399 |
-
'percentage': float, # % of detections with overlaps
|
| 1400 |
-
'avg_iou': float # Mean IoU of overlaps
|
| 1401 |
-
}
|
| 1402 |
-
"""
|
| 1403 |
-
```
|
| 1404 |
-
|
| 1405 |
-
---
|
| 1406 |
-
|
| 1407 |
-
### utils/serialization.py
|
| 1408 |
-
|
| 1409 |
-
JSON conversion utilities.
|
| 1410 |
-
|
| 1411 |
-
#### `convert_to_json_serializable(obj)`
|
| 1412 |
-
|
| 1413 |
-
Recursively converts numpy types to Python native types.
|
| 1414 |
-
|
| 1415 |
-
**Conversions:**
|
| 1416 |
-
| NumPy Type | Python Type |
|
| 1417 |
-
|------------|-------------|
|
| 1418 |
-
| `np.integer` | `int` |
|
| 1419 |
-
| `np.floating` | `float` |
|
| 1420 |
-
| `np.ndarray` | `list` |
|
| 1421 |
-
| `np.bool_` | `bool` |
|
| 1422 |
-
|
| 1423 |
-
```python
|
| 1424 |
-
def convert_to_json_serializable(obj):
|
| 1425 |
-
"""
|
| 1426 |
-
Recursively convert numpy types for JSON serialization.
|
| 1427 |
-
|
| 1428 |
-
Handles:
|
| 1429 |
-
- Dictionaries (recursive)
|
| 1430 |
-
- Lists (recursive)
|
| 1431 |
-
- NumPy scalars and arrays
|
| 1432 |
-
- Native Python types (pass-through)
|
| 1433 |
-
"""
|
| 1434 |
-
if isinstance(obj, dict):
|
| 1435 |
-
return {k: convert_to_json_serializable(v) for k, v in obj.items()}
|
| 1436 |
-
elif isinstance(obj, list):
|
| 1437 |
-
return [convert_to_json_serializable(item) for item in obj]
|
| 1438 |
-
elif isinstance(obj, np.integer):
|
| 1439 |
-
return int(obj)
|
| 1440 |
-
elif isinstance(obj, np.floating):
|
| 1441 |
-
return float(obj)
|
| 1442 |
-
elif isinstance(obj, np.ndarray):
|
| 1443 |
-
return obj.tolist()
|
| 1444 |
-
elif isinstance(obj, np.bool_):
|
| 1445 |
-
return bool(obj)
|
| 1446 |
-
return obj
|
| 1447 |
-
```
|
| 1448 |
-
|
| 1449 |
-
---
|
| 1450 |
-
|
| 1451 |
-
## β οΈ Error Handling
|
| 1452 |
-
|
| 1453 |
-
### Exception Hierarchy
|
| 1454 |
-
|
| 1455 |
-
```
|
| 1456 |
-
Exception
|
| 1457 |
-
βββ HTTPException (FastAPI)
|
| 1458 |
-
β βββ 400 Bad Request
|
| 1459 |
-
β β βββ Invalid file type
|
| 1460 |
-
β βββ 500 Internal Server Error
|
| 1461 |
-
β βββ Model not loaded
|
| 1462 |
-
β βββ Inference failed
|
| 1463 |
-
β βββ Processing error
|
| 1464 |
-
βββ Standard Exceptions
|
| 1465 |
-
βββ FileNotFoundError
|
| 1466 |
-
βββ ValueError
|
| 1467 |
-
βββ RuntimeError
|
| 1468 |
-
```
|
| 1469 |
-
|
| 1470 |
-
### Error Handling Strategy
|
| 1471 |
-
|
| 1472 |
-
```python
|
| 1473 |
-
@app.post("/api/detect")
|
| 1474 |
-
async def detect_objects(...):
|
| 1475 |
-
tmp_path = None
|
| 1476 |
-
|
| 1477 |
-
try:
|
| 1478 |
-
# Main processing logic
|
| 1479 |
-
...
|
| 1480 |
-
|
| 1481 |
-
except HTTPException:
|
| 1482 |
-
# Re-raise HTTP exceptions unchanged
|
| 1483 |
-
if tmp_path and os.path.exists(tmp_path):
|
| 1484 |
-
os.unlink(tmp_path)
|
| 1485 |
-
raise
|
| 1486 |
-
|
| 1487 |
-
except Exception as e:
|
| 1488 |
-
# Handle unexpected errors
|
| 1489 |
-
if tmp_path and os.path.exists(tmp_path):
|
| 1490 |
-
os.unlink(tmp_path)
|
| 1491 |
-
|
| 1492 |
-
# Log full traceback
|
| 1493 |
-
import traceback
|
| 1494 |
-
traceback.print_exc()
|
| 1495 |
-
|
| 1496 |
-
# Return structured error response
|
| 1497 |
-
return JSONResponse(
|
| 1498 |
-
{"success": False, "error": str(e)},
|
| 1499 |
-
status_code=500
|
| 1500 |
-
)
|
| 1501 |
-
```
|
| 1502 |
-
|
| 1503 |
-
### Visualization Error Isolation
|
| 1504 |
-
|
| 1505 |
-
Each visualization is wrapped individually to prevent cascade failures:
|
| 1506 |
-
|
| 1507 |
-
```python
|
| 1508 |
-
for viz_name, viz_func in visualization_functions.items():
|
| 1509 |
-
try:
|
| 1510 |
-
visualizations[viz_name] = viz_func()
|
| 1511 |
-
except Exception as e:
|
| 1512 |
-
print(f"Error generating {viz_name}: {e}")
|
| 1513 |
-
visualizations[viz_name] = None
|
| 1514 |
-
```
|
| 1515 |
-
|
| 1516 |
-
### Resource Cleanup
|
| 1517 |
-
|
| 1518 |
-
Temporary files are always cleaned up:
|
| 1519 |
-
|
| 1520 |
-
```python
|
| 1521 |
-
finally:
|
| 1522 |
-
if tmp_path and os.path.exists(tmp_path):
|
| 1523 |
-
os.unlink(tmp_path)
|
| 1524 |
-
```
|
| 1525 |
-
|
| 1526 |
-
---
|
| 1527 |
-
|
| 1528 |
-
## β‘ Performance Optimization
|
| 1529 |
-
|
| 1530 |
-
### GPU Memory Management
|
| 1531 |
-
|
| 1532 |
-
```python
|
| 1533 |
-
# At startup - clear GPU cache
|
| 1534 |
-
if torch.cuda.is_available():
|
| 1535 |
-
torch.cuda.empty_cache()
|
| 1536 |
-
gc.collect()
|
| 1537 |
-
|
| 1538 |
-
# Monitor memory usage
|
| 1539 |
-
print(f"GPU Memory: {torch.cuda.memory_allocated(0) / 1024**3:.2f} GB")
|
| 1540 |
-
```
|
| 1541 |
-
|
| 1542 |
-
### Memory-Efficient Visualizations
|
| 1543 |
-
|
| 1544 |
-
```python
|
| 1545 |
-
# Always close figures after encoding
|
| 1546 |
-
fig, ax = plt.subplots()
|
| 1547 |
-
# ... generate chart ...
|
| 1548 |
-
base64_str = fig_to_base64(fig)
|
| 1549 |
-
plt.close(fig) # IMPORTANT: Prevents memory leaks
|
| 1550 |
-
```
|
| 1551 |
-
|
| 1552 |
-
### Response Size Optimization
|
| 1553 |
-
|
| 1554 |
-
```python
|
| 1555 |
-
# Remove large base64 images from saved JSON
|
| 1556 |
-
json_results = {k: v for k, v in results.items() if k != "visualizations"}
|
| 1557 |
-
|
| 1558 |
-
# Save visualizations as separate files
|
| 1559 |
-
for viz_name, viz_data in visualizations.items():
|
| 1560 |
-
save_visualization(viz_data, f"{filename}_{viz_name}.png")
|
| 1561 |
-
```
|
| 1562 |
-
|
| 1563 |
-
### Lazy Model Loading
|
| 1564 |
-
|
| 1565 |
-
```python
|
| 1566 |
-
# Model loaded once at startup, reused for all requests
|
| 1567 |
-
@app.on_event("startup")
|
| 1568 |
-
async def startup_event():
|
| 1569 |
-
global model
|
| 1570 |
-
model = init_detector(config, weights, device)
|
| 1571 |
-
```
|
| 1572 |
-
|
| 1573 |
-
### Performance Benchmarks
|
| 1574 |
-
|
| 1575 |
-
| Operation | Time (GPU) | Time (CPU) |
|
| 1576 |
-
|-----------|------------|------------|
|
| 1577 |
-
| Model loading | 10-15s | 20-30s |
|
| 1578 |
-
| Single inference | 0.3-0.5s | 2-5s |
|
| 1579 |
-
| Metrics calculation | 0.1-0.2s | 0.1-0.2s |
|
| 1580 |
-
| Visualization generation | 1-2s | 1-2s |
|
| 1581 |
-
| **Total per request** | **1.5-3s** | **4-8s** |
|
| 1582 |
-
|
| 1583 |
-
---
|
| 1584 |
-
|
| 1585 |
-
## π Security Considerations
|
| 1586 |
-
|
| 1587 |
-
### Current Security Status
|
| 1588 |
-
|
| 1589 |
-
| Aspect | Status | Risk | Recommendation |
|
| 1590 |
-
|--------|--------|------|----------------|
|
| 1591 |
-
| Authentication | β None | High | Add API key auth |
|
| 1592 |
-
| CORS | β οΈ Permissive | Medium | Restrict origins |
|
| 1593 |
-
| Rate Limiting | β None | Medium | Add throttling |
|
| 1594 |
-
| Input Validation | β οΈ Basic | Low | Add size limits |
|
| 1595 |
-
| Path Handling | β οΈ Hardcoded | Low | Use env vars |
|
| 1596 |
-
|
| 1597 |
-
### Recommended Security Enhancements
|
| 1598 |
-
|
| 1599 |
-
#### API Key Authentication
|
| 1600 |
-
|
| 1601 |
-
```python
|
| 1602 |
-
from fastapi import Security
|
| 1603 |
-
from fastapi.security.api_key import APIKeyHeader
|
| 1604 |
-
|
| 1605 |
-
API_KEY = os.environ.get("RODLA_API_KEY")
|
| 1606 |
-
api_key_header = APIKeyHeader(name="X-API-Key")
|
| 1607 |
-
|
| 1608 |
-
async def verify_api_key(api_key: str = Security(api_key_header)):
|
| 1609 |
-
if api_key != API_KEY:
|
| 1610 |
-
raise HTTPException(403, "Invalid API key")
|
| 1611 |
-
return api_key
|
| 1612 |
-
|
| 1613 |
-
@app.post("/api/detect")
|
| 1614 |
-
async def detect_objects(
|
| 1615 |
-
...,
|
| 1616 |
-
api_key: str = Depends(verify_api_key)
|
| 1617 |
-
):
|
| 1618 |
-
...
|
| 1619 |
-
```
|
| 1620 |
-
|
| 1621 |
-
#### Rate Limiting
|
| 1622 |
-
|
| 1623 |
-
```python
|
| 1624 |
-
from slowapi import Limiter
|
| 1625 |
-
from slowapi.util import get_remote_address
|
| 1626 |
-
|
| 1627 |
-
limiter = Limiter(key_func=get_remote_address)
|
| 1628 |
-
app.state.limiter = limiter
|
| 1629 |
-
|
| 1630 |
-
@app.post("/api/detect")
|
| 1631 |
-
@limiter.limit("10/minute")
|
| 1632 |
-
async def detect_objects(...):
|
| 1633 |
-
...
|
| 1634 |
-
```
|
| 1635 |
-
|
| 1636 |
-
#### File Size Limits
|
| 1637 |
-
|
| 1638 |
-
```python
|
| 1639 |
-
MAX_FILE_SIZE = 10 * 1024 * 1024 # 10MB
|
| 1640 |
-
|
| 1641 |
-
@app.post("/api/detect")
|
| 1642 |
-
async def detect_objects(file: UploadFile = File(...)):
|
| 1643 |
-
content = await file.read()
|
| 1644 |
-
if len(content) > MAX_FILE_SIZE:
|
| 1645 |
-
raise HTTPException(413, "File too large")
|
| 1646 |
-
...
|
| 1647 |
-
```
|
| 1648 |
-
|
| 1649 |
-
#### Restricted CORS
|
| 1650 |
-
|
| 1651 |
-
```python
|
| 1652 |
-
app.add_middleware(
|
| 1653 |
-
CORSMiddleware,
|
| 1654 |
-
allow_origins=["https://yourdomain.com"],
|
| 1655 |
-
allow_methods=["GET", "POST"],
|
| 1656 |
-
allow_headers=["X-API-Key", "Content-Type"],
|
| 1657 |
-
)
|
| 1658 |
-
```
|
| 1659 |
-
|
| 1660 |
-
---
|
| 1661 |
-
|
| 1662 |
-
## π§ͺ Testing
|
| 1663 |
-
|
| 1664 |
-
### Test Structure
|
| 1665 |
-
|
| 1666 |
-
```
|
| 1667 |
-
tests/
|
| 1668 |
-
βββ __init__.py
|
| 1669 |
-
βββ conftest.py # Pytest fixtures
|
| 1670 |
-
βββ test_api/
|
| 1671 |
-
β βββ test_routes.py # Endpoint tests
|
| 1672 |
-
β βββ test_schemas.py # Pydantic model tests
|
| 1673 |
-
βββ test_services/
|
| 1674 |
-
β βββ test_detection.py # Detection logic tests
|
| 1675 |
-
β βββ test_processing.py # Processing tests
|
| 1676 |
-
β βββ test_visualization.py # Chart generation tests
|
| 1677 |
-
βββ test_utils/
|
| 1678 |
-
β βββ test_helpers.py # Helper function tests
|
| 1679 |
-
β βββ test_metrics.py # Metrics calculation tests
|
| 1680 |
-
β βββ test_serialization.py # Serialization tests
|
| 1681 |
-
βββ test_integration/
|
| 1682 |
-
βββ test_full_pipeline.py # End-to-end tests
|
| 1683 |
-
```
|
| 1684 |
-
|
| 1685 |
-
### Running Tests
|
| 1686 |
-
|
| 1687 |
-
```bash
|
| 1688 |
-
# Run all tests
|
| 1689 |
-
pytest
|
| 1690 |
-
|
| 1691 |
-
# Run with coverage
|
| 1692 |
-
pytest --cov=. --cov-report=html
|
| 1693 |
-
|
| 1694 |
-
# Run specific test file
|
| 1695 |
-
pytest tests/test_utils/test_metrics.py
|
| 1696 |
-
|
| 1697 |
-
# Run with verbose output
|
| 1698 |
-
pytest -v
|
| 1699 |
-
|
| 1700 |
-
# Run only fast tests (no model loading)
|
| 1701 |
-
pytest -m "not slow"
|
| 1702 |
-
```
|
| 1703 |
-
|
| 1704 |
-
### Example Test Cases
|
| 1705 |
-
|
| 1706 |
-
```python
|
| 1707 |
-
# tests/test_utils/test_helpers.py
|
| 1708 |
-
|
| 1709 |
-
import pytest
|
| 1710 |
-
import numpy as np
|
| 1711 |
-
from utils.helpers import calculate_iou, calculate_skewness
|
| 1712 |
-
|
| 1713 |
-
class TestCalculateIoU:
|
| 1714 |
-
def test_complete_overlap(self):
|
| 1715 |
-
bbox1 = {'x1': 0, 'y1': 0, 'x2': 100, 'y2': 100, 'width': 100, 'height': 100}
|
| 1716 |
-
bbox2 = {'x1': 0, 'y1': 0, 'x2': 100, 'y2': 100, 'width': 100, 'height': 100}
|
| 1717 |
-
assert calculate_iou(bbox1, bbox2) == 1.0
|
| 1718 |
-
|
| 1719 |
-
def test_no_overlap(self):
|
| 1720 |
-
bbox1 = {'x1': 0, 'y1': 0, 'x2': 50, 'y2': 50, 'width': 50, 'height': 50}
|
| 1721 |
-
bbox2 = {'x1': 100, 'y1': 100, 'x2': 150, 'y2': 150, 'width': 50, 'height': 50}
|
| 1722 |
-
assert calculate_iou(bbox1, bbox2) == 0.0
|
| 1723 |
-
|
| 1724 |
-
def test_partial_overlap(self):
|
| 1725 |
-
bbox1 = {'x1': 0, 'y1': 0, 'x2': 100, 'y2': 100, 'width': 100, 'height': 100}
|
| 1726 |
-
bbox2 = {'x1': 50, 'y1': 50, 'x2': 150, 'y2': 150, 'width': 100, 'height': 100}
|
| 1727 |
-
iou = calculate_iou(bbox1, bbox2)
|
| 1728 |
-
assert 0 < iou < 1
|
| 1729 |
-
|
| 1730 |
-
class TestCalculateSkewness:
|
| 1731 |
-
def test_symmetric_distribution(self):
|
| 1732 |
-
data = [1, 2, 3, 4, 5]
|
| 1733 |
-
skew = calculate_skewness(data)
|
| 1734 |
-
assert abs(skew) < 0.1 # Nearly symmetric
|
| 1735 |
-
|
| 1736 |
-
def test_right_skewed(self):
|
| 1737 |
-
data = [1, 1, 1, 1, 10]
|
| 1738 |
-
skew = calculate_skewness(data)
|
| 1739 |
-
assert skew > 0 # Positive skew
|
| 1740 |
-
```
|
| 1741 |
-
|
| 1742 |
-
### Mocking the Model
|
| 1743 |
-
|
| 1744 |
-
```python
|
| 1745 |
-
# tests/conftest.py
|
| 1746 |
-
|
| 1747 |
-
import pytest
|
| 1748 |
-
from unittest.mock import Mock, patch
|
| 1749 |
-
|
| 1750 |
-
@pytest.fixture
|
| 1751 |
-
def mock_model():
|
| 1752 |
-
"""Create a mock detection model."""
|
| 1753 |
-
model = Mock()
|
| 1754 |
-
model.CLASSES = ['paragraph', 'title', 'figure', 'table']
|
| 1755 |
-
return model
|
| 1756 |
-
|
| 1757 |
-
@pytest.fixture
|
| 1758 |
-
def mock_detections():
|
| 1759 |
-
"""Sample detection results."""
|
| 1760 |
-
return [
|
| 1761 |
-
{
|
| 1762 |
-
'class_id': 0,
|
| 1763 |
-
'class_name': 'paragraph',
|
| 1764 |
-
'bbox': {'x1': 100, 'y1': 100, 'x2': 500, 'y2': 300,
|
| 1765 |
-
'width': 400, 'height': 200, 'center_x': 300, 'center_y': 200},
|
| 1766 |
-
'confidence': 0.95,
|
| 1767 |
-
'area': 80000,
|
| 1768 |
-
'aspect_ratio': 2.0
|
| 1769 |
-
}
|
| 1770 |
-
]
|
| 1771 |
-
```
|
| 1772 |
-
|
| 1773 |
-
---
|
| 1774 |
-
|
| 1775 |
-
## π’ Deployment
|
| 1776 |
-
|
| 1777 |
-
### Development Server
|
| 1778 |
-
|
| 1779 |
-
```bash
|
| 1780 |
-
python backend.py
|
| 1781 |
-
# or
|
| 1782 |
-
uvicorn backend:app --reload --host 0.0.0.0 --port 8000
|
| 1783 |
-
```
|
| 1784 |
-
|
| 1785 |
-
### Production with Gunicorn
|
| 1786 |
-
|
| 1787 |
-
```bash
|
| 1788 |
-
gunicorn backend:app -w 1 -k uvicorn.workers.UvicornWorker \
|
| 1789 |
-
--bind 0.0.0.0:8000 \
|
| 1790 |
-
--timeout 120 \
|
| 1791 |
-
--keep-alive 5
|
| 1792 |
-
```
|
| 1793 |
-
|
| 1794 |
-
**Note:** Use `workers=1` for GPU models to avoid memory issues.
|
| 1795 |
-
|
| 1796 |
-
### Docker Deployment
|
| 1797 |
-
|
| 1798 |
-
```dockerfile
|
| 1799 |
-
# Dockerfile
|
| 1800 |
-
FROM nvidia/cuda:11.8-cudnn8-runtime-ubuntu22.04
|
| 1801 |
-
|
| 1802 |
-
# Install Python
|
| 1803 |
-
RUN apt-get update && apt-get install -y python3.9 python3-pip
|
| 1804 |
-
|
| 1805 |
-
# Set working directory
|
| 1806 |
-
WORKDIR /app
|
| 1807 |
-
|
| 1808 |
-
# Copy requirements first for caching
|
| 1809 |
-
COPY requirements.txt .
|
| 1810 |
-
RUN pip install --no-cache-dir -r requirements.txt
|
| 1811 |
-
|
| 1812 |
-
# Copy application code
|
| 1813 |
-
COPY . .
|
| 1814 |
-
|
| 1815 |
-
# Create output directory
|
| 1816 |
-
RUN mkdir -p outputs
|
| 1817 |
-
|
| 1818 |
-
# Expose port
|
| 1819 |
-
EXPOSE 8000
|
| 1820 |
-
|
| 1821 |
-
# Run application
|
| 1822 |
-
CMD ["uvicorn", "backend:app", "--host", "0.0.0.0", "--port", "8000"]
|
| 1823 |
-
```
|
| 1824 |
-
|
| 1825 |
-
```yaml
|
| 1826 |
-
# docker-compose.yml
|
| 1827 |
-
version: '3.8'
|
| 1828 |
-
|
| 1829 |
-
services:
|
| 1830 |
-
rodla-api:
|
| 1831 |
-
build: .
|
| 1832 |
-
ports:
|
| 1833 |
-
- "8000:8000"
|
| 1834 |
-
volumes:
|
| 1835 |
-
- ./outputs:/app/outputs
|
| 1836 |
-
- ./weights:/app/weights
|
| 1837 |
-
deploy:
|
| 1838 |
-
resources:
|
| 1839 |
-
reservations:
|
| 1840 |
-
devices:
|
| 1841 |
-
- driver: nvidia
|
| 1842 |
-
count: 1
|
| 1843 |
-
capabilities: [gpu]
|
| 1844 |
-
environment:
|
| 1845 |
-
- RODLA_API_KEY=${RODLA_API_KEY}
|
| 1846 |
-
restart: unless-stopped
|
| 1847 |
-
```
|
| 1848 |
-
|
| 1849 |
-
### Kubernetes Deployment
|
| 1850 |
-
|
| 1851 |
-
```yaml
|
| 1852 |
-
# k8s/deployment.yaml
|
| 1853 |
-
apiVersion: apps/v1
|
| 1854 |
-
kind: Deployment
|
| 1855 |
-
metadata:
|
| 1856 |
-
name: rodla-api
|
| 1857 |
-
spec:
|
| 1858 |
-
replicas: 1
|
| 1859 |
-
selector:
|
| 1860 |
-
matchLabels:
|
| 1861 |
-
app: rodla-api
|
| 1862 |
-
template:
|
| 1863 |
-
metadata:
|
| 1864 |
-
labels:
|
| 1865 |
-
app: rodla-api
|
| 1866 |
-
spec:
|
| 1867 |
-
containers:
|
| 1868 |
-
- name: rodla-api
|
| 1869 |
-
image: your-registry/rodla-api:latest
|
| 1870 |
-
ports:
|
| 1871 |
-
- containerPort: 8000
|
| 1872 |
-
resources:
|
| 1873 |
-
limits:
|
| 1874 |
-
nvidia.com/gpu: 1
|
| 1875 |
-
memory: "16Gi"
|
| 1876 |
-
requests:
|
| 1877 |
-
memory: "8Gi"
|
| 1878 |
-
volumeMounts:
|
| 1879 |
-
- name: outputs
|
| 1880 |
-
mountPath: /app/outputs
|
| 1881 |
-
volumes:
|
| 1882 |
-
- name: outputs
|
| 1883 |
-
persistentVolumeClaim:
|
| 1884 |
-
claimName: rodla-outputs-pvc
|
| 1885 |
-
```
|
| 1886 |
-
|
| 1887 |
-
### Nginx Reverse Proxy
|
| 1888 |
-
|
| 1889 |
-
```nginx
|
| 1890 |
-
# /etc/nginx/sites-available/rodla-api
|
| 1891 |
-
upstream rodla_backend {
|
| 1892 |
-
server 127.0.0.1:8000;
|
| 1893 |
-
}
|
| 1894 |
-
|
| 1895 |
-
server {
|
| 1896 |
-
listen 80;
|
| 1897 |
-
server_name api.yourdomain.com;
|
| 1898 |
-
|
| 1899 |
-
client_max_body_size 50M;
|
| 1900 |
-
|
| 1901 |
-
location / {
|
| 1902 |
-
proxy_pass http://rodla_backend;
|
| 1903 |
-
proxy_set_header Host $host;
|
| 1904 |
-
proxy_set_header X-Real-IP $remote_addr;
|
| 1905 |
-
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
| 1906 |
-
proxy_read_timeout 120s;
|
| 1907 |
-
}
|
| 1908 |
-
}
|
| 1909 |
-
```
|
| 1910 |
-
|
| 1911 |
-
---
|
| 1912 |
-
|
| 1913 |
-
## π§ Troubleshooting
|
| 1914 |
-
|
| 1915 |
-
### Common Issues
|
| 1916 |
-
|
| 1917 |
-
#### Model Loading Failures
|
| 1918 |
-
|
| 1919 |
-
**Symptom:** `RuntimeError: CUDA out of memory`
|
| 1920 |
-
|
| 1921 |
-
**Solutions:**
|
| 1922 |
-
```bash
|
| 1923 |
-
# Clear GPU memory before starting
|
| 1924 |
-
nvidia-smi --gpu-reset
|
| 1925 |
-
|
| 1926 |
-
# Or in Python
|
| 1927 |
-
import torch
|
| 1928 |
-
torch.cuda.empty_cache()
|
| 1929 |
-
|
| 1930 |
-
# Check available GPU memory
|
| 1931 |
-
nvidia-smi
|
| 1932 |
-
```
|
| 1933 |
-
|
| 1934 |
-
**Symptom:** `ModuleNotFoundError: No module named 'mmdet'`
|
| 1935 |
-
|
| 1936 |
-
**Solution:**
|
| 1937 |
-
```bash
|
| 1938 |
-
pip install -U openmim
|
| 1939 |
-
mim install mmengine mmcv mmdet
|
| 1940 |
-
```
|
| 1941 |
-
|
| 1942 |
-
**Symptom:** `FileNotFoundError: Config file not found`
|
| 1943 |
-
|
| 1944 |
-
**Solution:**
|
| 1945 |
-
```python
|
| 1946 |
-
# Check paths in config/settings.py
|
| 1947 |
-
from pathlib import Path
|
| 1948 |
-
print(Path(MODEL_CONFIG).exists()) # Should be True
|
| 1949 |
-
print(Path(MODEL_WEIGHTS).exists()) # Should be True
|
| 1950 |
-
```
|
| 1951 |
-
|
| 1952 |
-
---
|
| 1953 |
-
|
| 1954 |
-
#### Inference Errors
|
| 1955 |
-
|
| 1956 |
-
**Symptom:** `RuntimeError: Input type and weight type should be the same`
|
| 1957 |
-
|
| 1958 |
-
**Solution:**
|
| 1959 |
-
```python
|
| 1960 |
-
# Ensure model and input are on same device
|
| 1961 |
-
model = model.to('cuda')
|
| 1962 |
-
# or
|
| 1963 |
-
model = model.to('cpu')
|
| 1964 |
-
```
|
| 1965 |
-
|
| 1966 |
-
**Symptom:** `ValueError: could not broadcast input array`
|
| 1967 |
-
|
| 1968 |
-
**Solution:**
|
| 1969 |
-
```python
|
| 1970 |
-
# Check image dimensions
|
| 1971 |
-
from PIL import Image
|
| 1972 |
-
img = Image.open(image_path)
|
| 1973 |
-
print(f"Image size: {img.size}") # Should be reasonable dimensions
|
| 1974 |
-
```
|
| 1975 |
-
|
| 1976 |
-
---
|
| 1977 |
-
|
| 1978 |
-
#### Visualization Errors
|
| 1979 |
-
|
| 1980 |
-
**Symptom:** `RuntimeError: main thread is not in main loop`
|
| 1981 |
-
|
| 1982 |
-
**Solution:**
|
| 1983 |
-
```python
|
| 1984 |
-
# Set matplotlib backend before importing pyplot
|
| 1985 |
-
import matplotlib
|
| 1986 |
-
matplotlib.use('Agg') # Non-interactive backend
|
| 1987 |
-
import matplotlib.pyplot as plt
|
| 1988 |
-
```
|
| 1989 |
-
|
| 1990 |
-
**Symptom:** Memory usage grows with each request
|
| 1991 |
-
|
| 1992 |
-
**Solution:**
|
| 1993 |
-
```python
|
| 1994 |
-
# Always close figures after use
|
| 1995 |
-
fig, ax = plt.subplots()
|
| 1996 |
-
# ... plotting code ...
|
| 1997 |
-
plt.savefig(buffer, format='png')
|
| 1998 |
-
plt.close(fig) # CRITICAL: Prevents memory leak
|
| 1999 |
-
plt.close('all') # Nuclear option if needed
|
| 2000 |
-
```
|
| 2001 |
-
|
| 2002 |
-
---
|
| 2003 |
-
|
| 2004 |
-
#### API Errors
|
| 2005 |
-
|
| 2006 |
-
**Symptom:** `422 Unprocessable Entity`
|
| 2007 |
-
|
| 2008 |
-
**Cause:** Invalid request format
|
| 2009 |
-
|
| 2010 |
-
**Solution:**
|
| 2011 |
-
```bash
|
| 2012 |
-
# Correct multipart form data format
|
| 2013 |
-
curl -X POST "http://localhost:8000/api/detect" \
|
| 2014 |
-
-H "accept: application/json" \
|
| 2015 |
-
-F "file=@image.jpg;type=image/jpeg" \
|
| 2016 |
-
-F "score_thr=0.3"
|
| 2017 |
-
```
|
| 2018 |
-
|
| 2019 |
-
**Symptom:** `413 Request Entity Too Large`
|
| 2020 |
-
|
| 2021 |
-
**Solution:**
|
| 2022 |
-
```python
|
| 2023 |
-
# Increase upload limit in FastAPI
|
| 2024 |
-
from fastapi import FastAPI, File, UploadFile
|
| 2025 |
-
|
| 2026 |
-
app = FastAPI()
|
| 2027 |
-
|
| 2028 |
-
# Or configure in nginx
|
| 2029 |
-
# client_max_body_size 50M;
|
| 2030 |
-
```
|
| 2031 |
-
|
| 2032 |
-
---
|
| 2033 |
-
|
| 2034 |
-
### Debugging Tips
|
| 2035 |
-
|
| 2036 |
-
#### Enable Debug Logging
|
| 2037 |
-
|
| 2038 |
-
```python
|
| 2039 |
-
import logging
|
| 2040 |
-
|
| 2041 |
-
logging.basicConfig(level=logging.DEBUG)
|
| 2042 |
-
logger = logging.getLogger(__name__)
|
| 2043 |
-
|
| 2044 |
-
# In your code
|
| 2045 |
-
logger.debug(f"Processing image: {filename}")
|
| 2046 |
-
logger.debug(f"Detections found: {len(detections)}")
|
| 2047 |
-
```
|
| 2048 |
-
|
| 2049 |
-
#### GPU Monitoring
|
| 2050 |
-
|
| 2051 |
-
```bash
|
| 2052 |
-
# Real-time GPU monitoring
|
| 2053 |
-
watch -n 1 nvidia-smi
|
| 2054 |
-
|
| 2055 |
-
# Or use gpustat
|
| 2056 |
-
pip install gpustat
|
| 2057 |
-
gpustat -i 1
|
| 2058 |
-
```
|
| 2059 |
-
|
| 2060 |
-
#### Memory Profiling
|
| 2061 |
-
|
| 2062 |
-
```python
|
| 2063 |
-
# Install memory profiler
|
| 2064 |
-
pip install memory_profiler
|
| 2065 |
-
|
| 2066 |
-
# Use decorator
|
| 2067 |
-
from memory_profiler import profile
|
| 2068 |
-
|
| 2069 |
-
@profile
|
| 2070 |
-
def detect_objects(...):
|
| 2071 |
-
...
|
| 2072 |
-
```
|
| 2073 |
-
|
| 2074 |
-
#### Request Timing
|
| 2075 |
-
|
| 2076 |
-
```python
|
| 2077 |
-
import time
|
| 2078 |
-
|
| 2079 |
-
@app.post("/api/detect")
|
| 2080 |
-
async def detect_objects(...):
|
| 2081 |
-
start_time = time.time()
|
| 2082 |
-
|
| 2083 |
-
# ... processing ...
|
| 2084 |
-
|
| 2085 |
-
elapsed = time.time() - start_time
|
| 2086 |
-
logger.info(f"Request completed in {elapsed:.2f}s")
|
| 2087 |
-
```
|
| 2088 |
-
|
| 2089 |
-
---
|
| 2090 |
-
|
| 2091 |
-
### Health Checks
|
| 2092 |
-
|
| 2093 |
-
```python
|
| 2094 |
-
# Add health check endpoint
|
| 2095 |
-
@app.get("/health")
|
| 2096 |
-
async def health_check():
|
| 2097 |
-
return {
|
| 2098 |
-
"status": "healthy",
|
| 2099 |
-
"model_loaded": model is not None,
|
| 2100 |
-
"gpu_available": torch.cuda.is_available(),
|
| 2101 |
-
"gpu_memory_used": f"{torch.cuda.memory_allocated(0) / 1024**3:.2f} GB"
|
| 2102 |
-
if torch.cuda.is_available() else "N/A"
|
| 2103 |
-
}
|
| 2104 |
-
```
|
| 2105 |
-
|
| 2106 |
-
---
|
| 2107 |
-
|
| 2108 |
-
## π€ Contributing
|
| 2109 |
-
|
| 2110 |
-
### Getting Started
|
| 2111 |
-
|
| 2112 |
-
1. Fork the repository
|
| 2113 |
-
2. Create a feature branch: `git checkout -b feature/amazing-feature`
|
| 2114 |
-
3. Make your changes
|
| 2115 |
-
4. Run tests: `pytest`
|
| 2116 |
-
5. Commit: `git commit -m 'Add amazing feature'`
|
| 2117 |
-
6. Push: `git push origin feature/amazing-feature`
|
| 2118 |
-
7. Open a Pull Request
|
| 2119 |
-
|
| 2120 |
-
### Code Style
|
| 2121 |
-
|
| 2122 |
-
```bash
|
| 2123 |
-
# Install development dependencies
|
| 2124 |
-
pip install black isort flake8 mypy
|
| 2125 |
-
|
| 2126 |
-
# Format code
|
| 2127 |
-
black .
|
| 2128 |
-
isort .
|
| 2129 |
-
|
| 2130 |
-
# Check style
|
| 2131 |
-
flake8 .
|
| 2132 |
-
|
| 2133 |
-
# Type checking
|
| 2134 |
-
mypy .
|
| 2135 |
-
```
|
| 2136 |
-
|
| 2137 |
-
### Pre-commit Hooks
|
| 2138 |
-
|
| 2139 |
-
```yaml
|
| 2140 |
-
# .pre-commit-config.yaml
|
| 2141 |
-
repos:
|
| 2142 |
-
- repo: https://github.com/psf/black
|
| 2143 |
-
rev: 23.7.0
|
| 2144 |
-
hooks:
|
| 2145 |
-
- id: black
|
| 2146 |
-
- repo: https://github.com/pycqa/isort
|
| 2147 |
-
rev: 5.12.0
|
| 2148 |
-
hooks:
|
| 2149 |
-
- id: isort
|
| 2150 |
-
- repo: https://github.com/pycqa/flake8
|
| 2151 |
-
rev: 6.1.0
|
| 2152 |
-
hooks:
|
| 2153 |
-
- id: flake8
|
| 2154 |
-
```
|
| 2155 |
-
|
| 2156 |
-
```bash
|
| 2157 |
-
pip install pre-commit
|
| 2158 |
-
pre-commit install
|
| 2159 |
-
```
|
| 2160 |
-
|
| 2161 |
-
### Adding New Metrics
|
| 2162 |
-
|
| 2163 |
-
1. Create function in appropriate module under `utils/metrics/`
|
| 2164 |
-
2. Export from `utils/metrics/__init__.py`
|
| 2165 |
-
3. Call from `services/processing.py`
|
| 2166 |
-
4. Add to response schema in `api/schemas.py`
|
| 2167 |
-
5. Document in this README
|
| 2168 |
-
6. Add tests in `tests/test_utils/test_metrics.py`
|
| 2169 |
-
|
| 2170 |
-
### Adding New Visualizations
|
| 2171 |
-
|
| 2172 |
-
1. Add function in `services/visualization.py`
|
| 2173 |
-
2. Call from `generate_comprehensive_visualizations()`
|
| 2174 |
-
3. Handle errors with try-except
|
| 2175 |
-
4. Always close figures with `plt.close(fig)`
|
| 2176 |
-
5. Document chart type in this README
|
| 2177 |
-
|
| 2178 |
-
---
|
| 2179 |
-
|
| 2180 |
-
## π Citation
|
| 2181 |
-
|
| 2182 |
-
If you use this API or the RoDLA model in your research, please cite:
|
| 2183 |
-
|
| 2184 |
-
```bibtex
|
| 2185 |
-
@inproceedings{rodla2024cvpr,
|
| 2186 |
-
title={RoDLA: Benchmarking the Robustness of Document Layout Analysis Models},
|
| 2187 |
-
author={Author Names},
|
| 2188 |
-
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision
|
| 2189 |
-
and Pattern Recognition (CVPR)},
|
| 2190 |
-
year={2024}
|
| 2191 |
-
}
|
| 2192 |
-
```
|
| 2193 |
-
|
| 2194 |
-
### Related Publications
|
| 2195 |
-
|
| 2196 |
-
```bibtex
|
| 2197 |
-
@article{internimage2023,
|
| 2198 |
-
title={InternImage: Exploring Large-Scale Vision Foundation Models
|
| 2199 |
-
with Deformable Convolutions},
|
| 2200 |
-
author={Wang et al.},
|
| 2201 |
-
journal={CVPR},
|
| 2202 |
-
year={2023}
|
| 2203 |
-
}
|
| 2204 |
-
|
| 2205 |
-
@article{dino2022,
|
| 2206 |
-
title={DINO: DETR with Improved DeNoising Anchor Boxes
|
| 2207 |
-
for End-to-End Object Detection},
|
| 2208 |
-
author={Zhang et al.},
|
| 2209 |
-
journal={ICLR},
|
| 2210 |
-
year={2023}
|
| 2211 |
-
}
|
| 2212 |
-
```
|
| 2213 |
-
|
| 2214 |
-
---
|
| 2215 |
-
|
| 2216 |
-
## π License
|
| 2217 |
-
|
| 2218 |
-
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
| 2219 |
-
|
| 2220 |
-
```
|
| 2221 |
-
MIT License
|
| 2222 |
-
|
| 2223 |
-
Copyright (c) 2024 [Your Name]
|
| 2224 |
-
|
| 2225 |
-
Permission is hereby granted, free of charge, to any person obtaining a copy
|
| 2226 |
-
of this software and associated documentation files (the "Software"), to deal
|
| 2227 |
-
in the Software without restriction, including without limitation the rights
|
| 2228 |
-
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
| 2229 |
-
copies of the Software, and to permit persons to whom the Software is
|
| 2230 |
-
furnished to do so, subject to the following conditions:
|
| 2231 |
-
|
| 2232 |
-
The above copyright notice and this permission notice shall be included in all
|
| 2233 |
-
copies or substantial portions of the Software.
|
| 2234 |
-
|
| 2235 |
-
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
| 2236 |
-
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
| 2237 |
-
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
| 2238 |
-
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
| 2239 |
-
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
| 2240 |
-
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
| 2241 |
-
SOFTWARE.
|
| 2242 |
-
```
|
| 2243 |
-
|
| 2244 |
-
---
|
| 2245 |
-
|
| 2246 |
-
## π Support
|
| 2247 |
-
|
| 2248 |
-
### Getting Help
|
| 2249 |
-
|
| 2250 |
-
- **Documentation:** This README
|
| 2251 |
-
- **Issues:** [GitHub Issues](https://github.com/yourusername/rodla-api/issues)
|
| 2252 |
-
- **Discussions:** [GitHub Discussions](https://github.com/yourusername/rodla-api/discussions)
|
| 2253 |
-
|
| 2254 |
-
### Reporting Bugs
|
| 2255 |
-
|
| 2256 |
-
When reporting bugs, please include:
|
| 2257 |
-
|
| 2258 |
-
1. Operating system and version
|
| 2259 |
-
2. Python version
|
| 2260 |
-
3. GPU model and driver version
|
| 2261 |
-
4. Complete error traceback
|
| 2262 |
-
5. Minimal reproducible example
|
| 2263 |
-
6. Input image (if possible)
|
| 2264 |
-
|
| 2265 |
-
### Feature Requests
|
| 2266 |
-
|
| 2267 |
-
We welcome feature requests! Please:
|
| 2268 |
-
|
| 2269 |
-
1. Check existing issues first
|
| 2270 |
-
2. Describe the use case
|
| 2271 |
-
3. Explain expected behavior
|
| 2272 |
-
4. Provide examples if possible
|
| 2273 |
-
|
| 2274 |
-
---
|
| 2275 |
-
|
| 2276 |
-
## π Acknowledgments
|
| 2277 |
-
|
| 2278 |
-
- **RoDLA Authors** - For the original model and research
|
| 2279 |
-
- **MMDetection Team** - For the detection framework
|
| 2280 |
-
- **InternImage Team** - For the backbone architecture
|
| 2281 |
-
- **FastAPI** - For the excellent web framework
|
| 2282 |
-
- **Open Source Community** - For countless contributions
|
| 2283 |
-
|
| 2284 |
-
---
|
| 2285 |
-
|
| 2286 |
-
<div align="center">
|
| 2287 |
-
|
| 2288 |
-
**Built with β€οΈ for Document Analysis**
|
| 2289 |
-
|
| 2290 |
-
[β¬ Back to Top](#rodla-document-layout-analysis-api)
|
| 2291 |
-
|
| 2292 |
-
</div>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
deployment/backend/README_Version_TWO.md
DELETED
|
The diff for this file is too large to render.
See raw diff
|
|
|
deployment/backend/README_Version_Three.md
DELETED
|
The diff for this file is too large to render.
See raw diff
|
|
|
deployment/backend/backend.py
CHANGED
|
@@ -1,98 +1,666 @@
|
|
| 1 |
"""
|
| 2 |
-
RoDLA
|
| 3 |
-
|
| 4 |
-
|
| 5 |
"""
|
| 6 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
from fastapi.middleware.cors import CORSMiddleware
|
|
|
|
| 8 |
import uvicorn
|
| 9 |
-
from pathlib import Path
|
| 10 |
|
| 11 |
-
#
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
CORS_ORIGINS, CORS_METHODS, CORS_HEADERS,
|
| 15 |
-
OUTPUT_DIR, PERTURBATION_OUTPUT_DIR # NEW
|
| 16 |
-
)
|
| 17 |
|
| 18 |
-
|
| 19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
-
# Import API routes
|
| 22 |
-
from api.routes import router
|
| 23 |
|
| 24 |
-
#
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
|
| 31 |
# Add CORS middleware
|
| 32 |
app.add_middleware(
|
| 33 |
CORSMiddleware,
|
| 34 |
-
allow_origins=
|
| 35 |
allow_credentials=True,
|
| 36 |
-
allow_methods=
|
| 37 |
-
allow_headers=
|
| 38 |
)
|
| 39 |
|
| 40 |
-
|
| 41 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 42 |
|
| 43 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
@app.on_event("startup")
|
| 45 |
async def startup_event():
|
| 46 |
-
"""Initialize model
|
| 47 |
try:
|
| 48 |
-
print("="*60)
|
| 49 |
-
print("Starting RoDLA Document Layout Analysis API")
|
| 50 |
-
print("="*60)
|
| 51 |
-
|
| 52 |
-
# Create output directories
|
| 53 |
-
print("π Creating output directories...")
|
| 54 |
-
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
|
| 55 |
-
PERTURBATION_OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
|
| 56 |
-
print(f" β Main output: {OUTPUT_DIR}")
|
| 57 |
-
print(f" β Perturbations: {PERTURBATION_OUTPUT_DIR}")
|
| 58 |
-
|
| 59 |
-
# Load model
|
| 60 |
-
print("\nπ§ Loading RoDLA model...")
|
| 61 |
load_model()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 62 |
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 76 |
|
| 77 |
except Exception as e:
|
| 78 |
-
print(f"β
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 82 |
|
| 83 |
|
| 84 |
-
@app.
|
| 85 |
-
async def
|
| 86 |
-
"""
|
| 87 |
-
|
| 88 |
-
|
| 89 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 90 |
|
|
|
|
|
|
|
|
|
|
| 91 |
|
| 92 |
if __name__ == "__main__":
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 93 |
uvicorn.run(
|
| 94 |
app,
|
| 95 |
-
host=
|
| 96 |
-
port=API_PORT,
|
| 97 |
log_level="info"
|
| 98 |
-
)
|
|
|
|
| 1 |
"""
|
| 2 |
+
RoDLA Backend - Production Version
|
| 3 |
+
Uses real InternImage-XL weights and all 12 perturbation types with 3 degree levels
|
| 4 |
+
MMDET disabled if MMCV extensions unavailable - perturbations always functional
|
| 5 |
"""
|
| 6 |
+
|
| 7 |
+
import os
|
| 8 |
+
import sys
|
| 9 |
+
import json
|
| 10 |
+
import base64
|
| 11 |
+
import traceback
|
| 12 |
+
from pathlib import Path
|
| 13 |
+
from typing import Dict, List, Any, Optional, Tuple
|
| 14 |
+
from io import BytesIO
|
| 15 |
+
from datetime import datetime
|
| 16 |
+
|
| 17 |
+
import numpy as np
|
| 18 |
+
from PIL import Image
|
| 19 |
+
import cv2
|
| 20 |
+
|
| 21 |
+
from fastapi import FastAPI, File, UploadFile, HTTPException
|
| 22 |
from fastapi.middleware.cors import CORSMiddleware
|
| 23 |
+
from pydantic import BaseModel
|
| 24 |
import uvicorn
|
|
|
|
| 25 |
|
| 26 |
+
# ============================================================================
|
| 27 |
+
# Configuration
|
| 28 |
+
# ============================================================================
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
+
class Config:
|
| 31 |
+
"""Global configuration"""
|
| 32 |
+
API_PORT = 8000
|
| 33 |
+
REPO_ROOT = Path("/home/admin/CV/rodla-academic")
|
| 34 |
+
MODEL_CONFIG_PATH = REPO_ROOT / "model/configs/m6doc/rodla_internimage_xl_m6doc.py"
|
| 35 |
+
MODEL_WEIGHTS_PATH = REPO_ROOT / "finetuning_rodla/finetuning_rodla/checkpoints/rodla_internimage_xl_publaynet.pth"
|
| 36 |
+
PERTURBATIONS_DIR = REPO_ROOT / "deployment/backend/perturbations"
|
| 37 |
+
|
| 38 |
+
# Automatically use GPU if available, otherwise CPU
|
| 39 |
+
@staticmethod
|
| 40 |
+
def get_device():
|
| 41 |
+
import torch
|
| 42 |
+
if torch.cuda.is_available():
|
| 43 |
+
return "cuda:0"
|
| 44 |
+
else:
|
| 45 |
+
return "cpu"
|
| 46 |
|
|
|
|
|
|
|
| 47 |
|
| 48 |
+
# ============================================================================
|
| 49 |
+
# Global State
|
| 50 |
+
# ============================================================================
|
| 51 |
+
|
| 52 |
+
app = FastAPI(title="RoDLA Production Backend", version="3.0.0")
|
| 53 |
+
|
| 54 |
+
# Detect device
|
| 55 |
+
import torch
|
| 56 |
+
DEVICE = "cuda:0" if torch.cuda.is_available() else "cpu"
|
| 57 |
+
|
| 58 |
+
model_state = {
|
| 59 |
+
"loaded": False,
|
| 60 |
+
"model": None,
|
| 61 |
+
"error": None,
|
| 62 |
+
"model_type": "RoDLA InternImage-XL (MMDET)",
|
| 63 |
+
"device": DEVICE,
|
| 64 |
+
"mmdet_available": False
|
| 65 |
+
}
|
| 66 |
|
| 67 |
# Add CORS middleware
|
| 68 |
app.add_middleware(
|
| 69 |
CORSMiddleware,
|
| 70 |
+
allow_origins=["*"],
|
| 71 |
allow_credentials=True,
|
| 72 |
+
allow_methods=["*"],
|
| 73 |
+
allow_headers=["*"],
|
| 74 |
)
|
| 75 |
|
| 76 |
+
|
| 77 |
+
# ============================================================================
|
| 78 |
+
# M6Doc Dataset Classes
|
| 79 |
+
# ============================================================================
|
| 80 |
+
|
| 81 |
+
LAYOUT_CLASS_MAP = {
|
| 82 |
+
i: "Text" for i in range(75)
|
| 83 |
+
}
|
| 84 |
+
# Simplified mapping to layout elements
|
| 85 |
+
for i in range(75):
|
| 86 |
+
if i in [1, 2, 3, 4, 5]:
|
| 87 |
+
LAYOUT_CLASS_MAP[i] = "Title"
|
| 88 |
+
elif i in [6, 7]:
|
| 89 |
+
LAYOUT_CLASS_MAP[i] = "List"
|
| 90 |
+
elif i in [8, 9]:
|
| 91 |
+
LAYOUT_CLASS_MAP[i] = "Figure"
|
| 92 |
+
elif i in [10, 11]:
|
| 93 |
+
LAYOUT_CLASS_MAP[i] = "Table"
|
| 94 |
+
elif i in [12, 13, 14]:
|
| 95 |
+
LAYOUT_CLASS_MAP[i] = "Header"
|
| 96 |
+
|
| 97 |
+
|
| 98 |
+
# ============================================================================
|
| 99 |
+
# Utility Functions
|
| 100 |
+
# ============================================================================
|
| 101 |
+
|
| 102 |
+
def encode_image_to_base64(image: np.ndarray) -> str:
|
| 103 |
+
"""Convert numpy array to base64 string"""
|
| 104 |
+
if len(image.shape) == 3 and image.shape[2] == 3:
|
| 105 |
+
# Ensure RGB order
|
| 106 |
+
if isinstance(image.flat[0], np.uint8):
|
| 107 |
+
image_to_encode = image
|
| 108 |
+
else:
|
| 109 |
+
image_to_encode = (image * 255).astype(np.uint8)
|
| 110 |
+
else:
|
| 111 |
+
image_to_encode = image
|
| 112 |
+
|
| 113 |
+
_, buffer = cv2.imencode('.png', image_to_encode)
|
| 114 |
+
return base64.b64encode(buffer).decode('utf-8')
|
| 115 |
+
|
| 116 |
+
|
| 117 |
+
def heuristic_detect(image_np: np.ndarray) -> List[Dict]:
|
| 118 |
+
"""Enhanced heuristic-based detection when MMDET is unavailable
|
| 119 |
+
Uses multiple edge detection methods and texture analysis"""
|
| 120 |
+
h, w = image_np.shape[:2]
|
| 121 |
+
detections = []
|
| 122 |
+
|
| 123 |
+
# Convert to grayscale for analysis
|
| 124 |
+
gray = cv2.cvtColor(image_np, cv2.COLOR_RGB2GRAY)
|
| 125 |
+
|
| 126 |
+
# Try multiple edge detection methods for better coverage
|
| 127 |
+
edges1 = cv2.Canny(gray, 50, 150)
|
| 128 |
+
edges2 = cv2.Canny(gray, 30, 100)
|
| 129 |
+
|
| 130 |
+
# Combine edges
|
| 131 |
+
edges = cv2.bitwise_or(edges1, edges2)
|
| 132 |
+
|
| 133 |
+
# Apply morphological operations to connect nearby edges
|
| 134 |
+
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))
|
| 135 |
+
edges = cv2.morphologyEx(edges, cv2.MORPH_CLOSE, kernel)
|
| 136 |
+
|
| 137 |
+
# Find contours
|
| 138 |
+
contours, _ = cv2.findContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
|
| 139 |
+
|
| 140 |
+
# Also try watershed/connected components for text detection
|
| 141 |
+
blur = cv2.GaussianBlur(gray, (5, 5), 0)
|
| 142 |
+
_, binary = cv2.threshold(blur, 127, 255, cv2.THRESH_BINARY)
|
| 143 |
+
|
| 144 |
+
# Find connected components
|
| 145 |
+
num_labels, labels = cv2.connectedComponents(binary)
|
| 146 |
+
|
| 147 |
+
# Process contours to create pseudo-detections
|
| 148 |
+
processed_boxes = set()
|
| 149 |
+
for contour in contours:
|
| 150 |
+
x, y, cw, ch = cv2.boundingRect(contour)
|
| 151 |
+
|
| 152 |
+
# Skip if too small or too large
|
| 153 |
+
if cw < 15 or ch < 15 or cw > w * 0.98 or ch > h * 0.98:
|
| 154 |
+
continue
|
| 155 |
+
|
| 156 |
+
area_ratio = (cw * ch) / (w * h)
|
| 157 |
+
if area_ratio < 0.0005 or area_ratio > 0.9:
|
| 158 |
+
continue
|
| 159 |
+
|
| 160 |
+
# Skip if box is too similar to already processed boxes
|
| 161 |
+
box_key = (round(x/10)*10, round(y/10)*10, round(cw/10)*10, round(ch/10)*10)
|
| 162 |
+
if box_key in processed_boxes:
|
| 163 |
+
continue
|
| 164 |
+
processed_boxes.add(box_key)
|
| 165 |
+
|
| 166 |
+
# Analyze content to determine class
|
| 167 |
+
roi = gray[y:y+ch, x:x+cw]
|
| 168 |
+
roi_blur = cv2.GaussianBlur(roi, (5, 5), 0)
|
| 169 |
+
roi_edges = cv2.Canny(roi_blur, 50, 150)
|
| 170 |
+
edge_density = np.sum(roi_edges > 0) / roi.size
|
| 171 |
+
|
| 172 |
+
aspect_ratio = cw / (ch + 1e-6)
|
| 173 |
+
|
| 174 |
+
# Classification logic
|
| 175 |
+
if aspect_ratio > 2.5 or (aspect_ratio > 2 and edge_density < 0.05):
|
| 176 |
+
# Wide with sparse edges = likely figure/table
|
| 177 |
+
class_name = "Figure"
|
| 178 |
+
class_id = 8
|
| 179 |
+
confidence = 0.6 + 0.35 * (1 - min(area_ratio / 0.5, 1.0))
|
| 180 |
+
elif aspect_ratio < 0.3:
|
| 181 |
+
# Narrow = likely list or table column
|
| 182 |
+
class_name = "List"
|
| 183 |
+
class_id = 6
|
| 184 |
+
confidence = 0.55 + 0.4 * (1 - min(area_ratio / 0.3, 1.0))
|
| 185 |
+
elif edge_density > 0.15:
|
| 186 |
+
# High edge density = likely table or complex content
|
| 187 |
+
class_name = "Table"
|
| 188 |
+
class_id = 10
|
| 189 |
+
confidence = 0.5 + 0.4 * edge_density
|
| 190 |
+
else:
|
| 191 |
+
# Default = text content
|
| 192 |
+
class_name = "Text"
|
| 193 |
+
class_id = 50
|
| 194 |
+
confidence = 0.5 + 0.4 * (1 - min(area_ratio / 0.3, 1.0))
|
| 195 |
+
|
| 196 |
+
# Ensure confidence in [0, 1]
|
| 197 |
+
confidence = min(max(confidence, 0.3), 0.95)
|
| 198 |
+
|
| 199 |
+
detections.append({
|
| 200 |
+
"class_id": class_id,
|
| 201 |
+
"class_name": class_name,
|
| 202 |
+
"confidence": float(confidence),
|
| 203 |
+
"bbox": {
|
| 204 |
+
"x": float(x / w),
|
| 205 |
+
"y": float(y / h),
|
| 206 |
+
"width": float(cw / w),
|
| 207 |
+
"height": float(ch / h)
|
| 208 |
+
},
|
| 209 |
+
"area": float(area_ratio)
|
| 210 |
+
})
|
| 211 |
+
|
| 212 |
+
# Sort by confidence and keep top 30
|
| 213 |
+
detections.sort(key=lambda x: x["confidence"], reverse=True)
|
| 214 |
+
return detections[:30]
|
| 215 |
+
|
| 216 |
+
|
| 217 |
+
# ============================================================================
|
| 218 |
+
# Model Loading
|
| 219 |
+
# ============================================================================
|
| 220 |
+
|
| 221 |
+
def load_model():
|
| 222 |
+
"""Load the RoDLA model with actual weights"""
|
| 223 |
+
global model_state
|
| 224 |
+
|
| 225 |
+
print("\n" + "="*70)
|
| 226 |
+
print("π Loading RoDLA InternImage-XL with Real Weights")
|
| 227 |
+
print("="*70)
|
| 228 |
+
|
| 229 |
+
# Verify weight file exists
|
| 230 |
+
if not Config.MODEL_WEIGHTS_PATH.exists():
|
| 231 |
+
error_msg = f"Weights not found: {Config.MODEL_WEIGHTS_PATH}"
|
| 232 |
+
print(f"β {error_msg}")
|
| 233 |
+
model_state["loaded"] = False
|
| 234 |
+
model_state["error"] = error_msg
|
| 235 |
+
return None
|
| 236 |
+
|
| 237 |
+
weights_size = Config.MODEL_WEIGHTS_PATH.stat().st_size / (1024**3)
|
| 238 |
+
print(f"β
Weights file: {Config.MODEL_WEIGHTS_PATH}")
|
| 239 |
+
print(f" Size: {weights_size:.2f}GB")
|
| 240 |
+
|
| 241 |
+
# Verify config exists
|
| 242 |
+
if not Config.MODEL_CONFIG_PATH.exists():
|
| 243 |
+
error_msg = f"Config not found: {Config.MODEL_CONFIG_PATH}"
|
| 244 |
+
print(f"β {error_msg}")
|
| 245 |
+
model_state["loaded"] = False
|
| 246 |
+
model_state["error"] = error_msg
|
| 247 |
+
return None
|
| 248 |
+
|
| 249 |
+
print(f"β
Config file: {Config.MODEL_CONFIG_PATH}")
|
| 250 |
+
print(f"π Device: {model_state['device']}")
|
| 251 |
+
|
| 252 |
+
if model_state["device"] == "cpu":
|
| 253 |
+
print("β οΈ WARNING: DCNv3 (used in InternImage backbone) only supports CUDA")
|
| 254 |
+
print(" CPU inference is NOT available. Using heuristic fallback.")
|
| 255 |
+
|
| 256 |
+
# Try to import and load MMDET
|
| 257 |
+
try:
|
| 258 |
+
print("β³ Setting up model environment...")
|
| 259 |
+
import torch
|
| 260 |
+
|
| 261 |
+
# Import and use DINO registration helper
|
| 262 |
+
from register_dino import try_load_with_dino_registration
|
| 263 |
+
|
| 264 |
+
print("β³ Loading model from weights (this will take ~30-60 seconds)...")
|
| 265 |
+
print(" File: 3.8GB checkpoint...")
|
| 266 |
+
|
| 267 |
+
model = try_load_with_dino_registration(
|
| 268 |
+
str(Config.MODEL_CONFIG_PATH),
|
| 269 |
+
str(Config.MODEL_WEIGHTS_PATH),
|
| 270 |
+
device=model_state["device"]
|
| 271 |
+
)
|
| 272 |
+
|
| 273 |
+
if model is not None:
|
| 274 |
+
# Set model to evaluation mode
|
| 275 |
+
model.eval()
|
| 276 |
+
|
| 277 |
+
model_state["model"] = model
|
| 278 |
+
model_state["loaded"] = True
|
| 279 |
+
model_state["mmdet_available"] = True
|
| 280 |
+
model_state["error"] = None
|
| 281 |
+
|
| 282 |
+
print("β
RoDLA Model loaded successfully!")
|
| 283 |
+
print(" Model set to evaluation mode (eval())")
|
| 284 |
+
print(" Ready for inference with real 3.8GB weights")
|
| 285 |
+
print("="*70 + "\n")
|
| 286 |
+
return model
|
| 287 |
+
else:
|
| 288 |
+
raise Exception("Model loading returned None")
|
| 289 |
+
|
| 290 |
+
except Exception as e:
|
| 291 |
+
error_msg = f"Failed to load model: {str(e)}"
|
| 292 |
+
print(f"β {error_msg}")
|
| 293 |
+
print(f" Traceback: {traceback.format_exc()}")
|
| 294 |
+
|
| 295 |
+
model_state["loaded"] = False
|
| 296 |
+
model_state["mmdet_available"] = False
|
| 297 |
+
model_state["error"] = error_msg
|
| 298 |
+
print(" Backend will run in HYBRID mode:")
|
| 299 |
+
print(" - Detection: Enhanced heuristic-based (contour analysis)")
|
| 300 |
+
print(" - Perturbations: Real module with all 12 types")
|
| 301 |
+
print("="*70 + "\n")
|
| 302 |
+
return None
|
| 303 |
+
|
| 304 |
+
|
| 305 |
+
def run_inference(image_np: np.ndarray, threshold: float = 0.3) -> List[Dict]:
|
| 306 |
+
"""Run detection on image (MMDET if available, else heuristic)"""
|
| 307 |
+
|
| 308 |
+
if model_state["mmdet_available"] and model_state["model"] is not None:
|
| 309 |
+
try:
|
| 310 |
+
import torch
|
| 311 |
+
from mmdet.apis import inference_detector
|
| 312 |
+
|
| 313 |
+
# Ensure model is in eval mode for inference
|
| 314 |
+
model = model_state["model"]
|
| 315 |
+
model.eval()
|
| 316 |
+
|
| 317 |
+
# Disable gradients for inference (saves memory and speeds up)
|
| 318 |
+
with torch.no_grad():
|
| 319 |
+
# Convert to BGR for inference
|
| 320 |
+
image_bgr = cv2.cvtColor(image_np, cv2.COLOR_RGB2BGR)
|
| 321 |
+
h, w = image_np.shape[:2]
|
| 322 |
+
|
| 323 |
+
# Run inference with loaded model
|
| 324 |
+
result = inference_detector(model, image_bgr)
|
| 325 |
+
|
| 326 |
+
detections = []
|
| 327 |
+
|
| 328 |
+
if result is not None:
|
| 329 |
+
# Handle different result formats
|
| 330 |
+
if hasattr(result, 'pred_instances'):
|
| 331 |
+
# Newer MMDET format
|
| 332 |
+
bboxes = result.pred_instances.bboxes.cpu().numpy()
|
| 333 |
+
scores = result.pred_instances.scores.cpu().numpy()
|
| 334 |
+
labels = result.pred_instances.labels.cpu().numpy()
|
| 335 |
+
elif isinstance(result, tuple) and len(result) > 0:
|
| 336 |
+
# Legacy format: (bbox_results, segm_results, ...)
|
| 337 |
+
bbox_results = result[0]
|
| 338 |
+
if isinstance(bbox_results, list):
|
| 339 |
+
# List of arrays per class
|
| 340 |
+
for class_id, class_bboxes in enumerate(bbox_results):
|
| 341 |
+
if class_bboxes.size == 0:
|
| 342 |
+
continue
|
| 343 |
+
for box in class_bboxes:
|
| 344 |
+
x1, y1, x2, y2, score = box
|
| 345 |
+
bw = x2 - x1
|
| 346 |
+
bh = y2 - y1
|
| 347 |
+
|
| 348 |
+
class_name = LAYOUT_CLASS_MAP.get(class_id, f"Class_{class_id}")
|
| 349 |
+
|
| 350 |
+
detections.append({
|
| 351 |
+
"class_id": class_id,
|
| 352 |
+
"class_name": class_name,
|
| 353 |
+
"confidence": float(score),
|
| 354 |
+
"bbox": {
|
| 355 |
+
"x": float(x1 / w),
|
| 356 |
+
"y": float(y1 / h),
|
| 357 |
+
"width": float(bw / w),
|
| 358 |
+
"height": float(bh / h)
|
| 359 |
+
},
|
| 360 |
+
"area": float((bw * bh) / (w * h))
|
| 361 |
+
})
|
| 362 |
+
# Skip the pred_instances path for legacy format
|
| 363 |
+
detections.sort(key=lambda x: x["confidence"], reverse=True)
|
| 364 |
+
return detections[:100]
|
| 365 |
+
|
| 366 |
+
# Handle pred_instances format
|
| 367 |
+
if 'bboxes' in locals():
|
| 368 |
+
for bbox, score, label in zip(bboxes, scores, labels):
|
| 369 |
+
if score < threshold:
|
| 370 |
+
continue
|
| 371 |
+
|
| 372 |
+
x1, y1, x2, y2 = bbox
|
| 373 |
+
bw = x2 - x1
|
| 374 |
+
bh = y2 - y1
|
| 375 |
+
|
| 376 |
+
class_id = int(label)
|
| 377 |
+
class_name = LAYOUT_CLASS_MAP.get(class_id, f"Class_{class_id}")
|
| 378 |
+
|
| 379 |
+
detections.append({
|
| 380 |
+
"class_id": class_id,
|
| 381 |
+
"class_name": class_name,
|
| 382 |
+
"confidence": float(score),
|
| 383 |
+
"bbox": {
|
| 384 |
+
"x": float(x1 / w),
|
| 385 |
+
"y": float(y1 / h),
|
| 386 |
+
"width": float(bw / w),
|
| 387 |
+
"height": float(bh / h)
|
| 388 |
+
},
|
| 389 |
+
"area": float((bw * bh) / (w * h))
|
| 390 |
+
})
|
| 391 |
+
|
| 392 |
+
# Sort by confidence and limit results
|
| 393 |
+
detections.sort(key=lambda x: x["confidence"], reverse=True)
|
| 394 |
+
return detections[:100]
|
| 395 |
+
|
| 396 |
+
except Exception as e:
|
| 397 |
+
print(f"β οΈ MMDET inference failed: {e}")
|
| 398 |
+
print(f" Error details: {traceback.format_exc()}")
|
| 399 |
+
# Fall back to heuristic if inference fails
|
| 400 |
+
return heuristic_detect(image_np)
|
| 401 |
+
else:
|
| 402 |
+
# Use heuristic detection
|
| 403 |
+
return heuristic_detect(image_np)
|
| 404 |
|
| 405 |
|
| 406 |
+
# ============================================================================
|
| 407 |
+
# API Routes
|
| 408 |
+
# ============================================================================
|
| 409 |
+
|
| 410 |
@app.on_event("startup")
|
| 411 |
async def startup_event():
|
| 412 |
+
"""Initialize model on startup"""
|
| 413 |
try:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 414 |
load_model()
|
| 415 |
+
except Exception as e:
|
| 416 |
+
print(f"β οΈ Model loading failed: {e}")
|
| 417 |
+
model_state["loaded"] = False
|
| 418 |
+
|
| 419 |
+
|
| 420 |
+
@app.get("/api/health")
|
| 421 |
+
async def health_check():
|
| 422 |
+
"""Health check endpoint"""
|
| 423 |
+
return {
|
| 424 |
+
"status": "ok",
|
| 425 |
+
"model_loaded": model_state["loaded"],
|
| 426 |
+
"mmdet_available": model_state["mmdet_available"],
|
| 427 |
+
"detection_mode": "MMDET" if model_state["mmdet_available"] else "Heuristic",
|
| 428 |
+
"device": model_state["device"],
|
| 429 |
+
"model_type": model_state["model_type"],
|
| 430 |
+
"weights_path": str(Config.MODEL_WEIGHTS_PATH),
|
| 431 |
+
"weights_exists": Config.MODEL_WEIGHTS_PATH.exists(),
|
| 432 |
+
"weights_size_gb": Config.MODEL_WEIGHTS_PATH.stat().st_size / (1024**3) if Config.MODEL_WEIGHTS_PATH.exists() else 0
|
| 433 |
+
}
|
| 434 |
+
|
| 435 |
+
|
| 436 |
+
@app.get("/api/model-info")
|
| 437 |
+
async def model_info():
|
| 438 |
+
"""Get model information"""
|
| 439 |
+
return {
|
| 440 |
+
"name": "RoDLA InternImage-XL",
|
| 441 |
+
"version": "3.0.0",
|
| 442 |
+
"type": "Document Layout Analysis",
|
| 443 |
+
"mmdet_loaded": model_state["loaded"],
|
| 444 |
+
"mmdet_available": model_state["mmdet_available"],
|
| 445 |
+
"detection_mode": "MMDET (Real Model)" if model_state["mmdet_available"] else "Heuristic (Contour-based)",
|
| 446 |
+
"error": model_state["error"],
|
| 447 |
+
"device": model_state["device"],
|
| 448 |
+
"framework": "MMDET + PyTorch (or Heuristic Fallback)",
|
| 449 |
+
"backbone": "InternImage-XL with DCNv3",
|
| 450 |
+
"detector": "DINO",
|
| 451 |
+
"dataset": "M6Doc (75 classes)",
|
| 452 |
+
"weights_file": str(Config.MODEL_WEIGHTS_PATH),
|
| 453 |
+
"config_file": str(Config.MODEL_CONFIG_PATH),
|
| 454 |
+
"perturbations_available": True,
|
| 455 |
+
"supported_perturbations": [
|
| 456 |
+
"defocus", "vibration", "speckle", "texture",
|
| 457 |
+
"watermark", "background", "ink_holdout", "ink_bleeding",
|
| 458 |
+
"illumination", "rotation", "keystoning", "warping"
|
| 459 |
+
]
|
| 460 |
+
}
|
| 461 |
+
|
| 462 |
+
|
| 463 |
+
@app.get("/api/perturbations/info")
|
| 464 |
+
async def perturbation_info():
|
| 465 |
+
"""Get information about available perturbations"""
|
| 466 |
+
return {
|
| 467 |
+
"total_perturbations": 12,
|
| 468 |
+
"categories": {
|
| 469 |
+
"blur": {
|
| 470 |
+
"types": ["defocus", "vibration"],
|
| 471 |
+
"description": "Blur effects simulating optical issues"
|
| 472 |
+
},
|
| 473 |
+
"noise": {
|
| 474 |
+
"types": ["speckle", "texture"],
|
| 475 |
+
"description": "Noise patterns and texture artifacts"
|
| 476 |
+
},
|
| 477 |
+
"content": {
|
| 478 |
+
"types": ["watermark", "background"],
|
| 479 |
+
"description": "Content additions like watermarks and backgrounds"
|
| 480 |
+
},
|
| 481 |
+
"inconsistency": {
|
| 482 |
+
"types": ["ink_holdout", "ink_bleeding", "illumination"],
|
| 483 |
+
"description": "Print quality issues and lighting variations"
|
| 484 |
+
},
|
| 485 |
+
"spatial": {
|
| 486 |
+
"types": ["rotation", "keystoning", "warping"],
|
| 487 |
+
"description": "Geometric transformations"
|
| 488 |
+
}
|
| 489 |
+
},
|
| 490 |
+
"all_types": [
|
| 491 |
+
"defocus", "vibration", "speckle", "texture",
|
| 492 |
+
"watermark", "background", "ink_holdout", "ink_bleeding",
|
| 493 |
+
"illumination", "rotation", "keystoning", "warping"
|
| 494 |
+
],
|
| 495 |
+
"degree_levels": {
|
| 496 |
+
1: "Mild - Subtle effect",
|
| 497 |
+
2: "Moderate - Noticeable effect",
|
| 498 |
+
3: "Severe - Strong effect"
|
| 499 |
+
}
|
| 500 |
+
}
|
| 501 |
+
|
| 502 |
+
|
| 503 |
+
@app.post("/api/detect")
|
| 504 |
+
async def detect(file: UploadFile = File(...), threshold: float = 0.3):
|
| 505 |
+
"""Detect document layout using RoDLA with real weights or heuristic fallback"""
|
| 506 |
+
start_time = datetime.now()
|
| 507 |
+
|
| 508 |
+
try:
|
| 509 |
+
# Load image
|
| 510 |
+
contents = await file.read()
|
| 511 |
+
image = Image.open(BytesIO(contents)).convert('RGB')
|
| 512 |
+
image_np = np.array(image)
|
| 513 |
+
h, w = image_np.shape[:2]
|
| 514 |
+
|
| 515 |
+
# Run inference
|
| 516 |
+
detections = run_inference(image_np, threshold=threshold)
|
| 517 |
|
| 518 |
+
# Build class distribution
|
| 519 |
+
class_distribution = {}
|
| 520 |
+
for det in detections:
|
| 521 |
+
cn = det["class_name"]
|
| 522 |
+
class_distribution[cn] = class_distribution.get(cn, 0) + 1
|
| 523 |
+
|
| 524 |
+
processing_time = (datetime.now() - start_time).total_seconds() * 1000
|
| 525 |
+
|
| 526 |
+
detection_mode = "Real MMDET Model (3.8GB weights)" if model_state["mmdet_available"] else "Heuristic Detection"
|
| 527 |
+
|
| 528 |
+
return {
|
| 529 |
+
"success": True,
|
| 530 |
+
"message": f"Detection completed using {detection_mode}",
|
| 531 |
+
"detection_mode": detection_mode,
|
| 532 |
+
"image_width": w,
|
| 533 |
+
"image_height": h,
|
| 534 |
+
"num_detections": len(detections),
|
| 535 |
+
"detections": detections,
|
| 536 |
+
"class_distribution": class_distribution,
|
| 537 |
+
"processing_time_ms": processing_time
|
| 538 |
+
}
|
| 539 |
|
| 540 |
except Exception as e:
|
| 541 |
+
print(f"β Detection error: {e}\n{traceback.format_exc()}")
|
| 542 |
+
processing_time = (datetime.now() - start_time).total_seconds() * 1000
|
| 543 |
+
|
| 544 |
+
return {
|
| 545 |
+
"success": False,
|
| 546 |
+
"message": str(e),
|
| 547 |
+
"image_width": 0,
|
| 548 |
+
"image_height": 0,
|
| 549 |
+
"num_detections": 0,
|
| 550 |
+
"detections": [],
|
| 551 |
+
"class_distribution": {},
|
| 552 |
+
"processing_time_ms": processing_time
|
| 553 |
+
}
|
| 554 |
|
| 555 |
|
| 556 |
+
@app.post("/api/generate-perturbations")
|
| 557 |
+
async def generate_perturbations(file: UploadFile = File(...)):
|
| 558 |
+
"""Generate all 12 perturbations with 3 degree levels each (36 total images)"""
|
| 559 |
+
|
| 560 |
+
try:
|
| 561 |
+
# Import simple perturbation functions (no external dependencies beyond common libs)
|
| 562 |
+
from perturbations_simple import apply_perturbation as simple_apply_perturbation
|
| 563 |
+
|
| 564 |
+
# Load image
|
| 565 |
+
contents = await file.read()
|
| 566 |
+
image = Image.open(BytesIO(contents)).convert('RGB')
|
| 567 |
+
image_np = np.array(image)
|
| 568 |
+
image_bgr = cv2.cvtColor(image_np, cv2.COLOR_RGB2BGR)
|
| 569 |
+
|
| 570 |
+
perturbations = {}
|
| 571 |
+
|
| 572 |
+
# Original
|
| 573 |
+
perturbations["original"] = {
|
| 574 |
+
"original": encode_image_to_base64(image_np)
|
| 575 |
+
}
|
| 576 |
+
|
| 577 |
+
# All 12 perturbation types
|
| 578 |
+
all_types = [
|
| 579 |
+
"defocus", "vibration", "speckle", "texture",
|
| 580 |
+
"watermark", "background", "ink_holdout", "ink_bleeding",
|
| 581 |
+
"illumination", "rotation", "keystoning", "warping"
|
| 582 |
+
]
|
| 583 |
+
|
| 584 |
+
print(f"π Generating perturbations for {len(all_types)} types Γ 3 degrees = 36 images...")
|
| 585 |
+
|
| 586 |
+
# Generate all perturbations with 3 degree levels
|
| 587 |
+
generated_count = 0
|
| 588 |
+
for ptype in all_types:
|
| 589 |
+
perturbations[ptype] = {}
|
| 590 |
+
|
| 591 |
+
for degree in [1, 2, 3]:
|
| 592 |
+
try:
|
| 593 |
+
# Use simple perturbation function (no external heavy dependencies)
|
| 594 |
+
result_image, success, message = simple_apply_perturbation(
|
| 595 |
+
image_bgr.copy(),
|
| 596 |
+
ptype,
|
| 597 |
+
degree=degree
|
| 598 |
+
)
|
| 599 |
+
|
| 600 |
+
if success:
|
| 601 |
+
# Convert BGR to RGB for display
|
| 602 |
+
if len(result_image.shape) == 3 and result_image.shape[2] == 3:
|
| 603 |
+
result_rgb = cv2.cvtColor(result_image, cv2.COLOR_BGR2RGB)
|
| 604 |
+
else:
|
| 605 |
+
result_rgb = result_image
|
| 606 |
+
|
| 607 |
+
perturbations[ptype][f"degree_{degree}"] = encode_image_to_base64(result_rgb)
|
| 608 |
+
generated_count += 1
|
| 609 |
+
print(f" β
{ptype:12} degree {degree}: {message}")
|
| 610 |
+
else:
|
| 611 |
+
print(f" β οΈ {ptype:12} degree {degree}: {message}")
|
| 612 |
+
perturbations[ptype][f"degree_{degree}"] = encode_image_to_base64(image_np)
|
| 613 |
+
|
| 614 |
+
except Exception as e:
|
| 615 |
+
print(f" β οΈ Exception {ptype:12} degree {degree}: {e}")
|
| 616 |
+
perturbations[ptype][f"degree_{degree}"] = encode_image_to_base64(image_np)
|
| 617 |
+
|
| 618 |
+
print(f"\nβ
Generated {generated_count}/36 perturbation images successfully")
|
| 619 |
+
|
| 620 |
+
return {
|
| 621 |
+
"success": True,
|
| 622 |
+
"message": f"Perturbations generated: 12 types Γ 3 degrees = 36 images + 1 original = 37 total",
|
| 623 |
+
"perturbations": perturbations,
|
| 624 |
+
"grid_info": {
|
| 625 |
+
"total_perturbations": 12,
|
| 626 |
+
"degree_levels": 3,
|
| 627 |
+
"total_images": 37,
|
| 628 |
+
"generated_count": generated_count
|
| 629 |
+
}
|
| 630 |
+
}
|
| 631 |
+
|
| 632 |
+
except ImportError as e:
|
| 633 |
+
print(f"β Import error: {e}\n{traceback.format_exc()}")
|
| 634 |
+
return {
|
| 635 |
+
"success": False,
|
| 636 |
+
"message": f"Perturbation module import error: {str(e)}",
|
| 637 |
+
"perturbations": {}
|
| 638 |
+
}
|
| 639 |
+
except Exception as e:
|
| 640 |
+
print(f"β Perturbation generation error: {e}\n{traceback.format_exc()}")
|
| 641 |
+
return {
|
| 642 |
+
"success": False,
|
| 643 |
+
"message": str(e),
|
| 644 |
+
"perturbations": {}
|
| 645 |
+
}
|
| 646 |
+
|
| 647 |
|
| 648 |
+
# ============================================================================
|
| 649 |
+
# Main
|
| 650 |
+
# ============================================================================
|
| 651 |
|
| 652 |
if __name__ == "__main__":
|
| 653 |
+
print("\n" + "π·"*35)
|
| 654 |
+
print("π· RoDLA PRODUCTION BACKEND")
|
| 655 |
+
print("π· Model: InternImage-XL with DINO")
|
| 656 |
+
print("π· Weights: 3.8GB (rodla_internimage_xl_publaynet.pth)")
|
| 657 |
+
print("π· Perturbations: 12 types Γ 3 degrees each")
|
| 658 |
+
print("π· Detection: MMDET (if available) or Heuristic fallback")
|
| 659 |
+
print("π·"*35)
|
| 660 |
+
|
| 661 |
uvicorn.run(
|
| 662 |
app,
|
| 663 |
+
host="0.0.0.0",
|
| 664 |
+
port=Config.API_PORT,
|
| 665 |
log_level="info"
|
| 666 |
+
)
|
deployment/backend/backend_adaptive.py
DELETED
|
@@ -1,500 +0,0 @@
|
|
| 1 |
-
"""
|
| 2 |
-
RoDLA Object Detection API - Adaptive Backend
|
| 3 |
-
Attempts to use real model if available, falls back to enhanced simulation
|
| 4 |
-
"""
|
| 5 |
-
from fastapi import FastAPI, File, UploadFile, HTTPException, Form
|
| 6 |
-
from fastapi.middleware.cors import CORSMiddleware
|
| 7 |
-
from fastapi.responses import JSONResponse
|
| 8 |
-
import uvicorn
|
| 9 |
-
from pathlib import Path
|
| 10 |
-
import json
|
| 11 |
-
import base64
|
| 12 |
-
import cv2
|
| 13 |
-
import numpy as np
|
| 14 |
-
from io import BytesIO
|
| 15 |
-
from PIL import Image, ImageDraw, ImageFont
|
| 16 |
-
import asyncio
|
| 17 |
-
import sys
|
| 18 |
-
|
| 19 |
-
# Try to import ML frameworks
|
| 20 |
-
try:
|
| 21 |
-
import torch
|
| 22 |
-
from mmdet.apis import init_detector, inference_detector
|
| 23 |
-
HAS_MMDET = True
|
| 24 |
-
print("β PyTorch/MMDET available - Using REAL model")
|
| 25 |
-
except ImportError:
|
| 26 |
-
HAS_MMDET = False
|
| 27 |
-
print("β PyTorch/MMDET not available - Using enhanced simulation")
|
| 28 |
-
|
| 29 |
-
# Add paths for config access
|
| 30 |
-
sys.path.insert(0, '/home/admin/CV/rodla-academic')
|
| 31 |
-
sys.path.insert(0, '/home/admin/CV/rodla-academic/model')
|
| 32 |
-
|
| 33 |
-
# Try to import settings
|
| 34 |
-
try:
|
| 35 |
-
from deployment.backend.config.settings import (
|
| 36 |
-
MODEL_CONFIG_PATH, MODEL_WEIGHTS_PATH,
|
| 37 |
-
API_HOST, API_PORT, CORS_ORIGINS, CORS_METHODS, CORS_HEADERS
|
| 38 |
-
)
|
| 39 |
-
print(f"β Config loaded from: {MODEL_CONFIG_PATH}")
|
| 40 |
-
except Exception as e:
|
| 41 |
-
print(f"β Could not load config: {e}")
|
| 42 |
-
API_HOST = "0.0.0.0"
|
| 43 |
-
API_PORT = 8000
|
| 44 |
-
CORS_ORIGINS = ["*"]
|
| 45 |
-
CORS_METHODS = ["*"]
|
| 46 |
-
CORS_HEADERS = ["*"]
|
| 47 |
-
|
| 48 |
-
# Initialize FastAPI app
|
| 49 |
-
app = FastAPI(
|
| 50 |
-
title="RoDLA Object Detection API (Adaptive)",
|
| 51 |
-
description="RoDLA Document Layout Analysis API - Real or Simulated Backend",
|
| 52 |
-
version="2.1.0"
|
| 53 |
-
)
|
| 54 |
-
|
| 55 |
-
# Add CORS middleware
|
| 56 |
-
app.add_middleware(
|
| 57 |
-
CORSMiddleware,
|
| 58 |
-
allow_origins=CORS_ORIGINS,
|
| 59 |
-
allow_credentials=True,
|
| 60 |
-
allow_methods=CORS_METHODS,
|
| 61 |
-
allow_headers=CORS_HEADERS,
|
| 62 |
-
)
|
| 63 |
-
|
| 64 |
-
# Configuration
|
| 65 |
-
OUTPUT_DIR = Path("outputs")
|
| 66 |
-
OUTPUT_DIR.mkdir(exist_ok=True)
|
| 67 |
-
|
| 68 |
-
# Model classes (from DINO detection)
|
| 69 |
-
MODEL_CLASSES = [
|
| 70 |
-
'Title', 'Abstract', 'Introduction', 'Related Work', 'Methodology',
|
| 71 |
-
'Experiments', 'Results', 'Discussion', 'Conclusion', 'References',
|
| 72 |
-
'Text', 'Figure', 'Table', 'Header', 'Footer', 'Page Number',
|
| 73 |
-
'Caption', 'Section', 'Subsection', 'Equation', 'Chart', 'List'
|
| 74 |
-
]
|
| 75 |
-
|
| 76 |
-
# Global model instance
|
| 77 |
-
_model = None
|
| 78 |
-
backend_mode = "SIMULATED" # Will change if model loads
|
| 79 |
-
|
| 80 |
-
# ============================================
|
| 81 |
-
# MODEL LOADING
|
| 82 |
-
# ============================================
|
| 83 |
-
|
| 84 |
-
def load_real_model():
|
| 85 |
-
"""Try to load the actual RoDLA model"""
|
| 86 |
-
global _model, backend_mode
|
| 87 |
-
|
| 88 |
-
if not HAS_MMDET:
|
| 89 |
-
return False
|
| 90 |
-
|
| 91 |
-
try:
|
| 92 |
-
print("\nπ Attempting to load real RoDLA model...")
|
| 93 |
-
|
| 94 |
-
# Check if files exist
|
| 95 |
-
if not Path(MODEL_CONFIG_PATH).exists():
|
| 96 |
-
print(f"β Config not found: {MODEL_CONFIG_PATH}")
|
| 97 |
-
return False
|
| 98 |
-
|
| 99 |
-
if not Path(MODEL_WEIGHTS_PATH).exists():
|
| 100 |
-
print(f"β Weights not found: {MODEL_WEIGHTS_PATH}")
|
| 101 |
-
return False
|
| 102 |
-
|
| 103 |
-
# Load model
|
| 104 |
-
device = "cuda:0" if torch.cuda.is_available() else "cpu"
|
| 105 |
-
print(f"Using device: {device}")
|
| 106 |
-
|
| 107 |
-
_model = init_detector(
|
| 108 |
-
str(MODEL_CONFIG_PATH),
|
| 109 |
-
str(MODEL_WEIGHTS_PATH),
|
| 110 |
-
device=device
|
| 111 |
-
)
|
| 112 |
-
|
| 113 |
-
backend_mode = "REAL"
|
| 114 |
-
print("β
Real RoDLA model loaded successfully!")
|
| 115 |
-
return True
|
| 116 |
-
|
| 117 |
-
except Exception as e:
|
| 118 |
-
print(f"β Failed to load real model: {e}")
|
| 119 |
-
print("Falling back to enhanced simulation...")
|
| 120 |
-
return False
|
| 121 |
-
|
| 122 |
-
def predict_with_model(image_array, score_threshold=0.3):
|
| 123 |
-
"""Run inference with actual model"""
|
| 124 |
-
try:
|
| 125 |
-
if _model is None or backend_mode != "REAL":
|
| 126 |
-
return None
|
| 127 |
-
|
| 128 |
-
result = inference_detector(_model, image_array)
|
| 129 |
-
return result
|
| 130 |
-
except Exception as e:
|
| 131 |
-
print(f"Model inference error: {e}")
|
| 132 |
-
return None
|
| 133 |
-
|
| 134 |
-
# ============================================
|
| 135 |
-
# ENHANCED SIMULATION
|
| 136 |
-
# ============================================
|
| 137 |
-
|
| 138 |
-
class EnhancedDetector:
|
| 139 |
-
"""Enhanced simulation that respects document layout"""
|
| 140 |
-
|
| 141 |
-
def __init__(self):
|
| 142 |
-
self.regions = []
|
| 143 |
-
|
| 144 |
-
def analyze_layout(self, image_array):
|
| 145 |
-
"""Analyze document layout to place detections intelligently"""
|
| 146 |
-
h, w = image_array.shape[:2]
|
| 147 |
-
|
| 148 |
-
# Common document layout regions
|
| 149 |
-
layouts = {
|
| 150 |
-
'title': (0.05*w, 0.02*h, 0.95*w, 0.08*h),
|
| 151 |
-
'abstract': (0.05*w, 0.09*h, 0.95*w, 0.2*h),
|
| 152 |
-
'introduction': (0.05*w, 0.21*h, 0.95*w, 0.35*h),
|
| 153 |
-
'figure': (0.1*w, 0.36*h, 0.5*w, 0.65*h),
|
| 154 |
-
'table': (0.55*w, 0.36*h, 0.95*w, 0.65*h),
|
| 155 |
-
'references': (0.05*w, 0.7*h, 0.95*w, 0.98*h),
|
| 156 |
-
}
|
| 157 |
-
return layouts
|
| 158 |
-
|
| 159 |
-
def generate_detections(self, image_array, num_detections=None):
|
| 160 |
-
"""Generate contextual detections"""
|
| 161 |
-
if num_detections is None:
|
| 162 |
-
num_detections = np.random.randint(10, 25)
|
| 163 |
-
|
| 164 |
-
h, w = image_array.shape[:2]
|
| 165 |
-
layouts = self.analyze_layout(image_array)
|
| 166 |
-
detections = []
|
| 167 |
-
|
| 168 |
-
# Grid-based detection for realistic distribution
|
| 169 |
-
grid_w, grid_h = np.random.randint(2, 4), np.random.randint(3, 6)
|
| 170 |
-
cell_w, cell_h = w // grid_w, h // grid_h
|
| 171 |
-
|
| 172 |
-
for i in range(num_detections):
|
| 173 |
-
# Pick random grid cell
|
| 174 |
-
grid_x = np.random.randint(0, grid_w)
|
| 175 |
-
grid_y = np.random.randint(0, grid_h)
|
| 176 |
-
|
| 177 |
-
# Add some variation within cell
|
| 178 |
-
margin = 0.1
|
| 179 |
-
x_min = int(grid_x * cell_w + margin * cell_w)
|
| 180 |
-
x_max = int((grid_x + 1) * cell_w - margin * cell_w)
|
| 181 |
-
y_min = int(grid_y * cell_h + margin * cell_h)
|
| 182 |
-
y_max = int((grid_y + 1) * cell_h - margin * cell_h)
|
| 183 |
-
|
| 184 |
-
if x_max <= x_min or y_max <= y_min:
|
| 185 |
-
continue
|
| 186 |
-
|
| 187 |
-
x1 = np.random.randint(x_min, x_max)
|
| 188 |
-
y1 = np.random.randint(y_min, y_max)
|
| 189 |
-
x2 = x1 + np.random.randint(50, min(200, x_max - x1))
|
| 190 |
-
y2 = y1 + np.random.randint(30, min(150, y_max - y1))
|
| 191 |
-
|
| 192 |
-
# Prefer certain classes in certain regions
|
| 193 |
-
if y1 < h * 0.1:
|
| 194 |
-
class_name = np.random.choice(['Title', 'Abstract', 'Header'])
|
| 195 |
-
elif y1 > h * 0.85:
|
| 196 |
-
class_name = np.random.choice(['Footer', 'References', 'Page Number'])
|
| 197 |
-
elif (x1 < w * 0.15 or x2 > w * 0.85):
|
| 198 |
-
class_name = np.random.choice(['Figure', 'Table', 'List'])
|
| 199 |
-
else:
|
| 200 |
-
class_name = np.random.choice(MODEL_CLASSES)
|
| 201 |
-
|
| 202 |
-
detection = {
|
| 203 |
-
'class': class_name,
|
| 204 |
-
'confidence': float(np.random.uniform(0.6, 0.98)),
|
| 205 |
-
'box': {
|
| 206 |
-
'x1': int(max(0, x1)),
|
| 207 |
-
'y1': int(max(0, y1)),
|
| 208 |
-
'x2': int(min(w, x2)),
|
| 209 |
-
'y2': int(min(h, y2))
|
| 210 |
-
}
|
| 211 |
-
}
|
| 212 |
-
detections.append(detection)
|
| 213 |
-
|
| 214 |
-
return detections
|
| 215 |
-
|
| 216 |
-
detector = EnhancedDetector()
|
| 217 |
-
|
| 218 |
-
# ============================================
|
| 219 |
-
# HELPER FUNCTIONS
|
| 220 |
-
# ============================================
|
| 221 |
-
|
| 222 |
-
def generate_detections(image_shape, num_detections=None):
|
| 223 |
-
"""Generate detections"""
|
| 224 |
-
return detector.generate_detections(np.zeros(image_shape), num_detections)
|
| 225 |
-
|
| 226 |
-
def create_annotated_image(image_array, detections):
|
| 227 |
-
"""Create annotated image with bounding boxes"""
|
| 228 |
-
img = Image.fromarray(image_array.astype('uint8'))
|
| 229 |
-
draw = ImageDraw.Draw(img)
|
| 230 |
-
|
| 231 |
-
box_color = (0, 255, 0) # Lime green
|
| 232 |
-
text_color = (0, 255, 255) # Cyan
|
| 233 |
-
|
| 234 |
-
for detection in detections:
|
| 235 |
-
box = detection['box']
|
| 236 |
-
x1, y1, x2, y2 = box['x1'], box['y1'], box['x2'], box['y2']
|
| 237 |
-
conf = detection['confidence']
|
| 238 |
-
class_name = detection['class']
|
| 239 |
-
|
| 240 |
-
draw.rectangle([x1, y1, x2, y2], outline=box_color, width=2)
|
| 241 |
-
label_text = f"{class_name} {conf*100:.0f}%"
|
| 242 |
-
draw.text((x1, y1-15), label_text, fill=text_color)
|
| 243 |
-
|
| 244 |
-
return np.array(img)
|
| 245 |
-
|
| 246 |
-
def apply_perturbation(image_array, perturbation_type):
|
| 247 |
-
"""Apply perturbation to image"""
|
| 248 |
-
result = image_array.copy()
|
| 249 |
-
|
| 250 |
-
if perturbation_type == 'blur':
|
| 251 |
-
result = cv2.GaussianBlur(result, (15, 15), 0)
|
| 252 |
-
|
| 253 |
-
elif perturbation_type == 'noise':
|
| 254 |
-
noise = np.random.normal(0, 25, result.shape)
|
| 255 |
-
result = np.clip(result.astype(float) + noise, 0, 255).astype(np.uint8)
|
| 256 |
-
|
| 257 |
-
elif perturbation_type == 'rotation':
|
| 258 |
-
h, w = result.shape[:2]
|
| 259 |
-
center = (w // 2, h // 2)
|
| 260 |
-
angle = np.random.uniform(-15, 15)
|
| 261 |
-
M = cv2.getRotationMatrix2D(center, angle, 1.0)
|
| 262 |
-
result = cv2.warpAffine(result, M, (w, h))
|
| 263 |
-
|
| 264 |
-
elif perturbation_type == 'scaling':
|
| 265 |
-
scale = np.random.uniform(0.8, 1.2)
|
| 266 |
-
h, w = result.shape[:2]
|
| 267 |
-
new_h, new_w = int(h * scale), int(w * scale)
|
| 268 |
-
result = cv2.resize(result, (new_w, new_h))
|
| 269 |
-
if new_h > h or new_w > w:
|
| 270 |
-
result = result[:h, :w]
|
| 271 |
-
else:
|
| 272 |
-
pad_h = h - new_h
|
| 273 |
-
pad_w = w - new_w
|
| 274 |
-
result = cv2.copyMakeBorder(result, pad_h//2, pad_h-pad_h//2,
|
| 275 |
-
pad_w//2, pad_w-pad_w//2, cv2.BORDER_CONSTANT)
|
| 276 |
-
|
| 277 |
-
elif perturbation_type == 'perspective':
|
| 278 |
-
h, w = result.shape[:2]
|
| 279 |
-
pts1 = np.float32([[0, 0], [w, 0], [0, h], [w, h]])
|
| 280 |
-
pts2 = np.float32([
|
| 281 |
-
[np.random.randint(0, 30), np.random.randint(0, 30)],
|
| 282 |
-
[w - np.random.randint(0, 30), np.random.randint(0, 30)],
|
| 283 |
-
[np.random.randint(0, 30), h - np.random.randint(0, 30)],
|
| 284 |
-
[w - np.random.randint(0, 30), h - np.random.randint(0, 30)]
|
| 285 |
-
])
|
| 286 |
-
M = cv2.getPerspectiveTransform(pts1, pts2)
|
| 287 |
-
result = cv2.warpPerspective(result, M, (w, h))
|
| 288 |
-
|
| 289 |
-
return result
|
| 290 |
-
|
| 291 |
-
def image_to_base64(image_array):
|
| 292 |
-
"""Convert image array to base64 string"""
|
| 293 |
-
img = Image.fromarray(image_array.astype('uint8'))
|
| 294 |
-
buffer = BytesIO()
|
| 295 |
-
img.save(buffer, format='PNG')
|
| 296 |
-
return base64.b64encode(buffer.getvalue()).decode()
|
| 297 |
-
|
| 298 |
-
# ============================================
|
| 299 |
-
# API ENDPOINTS
|
| 300 |
-
# ============================================
|
| 301 |
-
|
| 302 |
-
@app.on_event("startup")
|
| 303 |
-
async def startup_event():
|
| 304 |
-
"""Initialize on startup"""
|
| 305 |
-
print("="*60)
|
| 306 |
-
print("Starting RoDLA Document Layout Analysis API (Adaptive)")
|
| 307 |
-
print("="*60)
|
| 308 |
-
|
| 309 |
-
# Try to load real model
|
| 310 |
-
load_real_model()
|
| 311 |
-
|
| 312 |
-
print(f"\nπ Backend Mode: {backend_mode}")
|
| 313 |
-
print(f"π Main API: http://{API_HOST}:{API_PORT}")
|
| 314 |
-
print(f"π Docs: http://localhost:{API_PORT}/docs")
|
| 315 |
-
print(f"π ReDoc: http://localhost:{API_PORT}/redoc")
|
| 316 |
-
print("\nπ― Available Endpoints:")
|
| 317 |
-
print(" β’ GET /api/health - Health check")
|
| 318 |
-
print(" β’ GET /api/model-info - Model information")
|
| 319 |
-
print(" β’ POST /api/detect - Standard detection")
|
| 320 |
-
print(" β’ GET /api/perturbations/info - Perturbation info")
|
| 321 |
-
print(" β’ POST /api/generate-perturbations - Generate perturbations")
|
| 322 |
-
print(" β’ POST /api/detect-with-perturbation - Detect with perturbations")
|
| 323 |
-
print("="*60)
|
| 324 |
-
print("β
API Ready!\n")
|
| 325 |
-
|
| 326 |
-
|
| 327 |
-
@app.get("/api/health")
|
| 328 |
-
async def health_check():
|
| 329 |
-
"""Health check endpoint"""
|
| 330 |
-
return JSONResponse({
|
| 331 |
-
"status": "healthy",
|
| 332 |
-
"mode": backend_mode,
|
| 333 |
-
"has_model": backend_mode == "REAL"
|
| 334 |
-
})
|
| 335 |
-
|
| 336 |
-
|
| 337 |
-
@app.get("/api/model-info")
|
| 338 |
-
async def model_info():
|
| 339 |
-
"""Get model information"""
|
| 340 |
-
return JSONResponse({
|
| 341 |
-
"model_name": "RoDLA InternImage-XL",
|
| 342 |
-
"paper": "RoDLA: Benchmarking the Robustness of Document Layout Analysis Models (CVPR 2024)",
|
| 343 |
-
"backbone": "InternImage-XL",
|
| 344 |
-
"detection_framework": "DINO with Channel Attention + Average Pooling",
|
| 345 |
-
"dataset": "M6Doc-P",
|
| 346 |
-
"max_detections_per_image": 300,
|
| 347 |
-
"backend_mode": backend_mode,
|
| 348 |
-
"state_of_the_art_performance": {
|
| 349 |
-
"clean_mAP": 70.0,
|
| 350 |
-
"perturbed_avg_mAP": 61.7,
|
| 351 |
-
"mRD_score": 147.6
|
| 352 |
-
}
|
| 353 |
-
})
|
| 354 |
-
|
| 355 |
-
|
| 356 |
-
@app.post("/api/detect")
|
| 357 |
-
async def detect(file: UploadFile = File(...), score_threshold: float = Form(0.3)):
|
| 358 |
-
"""Standard detection endpoint"""
|
| 359 |
-
try:
|
| 360 |
-
contents = await file.read()
|
| 361 |
-
image = Image.open(BytesIO(contents)).convert('RGB')
|
| 362 |
-
image_array = np.array(image)
|
| 363 |
-
|
| 364 |
-
detections = generate_detections(image_array.shape)
|
| 365 |
-
detections = [d for d in detections if d['confidence'] >= score_threshold]
|
| 366 |
-
|
| 367 |
-
annotated = create_annotated_image(image_array, detections)
|
| 368 |
-
annotated_b64 = image_to_base64(annotated)
|
| 369 |
-
|
| 370 |
-
class_dist = {}
|
| 371 |
-
for det in detections:
|
| 372 |
-
cls = det['class']
|
| 373 |
-
class_dist[cls] = class_dist.get(cls, 0) + 1
|
| 374 |
-
|
| 375 |
-
return JSONResponse({
|
| 376 |
-
"detections": detections,
|
| 377 |
-
"class_distribution": class_dist,
|
| 378 |
-
"annotated_image": annotated_b64,
|
| 379 |
-
"metrics": {
|
| 380 |
-
"total_detections": len(detections),
|
| 381 |
-
"average_confidence": float(np.mean([d['confidence'] for d in detections]) if detections else 0),
|
| 382 |
-
"max_confidence": float(max([d['confidence'] for d in detections]) if detections else 0),
|
| 383 |
-
"min_confidence": float(min([d['confidence'] for d in detections]) if detections else 0),
|
| 384 |
-
"backend_mode": backend_mode
|
| 385 |
-
}
|
| 386 |
-
})
|
| 387 |
-
|
| 388 |
-
except Exception as e:
|
| 389 |
-
raise HTTPException(status_code=400, detail=str(e))
|
| 390 |
-
|
| 391 |
-
|
| 392 |
-
@app.get("/api/perturbations/info")
|
| 393 |
-
async def perturbations_info():
|
| 394 |
-
"""Get available perturbation types"""
|
| 395 |
-
return JSONResponse({
|
| 396 |
-
"available_perturbations": [
|
| 397 |
-
"blur",
|
| 398 |
-
"noise",
|
| 399 |
-
"rotation",
|
| 400 |
-
"scaling",
|
| 401 |
-
"perspective"
|
| 402 |
-
],
|
| 403 |
-
"description": "Various document perturbations for robustness testing"
|
| 404 |
-
})
|
| 405 |
-
|
| 406 |
-
|
| 407 |
-
@app.post("/api/generate-perturbations")
|
| 408 |
-
async def generate_perturbations(
|
| 409 |
-
file: UploadFile = File(...),
|
| 410 |
-
perturbation_types: str = Form("blur,noise")
|
| 411 |
-
):
|
| 412 |
-
"""Generate and return perturbations"""
|
| 413 |
-
try:
|
| 414 |
-
contents = await file.read()
|
| 415 |
-
image = Image.open(BytesIO(contents)).convert('RGB')
|
| 416 |
-
image_array = np.array(image)
|
| 417 |
-
|
| 418 |
-
pert_types = [p.strip() for p in perturbation_types.split(',')]
|
| 419 |
-
|
| 420 |
-
results = {
|
| 421 |
-
"original": image_to_base64(image_array),
|
| 422 |
-
"perturbations": {}
|
| 423 |
-
}
|
| 424 |
-
|
| 425 |
-
for pert_type in pert_types:
|
| 426 |
-
if pert_type:
|
| 427 |
-
perturbed = apply_perturbation(image_array, pert_type)
|
| 428 |
-
results["perturbations"][pert_type] = image_to_base64(perturbed)
|
| 429 |
-
|
| 430 |
-
return JSONResponse(results)
|
| 431 |
-
|
| 432 |
-
except Exception as e:
|
| 433 |
-
raise HTTPException(status_code=400, detail=str(e))
|
| 434 |
-
|
| 435 |
-
|
| 436 |
-
@app.post("/api/detect-with-perturbation")
|
| 437 |
-
async def detect_with_perturbation(
|
| 438 |
-
file: UploadFile = File(...),
|
| 439 |
-
score_threshold: float = Form(0.3),
|
| 440 |
-
perturbation_types: str = Form("blur,noise")
|
| 441 |
-
):
|
| 442 |
-
"""Detect with perturbations"""
|
| 443 |
-
try:
|
| 444 |
-
contents = await file.read()
|
| 445 |
-
image = Image.open(BytesIO(contents)).convert('RGB')
|
| 446 |
-
image_array = np.array(image)
|
| 447 |
-
|
| 448 |
-
pert_types = [p.strip() for p in perturbation_types.split(',')]
|
| 449 |
-
|
| 450 |
-
results = {
|
| 451 |
-
"clean": {},
|
| 452 |
-
"perturbed": {}
|
| 453 |
-
}
|
| 454 |
-
|
| 455 |
-
# Clean detection
|
| 456 |
-
clean_dets = generate_detections(image_array.shape)
|
| 457 |
-
clean_dets = [d for d in clean_dets if d['confidence'] >= score_threshold]
|
| 458 |
-
clean_img = create_annotated_image(image_array, clean_dets)
|
| 459 |
-
|
| 460 |
-
results["clean"]["detections"] = clean_dets
|
| 461 |
-
results["clean"]["annotated_image"] = image_to_base64(clean_img)
|
| 462 |
-
|
| 463 |
-
# Perturbed detections
|
| 464 |
-
for pert_type in pert_types:
|
| 465 |
-
if pert_type:
|
| 466 |
-
perturbed_img = apply_perturbation(image_array, pert_type)
|
| 467 |
-
pert_dets = generate_detections(perturbed_img.shape)
|
| 468 |
-
pert_dets = [
|
| 469 |
-
{**d, 'confidence': max(0, d['confidence'] - np.random.uniform(0, 0.1))}
|
| 470 |
-
for d in pert_dets
|
| 471 |
-
]
|
| 472 |
-
pert_dets = [d for d in pert_dets if d['confidence'] >= score_threshold]
|
| 473 |
-
annotated_pert = create_annotated_image(perturbed_img, pert_dets)
|
| 474 |
-
|
| 475 |
-
results["perturbed"][pert_type] = {
|
| 476 |
-
"detections": pert_dets,
|
| 477 |
-
"annotated_image": image_to_base64(annotated_pert)
|
| 478 |
-
}
|
| 479 |
-
|
| 480 |
-
return JSONResponse(results)
|
| 481 |
-
|
| 482 |
-
except Exception as e:
|
| 483 |
-
raise HTTPException(status_code=400, detail=str(e))
|
| 484 |
-
|
| 485 |
-
|
| 486 |
-
@app.on_event("shutdown")
|
| 487 |
-
async def shutdown_event():
|
| 488 |
-
"""Cleanup on shutdown"""
|
| 489 |
-
print("\n" + "="*60)
|
| 490 |
-
print("π Shutting down RoDLA API...")
|
| 491 |
-
print("="*60)
|
| 492 |
-
|
| 493 |
-
|
| 494 |
-
if __name__ == "__main__":
|
| 495 |
-
uvicorn.run(
|
| 496 |
-
app,
|
| 497 |
-
host=API_HOST,
|
| 498 |
-
port=API_PORT,
|
| 499 |
-
log_level="info"
|
| 500 |
-
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
deployment/backend/backend_demo.py
DELETED
|
@@ -1,366 +0,0 @@
|
|
| 1 |
-
"""
|
| 2 |
-
RoDLA Object Detection API - Demo/Lightweight Backend
|
| 3 |
-
Simulates the full backend for testing when real model weights unavailable
|
| 4 |
-
"""
|
| 5 |
-
from fastapi import FastAPI, File, UploadFile, HTTPException, Form
|
| 6 |
-
from fastapi.middleware.cors import CORSMiddleware
|
| 7 |
-
from fastapi.responses import JSONResponse
|
| 8 |
-
import uvicorn
|
| 9 |
-
from pathlib import Path
|
| 10 |
-
import json
|
| 11 |
-
import base64
|
| 12 |
-
import cv2
|
| 13 |
-
import numpy as np
|
| 14 |
-
from io import BytesIO
|
| 15 |
-
from PIL import Image, ImageDraw, ImageFont
|
| 16 |
-
import asyncio
|
| 17 |
-
|
| 18 |
-
# Initialize FastAPI app
|
| 19 |
-
app = FastAPI(
|
| 20 |
-
title="RoDLA Object Detection API (Demo Mode)",
|
| 21 |
-
description="RoDLA Document Layout Analysis API - Demo/Test Version",
|
| 22 |
-
version="2.1.0"
|
| 23 |
-
)
|
| 24 |
-
|
| 25 |
-
# Add CORS middleware
|
| 26 |
-
app.add_middleware(
|
| 27 |
-
CORSMiddleware,
|
| 28 |
-
allow_origins=["*"],
|
| 29 |
-
allow_credentials=True,
|
| 30 |
-
allow_methods=["*"],
|
| 31 |
-
allow_headers=["*"],
|
| 32 |
-
)
|
| 33 |
-
|
| 34 |
-
# Configuration
|
| 35 |
-
API_HOST = "0.0.0.0"
|
| 36 |
-
API_PORT = 8000
|
| 37 |
-
OUTPUT_DIR = Path("outputs")
|
| 38 |
-
OUTPUT_DIR.mkdir(exist_ok=True)
|
| 39 |
-
|
| 40 |
-
# Model classes
|
| 41 |
-
MODEL_CLASSES = [
|
| 42 |
-
'Title', 'Abstract', 'Introduction', 'Related Work', 'Methodology',
|
| 43 |
-
'Experiments', 'Results', 'Discussion', 'Conclusion', 'References',
|
| 44 |
-
'Text', 'Figure', 'Table', 'Header', 'Footer', 'Page Number', 'Caption'
|
| 45 |
-
]
|
| 46 |
-
|
| 47 |
-
# ============================================
|
| 48 |
-
# HELPER FUNCTIONS
|
| 49 |
-
# ============================================
|
| 50 |
-
|
| 51 |
-
def generate_demo_detections(image_shape, num_detections=None):
|
| 52 |
-
"""Generate realistic demo detections"""
|
| 53 |
-
if num_detections is None:
|
| 54 |
-
num_detections = np.random.randint(8, 20)
|
| 55 |
-
|
| 56 |
-
height, width = image_shape[:2]
|
| 57 |
-
detections = []
|
| 58 |
-
|
| 59 |
-
for i in range(num_detections):
|
| 60 |
-
x1 = np.random.randint(10, width - 200)
|
| 61 |
-
y1 = np.random.randint(10, height - 100)
|
| 62 |
-
x2 = x1 + np.random.randint(100, min(300, width - x1))
|
| 63 |
-
y2 = y1 + np.random.randint(50, min(200, height - y1))
|
| 64 |
-
|
| 65 |
-
detection = {
|
| 66 |
-
'class': np.random.choice(MODEL_CLASSES),
|
| 67 |
-
'confidence': float(np.random.uniform(0.5, 0.99)),
|
| 68 |
-
'box': {
|
| 69 |
-
'x1': int(x1),
|
| 70 |
-
'y1': int(y1),
|
| 71 |
-
'x2': int(x2),
|
| 72 |
-
'y2': int(y2)
|
| 73 |
-
}
|
| 74 |
-
}
|
| 75 |
-
detections.append(detection)
|
| 76 |
-
|
| 77 |
-
return detections
|
| 78 |
-
|
| 79 |
-
def create_annotated_image(image_array, detections):
|
| 80 |
-
"""Create annotated image with bounding boxes"""
|
| 81 |
-
# Convert to PIL Image
|
| 82 |
-
img = Image.fromarray(image_array.astype('uint8'))
|
| 83 |
-
draw = ImageDraw.Draw(img)
|
| 84 |
-
|
| 85 |
-
# Colors in teal/lime theme
|
| 86 |
-
box_color = (0, 255, 0) # Lime green
|
| 87 |
-
text_color = (0, 255, 255) # Cyan
|
| 88 |
-
|
| 89 |
-
for detection in detections:
|
| 90 |
-
box = detection['box']
|
| 91 |
-
x1, y1, x2, y2 = box['x1'], box['y1'], box['x2'], box['y2']
|
| 92 |
-
conf = detection['confidence']
|
| 93 |
-
class_name = detection['class']
|
| 94 |
-
|
| 95 |
-
# Draw box
|
| 96 |
-
draw.rectangle([x1, y1, x2, y2], outline=box_color, width=2)
|
| 97 |
-
|
| 98 |
-
# Draw label
|
| 99 |
-
label_text = f"{class_name} {conf*100:.0f}%"
|
| 100 |
-
draw.text((x1, y1-15), label_text, fill=text_color)
|
| 101 |
-
|
| 102 |
-
return np.array(img)
|
| 103 |
-
|
| 104 |
-
def apply_perturbation(image_array, perturbation_type):
|
| 105 |
-
"""Apply perturbation to image"""
|
| 106 |
-
result = image_array.copy()
|
| 107 |
-
|
| 108 |
-
if perturbation_type == 'blur':
|
| 109 |
-
result = cv2.GaussianBlur(result, (15, 15), 0)
|
| 110 |
-
|
| 111 |
-
elif perturbation_type == 'noise':
|
| 112 |
-
noise = np.random.normal(0, 25, result.shape)
|
| 113 |
-
result = np.clip(result.astype(float) + noise, 0, 255).astype(np.uint8)
|
| 114 |
-
|
| 115 |
-
elif perturbation_type == 'rotation':
|
| 116 |
-
h, w = result.shape[:2]
|
| 117 |
-
center = (w // 2, h // 2)
|
| 118 |
-
angle = np.random.uniform(-15, 15)
|
| 119 |
-
M = cv2.getRotationMatrix2D(center, angle, 1.0)
|
| 120 |
-
result = cv2.warpAffine(result, M, (w, h))
|
| 121 |
-
|
| 122 |
-
elif perturbation_type == 'scaling':
|
| 123 |
-
scale = np.random.uniform(0.8, 1.2)
|
| 124 |
-
h, w = result.shape[:2]
|
| 125 |
-
new_h, new_w = int(h * scale), int(w * scale)
|
| 126 |
-
result = cv2.resize(result, (new_w, new_h))
|
| 127 |
-
# Pad or crop to original size
|
| 128 |
-
if new_h > h or new_w > w:
|
| 129 |
-
result = result[:h, :w]
|
| 130 |
-
else:
|
| 131 |
-
pad_h = h - new_h
|
| 132 |
-
pad_w = w - new_w
|
| 133 |
-
result = cv2.copyMakeBorder(result, pad_h//2, pad_h-pad_h//2,
|
| 134 |
-
pad_w//2, pad_w-pad_w//2, cv2.BORDER_CONSTANT)
|
| 135 |
-
|
| 136 |
-
elif perturbation_type == 'perspective':
|
| 137 |
-
h, w = result.shape[:2]
|
| 138 |
-
pts1 = np.float32([[0, 0], [w, 0], [0, h], [w, h]])
|
| 139 |
-
pts2 = np.float32([
|
| 140 |
-
[np.random.randint(0, 30), np.random.randint(0, 30)],
|
| 141 |
-
[w - np.random.randint(0, 30), np.random.randint(0, 30)],
|
| 142 |
-
[np.random.randint(0, 30), h - np.random.randint(0, 30)],
|
| 143 |
-
[w - np.random.randint(0, 30), h - np.random.randint(0, 30)]
|
| 144 |
-
])
|
| 145 |
-
M = cv2.getPerspectiveTransform(pts1, pts2)
|
| 146 |
-
result = cv2.warpPerspective(result, M, (w, h))
|
| 147 |
-
|
| 148 |
-
return result
|
| 149 |
-
|
| 150 |
-
def image_to_base64(image_array):
|
| 151 |
-
"""Convert image array to base64 string"""
|
| 152 |
-
img = Image.fromarray(image_array.astype('uint8'))
|
| 153 |
-
buffer = BytesIO()
|
| 154 |
-
img.save(buffer, format='PNG')
|
| 155 |
-
return base64.b64encode(buffer.getvalue()).decode()
|
| 156 |
-
|
| 157 |
-
# ============================================
|
| 158 |
-
# API ENDPOINTS
|
| 159 |
-
# ============================================
|
| 160 |
-
|
| 161 |
-
@app.on_event("startup")
|
| 162 |
-
async def startup_event():
|
| 163 |
-
"""Initialize on startup"""
|
| 164 |
-
print("="*60)
|
| 165 |
-
print("Starting RoDLA Document Layout Analysis API (DEMO)")
|
| 166 |
-
print("="*60)
|
| 167 |
-
print(f"π Main API: http://{API_HOST}:{API_PORT}")
|
| 168 |
-
print(f"π Docs: http://localhost:{API_PORT}/docs")
|
| 169 |
-
print(f"π ReDoc: http://localhost:{API_PORT}/redoc")
|
| 170 |
-
print("\nπ― Available Endpoints:")
|
| 171 |
-
print(" β’ GET /api/health - Health check")
|
| 172 |
-
print(" β’ GET /api/model-info - Model information")
|
| 173 |
-
print(" β’ POST /api/detect - Standard detection")
|
| 174 |
-
print(" β’ GET /api/perturbations/info - Perturbation info")
|
| 175 |
-
print(" β’ POST /api/generate-perturbations - Generate perturbations")
|
| 176 |
-
print(" β’ POST /api/detect-with-perturbation - Detect with perturbations")
|
| 177 |
-
print("="*60)
|
| 178 |
-
print("β
API Ready! (Demo Mode)\n")
|
| 179 |
-
|
| 180 |
-
|
| 181 |
-
@app.get("/api/health")
|
| 182 |
-
async def health_check():
|
| 183 |
-
"""Health check endpoint"""
|
| 184 |
-
return JSONResponse({
|
| 185 |
-
"status": "healthy",
|
| 186 |
-
"mode": "demo",
|
| 187 |
-
"timestamp": str(Path.cwd())
|
| 188 |
-
})
|
| 189 |
-
|
| 190 |
-
|
| 191 |
-
@app.get("/api/model-info")
|
| 192 |
-
async def model_info():
|
| 193 |
-
"""Get model information"""
|
| 194 |
-
return JSONResponse({
|
| 195 |
-
"model_name": "RoDLA InternImage-XL (Demo Mode)",
|
| 196 |
-
"paper": "RoDLA: Benchmarking the Robustness of Document Layout Analysis Models (CVPR 2024)",
|
| 197 |
-
"backbone": "InternImage-XL",
|
| 198 |
-
"detection_framework": "DINO with Channel Attention + Average Pooling",
|
| 199 |
-
"dataset": "M6Doc-P",
|
| 200 |
-
"max_detections_per_image": 300,
|
| 201 |
-
"demo_mode": True,
|
| 202 |
-
"state_of_the_art_performance": {
|
| 203 |
-
"clean_mAP": 70.0,
|
| 204 |
-
"perturbed_avg_mAP": 61.7,
|
| 205 |
-
"mRD_score": 147.6
|
| 206 |
-
}
|
| 207 |
-
})
|
| 208 |
-
|
| 209 |
-
|
| 210 |
-
@app.post("/api/detect")
|
| 211 |
-
async def detect(file: UploadFile = File(...), score_threshold: float = Form(0.3)):
|
| 212 |
-
"""Standard detection endpoint"""
|
| 213 |
-
try:
|
| 214 |
-
# Read image
|
| 215 |
-
contents = await file.read()
|
| 216 |
-
image = Image.open(BytesIO(contents)).convert('RGB')
|
| 217 |
-
image_array = np.array(image)
|
| 218 |
-
|
| 219 |
-
# Generate demo detections
|
| 220 |
-
detections = generate_demo_detections(image_array.shape)
|
| 221 |
-
|
| 222 |
-
# Filter by threshold
|
| 223 |
-
detections = [d for d in detections if d['confidence'] >= score_threshold]
|
| 224 |
-
|
| 225 |
-
# Create annotated image
|
| 226 |
-
annotated = create_annotated_image(image_array, detections)
|
| 227 |
-
annotated_b64 = image_to_base64(annotated)
|
| 228 |
-
|
| 229 |
-
# Calculate class distribution
|
| 230 |
-
class_dist = {}
|
| 231 |
-
for det in detections:
|
| 232 |
-
cls = det['class']
|
| 233 |
-
class_dist[cls] = class_dist.get(cls, 0) + 1
|
| 234 |
-
|
| 235 |
-
return JSONResponse({
|
| 236 |
-
"detections": detections,
|
| 237 |
-
"class_distribution": class_dist,
|
| 238 |
-
"annotated_image": annotated_b64,
|
| 239 |
-
"metrics": {
|
| 240 |
-
"total_detections": len(detections),
|
| 241 |
-
"average_confidence": float(np.mean([d['confidence'] for d in detections]) if detections else 0),
|
| 242 |
-
"max_confidence": float(max([d['confidence'] for d in detections]) if detections else 0),
|
| 243 |
-
"min_confidence": float(min([d['confidence'] for d in detections]) if detections else 0)
|
| 244 |
-
}
|
| 245 |
-
})
|
| 246 |
-
|
| 247 |
-
except Exception as e:
|
| 248 |
-
raise HTTPException(status_code=400, detail=str(e))
|
| 249 |
-
|
| 250 |
-
|
| 251 |
-
@app.get("/api/perturbations/info")
|
| 252 |
-
async def perturbations_info():
|
| 253 |
-
"""Get available perturbation types"""
|
| 254 |
-
return JSONResponse({
|
| 255 |
-
"available_perturbations": [
|
| 256 |
-
"blur",
|
| 257 |
-
"noise",
|
| 258 |
-
"rotation",
|
| 259 |
-
"scaling",
|
| 260 |
-
"perspective"
|
| 261 |
-
],
|
| 262 |
-
"description": "Various document perturbations for robustness testing"
|
| 263 |
-
})
|
| 264 |
-
|
| 265 |
-
|
| 266 |
-
@app.post("/api/generate-perturbations")
|
| 267 |
-
async def generate_perturbations(
|
| 268 |
-
file: UploadFile = File(...),
|
| 269 |
-
perturbation_types: str = Form("blur,noise")
|
| 270 |
-
):
|
| 271 |
-
"""Generate and return perturbations"""
|
| 272 |
-
try:
|
| 273 |
-
# Read image
|
| 274 |
-
contents = await file.read()
|
| 275 |
-
image = Image.open(BytesIO(contents)).convert('RGB')
|
| 276 |
-
image_array = np.array(image)
|
| 277 |
-
|
| 278 |
-
# Parse perturbation types
|
| 279 |
-
pert_types = [p.strip() for p in perturbation_types.split(',')]
|
| 280 |
-
|
| 281 |
-
# Generate perturbations
|
| 282 |
-
results = {
|
| 283 |
-
"original": image_to_base64(image_array),
|
| 284 |
-
"perturbations": {}
|
| 285 |
-
}
|
| 286 |
-
|
| 287 |
-
for pert_type in pert_types:
|
| 288 |
-
if pert_type:
|
| 289 |
-
perturbed = apply_perturbation(image_array, pert_type)
|
| 290 |
-
results["perturbations"][pert_type] = image_to_base64(perturbed)
|
| 291 |
-
|
| 292 |
-
return JSONResponse(results)
|
| 293 |
-
|
| 294 |
-
except Exception as e:
|
| 295 |
-
raise HTTPException(status_code=400, detail=str(e))
|
| 296 |
-
|
| 297 |
-
|
| 298 |
-
@app.post("/api/detect-with-perturbation")
|
| 299 |
-
async def detect_with_perturbation(
|
| 300 |
-
file: UploadFile = File(...),
|
| 301 |
-
score_threshold: float = Form(0.3),
|
| 302 |
-
perturbation_types: str = Form("blur,noise")
|
| 303 |
-
):
|
| 304 |
-
"""Detect with perturbations"""
|
| 305 |
-
try:
|
| 306 |
-
# Read image
|
| 307 |
-
contents = await file.read()
|
| 308 |
-
image = Image.open(BytesIO(contents)).convert('RGB')
|
| 309 |
-
image_array = np.array(image)
|
| 310 |
-
|
| 311 |
-
# Parse perturbation types
|
| 312 |
-
pert_types = [p.strip() for p in perturbation_types.split(',')]
|
| 313 |
-
|
| 314 |
-
# Results for each perturbation
|
| 315 |
-
results = {
|
| 316 |
-
"clean": {},
|
| 317 |
-
"perturbed": {}
|
| 318 |
-
}
|
| 319 |
-
|
| 320 |
-
# Clean detection
|
| 321 |
-
clean_dets = generate_demo_detections(image_array.shape)
|
| 322 |
-
clean_dets = [d for d in clean_dets if d['confidence'] >= score_threshold]
|
| 323 |
-
clean_img = create_annotated_image(image_array, clean_dets)
|
| 324 |
-
|
| 325 |
-
results["clean"]["detections"] = clean_dets
|
| 326 |
-
results["clean"]["annotated_image"] = image_to_base64(clean_img)
|
| 327 |
-
|
| 328 |
-
# Perturbed detections
|
| 329 |
-
for pert_type in pert_types:
|
| 330 |
-
if pert_type:
|
| 331 |
-
perturbed_img = apply_perturbation(image_array, pert_type)
|
| 332 |
-
pert_dets = generate_demo_detections(perturbed_img.shape)
|
| 333 |
-
# Add slight confidence reduction for perturbed
|
| 334 |
-
pert_dets = [
|
| 335 |
-
{**d, 'confidence': max(0, d['confidence'] - np.random.uniform(0, 0.1))}
|
| 336 |
-
for d in pert_dets
|
| 337 |
-
]
|
| 338 |
-
pert_dets = [d for d in pert_dets if d['confidence'] >= score_threshold]
|
| 339 |
-
annotated_pert = create_annotated_image(perturbed_img, pert_dets)
|
| 340 |
-
|
| 341 |
-
results["perturbed"][pert_type] = {
|
| 342 |
-
"detections": pert_dets,
|
| 343 |
-
"annotated_image": image_to_base64(annotated_pert)
|
| 344 |
-
}
|
| 345 |
-
|
| 346 |
-
return JSONResponse(results)
|
| 347 |
-
|
| 348 |
-
except Exception as e:
|
| 349 |
-
raise HTTPException(status_code=400, detail=str(e))
|
| 350 |
-
|
| 351 |
-
|
| 352 |
-
@app.on_event("shutdown")
|
| 353 |
-
async def shutdown_event():
|
| 354 |
-
"""Cleanup on shutdown"""
|
| 355 |
-
print("\n" + "="*60)
|
| 356 |
-
print("π Shutting down RoDLA API...")
|
| 357 |
-
print("="*60)
|
| 358 |
-
|
| 359 |
-
|
| 360 |
-
if __name__ == "__main__":
|
| 361 |
-
uvicorn.run(
|
| 362 |
-
app,
|
| 363 |
-
host=API_HOST,
|
| 364 |
-
port=API_PORT,
|
| 365 |
-
log_level="info"
|
| 366 |
-
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
deployment/backend/backend_lite.py
DELETED
|
@@ -1,618 +0,0 @@
|
|
| 1 |
-
"""
|
| 2 |
-
Lightweight RoDLA Backend - Pure PyTorch Implementation
|
| 3 |
-
Bypasses MMCV/MMDET compiled extensions for CPU-only systems
|
| 4 |
-
"""
|
| 5 |
-
|
| 6 |
-
import os
|
| 7 |
-
import sys
|
| 8 |
-
import json
|
| 9 |
-
import base64
|
| 10 |
-
import traceback
|
| 11 |
-
import subprocess
|
| 12 |
-
from pathlib import Path
|
| 13 |
-
from typing import Dict, List, Any, Optional, Tuple
|
| 14 |
-
from io import BytesIO
|
| 15 |
-
from datetime import datetime
|
| 16 |
-
|
| 17 |
-
import numpy as np
|
| 18 |
-
from PIL import Image
|
| 19 |
-
import cv2
|
| 20 |
-
import torch
|
| 21 |
-
|
| 22 |
-
from fastapi import FastAPI, File, UploadFile, HTTPException, BackgroundTasks
|
| 23 |
-
from fastapi.middleware.cors import CORSMiddleware
|
| 24 |
-
from fastapi.responses import JSONResponse
|
| 25 |
-
from pydantic import BaseModel
|
| 26 |
-
import uvicorn
|
| 27 |
-
|
| 28 |
-
# Try to import real perturbation functions
|
| 29 |
-
try:
|
| 30 |
-
from perturbations.apply import (
|
| 31 |
-
apply_perturbation as real_apply_perturbation,
|
| 32 |
-
apply_multiple_perturbations,
|
| 33 |
-
get_perturbation_info as get_real_perturbation_info,
|
| 34 |
-
PERTURBATION_CATEGORIES
|
| 35 |
-
)
|
| 36 |
-
REAL_PERTURBATIONS_AVAILABLE = True
|
| 37 |
-
print("β
Real perturbation module imported successfully")
|
| 38 |
-
except Exception as e:
|
| 39 |
-
REAL_PERTURBATIONS_AVAILABLE = False
|
| 40 |
-
print(f"β οΈ Could not import real perturbations: {e}")
|
| 41 |
-
PERTURBATION_CATEGORIES = {}
|
| 42 |
-
|
| 43 |
-
# ============================================================================
|
| 44 |
-
# Configuration
|
| 45 |
-
# ============================================================================
|
| 46 |
-
|
| 47 |
-
class Config:
|
| 48 |
-
"""Global configuration"""
|
| 49 |
-
API_PORT = 8000
|
| 50 |
-
MAX_UPLOAD_SIZE = 50 * 1024 * 1024 # 50MB
|
| 51 |
-
DEFAULT_SCORE_THRESHOLD = 0.3
|
| 52 |
-
MAX_DETECTIONS_PER_IMAGE = 300
|
| 53 |
-
REPO_ROOT = Path("/home/admin/CV/rodla-academic")
|
| 54 |
-
MODEL_CONFIG_PATH = REPO_ROOT / "model/configs/m6doc/rodla_internimage_xl_m6doc.py"
|
| 55 |
-
MODEL_WEIGHTS_PATH = REPO_ROOT / "finetuning_rodla/finetuning_rodla/checkpoints/rodla_internimage_xl_publaynet.pth"
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
# ============================================================================
|
| 59 |
-
# Global State
|
| 60 |
-
# ============================================================================
|
| 61 |
-
|
| 62 |
-
app = FastAPI(title="RoDLA Backend Lite", version="1.0.0")
|
| 63 |
-
model_state = {
|
| 64 |
-
"loaded": False,
|
| 65 |
-
"error": None,
|
| 66 |
-
"model": None,
|
| 67 |
-
"model_type": "lightweight",
|
| 68 |
-
"device": "cpu"
|
| 69 |
-
}
|
| 70 |
-
|
| 71 |
-
# Add CORS middleware
|
| 72 |
-
app.add_middleware(
|
| 73 |
-
CORSMiddleware,
|
| 74 |
-
allow_origins=["*"],
|
| 75 |
-
allow_credentials=True,
|
| 76 |
-
allow_methods=["*"],
|
| 77 |
-
allow_headers=["*"],
|
| 78 |
-
)
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
# ============================================================================
|
| 82 |
-
# Schemas
|
| 83 |
-
# ============================================================================
|
| 84 |
-
|
| 85 |
-
class DetectionResult(BaseModel):
|
| 86 |
-
class_id: int
|
| 87 |
-
class_name: str
|
| 88 |
-
confidence: float
|
| 89 |
-
bbox: Dict[str, float] # {x, y, width, height}
|
| 90 |
-
area: float
|
| 91 |
-
|
| 92 |
-
|
| 93 |
-
class AnalysisResponse(BaseModel):
|
| 94 |
-
success: bool
|
| 95 |
-
message: str
|
| 96 |
-
image_width: int
|
| 97 |
-
image_height: int
|
| 98 |
-
num_detections: int
|
| 99 |
-
detections: List[DetectionResult]
|
| 100 |
-
class_distribution: Dict[str, int]
|
| 101 |
-
processing_time_ms: float
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
class PerturbationResponse(BaseModel):
|
| 105 |
-
success: bool
|
| 106 |
-
message: str
|
| 107 |
-
perturbation_type: str
|
| 108 |
-
original_image: str # base64
|
| 109 |
-
perturbed_image: str # base64
|
| 110 |
-
|
| 111 |
-
|
| 112 |
-
class BatchAnalysisRequest(BaseModel):
|
| 113 |
-
threshold: float = Config.DEFAULT_SCORE_THRESHOLD
|
| 114 |
-
score_threshold: float = Config.DEFAULT_SCORE_THRESHOLD
|
| 115 |
-
|
| 116 |
-
|
| 117 |
-
# ============================================================================
|
| 118 |
-
# Simple Mock Model (Lightweight Detection)
|
| 119 |
-
# ============================================================================
|
| 120 |
-
|
| 121 |
-
class LightweightDetector:
|
| 122 |
-
"""
|
| 123 |
-
Simple layout detection model that doesn't require MMCV/MMDET
|
| 124 |
-
Generates synthetic but realistic detections for document layout analysis
|
| 125 |
-
"""
|
| 126 |
-
|
| 127 |
-
DOCUMENT_CLASSES = {
|
| 128 |
-
0: "Text",
|
| 129 |
-
1: "Title",
|
| 130 |
-
2: "Figure",
|
| 131 |
-
3: "Table",
|
| 132 |
-
4: "Header",
|
| 133 |
-
5: "Footer",
|
| 134 |
-
6: "List"
|
| 135 |
-
}
|
| 136 |
-
|
| 137 |
-
def __init__(self):
|
| 138 |
-
self.device = "cpu"
|
| 139 |
-
print(f"β
Lightweight detector initialized (device: {self.device})")
|
| 140 |
-
|
| 141 |
-
def detect(self, image: np.ndarray, score_threshold: float = 0.3) -> List[Dict[str, Any]]:
|
| 142 |
-
"""
|
| 143 |
-
Perform document layout detection on image
|
| 144 |
-
Returns list of detections with class, confidence, and bbox
|
| 145 |
-
"""
|
| 146 |
-
height, width = image.shape[:2]
|
| 147 |
-
detections = []
|
| 148 |
-
|
| 149 |
-
# Simple heuristic: scan image for content regions
|
| 150 |
-
# Convert to grayscale
|
| 151 |
-
if len(image.shape) == 3:
|
| 152 |
-
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
|
| 153 |
-
else:
|
| 154 |
-
gray = image
|
| 155 |
-
|
| 156 |
-
# Apply threshold to find content regions
|
| 157 |
-
_, binary = cv2.threshold(gray, 200, 255, cv2.THRESH_BINARY_INV)
|
| 158 |
-
|
| 159 |
-
# Find contours
|
| 160 |
-
contours, _ = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
|
| 161 |
-
|
| 162 |
-
# Process top contours as regions
|
| 163 |
-
sorted_contours = sorted(contours, key=cv2.contourArea, reverse=True)[:15]
|
| 164 |
-
|
| 165 |
-
for idx, contour in enumerate(sorted_contours):
|
| 166 |
-
x, y, w, h = cv2.boundingRect(contour)
|
| 167 |
-
|
| 168 |
-
# Skip very small regions
|
| 169 |
-
if w < 10 or h < 10:
|
| 170 |
-
continue
|
| 171 |
-
|
| 172 |
-
# Filter regions that are too large (whole page)
|
| 173 |
-
if w > width * 0.95 or h > height * 0.95:
|
| 174 |
-
continue
|
| 175 |
-
|
| 176 |
-
# Assign class based on heuristics
|
| 177 |
-
aspect_ratio = w / h if h > 0 else 1
|
| 178 |
-
area_ratio = (w * h) / (width * height)
|
| 179 |
-
|
| 180 |
-
if aspect_ratio > 3: # Wide -> likely title or figure caption
|
| 181 |
-
class_id = 1 if area_ratio < 0.15 else 2
|
| 182 |
-
elif aspect_ratio < 0.5: # Tall -> likely list or table
|
| 183 |
-
class_id = 3 if area_ratio > 0.2 else 6
|
| 184 |
-
else: # Regular -> text
|
| 185 |
-
class_id = 0
|
| 186 |
-
|
| 187 |
-
# Generate confidence based on region size and position
|
| 188 |
-
confidence = min(0.95, 0.4 + area_ratio)
|
| 189 |
-
|
| 190 |
-
if confidence >= score_threshold:
|
| 191 |
-
detections.append({
|
| 192 |
-
"class_id": class_id,
|
| 193 |
-
"class_name": self.DOCUMENT_CLASSES.get(class_id, "Unknown"),
|
| 194 |
-
"confidence": float(confidence),
|
| 195 |
-
"bbox": {
|
| 196 |
-
"x": float(x / width),
|
| 197 |
-
"y": float(y / height),
|
| 198 |
-
"width": float(w / width),
|
| 199 |
-
"height": float(h / height)
|
| 200 |
-
},
|
| 201 |
-
"area": float((w * h) / (width * height))
|
| 202 |
-
})
|
| 203 |
-
|
| 204 |
-
# If no detections found, add synthetic ones
|
| 205 |
-
if not detections:
|
| 206 |
-
detections = self._generate_synthetic_detections(width, height, score_threshold)
|
| 207 |
-
|
| 208 |
-
return detections[:Config.MAX_DETECTIONS_PER_IMAGE]
|
| 209 |
-
|
| 210 |
-
def _generate_synthetic_detections(self, width: int, height: int,
|
| 211 |
-
score_threshold: float) -> List[Dict[str, Any]]:
|
| 212 |
-
"""Generate synthetic detections when contour detection fails"""
|
| 213 |
-
detections = []
|
| 214 |
-
|
| 215 |
-
# Title at top
|
| 216 |
-
detections.append({
|
| 217 |
-
"class_id": 1,
|
| 218 |
-
"class_name": "Title",
|
| 219 |
-
"confidence": 0.92,
|
| 220 |
-
"bbox": {"x": 0.05, "y": 0.05, "width": 0.9, "height": 0.1},
|
| 221 |
-
"area": 0.09
|
| 222 |
-
})
|
| 223 |
-
|
| 224 |
-
# Main text body
|
| 225 |
-
detections.append({
|
| 226 |
-
"class_id": 0,
|
| 227 |
-
"class_name": "Text",
|
| 228 |
-
"confidence": 0.88,
|
| 229 |
-
"bbox": {"x": 0.05, "y": 0.2, "width": 0.9, "height": 0.6},
|
| 230 |
-
"area": 0.54
|
| 231 |
-
})
|
| 232 |
-
|
| 233 |
-
# Side figure
|
| 234 |
-
detections.append({
|
| 235 |
-
"class_id": 2,
|
| 236 |
-
"class_name": "Figure",
|
| 237 |
-
"confidence": 0.85,
|
| 238 |
-
"bbox": {"x": 0.55, "y": 0.22, "width": 0.4, "height": 0.4},
|
| 239 |
-
"area": 0.16
|
| 240 |
-
})
|
| 241 |
-
|
| 242 |
-
return [d for d in detections if d["confidence"] >= score_threshold]
|
| 243 |
-
|
| 244 |
-
|
| 245 |
-
# ============================================================================
|
| 246 |
-
# Model Loading
|
| 247 |
-
# ============================================================================
|
| 248 |
-
|
| 249 |
-
def load_model():
|
| 250 |
-
"""Load the detection model"""
|
| 251 |
-
global model_state
|
| 252 |
-
|
| 253 |
-
try:
|
| 254 |
-
print("\n" + "="*60)
|
| 255 |
-
print("π Loading RoDLA Model (Lightweight Mode)")
|
| 256 |
-
print("="*60)
|
| 257 |
-
|
| 258 |
-
model_state["model"] = LightweightDetector()
|
| 259 |
-
model_state["loaded"] = True
|
| 260 |
-
model_state["error"] = None
|
| 261 |
-
|
| 262 |
-
print("β
Model loaded successfully!")
|
| 263 |
-
print(f" Device: {model_state['model'].device}")
|
| 264 |
-
print(f" Type: Lightweight detector (no MMCV/MMDET required)")
|
| 265 |
-
print("="*60 + "\n")
|
| 266 |
-
|
| 267 |
-
return model_state["model"]
|
| 268 |
-
|
| 269 |
-
except Exception as e:
|
| 270 |
-
error_msg = f"Failed to load model: {str(e)}\n{traceback.format_exc()}"
|
| 271 |
-
print(f"β {error_msg}")
|
| 272 |
-
model_state["error"] = error_msg
|
| 273 |
-
model_state["loaded"] = False
|
| 274 |
-
raise
|
| 275 |
-
|
| 276 |
-
|
| 277 |
-
# ============================================================================
|
| 278 |
-
# Utility Functions
|
| 279 |
-
# ============================================================================
|
| 280 |
-
|
| 281 |
-
def encode_image_to_base64(image: np.ndarray) -> str:
|
| 282 |
-
"""Convert numpy array to base64 string"""
|
| 283 |
-
_, buffer = cv2.imencode('.png', cv2.cvtColor(image, cv2.COLOR_RGB2BGR))
|
| 284 |
-
return base64.b64encode(buffer).decode('utf-8')
|
| 285 |
-
|
| 286 |
-
|
| 287 |
-
def decode_base64_to_image(b64_str: str) -> np.ndarray:
|
| 288 |
-
"""Convert base64 string to numpy array"""
|
| 289 |
-
buffer = base64.b64decode(b64_str)
|
| 290 |
-
image = Image.open(BytesIO(buffer)).convert('RGB')
|
| 291 |
-
return np.array(image)
|
| 292 |
-
|
| 293 |
-
|
| 294 |
-
def apply_perturbation(image: np.ndarray, perturbation_type: str,
|
| 295 |
-
degree: int = 2, **kwargs) -> np.ndarray:
|
| 296 |
-
"""Apply perturbation using real backend if available, else fallback"""
|
| 297 |
-
|
| 298 |
-
if REAL_PERTURBATIONS_AVAILABLE:
|
| 299 |
-
try:
|
| 300 |
-
result, success, msg = real_apply_perturbation(image, perturbation_type, degree=degree)
|
| 301 |
-
if success:
|
| 302 |
-
return result
|
| 303 |
-
else:
|
| 304 |
-
print(f"β οΈ Real perturbation failed ({perturbation_type}): {msg}")
|
| 305 |
-
except Exception as e:
|
| 306 |
-
print(f"β οΈ Exception in real perturbation ({perturbation_type}): {e}")
|
| 307 |
-
|
| 308 |
-
# Fallback to simple perturbations
|
| 309 |
-
h, w = image.shape[:2]
|
| 310 |
-
|
| 311 |
-
if perturbation_type == "blur" or perturbation_type == "defocus":
|
| 312 |
-
kernel_size = [3, 5, 7][degree - 1]
|
| 313 |
-
return cv2.GaussianBlur(image, (kernel_size, kernel_size), 0)
|
| 314 |
-
|
| 315 |
-
elif perturbation_type == "noise" or perturbation_type == "speckle":
|
| 316 |
-
std = [10, 25, 50][degree - 1]
|
| 317 |
-
noise = np.random.normal(0, std, image.shape)
|
| 318 |
-
return np.clip(image.astype(float) + noise, 0, 255).astype(np.uint8)
|
| 319 |
-
|
| 320 |
-
elif perturbation_type == "rotation":
|
| 321 |
-
angle = [5, 15, 25][degree - 1]
|
| 322 |
-
center = (w // 2, h // 2)
|
| 323 |
-
M = cv2.getRotationMatrix2D(center, angle, 1.0)
|
| 324 |
-
return cv2.warpAffine(image, M, (w, h), borderValue=(255, 255, 255))
|
| 325 |
-
|
| 326 |
-
elif perturbation_type == "scaling":
|
| 327 |
-
scale = [0.9, 0.8, 0.7][degree - 1]
|
| 328 |
-
new_w, new_h = int(w * scale), int(h * scale)
|
| 329 |
-
resized = cv2.resize(image, (new_w, new_h))
|
| 330 |
-
canvas = np.full((h, w, 3), 255, dtype=np.uint8)
|
| 331 |
-
y_offset = (h - new_h) // 2
|
| 332 |
-
x_offset = (w - new_w) // 2
|
| 333 |
-
canvas[y_offset:y_offset+new_h, x_offset:x_offset+new_w] = resized
|
| 334 |
-
return canvas
|
| 335 |
-
|
| 336 |
-
elif perturbation_type == "perspective":
|
| 337 |
-
offset = [10, 20, 40][degree - 1]
|
| 338 |
-
pts1 = np.float32([[0, 0], [w, 0], [0, h], [w, h]])
|
| 339 |
-
pts2 = np.float32([
|
| 340 |
-
[offset, 0],
|
| 341 |
-
[w - offset, offset],
|
| 342 |
-
[0, h - offset],
|
| 343 |
-
[w - offset, h]
|
| 344 |
-
])
|
| 345 |
-
M = cv2.getPerspectiveTransform(pts1, pts2)
|
| 346 |
-
return cv2.warpPerspective(image, M, (w, h), borderValue=(255, 255, 255))
|
| 347 |
-
|
| 348 |
-
else:
|
| 349 |
-
return image
|
| 350 |
-
|
| 351 |
-
|
| 352 |
-
# ============================================================================
|
| 353 |
-
# API Routes
|
| 354 |
-
# ============================================================================
|
| 355 |
-
|
| 356 |
-
@app.on_event("startup")
|
| 357 |
-
async def startup_event():
|
| 358 |
-
"""Initialize model on startup"""
|
| 359 |
-
try:
|
| 360 |
-
load_model()
|
| 361 |
-
except Exception as e:
|
| 362 |
-
print(f"β οΈ Startup error: {e}")
|
| 363 |
-
|
| 364 |
-
|
| 365 |
-
@app.get("/api/health")
|
| 366 |
-
async def health_check():
|
| 367 |
-
"""Health check endpoint"""
|
| 368 |
-
return {
|
| 369 |
-
"status": "ok",
|
| 370 |
-
"model_loaded": model_state["loaded"],
|
| 371 |
-
"device": model_state["device"],
|
| 372 |
-
"model_type": model_state["model_type"]
|
| 373 |
-
}
|
| 374 |
-
|
| 375 |
-
|
| 376 |
-
@app.get("/api/model-info")
|
| 377 |
-
async def model_info():
|
| 378 |
-
"""Get model information"""
|
| 379 |
-
return {
|
| 380 |
-
"name": "RoDLA Lightweight",
|
| 381 |
-
"version": "1.0.0",
|
| 382 |
-
"type": "Document Layout Analysis",
|
| 383 |
-
"loaded": model_state["loaded"],
|
| 384 |
-
"device": model_state["device"],
|
| 385 |
-
"framework": "PyTorch (Pure)",
|
| 386 |
-
"classes": LightweightDetector.DOCUMENT_CLASSES,
|
| 387 |
-
"supported_perturbations": ["blur", "noise", "rotation", "scaling", "perspective"]
|
| 388 |
-
}
|
| 389 |
-
|
| 390 |
-
|
| 391 |
-
@app.post("/api/detect")
|
| 392 |
-
async def detect(file: UploadFile = File(...), threshold: float = 0.3):
|
| 393 |
-
"""Detect document layout in image"""
|
| 394 |
-
start_time = datetime.now()
|
| 395 |
-
|
| 396 |
-
try:
|
| 397 |
-
if not model_state["loaded"]:
|
| 398 |
-
raise HTTPException(status_code=500, detail="Model not loaded")
|
| 399 |
-
|
| 400 |
-
# Read image
|
| 401 |
-
contents = await file.read()
|
| 402 |
-
image = Image.open(BytesIO(contents)).convert('RGB')
|
| 403 |
-
image_np = np.array(image)
|
| 404 |
-
|
| 405 |
-
# Run detection
|
| 406 |
-
detections = model_state["model"].detect(image_np, score_threshold=threshold)
|
| 407 |
-
|
| 408 |
-
# Build response
|
| 409 |
-
class_distribution = {}
|
| 410 |
-
for det in detections:
|
| 411 |
-
class_name = det["class_name"]
|
| 412 |
-
class_distribution[class_name] = class_distribution.get(class_name, 0) + 1
|
| 413 |
-
|
| 414 |
-
processing_time = (datetime.now() - start_time).total_seconds() * 1000
|
| 415 |
-
|
| 416 |
-
return {
|
| 417 |
-
"success": True,
|
| 418 |
-
"message": "Detection completed",
|
| 419 |
-
"image_width": image_np.shape[1],
|
| 420 |
-
"image_height": image_np.shape[0],
|
| 421 |
-
"num_detections": len(detections),
|
| 422 |
-
"detections": detections,
|
| 423 |
-
"class_distribution": class_distribution,
|
| 424 |
-
"processing_time_ms": processing_time
|
| 425 |
-
}
|
| 426 |
-
|
| 427 |
-
except Exception as e:
|
| 428 |
-
print(f"β Detection error: {e}")
|
| 429 |
-
return {
|
| 430 |
-
"success": False,
|
| 431 |
-
"message": str(e),
|
| 432 |
-
"image_width": 0,
|
| 433 |
-
"image_height": 0,
|
| 434 |
-
"num_detections": 0,
|
| 435 |
-
"detections": [],
|
| 436 |
-
"class_distribution": {},
|
| 437 |
-
"processing_time_ms": 0
|
| 438 |
-
}
|
| 439 |
-
|
| 440 |
-
|
| 441 |
-
@app.get("/api/perturbations/info")
|
| 442 |
-
async def perturbation_info():
|
| 443 |
-
"""Get information about available perturbations"""
|
| 444 |
-
return {
|
| 445 |
-
"total_perturbations": 12,
|
| 446 |
-
"categories": {
|
| 447 |
-
"blur": {
|
| 448 |
-
"types": ["defocus", "vibration"],
|
| 449 |
-
"description": "Blur effects simulating optical issues"
|
| 450 |
-
},
|
| 451 |
-
"noise": {
|
| 452 |
-
"types": ["speckle", "texture"],
|
| 453 |
-
"description": "Noise patterns and texture artifacts"
|
| 454 |
-
},
|
| 455 |
-
"content": {
|
| 456 |
-
"types": ["watermark", "background"],
|
| 457 |
-
"description": "Content additions like watermarks and backgrounds"
|
| 458 |
-
},
|
| 459 |
-
"inconsistency": {
|
| 460 |
-
"types": ["ink_holdout", "ink_bleeding", "illumination"],
|
| 461 |
-
"description": "Print quality issues and lighting variations"
|
| 462 |
-
},
|
| 463 |
-
"spatial": {
|
| 464 |
-
"types": ["rotation", "keystoning", "warping"],
|
| 465 |
-
"description": "Geometric transformations"
|
| 466 |
-
}
|
| 467 |
-
},
|
| 468 |
-
"all_types": [
|
| 469 |
-
"defocus", "vibration", "speckle", "texture",
|
| 470 |
-
"watermark", "background", "ink_holdout", "ink_bleeding",
|
| 471 |
-
"illumination", "rotation", "keystoning", "warping"
|
| 472 |
-
],
|
| 473 |
-
"degree_levels": {
|
| 474 |
-
1: "Mild - Subtle effect",
|
| 475 |
-
2: "Moderate - Noticeable effect",
|
| 476 |
-
3: "Severe - Strong effect"
|
| 477 |
-
}
|
| 478 |
-
}
|
| 479 |
-
|
| 480 |
-
|
| 481 |
-
@app.post("/api/generate-perturbations")
|
| 482 |
-
async def generate_perturbations(file: UploadFile = File(...)):
|
| 483 |
-
"""Generate perturbed versions of image with all 12 types Γ 3 degrees"""
|
| 484 |
-
|
| 485 |
-
try:
|
| 486 |
-
# Read image
|
| 487 |
-
contents = await file.read()
|
| 488 |
-
image = Image.open(BytesIO(contents)).convert('RGB')
|
| 489 |
-
image_np = np.array(image)
|
| 490 |
-
|
| 491 |
-
# Convert RGB to BGR for OpenCV
|
| 492 |
-
image_bgr = cv2.cvtColor(image_np, cv2.COLOR_RGB2BGR)
|
| 493 |
-
|
| 494 |
-
perturbations = {}
|
| 495 |
-
|
| 496 |
-
# Original
|
| 497 |
-
perturbations["original"] = {
|
| 498 |
-
"original": encode_image_to_base64(image_np)
|
| 499 |
-
}
|
| 500 |
-
|
| 501 |
-
# All 12 perturbation types
|
| 502 |
-
all_types = [
|
| 503 |
-
"defocus", "vibration", "speckle", "texture",
|
| 504 |
-
"watermark", "background", "ink_holdout", "ink_bleeding",
|
| 505 |
-
"illumination", "rotation", "keystoning", "warping"
|
| 506 |
-
]
|
| 507 |
-
|
| 508 |
-
for ptype in all_types:
|
| 509 |
-
perturbations[ptype] = {}
|
| 510 |
-
for degree in [1, 2, 3]:
|
| 511 |
-
try:
|
| 512 |
-
perturbed = apply_perturbation(image_bgr.copy(), ptype, degree)
|
| 513 |
-
# Convert back to RGB for display
|
| 514 |
-
if len(perturbed.shape) == 3 and perturbed.shape[2] == 3:
|
| 515 |
-
perturbed_rgb = cv2.cvtColor(perturbed, cv2.COLOR_BGR2RGB)
|
| 516 |
-
else:
|
| 517 |
-
perturbed_rgb = perturbed
|
| 518 |
-
perturbations[ptype][f"degree_{degree}"] = encode_image_to_base64(perturbed_rgb)
|
| 519 |
-
except Exception as e:
|
| 520 |
-
print(f"β οΈ Warning: Failed to apply {ptype} degree {degree}: {e}")
|
| 521 |
-
# Use original as fallback
|
| 522 |
-
perturbations[ptype][f"degree_{degree}"] = encode_image_to_base64(image_np)
|
| 523 |
-
|
| 524 |
-
return {
|
| 525 |
-
"success": True,
|
| 526 |
-
"message": "Perturbations generated (12 types Γ 3 levels)",
|
| 527 |
-
"perturbations": perturbations,
|
| 528 |
-
"grid_info": {
|
| 529 |
-
"total_perturbations": 12,
|
| 530 |
-
"degree_levels": 3,
|
| 531 |
-
"total_images": 13 # 1 original + 12 types
|
| 532 |
-
}
|
| 533 |
-
}
|
| 534 |
-
|
| 535 |
-
except Exception as e:
|
| 536 |
-
print(f"β Perturbation error: {e}")
|
| 537 |
-
import traceback
|
| 538 |
-
traceback.print_exc()
|
| 539 |
-
return {
|
| 540 |
-
"success": False,
|
| 541 |
-
"message": str(e),
|
| 542 |
-
"perturbations": {}
|
| 543 |
-
}
|
| 544 |
-
|
| 545 |
-
|
| 546 |
-
@app.post("/api/detect-with-perturbation")
|
| 547 |
-
async def detect_with_perturbation(
|
| 548 |
-
file: UploadFile = File(...),
|
| 549 |
-
perturbation_type: str = "blur",
|
| 550 |
-
threshold: float = 0.3
|
| 551 |
-
):
|
| 552 |
-
"""Apply perturbation and detect"""
|
| 553 |
-
|
| 554 |
-
try:
|
| 555 |
-
# Read image
|
| 556 |
-
contents = await file.read()
|
| 557 |
-
image = Image.open(BytesIO(contents)).convert('RGB')
|
| 558 |
-
image_np = np.array(image)
|
| 559 |
-
|
| 560 |
-
# Apply perturbation
|
| 561 |
-
if perturbation_type == "blur":
|
| 562 |
-
perturbed = apply_perturbation(image_np, "blur", kernel_size=15)
|
| 563 |
-
elif perturbation_type == "noise":
|
| 564 |
-
perturbed = apply_perturbation(image_np, "noise", std=25)
|
| 565 |
-
elif perturbation_type == "rotation":
|
| 566 |
-
perturbed = apply_perturbation(image_np, "rotation", angle=15)
|
| 567 |
-
elif perturbation_type == "scaling":
|
| 568 |
-
perturbed = apply_perturbation(image_np, "scaling", scale=0.85)
|
| 569 |
-
elif perturbation_type == "perspective":
|
| 570 |
-
perturbed = apply_perturbation(image_np, "perspective", offset=20)
|
| 571 |
-
else:
|
| 572 |
-
perturbed = image_np
|
| 573 |
-
|
| 574 |
-
# Run detection
|
| 575 |
-
detections = model_state["model"].detect(perturbed, score_threshold=threshold)
|
| 576 |
-
|
| 577 |
-
class_distribution = {}
|
| 578 |
-
for det in detections:
|
| 579 |
-
class_name = det["class_name"]
|
| 580 |
-
class_distribution[class_name] = class_distribution.get(class_name, 0) + 1
|
| 581 |
-
|
| 582 |
-
return {
|
| 583 |
-
"success": True,
|
| 584 |
-
"message": "Detection with perturbation completed",
|
| 585 |
-
"perturbation_type": perturbation_type,
|
| 586 |
-
"image_width": perturbed.shape[1],
|
| 587 |
-
"image_height": perturbed.shape[0],
|
| 588 |
-
"num_detections": len(detections),
|
| 589 |
-
"detections": detections,
|
| 590 |
-
"class_distribution": class_distribution
|
| 591 |
-
}
|
| 592 |
-
|
| 593 |
-
except Exception as e:
|
| 594 |
-
print(f"β Detection with perturbation error: {e}")
|
| 595 |
-
return {
|
| 596 |
-
"success": False,
|
| 597 |
-
"message": str(e),
|
| 598 |
-
"perturbation_type": perturbation_type,
|
| 599 |
-
"num_detections": 0,
|
| 600 |
-
"detections": []
|
| 601 |
-
}
|
| 602 |
-
|
| 603 |
-
|
| 604 |
-
# ============================================================================
|
| 605 |
-
# Main
|
| 606 |
-
# ============================================================================
|
| 607 |
-
|
| 608 |
-
if __name__ == "__main__":
|
| 609 |
-
print("\n" + "π·"*30)
|
| 610 |
-
print("π· RoDLA Lightweight Backend Starting...")
|
| 611 |
-
print("π·"*30)
|
| 612 |
-
|
| 613 |
-
uvicorn.run(
|
| 614 |
-
app,
|
| 615 |
-
host="0.0.0.0",
|
| 616 |
-
port=Config.API_PORT,
|
| 617 |
-
log_level="info"
|
| 618 |
-
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
deployment/backend/{backend_two.py β backend_old.py}
RENAMED
|
File without changes
|
deployment/backend/perturbations/spatial.py
CHANGED
|
@@ -1,41 +1,49 @@
|
|
| 1 |
import os.path
|
| 2 |
-
from detectron2.data.transforms import RotationTransform
|
| 3 |
-
from detectron2.data.detection_utils import transform_instance_annotations
|
| 4 |
import numpy as np
|
| 5 |
-
from detectron2.data.datasets import register_coco_instances
|
| 6 |
from copy import deepcopy
|
| 7 |
import os
|
| 8 |
import cv2
|
| 9 |
-
from detectron2.data.datasets.coco import convert_to_coco_json, convert_to_coco_dict
|
| 10 |
-
from detectron2.data import MetadataCatalog, DatasetCatalog
|
| 11 |
import imgaug.augmenters as iaa
|
| 12 |
from imgaug.augmentables.bbs import BoundingBox, BoundingBoxesOnImage
|
| 13 |
from imgaug.augmentables.polys import Polygon, PolygonsOnImage
|
| 14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
def apply_rotation(image, degree, annos=None):
|
| 17 |
if degree == 0:
|
| 18 |
-
return image
|
|
|
|
| 19 |
angle_low_list = [0, 5, 10]
|
| 20 |
angle_high_list = [5, 10, 15]
|
| 21 |
angle_high = angle_high_list[degree - 1]
|
| 22 |
angle_low = angle_low_list[degree - 1]
|
| 23 |
h, w = image.shape[:2]
|
|
|
|
| 24 |
if angle_low == 0:
|
| 25 |
rotation = np.random.choice(np.arange(-angle_high, angle_high+1))
|
| 26 |
else:
|
| 27 |
rotation = np.random.choice(np.concatenate([np.arange(-angle_high, -angle_low+1), np.arange(angle_low, angle_high+1)]))
|
| 28 |
-
|
| 29 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
if annos is None:
|
| 31 |
return rotated_image
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
for i, seg in enumerate(rotated_anno["segmentation"]):
|
| 36 |
-
rotated_anno["segmentation"][i] = seg.tolist()
|
| 37 |
-
rotated_annos.append(rotated_anno)
|
| 38 |
-
return rotated_image, rotated_annos
|
| 39 |
|
| 40 |
|
| 41 |
def apply_warping(image, degree, annos=None):
|
|
|
|
| 1 |
import os.path
|
|
|
|
|
|
|
| 2 |
import numpy as np
|
|
|
|
| 3 |
from copy import deepcopy
|
| 4 |
import os
|
| 5 |
import cv2
|
|
|
|
|
|
|
| 6 |
import imgaug.augmenters as iaa
|
| 7 |
from imgaug.augmentables.bbs import BoundingBox, BoundingBoxesOnImage
|
| 8 |
from imgaug.augmentables.polys import Polygon, PolygonsOnImage
|
| 9 |
|
| 10 |
+
# detectron2 imports are only used for annotation transformation (optional)
|
| 11 |
+
try:
|
| 12 |
+
from detectron2.data.transforms import RotationTransform
|
| 13 |
+
from detectron2.data.detection_utils import transform_instance_annotations
|
| 14 |
+
from detectron2.data.datasets import register_coco_instances
|
| 15 |
+
from detectron2.data.datasets.coco import convert_to_coco_json, convert_to_coco_dict
|
| 16 |
+
from detectron2.data import MetadataCatalog, DatasetCatalog
|
| 17 |
+
HAS_DETECTRON2 = True
|
| 18 |
+
except ImportError:
|
| 19 |
+
HAS_DETECTRON2 = False
|
| 20 |
+
|
| 21 |
|
| 22 |
def apply_rotation(image, degree, annos=None):
|
| 23 |
if degree == 0:
|
| 24 |
+
return image if annos is None else (image, annos)
|
| 25 |
+
|
| 26 |
angle_low_list = [0, 5, 10]
|
| 27 |
angle_high_list = [5, 10, 15]
|
| 28 |
angle_high = angle_high_list[degree - 1]
|
| 29 |
angle_low = angle_low_list[degree - 1]
|
| 30 |
h, w = image.shape[:2]
|
| 31 |
+
|
| 32 |
if angle_low == 0:
|
| 33 |
rotation = np.random.choice(np.arange(-angle_high, angle_high+1))
|
| 34 |
else:
|
| 35 |
rotation = np.random.choice(np.concatenate([np.arange(-angle_high, -angle_low+1), np.arange(angle_low, angle_high+1)]))
|
| 36 |
+
|
| 37 |
+
# Use OpenCV for rotation instead of detectron2
|
| 38 |
+
center = (w // 2, h // 2)
|
| 39 |
+
rotation_matrix = cv2.getRotationMatrix2D(center, rotation, 1.0)
|
| 40 |
+
rotated_image = cv2.warpAffine(image, rotation_matrix, (w, h), borderValue=(255, 255, 255))
|
| 41 |
+
|
| 42 |
if annos is None:
|
| 43 |
return rotated_image
|
| 44 |
+
|
| 45 |
+
# For annotations, return original since we don't have detectron2
|
| 46 |
+
return rotated_image, annos
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
|
| 48 |
|
| 49 |
def apply_warping(image, degree, annos=None):
|
deployment/backend/perturbations_simple.py
ADDED
|
@@ -0,0 +1,516 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Perturbation Application Module - Using Common Libraries
|
| 3 |
+
Applies 12 document degradation perturbations using PIL, OpenCV, NumPy, and SciPy
|
| 4 |
+
"""
|
| 5 |
+
|
| 6 |
+
import cv2
|
| 7 |
+
import numpy as np
|
| 8 |
+
from PIL import Image, ImageDraw, ImageFilter, ImageOps
|
| 9 |
+
from typing import Optional, Tuple, List, Dict
|
| 10 |
+
from scipy import ndimage
|
| 11 |
+
from scipy.ndimage import gaussian_filter
|
| 12 |
+
import random
|
| 13 |
+
|
| 14 |
+
|
| 15 |
+
def encode_to_rgb(image: np.ndarray) -> np.ndarray:
|
| 16 |
+
"""Ensure image is in RGB format"""
|
| 17 |
+
if len(image.shape) == 2: # Grayscale
|
| 18 |
+
return cv2.cvtColor(image, cv2.COLOR_GRAY2RGB)
|
| 19 |
+
elif image.shape[2] == 4: # RGBA
|
| 20 |
+
return cv2.cvtColor(image, cv2.COLOR_RGBA2RGB)
|
| 21 |
+
return image
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
# ============================================================================
|
| 25 |
+
# BLUR PERTURBATIONS
|
| 26 |
+
# ============================================================================
|
| 27 |
+
|
| 28 |
+
def apply_defocus(image: np.ndarray, degree: int) -> Tuple[np.ndarray, bool, str]:
|
| 29 |
+
"""
|
| 30 |
+
Apply defocus blur (Gaussian blur simulating out-of-focus camera)
|
| 31 |
+
degree: 1 (mild), 2 (moderate), 3 (severe)
|
| 32 |
+
"""
|
| 33 |
+
if degree == 0:
|
| 34 |
+
return image, True, "No defocus"
|
| 35 |
+
|
| 36 |
+
try:
|
| 37 |
+
image = encode_to_rgb(image)
|
| 38 |
+
|
| 39 |
+
# Kernel sizes for different degrees
|
| 40 |
+
kernel_sizes = {1: 3, 2: 7, 3: 15}
|
| 41 |
+
kernel_size = kernel_sizes.get(degree, 15)
|
| 42 |
+
|
| 43 |
+
# Apply Gaussian blur
|
| 44 |
+
blurred = cv2.GaussianBlur(image, (kernel_size, kernel_size), 0)
|
| 45 |
+
|
| 46 |
+
return blurred, True, f"Defocus applied (kernel={kernel_size})"
|
| 47 |
+
except Exception as e:
|
| 48 |
+
return image, False, f"Defocus error: {str(e)}"
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
def apply_vibration(image: np.ndarray, degree: int) -> Tuple[np.ndarray, bool, str]:
|
| 52 |
+
"""
|
| 53 |
+
Apply motion blur (vibration/camera shake effect)
|
| 54 |
+
degree: 1 (mild), 2 (moderate), 3 (severe)
|
| 55 |
+
"""
|
| 56 |
+
if degree == 0:
|
| 57 |
+
return image, True, "No vibration"
|
| 58 |
+
|
| 59 |
+
try:
|
| 60 |
+
image = encode_to_rgb(image)
|
| 61 |
+
h, w = image.shape[:2]
|
| 62 |
+
|
| 63 |
+
# Motion blur kernel sizes
|
| 64 |
+
kernel_sizes = {1: 5, 2: 15, 3: 25}
|
| 65 |
+
kernel_size = kernel_sizes.get(degree, 25)
|
| 66 |
+
|
| 67 |
+
# Create motion blur kernel
|
| 68 |
+
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (kernel_size, kernel_size))
|
| 69 |
+
kernel = kernel / kernel.sum()
|
| 70 |
+
|
| 71 |
+
# Apply motion blur
|
| 72 |
+
blurred = cv2.filter2D(image, -1, kernel)
|
| 73 |
+
|
| 74 |
+
return blurred, True, f"Vibration applied (kernel={kernel_size})"
|
| 75 |
+
except Exception as e:
|
| 76 |
+
return image, False, f"Vibration error: {str(e)}"
|
| 77 |
+
|
| 78 |
+
|
| 79 |
+
# ============================================================================
|
| 80 |
+
# NOISE PERTURBATIONS
|
| 81 |
+
# ============================================================================
|
| 82 |
+
|
| 83 |
+
def apply_speckle(image: np.ndarray, degree: int) -> Tuple[np.ndarray, bool, str]:
|
| 84 |
+
"""
|
| 85 |
+
Apply speckle noise (multiplicative noise)
|
| 86 |
+
degree: 1 (mild), 2 (moderate), 3 (severe)
|
| 87 |
+
"""
|
| 88 |
+
if degree == 0:
|
| 89 |
+
return image, True, "No speckle"
|
| 90 |
+
|
| 91 |
+
try:
|
| 92 |
+
image = encode_to_rgb(image)
|
| 93 |
+
image_float = image.astype(np.float32) / 255.0
|
| 94 |
+
|
| 95 |
+
# Noise intensity
|
| 96 |
+
noise_levels = {1: 0.1, 2: 0.25, 3: 0.5}
|
| 97 |
+
noise_level = noise_levels.get(degree, 0.5)
|
| 98 |
+
|
| 99 |
+
# Generate speckle noise
|
| 100 |
+
speckle = np.random.normal(1, noise_level, image_float.shape)
|
| 101 |
+
noisy = image_float * speckle
|
| 102 |
+
|
| 103 |
+
# Clip values
|
| 104 |
+
noisy = np.clip(noisy, 0, 1)
|
| 105 |
+
noisy = (noisy * 255).astype(np.uint8)
|
| 106 |
+
|
| 107 |
+
return noisy, True, f"Speckle applied (intensity={noise_level})"
|
| 108 |
+
except Exception as e:
|
| 109 |
+
return image, False, f"Speckle error: {str(e)}"
|
| 110 |
+
|
| 111 |
+
|
| 112 |
+
def apply_texture(image: np.ndarray, degree: int) -> Tuple[np.ndarray, bool, str]:
|
| 113 |
+
"""
|
| 114 |
+
Apply texture/grain noise (additive Gaussian noise)
|
| 115 |
+
degree: 1 (mild), 2 (moderate), 3 (severe)
|
| 116 |
+
"""
|
| 117 |
+
if degree == 0:
|
| 118 |
+
return image, True, "No texture"
|
| 119 |
+
|
| 120 |
+
try:
|
| 121 |
+
image = encode_to_rgb(image)
|
| 122 |
+
image_float = image.astype(np.float32)
|
| 123 |
+
|
| 124 |
+
# Noise levels
|
| 125 |
+
noise_levels = {1: 10, 2: 25, 3: 50}
|
| 126 |
+
noise_level = noise_levels.get(degree, 50)
|
| 127 |
+
|
| 128 |
+
# Add Gaussian noise
|
| 129 |
+
noise = np.random.normal(0, noise_level, image_float.shape)
|
| 130 |
+
noisy = image_float + noise
|
| 131 |
+
|
| 132 |
+
# Clip values
|
| 133 |
+
noisy = np.clip(noisy, 0, 255).astype(np.uint8)
|
| 134 |
+
|
| 135 |
+
return noisy, True, f"Texture applied (std={noise_level})"
|
| 136 |
+
except Exception as e:
|
| 137 |
+
return image, False, f"Texture error: {str(e)}"
|
| 138 |
+
|
| 139 |
+
|
| 140 |
+
# ============================================================================
|
| 141 |
+
# CONTENT PERTURBATIONS
|
| 142 |
+
# ============================================================================
|
| 143 |
+
|
| 144 |
+
def apply_watermark(image: np.ndarray, degree: int) -> Tuple[np.ndarray, bool, str]:
|
| 145 |
+
"""
|
| 146 |
+
Add watermark text overlay
|
| 147 |
+
degree: 1 (subtle), 2 (noticeable), 3 (heavy)
|
| 148 |
+
"""
|
| 149 |
+
if degree == 0:
|
| 150 |
+
return image, True, "No watermark"
|
| 151 |
+
|
| 152 |
+
try:
|
| 153 |
+
image = encode_to_rgb(image)
|
| 154 |
+
h, w = image.shape[:2]
|
| 155 |
+
|
| 156 |
+
# Convert to PIL for text drawing
|
| 157 |
+
pil_image = Image.fromarray(image)
|
| 158 |
+
draw = ImageDraw.Draw(pil_image, 'RGBA')
|
| 159 |
+
|
| 160 |
+
# Watermark parameters by degree
|
| 161 |
+
watermark_text = "WATERMARK" * degree
|
| 162 |
+
fontsize_list = {1: max(10, h // 20), 2: max(15, h // 15), 3: max(20, h // 10)}
|
| 163 |
+
fontsize = fontsize_list.get(degree, 20)
|
| 164 |
+
|
| 165 |
+
alpha_list = {1: 64, 2: 128, 3: 200}
|
| 166 |
+
alpha = alpha_list.get(degree, 200)
|
| 167 |
+
|
| 168 |
+
# Draw watermark multiple times
|
| 169 |
+
num_watermarks = {1: 1, 2: 3, 3: 5}.get(degree, 5)
|
| 170 |
+
|
| 171 |
+
for i in range(num_watermarks):
|
| 172 |
+
x = (w // (num_watermarks + 1)) * (i + 1)
|
| 173 |
+
y = h // 2
|
| 174 |
+
color = (255, 0, 0, alpha)
|
| 175 |
+
draw.text((x, y), watermark_text, fill=color)
|
| 176 |
+
|
| 177 |
+
return np.array(pil_image), True, f"Watermark applied (degree={degree})"
|
| 178 |
+
except Exception as e:
|
| 179 |
+
return image, False, f"Watermark error: {str(e)}"
|
| 180 |
+
|
| 181 |
+
|
| 182 |
+
def apply_background(image: np.ndarray, degree: int) -> Tuple[np.ndarray, bool, str]:
|
| 183 |
+
"""
|
| 184 |
+
Add background patterns/textures
|
| 185 |
+
degree: 1 (subtle), 2 (noticeable), 3 (heavy)
|
| 186 |
+
"""
|
| 187 |
+
if degree == 0:
|
| 188 |
+
return image, True, "No background"
|
| 189 |
+
|
| 190 |
+
try:
|
| 191 |
+
image = encode_to_rgb(image)
|
| 192 |
+
h, w = image.shape[:2]
|
| 193 |
+
|
| 194 |
+
# Create background pattern
|
| 195 |
+
pattern_intensity = {1: 0.1, 2: 0.2, 3: 0.35}.get(degree, 0.35)
|
| 196 |
+
|
| 197 |
+
# Generate random pattern
|
| 198 |
+
pattern = np.random.randint(0, 100, (h, w, 3), dtype=np.uint8)
|
| 199 |
+
pattern = cv2.GaussianBlur(pattern, (21, 21), 0)
|
| 200 |
+
|
| 201 |
+
# Blend with original image
|
| 202 |
+
result = cv2.addWeighted(image, 1.0, pattern, pattern_intensity, 0)
|
| 203 |
+
|
| 204 |
+
return result.astype(np.uint8), True, f"Background applied (intensity={pattern_intensity})"
|
| 205 |
+
except Exception as e:
|
| 206 |
+
return image, False, f"Background error: {str(e)}"
|
| 207 |
+
|
| 208 |
+
|
| 209 |
+
# ============================================================================
|
| 210 |
+
# INCONSISTENCY PERTURBATIONS
|
| 211 |
+
# ============================================================================
|
| 212 |
+
|
| 213 |
+
def apply_ink_holdout(image: np.ndarray, degree: int) -> Tuple[np.ndarray, bool, str]:
|
| 214 |
+
"""
|
| 215 |
+
Apply ink holdout (missing ink/text drop-out)
|
| 216 |
+
degree: 1 (few gaps), 2 (some gaps), 3 (many gaps)
|
| 217 |
+
"""
|
| 218 |
+
if degree == 0:
|
| 219 |
+
return image, True, "No ink holdout"
|
| 220 |
+
|
| 221 |
+
try:
|
| 222 |
+
image = encode_to_rgb(image)
|
| 223 |
+
h, w = image.shape[:2]
|
| 224 |
+
|
| 225 |
+
# Create white mask to simulate missing ink
|
| 226 |
+
num_dropouts = {1: 3, 2: 8, 3: 15}.get(degree, 15)
|
| 227 |
+
|
| 228 |
+
result = image.copy()
|
| 229 |
+
|
| 230 |
+
for _ in range(num_dropouts):
|
| 231 |
+
# Random position and size
|
| 232 |
+
x = np.random.randint(0, w - 20)
|
| 233 |
+
y = np.random.randint(0, h - 20)
|
| 234 |
+
size = np.random.randint(10, 40)
|
| 235 |
+
|
| 236 |
+
# Create white rectangle (simulating ink dropout)
|
| 237 |
+
result[y:y+size, x:x+size] = [255, 255, 255]
|
| 238 |
+
|
| 239 |
+
return result, True, f"Ink holdout applied (dropouts={num_dropouts})"
|
| 240 |
+
except Exception as e:
|
| 241 |
+
return image, False, f"Ink holdout error: {str(e)}"
|
| 242 |
+
|
| 243 |
+
|
| 244 |
+
def apply_ink_bleeding(image: np.ndarray, degree: int) -> Tuple[np.ndarray, bool, str]:
|
| 245 |
+
"""
|
| 246 |
+
Apply ink bleeding effect (ink spread/bleed)
|
| 247 |
+
degree: 1 (mild), 2 (moderate), 3 (severe)
|
| 248 |
+
"""
|
| 249 |
+
if degree == 0:
|
| 250 |
+
return image, True, "No ink bleeding"
|
| 251 |
+
|
| 252 |
+
try:
|
| 253 |
+
image = encode_to_rgb(image)
|
| 254 |
+
|
| 255 |
+
# Convert to grayscale for processing
|
| 256 |
+
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
|
| 257 |
+
|
| 258 |
+
# Dilate dark regions (simulating ink spread)
|
| 259 |
+
kernel_sizes = {1: 3, 2: 5, 3: 7}
|
| 260 |
+
kernel_size = kernel_sizes.get(degree, 7)
|
| 261 |
+
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (kernel_size, kernel_size))
|
| 262 |
+
|
| 263 |
+
# Dilate to spread ink
|
| 264 |
+
dilated = cv2.dilate(gray, kernel, iterations=degree)
|
| 265 |
+
|
| 266 |
+
# Blend back with original
|
| 267 |
+
result = image.copy().astype(np.float32)
|
| 268 |
+
result[:,:,0] = cv2.addWeighted(image[:,:,0], 0.7, dilated, 0.3, 0)
|
| 269 |
+
result[:,:,1] = cv2.addWeighted(image[:,:,1], 0.7, dilated, 0.3, 0)
|
| 270 |
+
result[:,:,2] = cv2.addWeighted(image[:,:,2], 0.7, dilated, 0.3, 0)
|
| 271 |
+
|
| 272 |
+
return np.clip(result, 0, 255).astype(np.uint8), True, f"Ink bleeding applied (degree={degree})"
|
| 273 |
+
except Exception as e:
|
| 274 |
+
return image, False, f"Ink bleeding error: {str(e)}"
|
| 275 |
+
|
| 276 |
+
|
| 277 |
+
def apply_illumination(image: np.ndarray, degree: int) -> Tuple[np.ndarray, bool, str]:
|
| 278 |
+
"""
|
| 279 |
+
Apply illumination variations (uneven lighting)
|
| 280 |
+
degree: 1 (subtle), 2 (moderate), 3 (severe)
|
| 281 |
+
"""
|
| 282 |
+
if degree == 0:
|
| 283 |
+
return image, True, "No illumination"
|
| 284 |
+
|
| 285 |
+
try:
|
| 286 |
+
image = encode_to_rgb(image)
|
| 287 |
+
h, w = image.shape[:2]
|
| 288 |
+
|
| 289 |
+
# Create illumination pattern
|
| 290 |
+
intensity = {1: 0.15, 2: 0.3, 3: 0.5}.get(degree, 0.5)
|
| 291 |
+
|
| 292 |
+
# Create gradient-like illumination from corners
|
| 293 |
+
x = np.linspace(-1, 1, w)
|
| 294 |
+
y = np.linspace(-1, 1, h)
|
| 295 |
+
X, Y = np.meshgrid(x, y)
|
| 296 |
+
|
| 297 |
+
# Create vignette effect
|
| 298 |
+
illumination = 1 - intensity * (np.sqrt(X**2 + Y**2) / np.sqrt(2))
|
| 299 |
+
illumination = np.clip(illumination, 0, 1)
|
| 300 |
+
|
| 301 |
+
# Apply to each channel
|
| 302 |
+
result = image.astype(np.float32)
|
| 303 |
+
for c in range(3):
|
| 304 |
+
result[:,:,c] = result[:,:,c] * illumination
|
| 305 |
+
|
| 306 |
+
return np.clip(result, 0, 255).astype(np.uint8), True, f"Illumination applied (intensity={intensity})"
|
| 307 |
+
except Exception as e:
|
| 308 |
+
return image, False, f"Illumination error: {str(e)}"
|
| 309 |
+
|
| 310 |
+
|
| 311 |
+
# ============================================================================
|
| 312 |
+
# SPATIAL PERTURBATIONS
|
| 313 |
+
# ============================================================================
|
| 314 |
+
|
| 315 |
+
def apply_rotation(image: np.ndarray, degree: int) -> Tuple[np.ndarray, bool, str]:
|
| 316 |
+
"""
|
| 317 |
+
Apply rotation
|
| 318 |
+
degree: 1 (Β±5Β°), 2 (Β±10Β°), 3 (Β±15Β°)
|
| 319 |
+
"""
|
| 320 |
+
if degree == 0:
|
| 321 |
+
return image, True, "No rotation"
|
| 322 |
+
|
| 323 |
+
try:
|
| 324 |
+
image = encode_to_rgb(image)
|
| 325 |
+
h, w = image.shape[:2]
|
| 326 |
+
|
| 327 |
+
# Angle ranges by degree
|
| 328 |
+
angle_ranges = {1: 5, 2: 10, 3: 15}
|
| 329 |
+
max_angle = angle_ranges.get(degree, 15)
|
| 330 |
+
|
| 331 |
+
# Random angle
|
| 332 |
+
angle = np.random.uniform(-max_angle, max_angle)
|
| 333 |
+
|
| 334 |
+
# Rotation matrix
|
| 335 |
+
center = (w // 2, h // 2)
|
| 336 |
+
rotation_matrix = cv2.getRotationMatrix2D(center, angle, 1.0)
|
| 337 |
+
|
| 338 |
+
# Apply rotation with white padding
|
| 339 |
+
rotated = cv2.warpAffine(image, rotation_matrix, (w, h), borderValue=(255, 255, 255))
|
| 340 |
+
|
| 341 |
+
return rotated, True, f"Rotation applied (angle={angle:.1f}Β°)"
|
| 342 |
+
except Exception as e:
|
| 343 |
+
return image, False, f"Rotation error: {str(e)}"
|
| 344 |
+
|
| 345 |
+
|
| 346 |
+
def apply_keystoning(image: np.ndarray, degree: int) -> Tuple[np.ndarray, bool, str]:
|
| 347 |
+
"""
|
| 348 |
+
Apply keystoning effect (perspective distortion)
|
| 349 |
+
degree: 1 (subtle), 2 (moderate), 3 (severe)
|
| 350 |
+
"""
|
| 351 |
+
if degree == 0:
|
| 352 |
+
return image, True, "No keystoning"
|
| 353 |
+
|
| 354 |
+
try:
|
| 355 |
+
image = encode_to_rgb(image)
|
| 356 |
+
h, w = image.shape[:2]
|
| 357 |
+
|
| 358 |
+
# Distortion amount
|
| 359 |
+
distortion = {1: w * 0.05, 2: w * 0.1, 3: w * 0.15}.get(degree, w * 0.15)
|
| 360 |
+
|
| 361 |
+
# Source corners
|
| 362 |
+
src_points = np.float32([
|
| 363 |
+
[0, 0],
|
| 364 |
+
[w - 1, 0],
|
| 365 |
+
[0, h - 1],
|
| 366 |
+
[w - 1, h - 1]
|
| 367 |
+
])
|
| 368 |
+
|
| 369 |
+
# Destination corners (with perspective distortion)
|
| 370 |
+
dst_points = np.float32([
|
| 371 |
+
[distortion, 0],
|
| 372 |
+
[w - 1 - distortion * 0.5, 0],
|
| 373 |
+
[0, h - 1],
|
| 374 |
+
[w - 1, h - 1]
|
| 375 |
+
])
|
| 376 |
+
|
| 377 |
+
# Get perspective transform
|
| 378 |
+
matrix = cv2.getPerspectiveTransform(src_points, dst_points)
|
| 379 |
+
warped = cv2.warpPerspective(image, matrix, (w, h), borderValue=(255, 255, 255))
|
| 380 |
+
|
| 381 |
+
return warped, True, f"Keystoning applied (distortion={distortion:.1f})"
|
| 382 |
+
except Exception as e:
|
| 383 |
+
return image, False, f"Keystoning error: {str(e)}"
|
| 384 |
+
|
| 385 |
+
|
| 386 |
+
def apply_warping(image: np.ndarray, degree: int) -> Tuple[np.ndarray, bool, str]:
|
| 387 |
+
"""
|
| 388 |
+
Apply elastic/elastic deformation
|
| 389 |
+
degree: 1 (mild), 2 (moderate), 3 (severe)
|
| 390 |
+
"""
|
| 391 |
+
if degree == 0:
|
| 392 |
+
return image, True, "No warping"
|
| 393 |
+
|
| 394 |
+
try:
|
| 395 |
+
image = encode_to_rgb(image)
|
| 396 |
+
h, w = image.shape[:2]
|
| 397 |
+
|
| 398 |
+
# Warping parameters
|
| 399 |
+
alpha_values = {1: 15, 2: 30, 3: 60}
|
| 400 |
+
sigma_values = {1: 3, 2: 5, 3: 8}
|
| 401 |
+
alpha = alpha_values.get(degree, 60)
|
| 402 |
+
sigma = sigma_values.get(degree, 8)
|
| 403 |
+
|
| 404 |
+
# Generate random displacement field
|
| 405 |
+
dx = np.random.randn(h, w) * sigma
|
| 406 |
+
dy = np.random.randn(h, w) * sigma
|
| 407 |
+
|
| 408 |
+
# Smooth displacement field
|
| 409 |
+
dx = gaussian_filter(dx, sigma=sigma) * alpha
|
| 410 |
+
dy = gaussian_filter(dy, sigma=sigma) * alpha
|
| 411 |
+
|
| 412 |
+
# Create coordinate grids
|
| 413 |
+
x, y = np.meshgrid(np.arange(w), np.arange(h))
|
| 414 |
+
|
| 415 |
+
# Apply displacement
|
| 416 |
+
x_warped = np.clip(x + dx, 0, w - 1).astype(np.float32)
|
| 417 |
+
y_warped = np.clip(y + dy, 0, h - 1).astype(np.float32)
|
| 418 |
+
|
| 419 |
+
# Remap image
|
| 420 |
+
warped = cv2.remap(image, x_warped, y_warped, cv2.INTER_LINEAR, borderValue=(255, 255, 255))
|
| 421 |
+
|
| 422 |
+
return warped, True, f"Warping applied (alpha={alpha}, sigma={sigma})"
|
| 423 |
+
except Exception as e:
|
| 424 |
+
return image, False, f"Warping error: {str(e)}"
|
| 425 |
+
|
| 426 |
+
|
| 427 |
+
# ============================================================================
|
| 428 |
+
# Main Perturbation Application
|
| 429 |
+
# ============================================================================
|
| 430 |
+
|
| 431 |
+
PERTURBATION_FUNCTIONS = {
|
| 432 |
+
# Blur
|
| 433 |
+
"defocus": apply_defocus,
|
| 434 |
+
"vibration": apply_vibration,
|
| 435 |
+
# Noise
|
| 436 |
+
"speckle": apply_speckle,
|
| 437 |
+
"texture": apply_texture,
|
| 438 |
+
# Content
|
| 439 |
+
"watermark": apply_watermark,
|
| 440 |
+
"background": apply_background,
|
| 441 |
+
# Inconsistency
|
| 442 |
+
"ink_holdout": apply_ink_holdout,
|
| 443 |
+
"ink_bleeding": apply_ink_bleeding,
|
| 444 |
+
"illumination": apply_illumination,
|
| 445 |
+
# Spatial
|
| 446 |
+
"rotation": apply_rotation,
|
| 447 |
+
"keystoning": apply_keystoning,
|
| 448 |
+
"warping": apply_warping,
|
| 449 |
+
}
|
| 450 |
+
|
| 451 |
+
|
| 452 |
+
def apply_perturbation(
|
| 453 |
+
image: np.ndarray,
|
| 454 |
+
perturbation_type: str,
|
| 455 |
+
degree: int = 1
|
| 456 |
+
) -> Tuple[np.ndarray, bool, str]:
|
| 457 |
+
"""
|
| 458 |
+
Apply a single perturbation to an image
|
| 459 |
+
|
| 460 |
+
Args:
|
| 461 |
+
image: Input image as numpy array (BGR or RGB)
|
| 462 |
+
perturbation_type: Type of perturbation (see PERTURBATION_FUNCTIONS)
|
| 463 |
+
degree: Severity level (1=mild, 2=moderate, 3=severe)
|
| 464 |
+
|
| 465 |
+
Returns:
|
| 466 |
+
Tuple of (result_image, success, message)
|
| 467 |
+
"""
|
| 468 |
+
if perturbation_type not in PERTURBATION_FUNCTIONS:
|
| 469 |
+
return image, False, f"Unknown perturbation type: {perturbation_type}"
|
| 470 |
+
|
| 471 |
+
if degree < 0 or degree > 3:
|
| 472 |
+
return image, False, f"Invalid degree: {degree} (must be 0-3)"
|
| 473 |
+
|
| 474 |
+
func = PERTURBATION_FUNCTIONS[perturbation_type]
|
| 475 |
+
return func(image, degree)
|
| 476 |
+
|
| 477 |
+
|
| 478 |
+
def apply_multiple_perturbations(
|
| 479 |
+
image: np.ndarray,
|
| 480 |
+
perturbations: List[Tuple[str, int]]
|
| 481 |
+
) -> Tuple[np.ndarray, bool, str]:
|
| 482 |
+
"""
|
| 483 |
+
Apply multiple perturbations in sequence
|
| 484 |
+
|
| 485 |
+
Args:
|
| 486 |
+
image: Input image
|
| 487 |
+
perturbations: List of (type, degree) tuples
|
| 488 |
+
|
| 489 |
+
Returns:
|
| 490 |
+
Tuple of (result_image, success, message)
|
| 491 |
+
"""
|
| 492 |
+
result = image.copy()
|
| 493 |
+
messages = []
|
| 494 |
+
|
| 495 |
+
for ptype, degree in perturbations:
|
| 496 |
+
result, success, msg = apply_perturbation(result, ptype, degree)
|
| 497 |
+
messages.append(msg)
|
| 498 |
+
if not success:
|
| 499 |
+
return image, False, f"Failed: {msg}"
|
| 500 |
+
|
| 501 |
+
return result, True, " | ".join(messages)
|
| 502 |
+
|
| 503 |
+
|
| 504 |
+
def get_perturbation_info() -> Dict:
|
| 505 |
+
"""Get information about all available perturbations"""
|
| 506 |
+
return {
|
| 507 |
+
"total_perturbations": len(PERTURBATION_FUNCTIONS),
|
| 508 |
+
"types": list(PERTURBATION_FUNCTIONS.keys()),
|
| 509 |
+
"categories": {
|
| 510 |
+
"blur": ["defocus", "vibration"],
|
| 511 |
+
"noise": ["speckle", "texture"],
|
| 512 |
+
"content": ["watermark", "background"],
|
| 513 |
+
"inconsistency": ["ink_holdout", "ink_bleeding", "illumination"],
|
| 514 |
+
"spatial": ["rotation", "keystoning", "warping"]
|
| 515 |
+
}
|
| 516 |
+
}
|
deployment/backend/register_dino.py
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Register DINO detector with MMDET if not already registered
|
| 3 |
+
This allows loading RoDLA models without requiring DCNv3 compilation
|
| 4 |
+
"""
|
| 5 |
+
|
| 6 |
+
import sys
|
| 7 |
+
from pathlib import Path
|
| 8 |
+
|
| 9 |
+
def register_dino():
|
| 10 |
+
"""Register DINO with MMDET model registry"""
|
| 11 |
+
try:
|
| 12 |
+
from mmdet.models.builder import DETECTORS, BACKBONES, NECKS, HEADS
|
| 13 |
+
|
| 14 |
+
# Check if already registered
|
| 15 |
+
if 'DINO' in DETECTORS.module_dict:
|
| 16 |
+
print("β
DINO already registered in MMDET")
|
| 17 |
+
return True
|
| 18 |
+
|
| 19 |
+
print("β³ Registering DINO detector...")
|
| 20 |
+
|
| 21 |
+
# Try to import and register custom models
|
| 22 |
+
# Use absolute path from /home/admin/CV/rodla-academic
|
| 23 |
+
REPO_ROOT = Path("/home/admin/CV/rodla-academic")
|
| 24 |
+
sys.path.insert(0, str(REPO_ROOT / "model"))
|
| 25 |
+
sys.path.insert(0, str(REPO_ROOT / "model" / "ops_dcnv3"))
|
| 26 |
+
|
| 27 |
+
try:
|
| 28 |
+
import mmdet_custom
|
| 29 |
+
if 'DINO' in DETECTORS.module_dict:
|
| 30 |
+
print("β
DINO registered successfully from mmdet_custom")
|
| 31 |
+
return True
|
| 32 |
+
else:
|
| 33 |
+
print("β οΈ DINO not found in mmdet_custom registry")
|
| 34 |
+
return False
|
| 35 |
+
except ModuleNotFoundError as e:
|
| 36 |
+
if "DCNv3" in str(e):
|
| 37 |
+
print(f"β οΈ Cannot register DINO: DCNv3 module not available")
|
| 38 |
+
print(f" Error: {e}")
|
| 39 |
+
return False
|
| 40 |
+
else:
|
| 41 |
+
print(f"β Error importing mmdet_custom: {e}")
|
| 42 |
+
return False
|
| 43 |
+
|
| 44 |
+
except Exception as e:
|
| 45 |
+
print(f"β Error registering DINO: {e}")
|
| 46 |
+
return False
|
| 47 |
+
|
| 48 |
+
|
| 49 |
+
def try_load_with_dino_registration(config_path: str, checkpoint_path: str, device: str = "cpu"):
|
| 50 |
+
"""Try to load a DINO model, registering it if necessary"""
|
| 51 |
+
from mmdet.apis import init_detector
|
| 52 |
+
|
| 53 |
+
# Try registering DINO first
|
| 54 |
+
dino_registered = register_dino()
|
| 55 |
+
|
| 56 |
+
if not dino_registered:
|
| 57 |
+
print("β οΈ DINO could not be registered")
|
| 58 |
+
print(" Will attempt to load anyway...")
|
| 59 |
+
|
| 60 |
+
# Try to load the model
|
| 61 |
+
try:
|
| 62 |
+
print(f"β³ Loading model from {checkpoint_path}...")
|
| 63 |
+
model = init_detector(config_path, checkpoint_path, device=device)
|
| 64 |
+
print("β
Model loaded successfully!")
|
| 65 |
+
return model
|
| 66 |
+
except Exception as e:
|
| 67 |
+
print(f"β Failed to load model: {e}")
|
| 68 |
+
return None
|
frontend/index.html
CHANGED
|
@@ -106,12 +106,18 @@
|
|
| 106 |
|
| 107 |
<!-- Action Buttons -->
|
| 108 |
<section class="section button-section">
|
| 109 |
-
<button id="analyzeBtn" class="btn btn-primary" disabled>
|
| 110 |
[ANALYZE DOCUMENT]
|
| 111 |
</button>
|
| 112 |
<button id="resetBtn" class="btn btn-secondary">
|
| 113 |
[CLEAR ALL]
|
| 114 |
</button>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 115 |
</section>
|
| 116 |
|
| 117 |
<!-- Status Section -->
|
|
|
|
| 106 |
|
| 107 |
<!-- Action Buttons -->
|
| 108 |
<section class="section button-section">
|
| 109 |
+
<button id="analyzeBtn" class="btn btn-primary" disabled title="(1) Upload image, (2) Make sure STANDARD mode is selected">
|
| 110 |
[ANALYZE DOCUMENT]
|
| 111 |
</button>
|
| 112 |
<button id="resetBtn" class="btn btn-secondary">
|
| 113 |
[CLEAR ALL]
|
| 114 |
</button>
|
| 115 |
+
<p id="modeHint" class="mode-hint" style="display: none; color: #00FF00; margin-top: 10px; font-size: 12px;">
|
| 116 |
+
>>> Use [GENERATE PERTURBATIONS] button above to analyze with perturbations
|
| 117 |
+
</p>
|
| 118 |
+
<p id="standardModeHint" class="mode-hint" style="color: #00FF00; margin-top: 5px; font-size: 12px;">
|
| 119 |
+
>>> STANDARD MODE: Upload an image and click [ANALYZE DOCUMENT] to detect layout
|
| 120 |
+
</p>
|
| 121 |
</section>
|
| 122 |
|
| 123 |
<!-- Status Section -->
|
frontend/script.js
CHANGED
|
@@ -56,12 +56,30 @@ function setupEventListeners() {
|
|
| 56 |
btn.classList.add('active');
|
| 57 |
currentMode = btn.dataset.mode;
|
| 58 |
|
| 59 |
-
// Toggle perturbation options
|
| 60 |
const pertOptions = document.getElementById('perturbationOptions');
|
|
|
|
|
|
|
|
|
|
|
|
|
| 61 |
if (currentMode === 'perturbation') {
|
|
|
|
| 62 |
pertOptions.style.display = 'block';
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 63 |
} else {
|
|
|
|
| 64 |
pertOptions.style.display = 'none';
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 65 |
}
|
| 66 |
});
|
| 67 |
});
|
|
@@ -98,7 +116,12 @@ function handleFileSelect(file) {
|
|
| 98 |
|
| 99 |
currentFile = file;
|
| 100 |
showPreview(file);
|
| 101 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 102 |
}
|
| 103 |
|
| 104 |
function showPreview(file) {
|
|
@@ -121,39 +144,6 @@ function showPreview(file) {
|
|
| 121 |
// ANALYSIS
|
| 122 |
// ============================================
|
| 123 |
|
| 124 |
-
async function handleAnalysis() {
|
| 125 |
-
if (!currentFile) {
|
| 126 |
-
showError('Please select an image first.');
|
| 127 |
-
return;
|
| 128 |
-
}
|
| 129 |
-
|
| 130 |
-
const analysisType = currentMode === 'standard' ? 'Standard Detection' : 'Perturbation Analysis';
|
| 131 |
-
updateStatus(`> INITIATING ${analysisType.toUpperCase()}...`);
|
| 132 |
-
showStatus();
|
| 133 |
-
hideError();
|
| 134 |
-
|
| 135 |
-
try {
|
| 136 |
-
const startTime = Date.now();
|
| 137 |
-
const results = await runAnalysis();
|
| 138 |
-
const processingTime = Date.now() - startTime;
|
| 139 |
-
|
| 140 |
-
lastResults = {
|
| 141 |
-
...results,
|
| 142 |
-
processingTime: processingTime,
|
| 143 |
-
timestamp: new Date().toISOString(),
|
| 144 |
-
mode: currentMode,
|
| 145 |
-
fileName: currentFile.name
|
| 146 |
-
};
|
| 147 |
-
|
| 148 |
-
displayResults(results, processingTime);
|
| 149 |
-
hideStatus();
|
| 150 |
-
} catch (error) {
|
| 151 |
-
console.error('[ERROR]', error);
|
| 152 |
-
showError(`Analysis failed: ${error.message}`);
|
| 153 |
-
hideStatus();
|
| 154 |
-
}
|
| 155 |
-
}
|
| 156 |
-
|
| 157 |
async function handleAnalysis() {
|
| 158 |
if (!currentFile) {
|
| 159 |
showError('Please select an image first.');
|
|
@@ -178,8 +168,12 @@ async function handleAnalysis() {
|
|
| 178 |
|
| 179 |
const processingTime = Date.now() - startTime;
|
| 180 |
|
|
|
|
|
|
|
|
|
|
| 181 |
lastResults = {
|
| 182 |
...results,
|
|
|
|
| 183 |
processingTime: processingTime,
|
| 184 |
timestamp: new Date().toISOString(),
|
| 185 |
mode: currentMode,
|
|
@@ -202,36 +196,72 @@ async function runAnalysis() {
|
|
| 202 |
const threshold = parseFloat(document.getElementById('confidenceThreshold').value);
|
| 203 |
formData.append('score_threshold', threshold);
|
| 204 |
|
| 205 |
-
|
| 206 |
-
|
| 207 |
-
|
| 208 |
-
|
| 209 |
-
|
| 210 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 211 |
|
| 212 |
-
|
| 213 |
-
|
| 214 |
-
|
|
|
|
|
|
|
| 215 |
|
| 216 |
-
|
|
|
|
| 217 |
|
| 218 |
-
|
| 219 |
-
|
| 220 |
-
|
| 221 |
-
|
| 222 |
-
|
| 223 |
-
|
| 224 |
-
|
| 225 |
-
});
|
| 226 |
-
|
| 227 |
-
|
| 228 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 229 |
method: 'POST',
|
| 230 |
body: formData
|
| 231 |
-
}).then(r => {
|
| 232 |
-
if (!r.ok) throw new Error(`API Error: ${r.status}`);
|
| 233 |
-
return r.json();
|
| 234 |
});
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 235 |
}
|
| 236 |
}
|
| 237 |
|
|
@@ -291,16 +321,27 @@ function displayPerturbations(results) {
|
|
| 291 |
}
|
| 292 |
|
| 293 |
let html = `<div style="font-size: 0.9em; color: #00FFFF; margin-bottom: 15px; padding: 10px; border: 1px dashed #00FFFF;">
|
| 294 |
-
TOTAL: 12 Perturbation Types Γ 3 Degree Levels (1=Mild, 2=Moderate, 3=Severe)
|
| 295 |
</div>`;
|
| 296 |
|
|
|
|
|
|
|
|
|
|
| 297 |
// Add original
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 298 |
html += `
|
| 299 |
<div class="perturbation-grid-section">
|
| 300 |
<div class="perturbation-type-label">[ORIGINAL IMAGE]</div>
|
| 301 |
<div style="padding: 10px;">
|
| 302 |
<img src="data:image/png;base64,${results.perturbations.original.original}"
|
| 303 |
-
alt="Original" class="perturbation-preview-image"
|
|
|
|
|
|
|
|
|
|
| 304 |
</div>
|
| 305 |
</div>
|
| 306 |
`;
|
|
@@ -337,13 +378,24 @@ function displayPerturbations(results) {
|
|
| 337 |
const degreeLabel = ['MILD', 'MODERATE', 'SEVERE'][degree - 1];
|
| 338 |
|
| 339 |
if (results.perturbations[ptype][degreeKey]) {
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 340 |
html += `
|
| 341 |
<div style="text-align: center;">
|
| 342 |
<div style="color: #00FFFF; font-size: 0.8em; margin-bottom: 5px;">DEG ${degree}: ${degreeLabel}</div>
|
| 343 |
<img src="data:image/png;base64,${results.perturbations[ptype][degreeKey]}"
|
| 344 |
alt="${ptype} degree ${degree}"
|
| 345 |
class="perturbation-preview-image"
|
| 346 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 347 |
</div>
|
| 348 |
`;
|
| 349 |
}
|
|
@@ -357,6 +409,33 @@ function displayPerturbations(results) {
|
|
| 357 |
});
|
| 358 |
|
| 359 |
container.innerHTML = html;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 360 |
section.style.display = 'block';
|
| 361 |
section.scrollIntoView({ behavior: 'smooth' });
|
| 362 |
}
|
|
@@ -376,11 +455,17 @@ function displayResults(results, processingTime) {
|
|
| 376 |
|
| 377 |
document.getElementById('detectionCount').textContent = detections.length;
|
| 378 |
document.getElementById('avgConfidence').textContent = `${avgConfidence}%`;
|
| 379 |
-
document.getElementById('processingTime').textContent = `${processingTime}ms`;
|
| 380 |
|
| 381 |
-
//
|
| 382 |
-
if (
|
| 383 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 384 |
}
|
| 385 |
|
| 386 |
// Class distribution
|
|
@@ -390,13 +475,114 @@ function displayResults(results, processingTime) {
|
|
| 390 |
displayDetectionsTable(detections);
|
| 391 |
|
| 392 |
// Metrics
|
| 393 |
-
displayMetrics(results
|
| 394 |
|
| 395 |
// Show results section
|
| 396 |
document.getElementById('resultsSection').style.display = 'block';
|
| 397 |
document.getElementById('resultsSection').scrollIntoView({ behavior: 'smooth' });
|
| 398 |
}
|
| 399 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 400 |
function displayClassDistribution(distribution) {
|
| 401 |
const chart = document.getElementById('classChart');
|
| 402 |
|
|
@@ -429,30 +615,44 @@ function displayDetectionsTable(detections) {
|
|
| 429 |
const tbody = document.getElementById('detectionsTableBody');
|
| 430 |
|
| 431 |
if (detections.length === 0) {
|
| 432 |
-
tbody.innerHTML = '<tr><td colspan="
|
| 433 |
return;
|
| 434 |
}
|
| 435 |
|
| 436 |
let html = '';
|
| 437 |
detections.slice(0, 50).forEach((det, idx) => {
|
| 438 |
-
|
| 439 |
-
const
|
| 440 |
-
|
| 441 |
-
|
| 442 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 443 |
|
| 444 |
html += `
|
| 445 |
<tr>
|
| 446 |
<td>${idx + 1}</td>
|
| 447 |
-
<td>${
|
| 448 |
-
<td>${
|
| 449 |
-
<td>[${
|
| 450 |
</tr>
|
| 451 |
`;
|
| 452 |
});
|
| 453 |
|
| 454 |
if (detections.length > 50) {
|
| 455 |
-
html += `<tr><td colspan="
|
| 456 |
}
|
| 457 |
|
| 458 |
tbody.innerHTML = html;
|
|
@@ -658,5 +858,76 @@ async function checkBackendStatus() {
|
|
| 658 |
// UTILITY FUNCTIONS
|
| 659 |
// ============================================
|
| 660 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 661 |
console.log('[RODLA] Frontend loaded successfully. Ready for analysis.');
|
| 662 |
console.log('[RODLA] Demo mode available if backend is unavailable.');
|
|
|
|
| 56 |
btn.classList.add('active');
|
| 57 |
currentMode = btn.dataset.mode;
|
| 58 |
|
| 59 |
+
// Toggle perturbation options and hint
|
| 60 |
const pertOptions = document.getElementById('perturbationOptions');
|
| 61 |
+
const modeHint = document.getElementById('modeHint');
|
| 62 |
+
const standardModeHint = document.getElementById('standardModeHint');
|
| 63 |
+
const analyzeBtn = document.getElementById('analyzeBtn');
|
| 64 |
+
|
| 65 |
if (currentMode === 'perturbation') {
|
| 66 |
+
// PERTURBATION MODE - allow analysis of original or perturbation images
|
| 67 |
pertOptions.style.display = 'block';
|
| 68 |
+
modeHint.style.display = 'block';
|
| 69 |
+
standardModeHint.style.display = 'none';
|
| 70 |
+
analyzeBtn.style.opacity = currentFile ? '1' : '0.5';
|
| 71 |
+
analyzeBtn.style.cursor = currentFile ? 'pointer' : 'not-allowed';
|
| 72 |
+
analyzeBtn.disabled = !currentFile;
|
| 73 |
+
analyzeBtn.title = 'Click to generate perturbations, then click on any image to analyze it';
|
| 74 |
} else {
|
| 75 |
+
// STANDARD MODE
|
| 76 |
pertOptions.style.display = 'none';
|
| 77 |
+
modeHint.style.display = 'none';
|
| 78 |
+
standardModeHint.style.display = 'block';
|
| 79 |
+
analyzeBtn.style.opacity = currentFile ? '1' : '0.5';
|
| 80 |
+
analyzeBtn.style.cursor = currentFile ? 'pointer' : 'not-allowed';
|
| 81 |
+
analyzeBtn.disabled = !currentFile;
|
| 82 |
+
analyzeBtn.title = 'Click to analyze the document layout';
|
| 83 |
}
|
| 84 |
});
|
| 85 |
});
|
|
|
|
| 116 |
|
| 117 |
currentFile = file;
|
| 118 |
showPreview(file);
|
| 119 |
+
|
| 120 |
+
// Enable analyze button only if in standard mode
|
| 121 |
+
const analyzeBtn = document.getElementById('analyzeBtn');
|
| 122 |
+
if (currentMode === 'standard') {
|
| 123 |
+
analyzeBtn.disabled = false;
|
| 124 |
+
}
|
| 125 |
}
|
| 126 |
|
| 127 |
function showPreview(file) {
|
|
|
|
| 144 |
// ANALYSIS
|
| 145 |
// ============================================
|
| 146 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 147 |
async function handleAnalysis() {
|
| 148 |
if (!currentFile) {
|
| 149 |
showError('Please select an image first.');
|
|
|
|
| 168 |
|
| 169 |
const processingTime = Date.now() - startTime;
|
| 170 |
|
| 171 |
+
// Read original image as base64 for annotation
|
| 172 |
+
const originalImageBase64 = await readFileAsBase64(currentFile);
|
| 173 |
+
|
| 174 |
lastResults = {
|
| 175 |
...results,
|
| 176 |
+
original_image: originalImageBase64,
|
| 177 |
processingTime: processingTime,
|
| 178 |
timestamp: new Date().toISOString(),
|
| 179 |
mode: currentMode,
|
|
|
|
| 196 |
const threshold = parseFloat(document.getElementById('confidenceThreshold').value);
|
| 197 |
formData.append('score_threshold', threshold);
|
| 198 |
|
| 199 |
+
// Only standard detection mode
|
| 200 |
+
updateStatus('> RUNNING STANDARD DETECTION...');
|
| 201 |
+
return await fetch(`${API_BASE_URL}/detect`, {
|
| 202 |
+
method: 'POST',
|
| 203 |
+
body: formData
|
| 204 |
+
}).then(r => {
|
| 205 |
+
if (!r.ok) throw new Error(`API Error: ${r.status}`);
|
| 206 |
+
return r.json();
|
| 207 |
+
});
|
| 208 |
+
}
|
| 209 |
|
| 210 |
+
async function analyzePerturbationImage(imageBase64, perturbationType, degree) {
|
| 211 |
+
// Analyze a specific perturbation image
|
| 212 |
+
updateStatus(`> ANALYZING ${perturbationType.toUpperCase()} (DEGREE ${degree})...`);
|
| 213 |
+
showStatus();
|
| 214 |
+
hideError();
|
| 215 |
|
| 216 |
+
try {
|
| 217 |
+
const startTime = Date.now();
|
| 218 |
|
| 219 |
+
// Convert base64 to blob and create file
|
| 220 |
+
const binaryString = atob(imageBase64);
|
| 221 |
+
const bytes = new Uint8Array(binaryString.length);
|
| 222 |
+
for (let i = 0; i < binaryString.length; i++) {
|
| 223 |
+
bytes[i] = binaryString.charCodeAt(i);
|
| 224 |
+
}
|
| 225 |
+
const blob = new Blob([bytes], { type: 'image/png' });
|
| 226 |
+
const file = new File([blob], `${perturbationType}_degree_${degree}.png`, { type: 'image/png' });
|
| 227 |
+
|
| 228 |
+
// Create form data
|
| 229 |
+
const formData = new FormData();
|
| 230 |
+
formData.append('file', file);
|
| 231 |
+
const threshold = parseFloat(document.getElementById('confidenceThreshold').value);
|
| 232 |
+
formData.append('score_threshold', threshold);
|
| 233 |
+
|
| 234 |
+
// Send to backend
|
| 235 |
+
const response = await fetch(`${API_BASE_URL}/detect`, {
|
| 236 |
method: 'POST',
|
| 237 |
body: formData
|
|
|
|
|
|
|
|
|
|
| 238 |
});
|
| 239 |
+
|
| 240 |
+
if (!response.ok) {
|
| 241 |
+
throw new Error(`API Error: ${response.status}`);
|
| 242 |
+
}
|
| 243 |
+
|
| 244 |
+
const results = await response.json();
|
| 245 |
+
const processingTime = Date.now() - startTime;
|
| 246 |
+
|
| 247 |
+
// Store results with perturbation info
|
| 248 |
+
lastResults = {
|
| 249 |
+
...results,
|
| 250 |
+
original_image: imageBase64,
|
| 251 |
+
processingTime: processingTime,
|
| 252 |
+
timestamp: new Date().toISOString(),
|
| 253 |
+
mode: 'perturbation',
|
| 254 |
+
perturbation_type: perturbationType,
|
| 255 |
+
perturbation_degree: degree,
|
| 256 |
+
fileName: `${perturbationType}_degree_${degree}.png`
|
| 257 |
+
};
|
| 258 |
+
|
| 259 |
+
displayResults(results, processingTime);
|
| 260 |
+
hideStatus();
|
| 261 |
+
} catch (error) {
|
| 262 |
+
console.error('[ERROR]', error);
|
| 263 |
+
showError(`Perturbation analysis failed: ${error.message}`);
|
| 264 |
+
hideStatus();
|
| 265 |
}
|
| 266 |
}
|
| 267 |
|
|
|
|
| 321 |
}
|
| 322 |
|
| 323 |
let html = `<div style="font-size: 0.9em; color: #00FFFF; margin-bottom: 15px; padding: 10px; border: 1px dashed #00FFFF;">
|
| 324 |
+
TOTAL: 12 Perturbation Types Γ 3 Degree Levels (1=Mild, 2=Moderate, 3=Severe) - CLICK ON ANY IMAGE TO ANALYZE
|
| 325 |
</div>`;
|
| 326 |
|
| 327 |
+
// Store all perturbation images for clickable analysis
|
| 328 |
+
const perturbationImages = [];
|
| 329 |
+
|
| 330 |
// Add original
|
| 331 |
+
perturbationImages.push({
|
| 332 |
+
name: 'original',
|
| 333 |
+
image: results.perturbations.original.original
|
| 334 |
+
});
|
| 335 |
+
|
| 336 |
html += `
|
| 337 |
<div class="perturbation-grid-section">
|
| 338 |
<div class="perturbation-type-label">[ORIGINAL IMAGE]</div>
|
| 339 |
<div style="padding: 10px;">
|
| 340 |
<img src="data:image/png;base64,${results.perturbations.original.original}"
|
| 341 |
+
alt="Original" class="perturbation-preview-image"
|
| 342 |
+
data-perturbation="original" data-degree="0"
|
| 343 |
+
style="width: 200px; height: auto; cursor: pointer; border: 2px solid transparent; transition: all 0.2s;"
|
| 344 |
+
title="Click to analyze this image">
|
| 345 |
</div>
|
| 346 |
</div>
|
| 347 |
`;
|
|
|
|
| 378 |
const degreeLabel = ['MILD', 'MODERATE', 'SEVERE'][degree - 1];
|
| 379 |
|
| 380 |
if (results.perturbations[ptype][degreeKey]) {
|
| 381 |
+
perturbationImages.push({
|
| 382 |
+
name: ptype,
|
| 383 |
+
degree: degree,
|
| 384 |
+
image: results.perturbations[ptype][degreeKey]
|
| 385 |
+
});
|
| 386 |
+
|
| 387 |
html += `
|
| 388 |
<div style="text-align: center;">
|
| 389 |
<div style="color: #00FFFF; font-size: 0.8em; margin-bottom: 5px;">DEG ${degree}: ${degreeLabel}</div>
|
| 390 |
<img src="data:image/png;base64,${results.perturbations[ptype][degreeKey]}"
|
| 391 |
alt="${ptype} degree ${degree}"
|
| 392 |
class="perturbation-preview-image"
|
| 393 |
+
data-perturbation="${ptype}"
|
| 394 |
+
data-degree="${degree}"
|
| 395 |
+
style="width: 150px; height: auto; border: 2px solid #008080; padding: 2px; cursor: pointer; transition: all 0.2s;"
|
| 396 |
+
title="Click to analyze this perturbation"
|
| 397 |
+
onmouseover="this.style.borderColor='#00FF00'; this.style.boxShadow='0 0 10px #00FF00';"
|
| 398 |
+
onmouseout="this.style.borderColor='#008080'; this.style.boxShadow='none';">
|
| 399 |
</div>
|
| 400 |
`;
|
| 401 |
}
|
|
|
|
| 409 |
});
|
| 410 |
|
| 411 |
container.innerHTML = html;
|
| 412 |
+
|
| 413 |
+
// Add click handlers to perturbation images
|
| 414 |
+
const perturbationImgs = container.querySelectorAll('[data-perturbation]');
|
| 415 |
+
perturbationImgs.forEach(img => {
|
| 416 |
+
img.addEventListener('click', async function() {
|
| 417 |
+
const perturbationType = this.dataset.perturbation;
|
| 418 |
+
const degree = this.dataset.degree;
|
| 419 |
+
|
| 420 |
+
// Find the image data
|
| 421 |
+
let imageBase64 = null;
|
| 422 |
+
if (perturbationType === 'original') {
|
| 423 |
+
imageBase64 = results.perturbations.original.original;
|
| 424 |
+
} else {
|
| 425 |
+
const degreeKey = `degree_${degree}`;
|
| 426 |
+
imageBase64 = results.perturbations[perturbationType][degreeKey];
|
| 427 |
+
}
|
| 428 |
+
|
| 429 |
+
if (!imageBase64) {
|
| 430 |
+
showError('Failed to load image for analysis');
|
| 431 |
+
return;
|
| 432 |
+
}
|
| 433 |
+
|
| 434 |
+
// Convert base64 to File object and analyze
|
| 435 |
+
await analyzePerturbationImage(imageBase64, perturbationType, degree);
|
| 436 |
+
});
|
| 437 |
+
});
|
| 438 |
+
|
| 439 |
section.style.display = 'block';
|
| 440 |
section.scrollIntoView({ behavior: 'smooth' });
|
| 441 |
}
|
|
|
|
| 455 |
|
| 456 |
document.getElementById('detectionCount').textContent = detections.length;
|
| 457 |
document.getElementById('avgConfidence').textContent = `${avgConfidence}%`;
|
| 458 |
+
document.getElementById('processingTime').textContent = `${processingTime.toFixed(0)}ms`;
|
| 459 |
|
| 460 |
+
// Draw annotated image with bounding boxes
|
| 461 |
+
if (lastResults && lastResults.original_image) {
|
| 462 |
+
drawAnnotatedImage(lastResults.original_image, detections, results.image_width, results.image_height);
|
| 463 |
+
} else {
|
| 464 |
+
// Fallback: try to use previewImage
|
| 465 |
+
const previewImg = document.getElementById('previewImage');
|
| 466 |
+
if (previewImg && previewImg.src) {
|
| 467 |
+
drawAnnotatedImageFromSrc(previewImg.src, detections, results.image_width, results.image_height);
|
| 468 |
+
}
|
| 469 |
}
|
| 470 |
|
| 471 |
// Class distribution
|
|
|
|
| 475 |
displayDetectionsTable(detections);
|
| 476 |
|
| 477 |
// Metrics
|
| 478 |
+
displayMetrics(results, processingTime);
|
| 479 |
|
| 480 |
// Show results section
|
| 481 |
document.getElementById('resultsSection').style.display = 'block';
|
| 482 |
document.getElementById('resultsSection').scrollIntoView({ behavior: 'smooth' });
|
| 483 |
}
|
| 484 |
|
| 485 |
+
function drawAnnotatedImage(imageBase64, detections, imgWidth, imgHeight) {
|
| 486 |
+
// Draw bounding boxes on image and display
|
| 487 |
+
const canvas = document.createElement('canvas');
|
| 488 |
+
const ctx = canvas.getContext('2d');
|
| 489 |
+
|
| 490 |
+
// Load image
|
| 491 |
+
const img = new Image();
|
| 492 |
+
img.onload = () => {
|
| 493 |
+
canvas.width = img.width;
|
| 494 |
+
canvas.height = img.height;
|
| 495 |
+
ctx.drawImage(img, 0, 0);
|
| 496 |
+
|
| 497 |
+
// Draw bounding boxes
|
| 498 |
+
detections.forEach((det, idx) => {
|
| 499 |
+
const bbox = det.bbox || {};
|
| 500 |
+
|
| 501 |
+
// Convert normalized coordinates to pixel coordinates
|
| 502 |
+
const x = bbox.x * img.width;
|
| 503 |
+
const y = bbox.y * img.height;
|
| 504 |
+
const w = bbox.width * img.width;
|
| 505 |
+
const h = bbox.height * img.height;
|
| 506 |
+
|
| 507 |
+
// Draw box
|
| 508 |
+
ctx.strokeStyle = '#00FF00';
|
| 509 |
+
ctx.lineWidth = 2;
|
| 510 |
+
ctx.strokeRect(x, y, w, h);
|
| 511 |
+
|
| 512 |
+
// Draw label
|
| 513 |
+
const label = `${det.class_name || 'Unknown'} (${(det.confidence * 100).toFixed(1)}%)`;
|
| 514 |
+
const fontSize = Math.max(12, Math.min(18, Math.floor(img.height / 30)));
|
| 515 |
+
ctx.font = `bold ${fontSize}px monospace`;
|
| 516 |
+
ctx.fillStyle = '#000000';
|
| 517 |
+
ctx.fillRect(x, y - fontSize - 5, ctx.measureText(label).width + 10, fontSize + 5);
|
| 518 |
+
ctx.fillStyle = '#00FF00';
|
| 519 |
+
ctx.fillText(label, x + 5, y - 5);
|
| 520 |
+
});
|
| 521 |
+
|
| 522 |
+
// Display canvas as image
|
| 523 |
+
const resultImage = document.getElementById('resultImage');
|
| 524 |
+
resultImage.src = canvas.toDataURL('image/png');
|
| 525 |
+
resultImage.style.display = 'block';
|
| 526 |
+
};
|
| 527 |
+
|
| 528 |
+
img.src = `data:image/png;base64,${imageBase64}`;
|
| 529 |
+
}
|
| 530 |
+
|
| 531 |
+
function drawAnnotatedImageFromSrc(imageSrc, detections, imgWidth, imgHeight) {
|
| 532 |
+
// Draw bounding boxes on image from data URL
|
| 533 |
+
const canvas = document.createElement('canvas');
|
| 534 |
+
const ctx = canvas.getContext('2d');
|
| 535 |
+
|
| 536 |
+
const img = new Image();
|
| 537 |
+
img.onload = () => {
|
| 538 |
+
canvas.width = img.width;
|
| 539 |
+
canvas.height = img.height;
|
| 540 |
+
ctx.drawImage(img, 0, 0);
|
| 541 |
+
|
| 542 |
+
// Draw bounding boxes with colors based on class
|
| 543 |
+
const colors = ['#00FF00', '#00FFFF', '#FF00FF', '#FFFF00', '#FF6600', '#00FF99'];
|
| 544 |
+
|
| 545 |
+
detections.forEach((det, idx) => {
|
| 546 |
+
const bbox = det.bbox || {};
|
| 547 |
+
|
| 548 |
+
// Convert normalized coordinates to pixel coordinates
|
| 549 |
+
const x = bbox.x * img.width;
|
| 550 |
+
const y = bbox.y * img.height;
|
| 551 |
+
const w = bbox.width * img.width;
|
| 552 |
+
const h = bbox.height * img.height;
|
| 553 |
+
|
| 554 |
+
// Select color
|
| 555 |
+
const color = colors[idx % colors.length];
|
| 556 |
+
|
| 557 |
+
// Draw box
|
| 558 |
+
ctx.strokeStyle = color;
|
| 559 |
+
ctx.lineWidth = 2;
|
| 560 |
+
ctx.strokeRect(x, y, w, h);
|
| 561 |
+
|
| 562 |
+
// Draw label background
|
| 563 |
+
const label = `${idx + 1}. ${det.class_name || 'Unknown'} (${(det.confidence * 100).toFixed(1)}%)`;
|
| 564 |
+
const fontSize = 14;
|
| 565 |
+
ctx.font = `bold ${fontSize}px monospace`;
|
| 566 |
+
const textWidth = ctx.measureText(label).width;
|
| 567 |
+
|
| 568 |
+
ctx.fillStyle = 'rgba(0, 0, 0, 0.7)';
|
| 569 |
+
ctx.fillRect(x, y - fontSize - 8, textWidth + 8, fontSize + 6);
|
| 570 |
+
ctx.fillStyle = color;
|
| 571 |
+
ctx.fillText(label, x + 4, y - 4);
|
| 572 |
+
});
|
| 573 |
+
|
| 574 |
+
// Display canvas as image
|
| 575 |
+
const resultImage = document.getElementById('resultImage');
|
| 576 |
+
resultImage.src = canvas.toDataURL('image/png');
|
| 577 |
+
resultImage.style.display = 'block';
|
| 578 |
+
resultImage.style.maxWidth = '100%';
|
| 579 |
+
resultImage.style.height = 'auto';
|
| 580 |
+
resultImage.style.border = '2px solid #00FF00';
|
| 581 |
+
};
|
| 582 |
+
|
| 583 |
+
img.src = imageSrc;
|
| 584 |
+
}
|
| 585 |
+
|
| 586 |
function displayClassDistribution(distribution) {
|
| 587 |
const chart = document.getElementById('classChart');
|
| 588 |
|
|
|
|
| 615 |
const tbody = document.getElementById('detectionsTableBody');
|
| 616 |
|
| 617 |
if (detections.length === 0) {
|
| 618 |
+
tbody.innerHTML = '<tr><td colspan="5" class="no-data">NO DETECTIONS</td></tr>';
|
| 619 |
return;
|
| 620 |
}
|
| 621 |
|
| 622 |
let html = '';
|
| 623 |
detections.slice(0, 50).forEach((det, idx) => {
|
| 624 |
+
// Handle different bbox formats
|
| 625 |
+
const bbox = det.bbox || det.box || {};
|
| 626 |
+
|
| 627 |
+
// Convert normalized coordinates to pixel coordinates
|
| 628 |
+
let x = '?', y = '?', w = '?', h = '?';
|
| 629 |
+
if (bbox.x !== undefined && bbox.y !== undefined && bbox.width !== undefined && bbox.height !== undefined) {
|
| 630 |
+
x = bbox.x.toFixed(3);
|
| 631 |
+
y = bbox.y.toFixed(3);
|
| 632 |
+
w = bbox.width.toFixed(3);
|
| 633 |
+
h = bbox.height.toFixed(3);
|
| 634 |
+
} else if (bbox.x1 !== undefined && bbox.y1 !== undefined && bbox.x2 !== undefined && bbox.y2 !== undefined) {
|
| 635 |
+
x = bbox.x1.toFixed(0);
|
| 636 |
+
y = bbox.y1.toFixed(0);
|
| 637 |
+
w = (bbox.x2 - bbox.x1).toFixed(0);
|
| 638 |
+
h = (bbox.y2 - bbox.y1).toFixed(0);
|
| 639 |
+
}
|
| 640 |
+
|
| 641 |
+
const className = det.class_name || det.class || 'Unknown';
|
| 642 |
+
const confidence = det.confidence ? (det.confidence * 100).toFixed(1) : '0.0';
|
| 643 |
|
| 644 |
html += `
|
| 645 |
<tr>
|
| 646 |
<td>${idx + 1}</td>
|
| 647 |
+
<td>${className}</td>
|
| 648 |
+
<td>${confidence}%</td>
|
| 649 |
+
<td title="x: ${x}, y: ${y}, w: ${w}, h: ${h}">[${x.substring(0,5)}, ${y.substring(0,5)}, ${w.substring(0,5)}, ${h.substring(0,5)}]</td>
|
| 650 |
</tr>
|
| 651 |
`;
|
| 652 |
});
|
| 653 |
|
| 654 |
if (detections.length > 50) {
|
| 655 |
+
html += `<tr><td colspan="5" class="no-data">... and ${detections.length - 50} more</td></tr>`;
|
| 656 |
}
|
| 657 |
|
| 658 |
tbody.innerHTML = html;
|
|
|
|
| 858 |
// UTILITY FUNCTIONS
|
| 859 |
// ============================================
|
| 860 |
|
| 861 |
+
function readFileAsBase64(file) {
|
| 862 |
+
return new Promise((resolve, reject) => {
|
| 863 |
+
const reader = new FileReader();
|
| 864 |
+
reader.onload = () => {
|
| 865 |
+
const result = reader.result;
|
| 866 |
+
// Extract base64 data without the data:image/png;base64, prefix
|
| 867 |
+
const base64 = result.split(',')[1];
|
| 868 |
+
resolve(base64);
|
| 869 |
+
};
|
| 870 |
+
reader.onerror = reject;
|
| 871 |
+
reader.readAsDataURL(file);
|
| 872 |
+
});
|
| 873 |
+
}
|
| 874 |
+
|
| 875 |
+
function displayMetrics(results, processingTime) {
|
| 876 |
+
const metricsDiv = document.getElementById('metricsBox');
|
| 877 |
+
if (!metricsDiv) return;
|
| 878 |
+
|
| 879 |
+
const detections = results.detections || [];
|
| 880 |
+
const confidences = detections.map(d => d.confidence || 0);
|
| 881 |
+
const avgConfidence = confidences.length > 0
|
| 882 |
+
? (confidences.reduce((a, b) => a + b) / confidences.length * 100).toFixed(1)
|
| 883 |
+
: 0;
|
| 884 |
+
const maxConfidence = confidences.length > 0
|
| 885 |
+
? (Math.max(...confidences) * 100).toFixed(1)
|
| 886 |
+
: 0;
|
| 887 |
+
const minConfidence = confidences.length > 0
|
| 888 |
+
? (Math.min(...confidences) * 100).toFixed(1)
|
| 889 |
+
: 0;
|
| 890 |
+
|
| 891 |
+
// Determine detection mode
|
| 892 |
+
let detectionMode = 'HEURISTIC (CPU Fallback)';
|
| 893 |
+
let modelType = 'Heuristic Layout Detection';
|
| 894 |
+
|
| 895 |
+
if (results.detection_mode === 'mmdet') {
|
| 896 |
+
detectionMode = 'MMDET Neural Network';
|
| 897 |
+
modelType = 'DINO (InternImage-XL)';
|
| 898 |
+
}
|
| 899 |
+
|
| 900 |
+
const metricsHTML = `
|
| 901 |
+
<div style="display: grid; grid-template-columns: repeat(auto-fit, minmax(200px, 1fr)); gap: 12px;">
|
| 902 |
+
<div style="background: #1a1a1a; border: 2px solid #00FF00; border-radius: 4px; padding: 12px;">
|
| 903 |
+
<div style="color: #00FFFF; font-size: 12px; font-weight: bold;">DETECTION MODE</div>
|
| 904 |
+
<div style="color: #00FF00; font-size: 14px; margin-top: 4px;">${detectionMode}</div>
|
| 905 |
+
</div>
|
| 906 |
+
<div style="background: #1a1a1a; border: 2px solid #00FF00; border-radius: 4px; padding: 12px;">
|
| 907 |
+
<div style="color: #00FFFF; font-size: 12px; font-weight: bold;">MODEL TYPE</div>
|
| 908 |
+
<div style="color: #00FF00; font-size: 14px; margin-top: 4px;">${modelType}</div>
|
| 909 |
+
</div>
|
| 910 |
+
<div style="background: #1a1a1a; border: 2px solid #00FF00; border-radius: 4px; padding: 12px;">
|
| 911 |
+
<div style="color: #00FFFF; font-size: 12px; font-weight: bold;">PROCESSING TIME</div>
|
| 912 |
+
<div style="color: #00FF00; font-size: 14px; margin-top: 4px;">${processingTime.toFixed(0)}ms</div>
|
| 913 |
+
</div>
|
| 914 |
+
<div style="background: #1a1a1a; border: 2px solid #00FF00; border-radius: 4px; padding: 12px;">
|
| 915 |
+
<div style="color: #00FFFF; font-size: 12px; font-weight: bold;">AVG CONFIDENCE</div>
|
| 916 |
+
<div style="color: #00FF00; font-size: 14px; margin-top: 4px;">${avgConfidence}%</div>
|
| 917 |
+
</div>
|
| 918 |
+
<div style="background: #1a1a1a; border: 2px solid #00FF00; border-radius: 4px; padding: 12px;">
|
| 919 |
+
<div style="color: #00FFFF; font-size: 12px; font-weight: bold;">MAX CONFIDENCE</div>
|
| 920 |
+
<div style="color: #00FF00; font-size: 14px; margin-top: 4px;">${maxConfidence}%</div>
|
| 921 |
+
</div>
|
| 922 |
+
<div style="background: #1a1a1a; border: 2px solid #00FF00; border-radius: 4px; padding: 12px;">
|
| 923 |
+
<div style="color: #00FFFF; font-size: 12px; font-weight: bold;">MIN CONFIDENCE</div>
|
| 924 |
+
<div style="color: #00FF00; font-size: 14px; margin-top: 4px;">${minConfidence}%</div>
|
| 925 |
+
</div>
|
| 926 |
+
</div>
|
| 927 |
+
`;
|
| 928 |
+
|
| 929 |
+
metricsDiv.innerHTML = metricsHTML;
|
| 930 |
+
}
|
| 931 |
+
|
| 932 |
console.log('[RODLA] Frontend loaded successfully. Ready for analysis.');
|
| 933 |
console.log('[RODLA] Demo mode available if backend is unavailable.');
|
setup.sh
ADDED
|
@@ -0,0 +1,59 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/bin/bash
|
| 2 |
+
|
| 3 |
+
# Exit immediately if a command exits with a non-zero status
|
| 4 |
+
set -e
|
| 5 |
+
|
| 6 |
+
# --- Configuration ---
|
| 7 |
+
ENV_NAME="RoDLA"
|
| 8 |
+
ENV_PATH="./$ENV_NAME"
|
| 9 |
+
|
| 10 |
+
# URLs for PyTorch/Detectron2 wheels
|
| 11 |
+
TORCH_VERSION="1.11.0+cu113"
|
| 12 |
+
TORCH_URL="https://download.pytorch.org/whl/cu113/torch_stable.html"
|
| 13 |
+
|
| 14 |
+
DETECTRON2_VERSION="cu113/torch1.11"
|
| 15 |
+
DETECTRON2_URL="https://dl.fbaipublicfiles.com/detectron2/wheels/$DETECTRON2_VERSION/index.html"
|
| 16 |
+
|
| 17 |
+
DCNV3_URL="https://github.com/OpenGVLab/InternImage/releases/download/whl_files/DCNv3-1.0+cu113torch1.11.0-cp37-cp37m-linux_x86_64.whl"
|
| 18 |
+
|
| 19 |
+
# Check if the environment exists and activate it
|
| 20 |
+
if [ ! -d "$ENV_PATH" ]; then
|
| 21 |
+
echo "β Error: Virtual environment '$ENV_NAME' not found at '$ENV_PATH'."
|
| 22 |
+
echo "Please ensure you have created the environment using 'python3.7 -m venv $ENV_NAME' first."
|
| 23 |
+
exit 1
|
| 24 |
+
fi
|
| 25 |
+
|
| 26 |
+
echo "--- π οΈ Activating Virtual Environment: $ENV_NAME ---"
|
| 27 |
+
# Deactivate if active, then activate the target environment
|
| 28 |
+
# We use the full path to pip/python for reliability instead of 'source' which only affects the current shell session.
|
| 29 |
+
export PATH="$ENV_PATH/bin:$PATH"
|
| 30 |
+
|
| 31 |
+
# Check if the activation worked by checking the 'which python' command
|
| 32 |
+
if ! command -v python | grep -q "$ENV_PATH"; then
|
| 33 |
+
echo "β Failed to set environment path. Aborting."
|
| 34 |
+
exit 1
|
| 35 |
+
fi
|
| 36 |
+
|
| 37 |
+
echo "--- ποΈ Uninstalling Old PyTorch Packages (if present) ---"
|
| 38 |
+
# Use the environment's pip (now in $PATH)
|
| 39 |
+
pip uninstall torch torchvision torchaudio -y || true
|
| 40 |
+
|
| 41 |
+
echo "--- π¦ Installing PyTorch 1.11.0+cu113 and Core Dependencies ---"
|
| 42 |
+
# Note: We are using the correct PyTorch 1.11.0 versions that match the DCNv3 wheel.
|
| 43 |
+
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0+cu113 -f "$TORCH_URL"
|
| 44 |
+
|
| 45 |
+
echo "--- π¦ Installing OpenMMLab and Other Benchmarking Dependencies ---"
|
| 46 |
+
pip install -U openmim
|
| 47 |
+
# Ensure the full path to python is used for detectron2 (though it should be the venv python now)
|
| 48 |
+
python -m pip install detectron2 -f "$DETECTRON2_URL"
|
| 49 |
+
mim install mmcv-full==1.5.0
|
| 50 |
+
pip install timm==0.6.11 mmdet==2.28.1
|
| 51 |
+
pip install Pillow==9.5.0
|
| 52 |
+
pip install opencv-python termcolor yacs pyyaml scipy
|
| 53 |
+
|
| 54 |
+
echo "--- π Installing Compatible DCNv3 Wheel ---"
|
| 55 |
+
pip install "$DCNV3_URL"
|
| 56 |
+
|
| 57 |
+
echo "--- β
Setup Complete ---"
|
| 58 |
+
echo "The $ENV_NAME environment is configured. To use it, run:"
|
| 59 |
+
echo "source $ENV_PATH/bin/activate"
|
start.sh
DELETED
|
@@ -1,143 +0,0 @@
|
|
| 1 |
-
#!/bin/bash
|
| 2 |
-
# RoDLA Complete Startup Script
|
| 3 |
-
# Starts both frontend and backend services
|
| 4 |
-
|
| 5 |
-
set -e
|
| 6 |
-
|
| 7 |
-
# Colors
|
| 8 |
-
RED='\033[0;31m'
|
| 9 |
-
GREEN='\033[0;32m'
|
| 10 |
-
YELLOW='\033[1;33m'
|
| 11 |
-
BLUE='\033[0;34m'
|
| 12 |
-
NC='\033[0m' # No Color
|
| 13 |
-
|
| 14 |
-
# Header
|
| 15 |
-
echo -e "${BLUE}ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ${NC}"
|
| 16 |
-
echo -e "${BLUE}β RoDLA DOCUMENT LAYOUT ANALYSIS - 90s Edition β${NC}"
|
| 17 |
-
echo -e "${BLUE}β Startup Script (Frontend + Backend) β${NC}"
|
| 18 |
-
echo -e "${BLUE}ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ${NC}"
|
| 19 |
-
echo ""
|
| 20 |
-
|
| 21 |
-
# Get script directory
|
| 22 |
-
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
|
| 23 |
-
cd "$SCRIPT_DIR"
|
| 24 |
-
|
| 25 |
-
# Check if required directories exist
|
| 26 |
-
if [ ! -d "deployment/backend" ]; then
|
| 27 |
-
echo -e "${RED}ERROR: deployment/backend directory not found${NC}"
|
| 28 |
-
exit 1
|
| 29 |
-
fi
|
| 30 |
-
|
| 31 |
-
if [ ! -d "frontend" ]; then
|
| 32 |
-
echo -e "${RED}ERROR: frontend directory not found${NC}"
|
| 33 |
-
exit 1
|
| 34 |
-
fi
|
| 35 |
-
|
| 36 |
-
# Check if Python is available
|
| 37 |
-
if ! command -v python3 &> /dev/null; then
|
| 38 |
-
echo -e "${RED}ERROR: Python 3 is not installed${NC}"
|
| 39 |
-
exit 1
|
| 40 |
-
fi
|
| 41 |
-
|
| 42 |
-
echo -e "${GREEN}β System check passed${NC}"
|
| 43 |
-
echo ""
|
| 44 |
-
|
| 45 |
-
# Function to handle Ctrl+C
|
| 46 |
-
cleanup() {
|
| 47 |
-
echo ""
|
| 48 |
-
echo -e "${YELLOW}Shutting down RoDLA...${NC}"
|
| 49 |
-
kill $BACKEND_PID 2>/dev/null || true
|
| 50 |
-
kill $FRONTEND_PID 2>/dev/null || true
|
| 51 |
-
echo -e "${GREEN}β Services stopped${NC}"
|
| 52 |
-
exit 0
|
| 53 |
-
}
|
| 54 |
-
|
| 55 |
-
# Set trap for Ctrl+C
|
| 56 |
-
trap cleanup SIGINT
|
| 57 |
-
|
| 58 |
-
# Check ports
|
| 59 |
-
check_port() {
|
| 60 |
-
if lsof -Pi :$1 -sTCP:LISTEN -t >/dev/null 2>&1 ; then
|
| 61 |
-
return 0
|
| 62 |
-
else
|
| 63 |
-
return 1
|
| 64 |
-
fi
|
| 65 |
-
}
|
| 66 |
-
|
| 67 |
-
# Start Backend
|
| 68 |
-
echo -e "${BLUE}[1/2] Starting Backend API (port 8000)...${NC}"
|
| 69 |
-
|
| 70 |
-
if check_port 8000; then
|
| 71 |
-
echo -e "${YELLOW}β Port 8000 is already in use${NC}"
|
| 72 |
-
read -p "Continue anyway? (y/n) " -n 1 -r
|
| 73 |
-
echo
|
| 74 |
-
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
|
| 75 |
-
exit 1
|
| 76 |
-
fi
|
| 77 |
-
fi
|
| 78 |
-
|
| 79 |
-
cd "$SCRIPT_DIR/deployment/backend"
|
| 80 |
-
python3 backend.py > /tmp/rodla_backend.log 2>&1 &
|
| 81 |
-
BACKEND_PID=$!
|
| 82 |
-
echo -e "${GREEN}β Backend started (PID: $BACKEND_PID)${NC}"
|
| 83 |
-
sleep 2
|
| 84 |
-
|
| 85 |
-
# Check if backend started successfully
|
| 86 |
-
if ! kill -0 $BACKEND_PID 2>/dev/null; then
|
| 87 |
-
echo -e "${RED}β Backend failed to start${NC}"
|
| 88 |
-
echo -e "${RED}Check logs: cat /tmp/rodla_backend.log${NC}"
|
| 89 |
-
exit 1
|
| 90 |
-
fi
|
| 91 |
-
|
| 92 |
-
# Start Frontend
|
| 93 |
-
echo -e "${BLUE}[2/2] Starting Frontend Server (port 8080)...${NC}"
|
| 94 |
-
|
| 95 |
-
if check_port 8080; then
|
| 96 |
-
echo -e "${YELLOW}β Port 8080 is already in use${NC}"
|
| 97 |
-
read -p "Continue anyway? (y/n) " -n 1 -r
|
| 98 |
-
echo
|
| 99 |
-
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
|
| 100 |
-
kill $BACKEND_PID
|
| 101 |
-
exit 1
|
| 102 |
-
fi
|
| 103 |
-
fi
|
| 104 |
-
|
| 105 |
-
cd "$SCRIPT_DIR/frontend"
|
| 106 |
-
python3 server.py > /tmp/rodla_frontend.log 2>&1 &
|
| 107 |
-
FRONTEND_PID=$!
|
| 108 |
-
echo -e "${GREEN}β Frontend started (PID: $FRONTEND_PID)${NC}"
|
| 109 |
-
sleep 1
|
| 110 |
-
|
| 111 |
-
# Summary
|
| 112 |
-
echo ""
|
| 113 |
-
echo -e "${BLUE}ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ${NC}"
|
| 114 |
-
echo -e "${GREEN}β RoDLA System is Ready!${NC}"
|
| 115 |
-
echo -e "${BLUE}ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ${NC}"
|
| 116 |
-
echo ""
|
| 117 |
-
echo -e "${YELLOW}Access Points:${NC}"
|
| 118 |
-
echo -e " π Frontend: ${BLUE}http://localhost:8080${NC}"
|
| 119 |
-
echo -e " π Backend: ${BLUE}http://localhost:8000${NC}"
|
| 120 |
-
echo -e " π API Docs: ${BLUE}http://localhost:8000/docs${NC}"
|
| 121 |
-
echo ""
|
| 122 |
-
echo -e "${YELLOW}Services:${NC}"
|
| 123 |
-
echo -e " Backend PID: $BACKEND_PID"
|
| 124 |
-
echo -e " Frontend PID: $FRONTEND_PID"
|
| 125 |
-
echo ""
|
| 126 |
-
echo -e "${YELLOW}Logs:${NC}"
|
| 127 |
-
echo -e " Backend: ${BLUE}tail -f /tmp/rodla_backend.log${NC}"
|
| 128 |
-
echo -e " Frontend: ${BLUE}tail -f /tmp/rodla_frontend.log${NC}"
|
| 129 |
-
echo ""
|
| 130 |
-
echo -e "${YELLOW}Usage:${NC}"
|
| 131 |
-
echo -e " 1. Open ${BLUE}http://localhost:8080${NC} in your browser"
|
| 132 |
-
echo -e " 2. Upload a document image"
|
| 133 |
-
echo -e " 3. Select analysis mode (Standard or Perturbation)"
|
| 134 |
-
echo -e " 4. Click [ANALYZE DOCUMENT]"
|
| 135 |
-
echo -e " 5. Download results"
|
| 136 |
-
echo ""
|
| 137 |
-
echo -e "${YELLOW}Exit:${NC}"
|
| 138 |
-
echo -e " Press ${BLUE}Ctrl+C${NC} to stop all services"
|
| 139 |
-
echo ""
|
| 140 |
-
|
| 141 |
-
# Keep running
|
| 142 |
-
wait
|
| 143 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|