YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
ResNet-18 (CIFAR-10 Baseline)
This model is a baseline ResNet-18 trained on the CIFAR-10 dataset. It serves as the "Control" model for the VisionDev-Copilot project, designed to benchmark model robustness against common image distortions (Blur, Noise, Compression).
π Model Details
| Feature | Specification |
|---|---|
| Architecture | ResNet-18 (Pretrained on ImageNet, Finetuned on CIFAR-10) |
| Input Size | 32Γ32 pixels (RGB) |
| Number of Classes | 10 |
| Framework | PyTorch |
| Parameters | ~11 million |
| Training Dataset | CIFAR-10 |
| Baseline Accuracy | ~80% |
Classes
The model classifies images into the following 10 categories:
airplaneβοΈautomobileπbirdπ¦catπ±deerπ¦dogπfrogπΈhorseπshipπ’truckπ
π Quick Start
Installation
pip install torch torchvision huggingface_hub pillow requests
Load and Use the Model
import torch
import torch.nn as nn
from torchvision import models, transforms
from huggingface_hub import hf_hub_download
from PIL import Image
# Configuration
REPO_ID = "Phoenix21/resnet18-cifar10-baseline"
FILENAME = "resnet18_cifar10_baseline.pth"
# 1. Initialize model architecture
model = models.resnet18(pretrained=False)
model.fc = nn.Linear(model.fc.in_features, 10) # CIFAR-10 has 10 classes
# 2. Download and load weights
model_path = hf_hub_download(repo_id=REPO_ID, filename=FILENAME)
model.load_state_dict(torch.load(model_path, map_location=torch.device('cpu')))
# 3. Set to evaluation mode
model.eval()
# 4. Prepare image transformation
transform = transforms.Compose([
transforms.Resize((32, 32)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.4914, 0.4822, 0.4465],
std=[0.2023, 0.1994, 0.2010]) # CIFAR-10 stats
])
# 5. Load and preprocess image
image = Image.open("your_image.jpg").convert("RGB")
input_tensor = transform(image).unsqueeze(0)
# 6. Make prediction
with torch.no_grad():
output = model(input_tensor)
probabilities = torch.nn.functional.softmax(output[0], dim=0)
# 7. Get results
classes = ['airplane', 'automobile', 'bird', 'cat', 'deer',
'dog', 'frog', 'horse', 'ship', 'truck']
confidence, prediction = torch.max(probabilities, 0)
print(f"Prediction: {classes[prediction]} ({confidence.item()*100:.2f}%)")
π Performance
Baseline Accuracy
- Clean Test Accuracy: ~80.0%
- Training Details: Fine-tuned from ImageNet-pretrained ResNet-18
- Epochs: 50
- Optimizer: SGD with momentum
- Learning Rate: 0.01 with cosine annealing
Intended Use
This model is specifically designed for research and benchmarking purposes. It serves as:
- Baseline Model: For comparing against more robust architectures
- Control Model: In the VisionDev-Copilot project's robustness evaluation
- Test Subject: For studying failure modes under image distortions
π¬ Research Application
Robustness Benchmarking
This model is meant to be "stressed" with various image distortions to evaluate failure modes:
# Example: Applying Gaussian blur to test robustness
from torchvision.transforms import GaussianBlur
distortion_transform = transforms.Compose([
transforms.Resize((32, 32)),
GaussianBlur(kernel_size=5, sigma=2.0), # Add distortion
transforms.ToTensor(),
transforms.Normalize(mean=[0.4914, 0.4822, 0.4465],
std=[0.2023, 0.1994, 0.2010])
])
# Test model with distorted images
distorted_tensor = distortion_transform(image).unsqueeze(0)
with torch.no_grad():
distorted_output = model(distorted_tensor)
Common Distortions to Test:
- Gaussian Blur (Ο = 0.5-2.0)
- Shot/Impulse Noise (var = 0.01-0.1)
- JPEG Compression (quality = 10-90)
- Brightness/Contrast adjustments
- Pixelation effects
π Files
| File | Description |
|---|---|
resnet18_cifar10_baseline.pth |
Model weights (PyTorch state dict) |
README.md |
This documentation file |
config.json |
Model configuration (if applicable) |
π€ Contributing
This model is part of the VisionDev-Copilot project. If you're interested in:
- Testing additional distortion types
- Comparing with other architectures
- Improving baseline performance
Please feel free to fork the repository and submit pull requests.
π Citation
If you use this model in your research, please acknowledge it as:
@misc{resnet18-cifar10-baseline,
author = {VisionDev-Copilot Project},
title = {ResNet-18 CIFAR-10 Baseline Model},
year = {2024},
publisher = {Hugging Face},
url = {https://huggingface.co/Phoenix21/resnet18-cifar10-baseline}
}
β οΈ Limitations
- Low Resolution: Trained on 32Γ32 images, not suitable for high-resolution inputs
- Limited Classes: Only recognizes the 10 CIFAR-10 categories
- Sensitivity: Vulnerable to adversarial attacks and common image distortions
- Domain Specific: Optimized for CIFAR-10, may not generalize well to other datasets
π§ Contact
For questions or issues regarding this model:
- Open an issue on the Hugging Face model page
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support