YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

ResNet-18 (CIFAR-10 Baseline)

PyTorch Hugging Face License

This model is a baseline ResNet-18 trained on the CIFAR-10 dataset. It serves as the "Control" model for the VisionDev-Copilot project, designed to benchmark model robustness against common image distortions (Blur, Noise, Compression).

πŸ“‹ Model Details

Feature Specification
Architecture ResNet-18 (Pretrained on ImageNet, Finetuned on CIFAR-10)
Input Size 32Γ—32 pixels (RGB)
Number of Classes 10
Framework PyTorch
Parameters ~11 million
Training Dataset CIFAR-10
Baseline Accuracy ~80%

Classes

The model classifies images into the following 10 categories:

  • airplane ✈️
  • automobile πŸš—
  • bird 🐦
  • cat 🐱
  • deer 🦌
  • dog πŸ•
  • frog 🐸
  • horse 🐎
  • ship 🚒
  • truck 🚚

πŸš€ Quick Start

Installation

pip install torch torchvision huggingface_hub pillow requests

Load and Use the Model

import torch
import torch.nn as nn
from torchvision import models, transforms
from huggingface_hub import hf_hub_download
from PIL import Image

# Configuration
REPO_ID = "Phoenix21/resnet18-cifar10-baseline"
FILENAME = "resnet18_cifar10_baseline.pth"

# 1. Initialize model architecture
model = models.resnet18(pretrained=False)
model.fc = nn.Linear(model.fc.in_features, 10)  # CIFAR-10 has 10 classes

# 2. Download and load weights
model_path = hf_hub_download(repo_id=REPO_ID, filename=FILENAME)
model.load_state_dict(torch.load(model_path, map_location=torch.device('cpu')))

# 3. Set to evaluation mode
model.eval()

# 4. Prepare image transformation
transform = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.4914, 0.4822, 0.4465], 
                         std=[0.2023, 0.1994, 0.2010])  # CIFAR-10 stats
])

# 5. Load and preprocess image
image = Image.open("your_image.jpg").convert("RGB")
input_tensor = transform(image).unsqueeze(0)

# 6. Make prediction
with torch.no_grad():
    output = model(input_tensor)
    probabilities = torch.nn.functional.softmax(output[0], dim=0)

# 7. Get results
classes = ['airplane', 'automobile', 'bird', 'cat', 'deer', 
           'dog', 'frog', 'horse', 'ship', 'truck']
confidence, prediction = torch.max(probabilities, 0)
print(f"Prediction: {classes[prediction]} ({confidence.item()*100:.2f}%)")

πŸ“Š Performance

Baseline Accuracy

  • Clean Test Accuracy: ~80.0%
  • Training Details: Fine-tuned from ImageNet-pretrained ResNet-18
  • Epochs: 50
  • Optimizer: SGD with momentum
  • Learning Rate: 0.01 with cosine annealing

Intended Use

This model is specifically designed for research and benchmarking purposes. It serves as:

  1. Baseline Model: For comparing against more robust architectures
  2. Control Model: In the VisionDev-Copilot project's robustness evaluation
  3. Test Subject: For studying failure modes under image distortions

πŸ”¬ Research Application

Robustness Benchmarking

This model is meant to be "stressed" with various image distortions to evaluate failure modes:

# Example: Applying Gaussian blur to test robustness
from torchvision.transforms import GaussianBlur

distortion_transform = transforms.Compose([
    transforms.Resize((32, 32)),
    GaussianBlur(kernel_size=5, sigma=2.0),  # Add distortion
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.4914, 0.4822, 0.4465], 
                         std=[0.2023, 0.1994, 0.2010])
])

# Test model with distorted images
distorted_tensor = distortion_transform(image).unsqueeze(0)
with torch.no_grad():
    distorted_output = model(distorted_tensor)

Common Distortions to Test:

  • Gaussian Blur (Οƒ = 0.5-2.0)
  • Shot/Impulse Noise (var = 0.01-0.1)
  • JPEG Compression (quality = 10-90)
  • Brightness/Contrast adjustments
  • Pixelation effects

πŸ“ Files

File Description
resnet18_cifar10_baseline.pth Model weights (PyTorch state dict)
README.md This documentation file
config.json Model configuration (if applicable)

🀝 Contributing

This model is part of the VisionDev-Copilot project. If you're interested in:

  • Testing additional distortion types
  • Comparing with other architectures
  • Improving baseline performance

Please feel free to fork the repository and submit pull requests.

πŸ“ Citation

If you use this model in your research, please acknowledge it as:

@misc{resnet18-cifar10-baseline,
  author = {VisionDev-Copilot Project},
  title = {ResNet-18 CIFAR-10 Baseline Model},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/Phoenix21/resnet18-cifar10-baseline}
}

⚠️ Limitations

  1. Low Resolution: Trained on 32Γ—32 images, not suitable for high-resolution inputs
  2. Limited Classes: Only recognizes the 10 CIFAR-10 categories
  3. Sensitivity: Vulnerable to adversarial attacks and common image distortions
  4. Domain Specific: Optimized for CIFAR-10, may not generalize well to other datasets

πŸ“§ Contact

For questions or issues regarding this model:


Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support