YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

ResNet-18 (CIFAR-10 Baseline)

This model is a baseline ResNet-18 trained on the CIFAR-10 dataset. It serves as the "Control" model for the VisionDev-Copilot project, designed to benchmark model robustness against common image distortions (Blur, Noise, Compression).

📋 Model Details

Feature	Specification
Architecture	ResNet-18 (Pretrained on ImageNet, Finetuned on CIFAR-10)
Input Size	32×32 pixels (RGB)
Number of Classes	10
Framework	PyTorch
Parameters	~11 million
Training Dataset	CIFAR-10
Baseline Accuracy	~80%

Classes

The model classifies images into the following 10 categories:

airplane ✈️
automobile 🚗
bird 🐦
cat 🐱
deer 🦌
dog 🐕
frog 🐸
horse 🐎
ship 🚢
truck 🚚

🚀 Quick Start

Installation

pip install torch torchvision huggingface_hub pillow requests

Load and Use the Model

import torch
import torch.nn as nn
from torchvision import models, transforms
from huggingface_hub import hf_hub_download
from PIL import Image

# Configuration
REPO_ID = "Phoenix21/resnet18-cifar10-baseline"
FILENAME = "resnet18_cifar10_baseline.pth"

# 1. Initialize model architecture
model = models.resnet18(pretrained=False)
model.fc = nn.Linear(model.fc.in_features, 10)  # CIFAR-10 has 10 classes

# 2. Download and load weights
model_path = hf_hub_download(repo_id=REPO_ID, filename=FILENAME)
model.load_state_dict(torch.load(model_path, map_location=torch.device('cpu')))

# 3. Set to evaluation mode
model.eval()

# 4. Prepare image transformation
transform = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.4914, 0.4822, 0.4465], 
                         std=[0.2023, 0.1994, 0.2010])  # CIFAR-10 stats
])

# 5. Load and preprocess image
image = Image.open("your_image.jpg").convert("RGB")
input_tensor = transform(image).unsqueeze(0)

# 6. Make prediction
with torch.no_grad():
    output = model(input_tensor)
    probabilities = torch.nn.functional.softmax(output[0], dim=0)

# 7. Get results
classes = ['airplane', 'automobile', 'bird', 'cat', 'deer', 
           'dog', 'frog', 'horse', 'ship', 'truck']
confidence, prediction = torch.max(probabilities, 0)
print(f"Prediction: {classes[prediction]} ({confidence.item()*100:.2f}%)")

📊 Performance

Baseline Accuracy

Clean Test Accuracy: ~80.0%
Training Details: Fine-tuned from ImageNet-pretrained ResNet-18
Epochs: 50
Optimizer: SGD with momentum
Learning Rate: 0.01 with cosine annealing

Intended Use

This model is specifically designed for research and benchmarking purposes. It serves as:

Baseline Model: For comparing against more robust architectures
Control Model: In the VisionDev-Copilot project's robustness evaluation
Test Subject: For studying failure modes under image distortions

🔬 Research Application

Robustness Benchmarking

This model is meant to be "stressed" with various image distortions to evaluate failure modes:

# Example: Applying Gaussian blur to test robustness
from torchvision.transforms import GaussianBlur

distortion_transform = transforms.Compose([
    transforms.Resize((32, 32)),
    GaussianBlur(kernel_size=5, sigma=2.0),  # Add distortion
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.4914, 0.4822, 0.4465], 
                         std=[0.2023, 0.1994, 0.2010])
])

# Test model with distorted images
distorted_tensor = distortion_transform(image).unsqueeze(0)
with torch.no_grad():
    distorted_output = model(distorted_tensor)

Common Distortions to Test:

Gaussian Blur (σ = 0.5-2.0)
Shot/Impulse Noise (var = 0.01-0.1)
JPEG Compression (quality = 10-90)
Brightness/Contrast adjustments
Pixelation effects

📁 Files

File	Description
`resnet18_cifar10_baseline.pth`	Model weights (PyTorch state dict)
`README.md`	This documentation file
`config.json`	Model configuration (if applicable)

🤝 Contributing

This model is part of the VisionDev-Copilot project. If you're interested in:

Testing additional distortion types
Comparing with other architectures
Improving baseline performance

Please feel free to fork the repository and submit pull requests.

📝 Citation

If you use this model in your research, please acknowledge it as:

@misc{resnet18-cifar10-baseline,
  author = {VisionDev-Copilot Project},
  title = {ResNet-18 CIFAR-10 Baseline Model},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/Phoenix21/resnet18-cifar10-baseline}
}

⚠️ Limitations

Low Resolution: Trained on 32×32 images, not suitable for high-resolution inputs
Limited Classes: Only recognizes the 10 CIFAR-10 categories
Sensitivity: Vulnerable to adversarial attacks and common image distortions
Domain Specific: Optimized for CIFAR-10, may not generalize well to other datasets

📧 Contact

For questions or issues regarding this model:

Open an issue on the Hugging Face model page

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support