You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

CVE Risk Scoring Model (Mistral-7B QLoRA)

📋 Model Description

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.2 specifically optimized for CVE (Common Vulnerabilities and Exposures) risk assessment and CVSS scoring.

Key Features

🎯 Predicts CVSS scores (0-10 scale)
🚨 Classifies severity levels (Critical, High, Medium, Low)
🔍 Analyzes attack vectors (Network, Adjacent, Local, Physical)
⚡ Assesses exploitability (Access complexity, authentication requirements)
🛡️ Evaluates impact (Confidentiality, Integrity, Availability)
🏷️ Identifies CWE categories (Common Weakness Enumeration)

🚀 Quick Start

Installation

pip install transformers torch peft bitsandbytes accelerate

Basic Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load model and tokenizer
model_id = "Swapnanil09/cve-risk-scoring-mistral-qlora"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Prepare prompt
prompt = """<s>[INST] You are a cybersecurity risk assessment model.

Analyze the following CVE and provide:
- CVSS score
- Severity
- Attack Vector
- Access Complexity
- Impact (Confidentiality, Integrity, Availability)

CVE Description:
A remote code execution vulnerability exists in Apache Log4j2 when configured to use a JNDI Lookup. An attacker can exploit this by sending a crafted request containing a malicious JNDI lookup string.
[/INST]"""

# Generate prediction
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.7,
    do_sample=True,
    top_p=0.9,
    repetition_penalty=1.1
)

prediction = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(prediction)

Expected Output

CVSS Score: 9.8
Severity: Critical
Attack Vector: NETWORK
Access Complexity: LOW
Access Authentication: NONE
Impact - Confidentiality: COMPLETE
Impact - Integrity: COMPLETE
Impact - Availability: COMPLETE
CWE: CWE-502 - Deserialization of Untrusted Data

📊 Performance Metrics

Metric	Value
MAE (CVSS Score)	0.85
RMSE (CVSS Score)	1.23
Accuracy within ±1.0	78.9%
Accuracy within ±2.0	92.6%
Severity Classification Accuracy	85.0%

Detailed Classification Report

Severity	Precision	Recall	F1-Score
Critical	0.89	0.85	0.87
High	0.83	0.88	0.85
Medium	0.82	0.79	0.80
Low	0.88	0.92	0.90

🔧 Training Details

Base Model

Model: mistralai/Mistral-7B-Instruct-v0.2
Architecture: Decoder-only transformer (7B parameters)

Fine-Tuning Method

Technique: QLoRA (Quantized Low-Rank Adaptation)
Quantization: 4-bit NF4 with double quantization
LoRA Configuration:
- Rank (r): 16
- Alpha: 32
- Target modules: q_proj, k_proj, v_proj, o_proj
- Dropout: 0.05

Training Hyperparameters

Optimizer: AdamW (8-bit paged)
Learning Rate: 3e-4
Batch Size: 8 (per device)
Gradient Accumulation: 1
Epochs: 1
Max Sequence Length: 256 tokens
Precision: bfloat16
Warmup Steps: 10

Dataset

Source: Custom CVE dataset with CVSS v2/v3 annotations
Size: ~80,000 CVE entries
Train/Eval Split: 90/10
Features:
- CVE descriptions
- CVSS base scores
- Attack vectors
- Access complexity
- Impact metrics (CIA triad)
- CWE classifications

📚 Use Cases

1. Security Operations Centers (SOC)

Automatically triage and prioritize vulnerability alerts based on predicted severity and CVSS scores.

2. Vulnerability Management

Assess newly discovered vulnerabilities before official CVSS scores are published.

3. Threat Intelligence

Analyze threat reports and security advisories to extract risk metrics.

4. DevSecOps Automation

Integrate into CI/CD pipelines for automated security assessment of dependencies.

5. Security Research

Analyze patterns in vulnerability characteristics and predict potential impact.

⚠️ Limitations and Biases

Known Limitations

Training data bias: Model is trained primarily on historical CVE data (pre-2025), which may not fully represent emerging vulnerability classes
Context window: Limited to 256-512 tokens, may truncate very detailed CVE descriptions
CVSS version: Primarily trained on CVSS v2 data; performance on CVSS v3.1/v4.0 may vary
Language: Optimized for English-language CVE descriptions only

Recommended Best Practices

⚠️ Do not use as sole source of truth - Always validate predictions with official CVSS scores when available
🔍 Human review required - Critical decisions should involve security expert review
📊 Confidence thresholds - Implement confidence scoring for production use
🔄 Regular updates - Retrain periodically on new CVE data to maintain accuracy
🎯 Domain-specific tuning - Consider fine-tuning on organization-specific vulnerability data

🔒 Ethical Considerations

This model is designed for defensive cybersecurity purposes only. Users are responsible for ensuring compliance with:

Applicable laws and regulations
Responsible disclosure practices
Ethical security research guidelines

Prohibited Uses:

Facilitating malicious attacks or exploitation
Circumventing security measures without authorization
Weaponizing vulnerability information

📖 Citation

@misc{cve-risk-scoring-mistral-qlora,
  author = {Swapnanil Chatterjee},
  title = {CVE Risk Scoring Model: Fine-tuned Mistral-7B for Vulnerability Assessment},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/Swapnanil09/cve-risk-scoring-mistral-qlora}}
}

📜 License

This model is released under the Apache 2.0 License, inheriting from the base Mistral-7B model.

🙏 Acknowledgments

Mistral AI for the base Mistral-7B-Instruct model
NIST NVD for CVE database
MITRE for CWE classification system
Hugging Face for model hosting and tools

📞 Contact & Support

Email: swapnanilchatterjee09@gmail.com
Twitter/X: @your_handle

🔄 Model Updates

Version	Date	Changes
v1.0	2025-01-XX	Initial release

⚡ Built with QLoRA • 🤗 Hosted on HuggingFace • 🔒 For Defensive Security Only

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Swapnanil09/cve-risk-scoring-mistral-qlora

Base model

mistralai/Mistral-7B-Instruct-v0.2

Adapter

(1080)

this model

Evaluation results

Mean Absolute Error (CVSS)
self-reported

0.850
Severity Classification Accuracy
self-reported

85.000