MedGemma-4B Cardiology (GGUF)

uaritm β€” Domain-Finetuned Clinical LLM for Cardiology Based on Google MedGemma-4B-IT


Model Description

MedGemma-4B-Cardiology is a domain-adapted medical language model built on
google/medgemma-4b-it and fine-tuned via supervised instruction tuning (SFT) on anonymized cardiology clinical data:

  • discharge summaries
  • history of illness
  • surgical procedures
  • treatment pathways
  • physician recommendations

The model is optimized for:

  • generating clinically grounded cardiology recommendations
  • structuring and summarizing medical documentation
  • rewriting clinical narratives
  • assisting in cardiology decision support

GGUF format ensures compatibility with:

  • llama.cpp
  • Ollama
  • LM Studio
  • GPT4All
  • koboldcpp

Architecture

  • Base model: Google MedGemma-4B-IT
  • Parameters: ~4B
  • Fine-tuning: LoRA + SFT
  • Precision: Q4_K_M / Q5_K_M GGUF
  • Context window: up to 4096 tokens

Repository Contents

medgemma-4b-cardiology.Q4_K_M.gguf medgemma-4b-cardiology.Q5_K_M.gguf tokenizer.json README.md


Usage Examples

llama.cpp

./main -m medgemma-4b-cardiology.Q4_K_M.gguf
-p "You are a cardiologist. Based on the clinical details, generate treatment recommendations."


Ollama

Create a file named Modelfile:

FROM ./medgemma-4b-cardiology.Q4_K_M.gguf TEMPLATE "user\n{{ .Prompt }}\n\nassistant" PARAMETER temperature 0.3

Then run:

ollama create medcardio -f Modelfile
ollama run medcardio


Python (llama-cpp-python)

from llama_cpp import Llama

llm = Llama( model_path="medgemma-4b-cardiology.Q4_K_M.gguf", n_ctx=4096, n_gpu_layers=35, )

prompt = "You are a cardiologist. Patient: 67-year-old male with stable angina. Provide evidence-based recommendations."

output = llm(prompt, max_tokens=400) print(output["choices"][0]["text"])


Prompt Format (MedGemma Chat Template)

The model expects chat-structured input:

user
[Your instruction or clinical text]

assistant

Example:

user
You are a cardiologist. Generate recommendations for:
Sex: Male
Age: 67
Length of stay: 10 days
[discharge summary...]

assistant


Fine-Tuning Details

Training dataset included >12,000 cardiology-specific instruction-response pairs:

{ "prompt": "...", "response": "..." }

Converted into Gemma-style messages:

{ "messages": [ {"role": "user", "content": "..."}, {"role": "assistant", "content": "..."} ] }

Training setup:

  • LoRA r=16, alpha=32
  • BF16 training
  • max_length: 2048–4096
  • learning rate sweep: 5e-5 β†’ 3e-4
  • warmup_ratio: 0.03–0.10
  • gradient checkpointing: enabled
  • optimizer and scheduler from TRL SFTTrainer

Disclaimer

This model is NOT a medical device and must not be used for autonomous diagnosis or treatment decisions.
Outputs require interpretation and verification by certified medical professionals.
Intended solely for research and decision-support augmentation.


Limitations

  • Optimized for cardiology and cardiac surgery
  • Reduced accuracy outside these domains
  • No vision capabilities (text-only MedGemma IT)
  • May generate incomplete or generalized recommendations

Citing & Authors

If you use this model in your research, please cite:

@misc{Ostashko2025MedGemmaCardiology, title = {MedGemma-4B-Cardiology: A Domain-Finetuned Clinical LLM for Cardiology}, author = {Ostashko, Vitaliy}, year = {2025}, url = {ai.esemi.org} }

Project homepage: https://ai.esemi.org


HuggingFace

If you found this model useful β€” please give a star on HuggingFace:

https://huggingface.co/uaritm/medgemma-4b-cardiology-gguf

Downloads last month
40
GGUF
Model size
4B params
Architecture
gemma3
Hardware compatibility
Log In to view the estimation

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for uaritm/medgemma-4b-cardiology-gguf

Adapter
(39)
this model