πŸ₯ OpenMedLLM-70B β€” 78.9% on GeneTuring β€” Read the paper β†’
Models / deepcog-ai / OpenMedLLM-70B
πŸ₯
OpenMedLLM-70B
by deepcog-ai  Β·  Updated 2 days ago  Β·  ⬇ 48.2k downloads  Β·  ❀️ 1,241 likes
Medicals Clinical Apache 2.0 PyTorch GGUF AWQ NEW

πŸ“Š Benchmark Performance

78.9%
GeneTuring (avg)
82.4%
ClinVar VUS
84.1%
Gene-Disease Assoc.

OpenMedLLM-70B achieves state-of-the-art results across all major medical benchmarks, surpassing GPT-4o by 10.5 points on GeneTuring and outperforming all prior open-source medical models on ClinVar VUS classification.

πŸ₯ About OpenMedLLM-70B

OpenMedLLM-70B is an open-source large language model built by DeepCog.ai specifically for medical data analysis, clinical decision support, and clinical medicals reporting. It bridges the critical gap between raw medical data and clinically actionable insights using natural language.

Unlike general DNA sequence models (DNABERT, Evo2) which operate on raw nucleotide sequences, OpenMedLLM-70B is designed for clinical reasoning about medicals β€” answering complex questions like "What is the clinical significance of this BRCA1 variant?" or "Generate an ACMG classification report for this VCF."

  • Built on Llama-3 70B with continued domain-adaptive pre-training
  • Trained on ClinVar, NCBI dbSNP, OMIM, gnomAD, Ensembl, and 8M+ PubMed medicals papers
  • Aligned with DPO using 2.4M expert-annotated clinical decision support preference pairs
  • Supports 128K token context β€” process full clinical reports in one pass
  • Native VCF file input and structured JSON/ACMG report output

πŸ’» Quick Start

python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load model
tokenizer = AutoTokenizer.from_pretrained("deepcog-ai/OpenMedLLM-70B")
model = AutoModelForCausalLM.from_pretrained(
    "deepcog-ai/OpenMedLLM-70B",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Example 1: Variant interpretation
prompt = """Interpret the following variant and provide ACMG classification:
Gene: BRCA1
Variant: c.5266dupC (p.Gln1756ProfsTer25)
Allele frequency (gnomAD): 0.000004
ClinVar submissions: 142 pathogenic, 0 benign"""

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
output = model.generate(**inputs, max_new_tokens=1024, temperature=0.1)
print(tokenizer.decode(output[0], skip_special_tokens=True))
bash β€” CLI
# Install and run via CLI
pip install openmedllm

openmedllm download deepcog-ai/OpenMedLLM-70B
openmedllm interpret --vcf patient.vcf --output report.json

πŸ“š Training Data

ClinVar
2M+ clinically annotated variants with pathogenicity classifications and submitter evidence
gnomAD
Population allele frequencies across 125,000 exomes and 15,000 genomes
OMIM
7,000+ disease diagnosis with molecular mechanisms and inheritance patterns
PubMed Medicals
8M+ medicals research abstracts and 240K full-text papers from 2000–2025

⚠️ Limitations and Safety

OpenMedLLM-70B is intended for research and clinical decision support only. It should not replace the judgment of a board-certified clinical geneticist.

  • Performance degrades for rare variants with <5 ClinVar submissions
  • Not validated for somatic clinical decision support in oncology
  • Should be used with appropriate clinical informatics infrastructure
  • All outputs must be reviewed by a qualified clinician before patient use

πŸ“„ Citation

bibtex
@article{deepcog2026openmedllm,
  title   = {OpenMedLLM: An Open-Source LLM for Medical
             Data Analysis and Clinical Variant Interpretation},
  author  = {DeepCog AI Research Team},
  journal = {arXiv preprint},
  year    = {2026},
  url     = {https://openmedllm.org}
}