SA-Med2D-20M
SA-Med2D-20M是一个大规模的2D医学图像分割数据集,包含4.6百万张2D医学图像和19.7百万个对应的掩码,覆盖了几乎整个身体,并展示了显著的多样性。该数据集旨在帮助研究人员构建医学视觉基础模型或将其模型应用于下游医学应用。
Description
SAM‑Med2D Dataset Overview
🌤️ Highlights
- Collected and organized the largest medical image segmentation dataset to date (4.6 million images and 19.7 million masks).
- Conducted the most comprehensive fine‑tuning based on the Segment Anything Model (SAM).
- Performed a thorough evaluation of SAM‑Med2D on a large‑scale dataset.
🔥 Updates
- (2023.12.05) Dataset released for download on the Hugging Face platform.
- (2023.11.23) Released the SA‑Med2D‑20M dataset.
- (2023.11.21) Published a paper introducing the SA‑Med2D‑20M dataset.
- (2023.10.24) Released SAM‑Med3D, focusing on 3D medical image segmentation.
- (2023.09.14) Released training code.
- (2023.09.02) Released testing code.
- (2023.08.31) Released pretrained model.
- (2023.08.31) Released paper.
- (2023.08.26) Released online demo.
👉 Dataset
SAM‑Med2D was trained and evaluated on a dataset containing 4.6 million images and 19.7 million masks. The dataset covers 10 medical data modalities, 4 anatomical structures + lesion categories, and 31 major human organs. To the best of our knowledge, this is the largest medical image segmentation dataset in terms of volume and category coverage.
👉 Framework
SAM‑Med2D workflow. We freeze the image encoder and insert learnable adapter layers into each Transformer block to acquire domain‑specific knowledge. Prompt encoder is fine‑tuned with point, Bbox, and mask information, and the mask decoder is updated via interactive training.
👉 Results
Quantitative Comparison
| Model | Resolution | Bbox (%) | 1 pt (%) | 3 pts (%) | 5 pts (%) | FPS | Checkpoint |
|---|---|---|---|---|---|---|---|
| SAM | $256\times256$ | 61.63 | 18.94 | 28.28 | 37.47 | 51 | Offical |
| SAM | $1024\times1024$ | 74.49 | 36.88 | 42.00 | 47.57 | 8 | Offical |
| FT‑SAM | $256\times256$ | 73.56 | 60.11 | 70.95 | 75.51 | 51 | FT‑SAM |
| SAM‑Med2D | $256\times256$ | 79.30 | 70.01 | 76.35 | 78.68 | 35 | SAM‑Med2D |
Generalization Validation
| Dataset | Bbox prompt (%) | 1 point prompt (%) | ||||
|---|---|---|---|---|---|---|
| SAM | SAM‑Med2D* | SAM‑Med2D | SAM | SAM‑Med2D* | SAM‑Med2D | |
| CrossMoDA23 | 78.12 | 86.26 | 88.42 | 33.84 | 65.85 | 85.26 |
| KiTS23 | 81.52 | 86.14 | 89.89 | 31.36 | 56.67 | 83.71 |
| FLARE23 | 73.20 | 77.18 | 85.09 | 19.87 | 32.01 | 77.17 |
| ATLAS2023 | 76.98 | 79.09 | 82.59 | 29.07 | 45.25 | 64.76 |
| SEG2023 | 64.82 | 81.85 | 85.09 | 21.15 | 34.71 | 72.08 |
| LNQ2023 | 53.02 | 57.37 | 58.01 | 7.05 | 7.21 | 37.64 |
| CAS2023 | 61.53 | 78.20 | 81.10 | 22.75 | 46.85 | 78.46 |
| TDSC‑ABUS2023 | 64.31 | 69.00 | 66.14 | 8.24 | 18.98 | 43.55 |
| ToothFairy2023 | 43.40 | 39.13 | 41.23 | 5.47 | 5.27 | 12.93 |
| Weighted sum | 73.49 | 77.67 | 84.88 | 20.88 | 34.30 | 76.63 |
👉 Training
Prepare your own dataset and replace the examples in SAM‑Med2D/data_demo. Before running train.py, generate the image2label_train.json file.
cd ./SAM‑Med2D
python train.py
Parameters
work_dir: Directory for training outputs (defaultworkdir).image_size: Default256.mask_num: Number of masks per image (default5).data_path: Dataset directory, e.g.,data_demo.resume: Path to pretrained weights; if provided, ignoresam_checkpoint.sam_checkpoint: Load SAM checkpoint.iter_point: Iteration count for mask decoder.multimask: Whether to output multiple masks (defaultTrue).encoder_adapter: Whether to fine‑tune adapter layers; setFalsewhen only decoder is fine‑tuned.use_amp: Use mixed‑precision training.
👉 Testing
Prepare your own dataset and replace the examples in SAM‑Med2D/data_demo. Before running test.py, generate the label2image_test.json file.
cd ./SAM‑Med2D
python test.py
Parameters
work_dir: Directory for testing outputs (defaultworkdir).batch_size: Default1.image_size: Default256.boxes_prompt: Use Bbox prompts for segmentation.point_num: Number of points (default1).iter_point: Iterations for point prompts.sam_checkpoint: Load SAM or SAM‑Med checkpoint.encoder_adapter: Set toTruewhen using SAM‑Med2D pretrained weights.save_pred: Save prediction results.prompt_path: Path to a fixed prompt file; ifNone, prompts are generated automatically during inference.
👉 Deployment
Export to ONNX
- Export encoder model
python3 scripts/export_onnx_encoder_model.py --sam_checkpoint /path/to/sam‑med2d_b.pth --output /path/to/sam‑med2d_b.encoder.onnx --model-type vit_b --image_size 256 --encoder_adapter True
- Export decoder model
python3 scripts/export_onnx_model.py --checkpoint /path/to/sam‑med2d_b.pth --output /path/to/sam‑med2d_b.decoder.onnx --model-type vit_b --return-single-mask
- Run inference with onnxruntime
# cd examples/SAM‑Med2D‑onnxruntime
python3 main.py --encoder_model /path/to/sam‑med2d_b.encoder.onnx --decoder_model /path/to/sam‑med2d_b.decoder.onnx
🚀 Try SAM‑Med2D
- 🏆 Gradio online demo: Available on OpenXLab.
- 🏆 Notebook demo: Use predictor_example.ipynb locally to view predictions under different prompts.
- 🏆 Local Gradio deployment: Deploy app.ipynb locally and upload test cases.
🗓️ Ongoing
- Dataset release
- Dataset paper release
- Training code release
- Testing code release
- Pretrained model release
- Paper release
- Online demo release
🎫 License
The project is released under the Apache 2.0 License.
💬 Discussion Group
For any questions about SAM‑Med2D, add the WeChat ID to join the discussion group.
🤝 Acknowledgements
- Thanks to all medical professionals and dataset owners for providing public datasets.
- Thanks to the open‑source projects: Segment Anything.
👋 Recruitment & Global Collaboration
- Recruitment: Our General Vision team at the Shanghai AI Lab is hiring researchers, engineers, and interns. If you are interested in medical foundation models and general medical AI, contact us.
- Global Collaboration: We aim to redefine medical research and advance the medical community. Partners are welcome to increase competitiveness, reduce risk, and expand markets.
- Contact: Junjun He (hejunjun@pjlab.org.cn), Jin Ye (yejin@pjlab.org.cn), and Tianbin Li (litianbin@pjlab.org.cn).
References
@misc{cheng2023sammed2d, title={SAM‑Med2D}, author={Junlong Cheng and Jin Ye and Zhongying Deng and Jianpin Chen and Tianbin Li and Haoyu Wang and Yanzhou Su and Ziyan Huang and Jilong Chen and Lei Jiang and Hui Sun and Junjun He and Shaoting Zhang and Min Zhu and Yu Qiao}, year={2023}, eprint={2308.16184}, archivePrefix={arXiv}, primaryClass={cs.CV} }
@misc{ye2023samed2d20m, title={...} }
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: arXiv
Created: 11/21/2023
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.