Model Overview
Description:
NVIDIA MAISI (Medical AI for Synthetic Imaging) is a state-of-the-art three-dimensional (3D) Latent Diffusion Model designed for generating high-quality synthetic CT images with or without anatomical annotations. This AI model excels in data augmentation and creating realistic medical imaging data to supplement limited datasets due to privacy concerns or rare conditions. It can also significantly enhance the performance of other medical imaging AI models by generating diverse and realistic training data.
MAISI offers several key features:
- Generates high-resolution 3D CT images up to 512 × 512 × 768 voxels
- Supports variable voxel sizes ranging from 0.5mm to 5.0mm
- Capable of annotating up to 127 anatomical classes, including organs and tumors
- Allows controllable anatomy size for 10 specific classes
- Produces paired segmentation masks
By providing these capabilities, MAISI is a valuable tool for researchers advancing AI applications in healthcare. However, it is important to note that this model is intended for research purposes only and not for clinical usage.
Terms of Use
By using this model, you are agreeing to the terms and conditions of the license.
References:
[1] Rombach, Robin, et al. "High-resolution image synthesis with latent diffusion models." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022. https://openaccess.thecvf.com/content/CVPR2022/papers/Rombach_High-Resolution_Image_Synthesis_With_Latent_Diffusion_Models_CVPR_2022_paper.pdf
[2] Lvmin Zhang, Anyi Rao, Maneesh Agrawala; “Adding Conditional Control to Text-to-Image Diffusion Models.” Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 3836-3847.
https://openaccess.thecvf.com/content/ICCV2023/papers/Zhang_Adding_Conditional_Control_to_Text-to-Image_Diffusion_Models_ICCV_2023_paper.pdf
Model Architecture:
Architecture Type: Convolutional Neural Network (CNN)
Network Architecture: 3D UNet + attention blocks
Input:
num_output_samples
Input Type: Integer
Input Format: Single integer value
Input Parameters: Required input indicates the number of synthetic images the model will generate.
body_region
Input Type: List
Input Format: Array of Strings
Input Parameters: Required input indicates the region of body the generated CT will focus on
- Options: ["head", "chest", "thorax", "abdomen", "pelvis", "lower"]
anatomy_list
Input Type: List
Input Format: Array of Strings
Input Parameters: Optional list of 127 anatomical classes (listed in the Additional Information section)
output_size
Input Type: List
Input Format: Array of 3 Integers
Input Parameters: Optional list of 3 numbers that indicate the x, y, and z size of the CT image.
- x- and y-axes: 128, 256, 384, 512
- z-axis: 128, 256, 384, 512, 640, 768
spacing
Input Type: List
Input Format: Array of 3 Floats
Input Parameters: Optional list of 3 floats that indicate the spacing of the CT image
- Each element must be in the range: 0.5 to 5.0
controllable_anatomy_size
Input Type: List
Input Format: Array of Tuples (String, Float)
Input Parameters: Optional list of tuples for up to 10 different anatomies. Each tuple consists of an (organ_name, size_value) pair.
- organ_name options: ["liver", "gallbladder", "stomach", "pancreas", "colon", "lung tumor", "bone lesion", "hepatic tumor", "colon cancer primaries", "pancreatic tumor"]
- size_value range: 0.0 to 1.0, or -1 (means not exist/delete this organ)
Output:
Output Type(s): Image(s)
Output Format: (Neuroimaging Informatics Technology Initiative) NIfTI, (Digital Imaging and Communications in Medicine) DICOM, and (Nearly Raw Raster Data) Nrrd
Output Parameters: Three-Dimensional (3D)
Output Description: Synthetic CT image with dimensions up to 512x512x768 and spacing between 0.5mm and 5.0mm, reflecting controllable anatomy sizes as specified. If requested in input parameters, an additional NIfTI file containing the corresponding label map for the anatomy_list is also provided.
Software Integration:
Runtime Engine(s):
MONAI Core v.1.3.2
Supported Hardware Microarchitecture Compatibility:
- NVIDIA Ampere
- NVIDIA Hopper
[Preferred/Supported] Operating System(s):
- Linux
Model Version(s):
0.3.1
Inference:
Engine: PyTorch
Test Hardware:
A100 with at least 80GB memory for 512x512x512 images
H100 with at least 80GB memory for 512x512x512 images
Additional Information:
The current list of classes available within MAISI:
"liver": 1,
"spleen": 3,
"pancreas": 4,
"right kidney": 5,
"aorta": 6,
"inferior vena cava": 7,
"right adrenal gland": 8,
"left adrenal gland": 9,
"gallbladder": 10,
"esophagus": 11,
"stomach": 12,
"duodenum": 13,
"left kidney": 14,
"bladder": 15,
"portal vein and splenic vein": 17,
"small bowel": 19,
"brain": 22,
"lung tumor": 23,
"pancreatic tumor": 24,
"hepatic vessel": 25,
"hepatic tumor": 26,
"colon cancer primaries": 27,
"left lung upper lobe": 28,
"left lung lower lobe": 29,
"right lung upper lobe": 30,
"right lung middle lobe": 31,
"right lung lower lobe": 32,
"vertebrae L5": 33,
"vertebrae L4": 34,
"vertebrae L3": 35,
"vertebrae L2": 36,
"vertebrae L1": 37,
"vertebrae T12": 38,
"vertebrae T11": 39,
"vertebrae T10": 40,
"vertebrae T9": 41,
"vertebrae T8": 42,
"vertebrae T7": 43,
"vertebrae T6": 44,
"vertebrae T5": 45,
"vertebrae T4": 46,
"vertebrae T3": 47,
"vertebrae T2": 48,
"vertebrae T1": 49,
"vertebrae C7": 50,
"vertebrae C6": 51,
"vertebrae C5": 52,
"vertebrae C4": 53,
"vertebrae C3": 54,
"vertebrae C2": 55,
"vertebrae C1": 56,
"trachea": 57,
"left iliac artery": 58,
"right iliac artery": 59,
"left iliac vena": 60,
"right iliac vena": 61,
"colon": 62,
"left rib 1": 63,
"left rib 2": 64,
"left rib 3": 65,
"left rib 4": 66,
"left rib 5": 67,
"left rib 6": 68,
"left rib 7": 69,
"left rib 8": 70,
"left rib 9": 71,
"left rib 10": 72,
"left rib 11": 73,
"left rib 12": 74,
"right rib 1": 75,
"right rib 2": 76,
"right rib 3": 77,
"right rib 4": 78,
"right rib 5": 79,
"right rib 6": 80,
"right rib 7": 81,
"right rib 8": 82,
"right rib 9": 83,
"right rib 10": 84,
"right rib 11": 85,
"right rib 12": 86,
"left humerus": 87,
"right humerus": 88,
"left scapula": 89,
"right scapula": 90,
"left clavicula": 91,
"right clavicula": 92,
"left femur": 93,
"right femur": 94,
"left hip": 95,
"right hip": 96,
"sacrum": 97,
"left gluteus maximus": 98,
"right gluteus maximus": 99,
"left gluteus medius": 100,
"right gluteus medius": 101,
"left gluteus minimus": 102,
"right gluteus minimus": 103,
"left autochthon": 104,
"right autochthon": 105,
"left iliopsoas": 106,
"right iliopsoas": 107,
"left atrial appendage": 108,
"brachiocephalic trunk": 109,
"left brachiocephalic vein": 110,
"right brachiocephalic vein": 111,
"left common carotid artery": 112,
"right common carotid artery": 113,
"costal cartilages": 114,
"heart": 115,
"left kidney cyst": 116,
"right kidney cyst": 117,
"prostate": 118,
"pulmonary vein": 119,
"skull": 120,
"spinal cord": 121,
"sternum": 122,
"left subclavian artery": 123,
"right subclavian artery": 124,
"superior vena cava": 125,
"thyroid gland": 126,
"vertebrae S1": 127,
"bone lesion": 128,
"airway": 132
Ethical Considerations:
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report security vulnerabilities or NVIDIA AI Concerns here.