Type: Dataset
Tags: TNT
Bibtex:
Tags: TNT
Bibtex:
@article{, title= {BRATS2013 Tumor-NoTumor Dataset (T-NT)}, keywords= {TNT}, author= {}, abstract= {This dataset (called T-NT) contains images which contain or do not contain a tumor along with a segmentation of brain matter and the tumor. The goal is that it can be used to simulate bias in data in a controlled fashion. # Dataset Construction The synthetic data of the BRATS2013 dataset is used to construct this dataset. Each brain contains a tumor but it is typically only on one side. Only the right side is taken in order to have examples that do not have tumors. Each image is filtered to ensure it has enough brain in the image (more than 30% of the pixels). If the tumor takes up at least 1% of the pixels in the brain then it is considered to have a tumor. Here is an snippet from the code used to construct the dataset: ``` def get_labels(rightside): met = {} met['brain'] = ( 1. * (rightside != 0).sum() / (rightside == 0).sum()) met['tumor'] = ( 1. * (rightside > 2).sum() / ((rightside != 0).sum() + 1e-10)) met['has_enough_brain'] = met['brain'] > 0.30 met['has_tumor'] = met['tumor'] > 0.01 return met ``` # File and Folder structure The files are organized as follows: PatientID-SlideNumber-HasTumor.png For example: ``` HG0011-118-False.png HG0015-65-True.png HG0019-95-False.png ``` The segmentation images are pixel values that correspond to the following 6 classes: ``` Non Tumor classes: 0, 10, 20 Tumor classes: 40 Unknown classes: 30, 50 ``` A Tumor example https://i.imgur.com/WIKFhO1.png A NoTumor example https://i.imgur.com/AbkTw5L.png The folders are divided into training or testing by patient. Then they are divided into flair, t1, and a segmentation image. ``` train (2125 images, 1421 tumor, 704 notumor) ├── flair ├── segmentation └── t1 holdout (1415 images, 1051 tumor, 364 notumor) ├── flair ├── segmentation └── t1 ``` Patients in training: ['HG0018' 'HG0019' 'HG0012' 'HG0013' 'HG0010' 'HG0011' 'HG0016' 'HG0017' 'HG0014' 'HG0015' 'HG0023' 'HG0022' 'HG0021' 'LG0005' 'LG0004' 'LG0007' 'LG0006' 'LG0001' 'LG0003' 'LG0002' 'LG0025' 'LG0024' 'LG0009' 'LG0022' 'LG0021' 'LG0020' 'HG0009' 'HG0008' 'HG0002' 'HG0025'] Patients in test: ['HG0001' 'HG0003' 'HG0024' 'HG0005' 'HG0004' 'HG0007' 'HG0006' 'HG0020' 'LG0023' 'LG0008' 'LG0016' 'LG0017' 'LG0014' 'LG0015' 'LG0012' 'LG0013' 'LG0010' 'LG0011' 'LG0018' 'LG0019'] Sample Flair Images | Tumor | NoTumor | |:----------:|:-------------:| | https://i.imgur.com/3305V4u.png | https://i.imgur.com/QDVB4fo.png| | https://i.imgur.com/kGHfa8Q.png | https://i.imgur.com/MKA9vxK.png| # Citation If you use this dataset, please cite: ``` Distribution Matching Losses Can Hallucinate Features in Medical Image Translation Joseph Paul Cohen, Margaux Luck, Sina Honari Medical Image Computing & Computer Assisted Intervention (MICCAI) https://arxiv.org/abs/1805.08841 ``` ``` @article{cohen2018distribution, author = {Cohen, Joseph Paul and Luck, Margaux and Honari, Sina}, journal = {Medical Image Computing & Computer Assisted Intervention (MICCAI)}, title = {Distribution Matching Losses Can Hallucinate Features in Medical Image Translation}, year = {2018} } ``` ## License The original files are shared with the following license so our dataset is shared with the same license. "Except where otherwise noted, content is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Switzerland License. http://creativecommons.org/licenses/by-nc-sa/3.0/ch/deed.en" The following papers describe the original dataset: Menze et al., The Multimodal Brain TumorImage Segmentation Benchmark (BRATS), IEEE Trans. Med. Imaging, 2015.Get the citation as BibTex Kistler et. al, The virtual skeleton database: an open access repository for biomedical research and collaboration. JMIR, 2013. (BibTex) }, terms= {}, license= {Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)}, superseded= {}, url= {https://github.com/ieee8023/dist-bias} }