Download Data¶
Written by Luke Chang
Throughout this course we will be using two open source naturalistic datasets to demonstrate how to use different analytic techniques. If you end up using any of this data in a paper, be sure to cite the original papers.
The Sherlock dataset contains 16 participants that watched 50 minutes of Sherlock across 2 scanning run (i.e., Part1 & Part2) and then verbally recalled the narrative in the scanner using a noise cancelling microphone. The TR was 1.5. If you would like to access the stimuli, we have included the video from Part1 (1st 25min) and an audio recording of the show can be downloaded in the stimuli folder of this openneuro page. We have preprocessed the data using fmriprep and performed denoising. See the Preprocessing tutorial for more details. Note that we have also cropped the viewing files so that each subject has the same number of TRs and are aligned in time at the start of the movie. The Recall data has not been cropped, but we have included the details of when the subjects recall specific scenes in the Sherlock_Recall_Scene_n50_Onsets.csv
in the onset folder. Finally, we have also included some scene annotations shared by the authors in the Sherlock_Segments_1000_NN_2017.xlsx
file in the onsets folder.
Chen, J., Leong, Y., Honey, C. et al. Shared memories reveal shared structure in neural activity across individuals. Nat Neurosci 20, 115–125 (2017). https://doi.org/10.1038/nn.4450
The Paranoia dataset contains 23 participants who listened to 22 minute original narrative that describes an ambiguous social scenario. It was written such that some individuals might find it highly suspicious. The transcript and audio recording can be downloaded in the stimuli folder on openneuro. The TR was 1s and there was a 3 sec fixation before the beginning of each run.
Finn, E.S., Corlett, P.R., Chen, G. et al. Trait paranoia shapes inter-subject synchrony in brain activity during an ambiguous social narrative. Nat Commun 9, 2043 (2018). https://doi.org/10.1038/s41467-018-04387-2
The datasets are being shared using DataLad on the German Neuroinformatics Node, which is an international forum for sharing experimental data and analysis tools.
In this notebook, we will walk through how to access the datset using DataLad.
DataLad¶
The easist way to access the data is using DataLad, which is an open source version control system for data built on top of git-annex. Think of it like git for data. It provides a handy command line interface for downloading data, tracking changes, and sharing it with others.
While DataLad offers a number of useful features for working with datasets, there are three in particular that we think make it worth the effort to install for this course.
Cloning a DataLad Repository can be completed with a single line of code
datalad clone <repository>
and provides the full directory structure in the form of symbolic links. This allows you to explore all of the files in the dataset, without having to download the entire dataset at once.Specific files can be easily downloaded using
datalad get <filename>
, and files can be removed from your computer at any time usingdatalad drop <filename>
. As these datasets are large, this will allow you to only work with the data that you need for a specific tutorial and you can drop the rest when you are done with it.All of the DataLad commands can be run within Python using the datalad python api.
We will only be covering a few basic DataLad functions to get and drop data. We encourage the interested reader to read the very comprehensive DataLad User Handbook for more details and troubleshooting.
Installing Datalad¶
DataLad can be easily installed using pip.
pip install datalad
Unfortunately, it currently requires manually installing the git-annex dependency, which is not automatically installed using pip.
If you are using OSX, we recommend installing git-annex using homebrew package manager.
brew install git-annex
If you are on Debian/Ubuntu we recommend enabling the NeuroDebian repository and installing with apt-get.
sudo apt-get install datalad
For more installation options, we recommend reading the DataLad installation instructions.
!pip install datalad
Download Data with DataLad¶
Download Sherlock¶
The Sherlock dataset can be accessed at the following location https://gin.g-node.org/ljchang/Sherlock. To download the Sherlock dataset run datalad install https://gin.g-node.org/ljchang/Sherlock
in a terminal in the location where you would like to install the dataset. The full dataset is approximately 109gb.
You can run this from the notebook using the !
cell magic.
!datalad install https://gin.g-node.org/ljchang/Sherlock
Download Paranoia¶
The Paranoia dataset can be accessed at the following location https://gin.g-node.org/ljchang/Paranoia. To download the Paranoia dataset run datalad clone https://gin.g-node.org/ljchang/Paranoia
. The full dataset is approximately 100gb.
!datalad install https://gin.g-node.org/ljchang/Paranoia
Datalad Basics¶
You might be surprised to find that after cloning the dataset that it barely takes up any space du -sh
. This is because cloning only downloads the metadata of the dataset to see what files are included.
You can check to see how big the entire dataset would be if you downloaded everything using datalad status
.
!datalad status --annex
Getting Data¶
One of the really nice features of datalad is that you can see all of the data without actually storing it on your computer. When you want a specific file you use datalad get <filename>
to download that specific file. Importantly, you do not need to download all of the dat at once, only when you need it.
Now that we have cloned the repository we can grab individual files. For example, suppose we wanted to grab the first subject’s confound regressors generated by fmriprep.
!datalad get fmriprep/sub-01/func/sub-01_task-sherlockPart1_desc-confounds_regressors.tsv
Now we can check and see how much of the total dataset we have downloaded using datalad status
!datalad status --annex all
If you would like to download all of the files you can use datalad get .
. Depending on the size of the dataset and the speed of your internet connection, this might take awhile. One really nice thing about datalad is that if your connection is interrupted you can simply run datalad get .
again, and it will resume where it left off.
You can also install the dataset and download all of the files with a single command datalad install -g https://gin.g-node.org/ljchang/Sherlock
. You may want to do this if you have a lot of storage available and a fast internet connection. For most people, we recommend only downloading the files you need for a specific tutorial.
Dropping Data¶
Most people do not have unlimited space on their hard drives and are constantly looking for ways to free up space when they are no longer actively working with files. Any file in a dataset can be removed using datalad drop
. Importantly, this does not delete the file, but rather removes it from your computer. You will still be able to see file metadata after it has been dropped in case you want to download it again in the future.
As an example, let’s drop the Sherlock confound regressor .tsv file.
!datalad drop fmriprep/sub-01/func/sub-01_task-sherlockPart1_desc-confounds_regressors.tsv
Datalad has a Python API!¶
One particularly nice aspect of datalad is that it has a Python API, which means that anything you would like to do with datalad in the commandline, can also be run in Python. See the details of the datalad Python API.
For example, suppose you would like to clone a data repository, such as the Sherlock dataset. You can run dl.clone(source=url, path=location)
. Make sure you set sherlock_path
to the location where you would like the Sherlock repository installed.
import os
import glob
import datalad.api as dl
import pandas as pd
sherlock_path = '/Users/lukechang/Downloads/Sherlock'
dl.clone(source='https://gin.g-node.org/ljchang/Sherlock', path=sherlock_path)
<Dataset path=/Users/lukechang/Downloads/Sherlock>
We can now create a dataset instance using dl.Dataset(path_to_data)
.
ds = dl.Dataset(sherlock_path)
How much of the dataset have we downloaded? We can check the status of the annex using ds.status(annex='all')
.
results = ds.status(annex='all')
1349 annex'd files (0.0 B/109.0 GB present/total size)
1349 annex'd files (0.0 B/109.0 GB present/total size)
Looks like it’s empty, which makes sense since we only cloned the dataset.
Now we need to get some data. Let’s start with something small to play with first.
Let’s use glob
to find all of the tab-delimited confound data generated by fmriprep.
file_list = glob.glob(os.path.join(sherlock_path, 'fmriprep', '*', 'func', '*tsv'))
file_list.sort()
file_list
['/Users/lukechang/Downloads/Sherlock/fmriprep/sub-01/func/sub-01_task-freerecall_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-01/func/sub-01_task-sherlockPart1_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-01/func/sub-01_task-sherlockPart2_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-02/func/sub-02_task-freerecall_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-02/func/sub-02_task-sherlockPart1_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-02/func/sub-02_task-sherlockPart2_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-03/func/sub-03_task-freerecall_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-03/func/sub-03_task-sherlockPart1_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-03/func/sub-03_task-sherlockPart2_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-04/func/sub-04_task-freerecall_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-04/func/sub-04_task-sherlockPart1_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-04/func/sub-04_task-sherlockPart2_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-05/func/sub-05_task-freerecall_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-05/func/sub-05_task-sherlockPart1_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-05/func/sub-05_task-sherlockPart2_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-06/func/sub-06_task-freerecall_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-06/func/sub-06_task-sherlockPart1_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-06/func/sub-06_task-sherlockPart2_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-07/func/sub-07_task-freerecall_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-07/func/sub-07_task-sherlockPart1_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-07/func/sub-07_task-sherlockPart2_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-08/func/sub-08_task-freerecall_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-08/func/sub-08_task-sherlockPart1_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-08/func/sub-08_task-sherlockPart2_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-09/func/sub-09_task-freerecall_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-09/func/sub-09_task-sherlockPart1_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-09/func/sub-09_task-sherlockPart2_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-10/func/sub-10_task-freerecall_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-10/func/sub-10_task-sherlockPart1_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-10/func/sub-10_task-sherlockPart2_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-11/func/sub-11_task-freerecall_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-11/func/sub-11_task-sherlockPart1_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-11/func/sub-11_task-sherlockPart2_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-12/func/sub-12_task-freerecall_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-12/func/sub-12_task-sherlockPart1_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-12/func/sub-12_task-sherlockPart2_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-13/func/sub-13_task-freerecall_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-13/func/sub-13_task-sherlockPart1_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-13/func/sub-13_task-sherlockPart2_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-14/func/sub-14_task-freerecall_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-14/func/sub-14_task-sherlockPart1_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-14/func/sub-14_task-sherlockPart2_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-15/func/sub-15_task-freerecall_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-15/func/sub-15_task-sherlockPart1_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-15/func/sub-15_task-sherlockPart2_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-16/func/sub-16_task-freerecall_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-16/func/sub-16_task-sherlockPart1_desc-confounds_regressors.tsv',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-16/func/sub-16_task-sherlockPart2_desc-confounds_regressors.tsv']
glob can search the filetree and see all of the relevant data even though none of it has been downloaded yet.
Let’s now download the first subjects confound regressor file and load it using pandas.
result = ds.get(file_list[0])
confounds = pd.read_csv(file_list[0], sep='\t')
confounds.head()
csf | csf_derivative1 | csf_derivative1_power2 | csf_power2 | white_matter | white_matter_derivative1 | white_matter_power2 | white_matter_derivative1_power2 | global_signal | global_signal_derivative1 | ... | motion_outlier127 | motion_outlier128 | motion_outlier129 | motion_outlier130 | motion_outlier131 | motion_outlier132 | motion_outlier133 | motion_outlier134 | motion_outlier135 | motion_outlier136 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 897.394539 | NaN | NaN | 805316.958867 | 695.564895 | NaN | 483810.522970 | NaN | 635.368732 | NaN | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
1 | 889.168301 | -8.226238 | 67.670997 | 790620.267181 | 695.581487 | 0.016592 | 483833.605371 | 0.000275 | 633.147820 | -2.220912 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
2 | 886.763733 | -2.404568 | 5.781946 | 786349.918340 | 694.135813 | -1.445674 | 481824.526707 | 2.089974 | 629.315476 | -3.832344 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
3 | 879.587127 | -7.176606 | 51.503680 | 773673.513408 | 693.405760 | -0.730052 | 480811.548543 | 0.532977 | 626.664425 | -2.651051 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
4 | 881.602440 | 2.015313 | 4.061486 | 777222.861394 | 692.787848 | -0.617913 | 479955.002077 | 0.381816 | 628.319190 | 1.654765 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
5 rows × 389 columns
What if we wanted to drop that file? Just like the CLI, we can use ds.drop(file_name)
.
result = ds.drop(file_list[0])
To confirm that it is actually removed, let’s try to load it again with pandas.
confounds = pd.read_csv(file_list[0], sep='\t')
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-16-58d68c1a1fbf> in <module>
----> 1 confounds = pd.read_csv(file_list[0], sep='\t')
~/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision)
674 )
675
--> 676 return _read(filepath_or_buffer, kwds)
677
678 parser_f.__name__ = name
~/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds)
446
447 # Create the parser.
--> 448 parser = TextFileReader(fp_or_buf, **kwds)
449
450 if chunksize or iterator:
~/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py in __init__(self, f, engine, **kwds)
878 self.options["has_index_names"] = kwds["has_index_names"]
879
--> 880 self._make_engine(self.engine)
881
882 def close(self):
~/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py in _make_engine(self, engine)
1112 def _make_engine(self, engine="c"):
1113 if engine == "c":
-> 1114 self._engine = CParserWrapper(self.f, **self.options)
1115 else:
1116 if engine == "python":
~/anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py in __init__(self, src, **kwds)
1889 kwds["usecols"] = self.usecols
1890
-> 1891 self._reader = parsers.TextReader(src, **kwds)
1892 self.unnamed_cols = self._reader.unnamed_cols
1893
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.__cinit__()
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._setup_parser_source()
FileNotFoundError: [Errno 2] File /Users/lukechang/Downloads/Sherlock/fmriprep/sub-01/func/sub-01_task-freerecall_desc-confounds_regressors.tsv does not exist: '/Users/lukechang/Downloads/Sherlock/fmriprep/sub-01/func/sub-01_task-freerecall_desc-confounds_regressors.tsv'
Looks like it was successfully removed.
We can also load the entire dataset in one command if want using ds.get(dataset='.', recursive=True)
. We are not going to do it right now as this will take awhile and require lots of free hard disk space.
Let’s actually download one of the files we will be using in the tutorial. First, let’s use glob to get a list of all of the functional data that has been preprocessed by fmriprep, denoised, and smoothed.
file_list = glob.glob(os.path.join(sherlock_path, 'fmriprep', '*', 'func', '*crop*nii.gz'))
file_list.sort()
file_list
['/Users/lukechang/Downloads/Sherlock/fmriprep/sub-01/func/sub-01_denoise_crop_smooth6mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-01/func/sub-01_denoise_crop_smooth6mm_task-sherlockPart2_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-02/func/sub-02_denoise_crop_smooth6mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-02/func/sub-02_denoise_crop_smooth6mm_task-sherlockPart2_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-03/func/sub-03_denoise_crop_smooth6mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-03/func/sub-03_denoise_crop_smooth6mm_task-sherlockPart2_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-04/func/sub-04_denoise_crop_smooth6mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-04/func/sub-04_denoise_crop_smooth6mm_task-sherlockPart2_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-05/func/sub-05_denoise_crop_smooth6mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-05/func/sub-05_denoise_crop_smooth6mm_task-sherlockPart2_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-06/func/sub-06_denoise_crop_smooth6mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-06/func/sub-06_denoise_crop_smooth6mm_task-sherlockPart2_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-07/func/sub-07_denoise_crop_smooth6mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-07/func/sub-07_denoise_crop_smooth6mm_task-sherlockPart2_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-08/func/sub-08_denoise_crop_smooth6mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-08/func/sub-08_denoise_crop_smooth6mm_task-sherlockPart2_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-09/func/sub-09_denoise_crop_smooth6mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-09/func/sub-09_denoise_crop_smooth6mm_task-sherlockPart2_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-10/func/sub-10_denoise_crop_smooth6mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-10/func/sub-10_denoise_crop_smooth6mm_task-sherlockPart2_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-11/func/sub-11_denoise_crop_smooth6mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-11/func/sub-11_denoise_crop_smooth6mm_task-sherlockPart2_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-12/func/sub-12_denoise_crop_smooth6mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-12/func/sub-12_denoise_crop_smooth6mm_task-sherlockPart2_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-13/func/sub-13_denoise_crop_smooth6mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-13/func/sub-13_denoise_crop_smooth6mm_task-sherlockPart2_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-14/func/sub-14_denoise_crop_smooth6mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-14/func/sub-14_denoise_crop_smooth6mm_task-sherlockPart2_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-15/func/sub-15_denoise_crop_smooth6mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-15/func/sub-15_denoise_crop_smooth6mm_task-sherlockPart2_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-16/func/sub-16_denoise_crop_smooth6mm_task-sherlockPart1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz',
'/Users/lukechang/Downloads/Sherlock/fmriprep/sub-16/func/sub-16_denoise_crop_smooth6mm_task-sherlockPart2_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz']
Now let’s download the first subject’s file using ds.get()
. This file is 825mb, so this might take a few minutes depending on your internet speed.
result = ds.get(file_list[0])
How much of the dataset have we downloaded? We can check the status of the annex using ds.status(annex='all')
.
result = ds.status(annex='all')
1349 annex'd files (825.0 MB/109.0 GB present/total size)
1349 annex'd files (825.0 MB/109.0 GB present/total size)
Ok, that concludes our tutorial for how to download data for this course with datalad using both the command line interface and also the Python API.