fMRI motion correction RAM error

khemmerling · September 20, 2021, 10:45pm

Hello!

I am getting an error with sct_fmri_moco (OSError: [Errno 24] Too many open files) that I believe has to do with RAM. My dataset is 300 volumes, which produces this error. If I motion correct a subset of this dataset (e.g. 200 volumes), motion correction is successful.

I recognize that this type of memory issue has happened before (https://github.com/spinalcordtoolbox/spinalcordtoolbox/issues/2661), but I can’t find a solution so I am wondering whether you have any insight on a solution.

Thank you for your help!
Kim

Commands and terminal output:

--
Spinal Cord Toolbox (5.3.0)

sct_fmri_moco -i func.nii.gz -m mask.nii.gz -g 2
--


Input parameters:
  Input file ............ /Users/user/Desktop/func.nii.gz
  Group size ............ 2
Creating temporary folder (/var/folders/z0/hm1j1z1512v85rj04nw_0hp80000gp/T/sct-20210920104828.701384-moco-1rnel45h)

Copying input data to tmp folder and convert to nii...

Get dimensions of data...
  128 x 44 x 25

Data orientation: RPI
  Treated as axial

Set suffix of transformation file name, which depends on the orientation:
Orientation is axial, suffix is 'Warp.nii.gz'. The estimated transformation is a 3D warping field, which is composed of a stack of 2D Tx-Ty transformations

Split along T dimension...
Merge within groups: 100%|██████████████████| 150/150 [00:04<00:00, 35.76iter/s]

Merge across groups...

-------------------------------------------------------------------------------
  Estimating motion across groups...
-------------------------------------------------------------------------------

Input parameters:
  Input file ............ datasub-groups.nii
  Reference file ........ datasub_0_mean.nii.gz
  Polynomial degree ..... 2
  Smoothing kernel ...... 0
  Gradient step ......... 1
  Metric ................ MeanSquares
  Sampling .............. None
  Todo .................. estimate_and_apply
  Mask  ................. mask.nii
  Output mat folder ..... mat_groups

Data dimensions:
  128 x 44 x 25 x 150

Copy file_target to a temporary file...

Register. Loop across Z (note: there is only one Z if orientation is axial)
Z=0/0: 100%|████████████████████████████████| 150/150 [04:10<00:00,  1.67s/iter]

-------------------------------------------------------------------------------
  Apply moco
-------------------------------------------------------------------------------

Input parameters:
  Input file ............ data.nii
  Reference file ........ datasub_0_mean.nii.gz
  Polynomial degree ..... 2
  Smoothing kernel ...... 0
  Gradient step ......... 1
  Metric ................ MeanSquares
  Sampling .............. None
  Todo .................. apply
  Mask  ................. mask.nii
  Output mat folder ..... mat_final/

Data dimensions:
  128 x 44 x 25 x 300

Copy file_target to a temporary file...

Register. Loop across Z (note: there is only one Z if orientation is axial)
Z=0/0: 100%|████████████████████████████████| 300/300 [00:34<00:00,  8.57iter/s]
Traceback (most recent call last):
  File "/Users/user/sct_5.3.0/spinalcordtoolbox/scripts/sct_fmri_moco.py", line 201, in <module>
    main(sys.argv[1:])
  File "/Users/user/sct_5.3.0/spinalcordtoolbox/scripts/sct_fmri_moco.py", line 186, in main
    fname_output_image = moco_wrapper(param)
  File "/Users/user/sct_5.3.0/spinalcordtoolbox/moco.py", line 375, in moco_wrapper
    file_mat_data, im_moco = moco(param_moco)
  File "/Users/user/sct_5.3.0/spinalcordtoolbox/moco.py", line 659, in moco
    im_data_splitZ_splitT_moco = [Image(fname) for fname in file_data_splitZ_splitT_moco]
  File "/Users/user/sct_5.3.0/spinalcordtoolbox/moco.py", line 659, in <listcomp>
    im_data_splitZ_splitT_moco = [Image(fname) for fname in file_data_splitZ_splitT_moco]
  File "/Users/user/sct_5.3.0/spinalcordtoolbox/image.py", line 285, in __init__
    self.loadFromPath(param, verbose)
  File "/Users/user/sct_5.3.0/spinalcordtoolbox/image.py", line 406, in loadFromPath
    self.data = self.im_file.get_data()
  File "/Users/user/sct_5.3.0/python/envs/venv_sct/lib/python3.6/site-packages/nibabel/deprecator.py", line 183, in deprecated_func
    return func(*args, **kwargs)
  File "/Users/user/sct_5.3.0/python/envs/venv_sct/lib/python3.6/site-packages/nibabel/dataobj_images.py", line 207, in get_data
    data = np.asanyarray(self._dataobj)
  File "/Users/user/sct_5.3.0/python/envs/venv_sct/lib/python3.6/site-packages/numpy/core/_asarray.py", line 136, in asanyarray
    return array(a, dtype, copy=False, order=order, subok=True)
  File "/Users/user/sct_5.3.0/python/envs/venv_sct/lib/python3.6/site-packages/nibabel/arrayproxy.py", line 391, in __array__
    arr = self._get_scaled(dtype=dtype, slicer=())
  File "/Users/user/sct_5.3.0/python/envs/venv_sct/lib/python3.6/site-packages/nibabel/arrayproxy.py", line 358, in _get_scaled
    scaled = apply_read_scaling(self._get_unscaled(slicer=slicer), scl_slope, scl_inter)
  File "/Users/user/sct_5.3.0/python/envs/venv_sct/lib/python3.6/site-packages/nibabel/arrayproxy.py", line 337, in _get_unscaled
    mmap=self._mmap)
  File "/Users/user/sct_5.3.0/python/envs/venv_sct/lib/python3.6/site-packages/nibabel/volumeutils.py", line 507, in array_from_file
    offset=offset)
  File "/Users/user/sct_5.3.0/python/envs/venv_sct/lib/python3.6/site-packages/numpy/core/memmap.py", line 264, in __new__
    mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)
OSError: [Errno 24] Too many open files

Note: I’ve edited some of the paths above to delete personal info.

Check dependencies output:

SCT info:
- version: 5.3.0
- path: /Users/user/sct_5.3.0
OS: osx (Darwin-19.0.0-x86_64-i386-64bit)
CPU cores: Available: 8, Used by ITK functions: 8
RAM: Total: 16384MB, Used: 9668MB, Available: 6512MB
Check Python executable.............................[OK]
Using bundled python 3.6.13 |Anaconda, Inc.| (default, Feb 23 2021, 12:58:59) 
[GCC Clang 10.0.0 ] at /Users/user/sct_5.3.0/python/envs/venv_sct/bin/python
Check if data are installed.........................[OK]
Check if colored is installed.......................[OK] (1.4.2)
Check if dipy is installed..........................[OK] (1.4.0)
Check if futures is installed.......................[OK]
Check if h5py is installed..........................[OK] (2.10.0)
Check if Keras (2.1.5) is installed.................[OK] (2.1.5)
Check if ivadomed is installed......................[OK] (2.7.4)
Check if matplotlib is installed....................[OK] (3.3.4)
Check if nibabel is installed.......................[OK] (3.2.1)
Check if numpy is installed.........................[OK] (1.19.5)
Check if onnxruntime (1.4.0) is installed...........[OK] (1.4.0)
Check if pandas is installed........................[OK] (1.1.5)
Check if psutil is installed........................[OK] (5.8.0)
Check if pyqt5 (5.11.3) is installed................[OK] (5.11.3)
Check if pytest is installed........................[OK] (6.2.3)
Check if pytest-cov is installed....................[OK] (__version__ = '2.11.1')
Check if raven is installed.........................[OK]
Check if requests is installed......................[OK] (2.25.1)
Check if requirements-parser is installed...........[OK] (0.2.0)
Check if scipy is installed.........................[OK] (1.5.4)
Check if scikit-image is installed..................[OK] (0.17.2)
Check if scikit-learn is installed..................[OK] (0.24.1)
Check if tensorflow (1.5.0) is installed............[OK] (1.5.0)
Check if torch (1.5.0) is installed.................[OK] (1.5.0)
Check if torchvision (0.6.0) is installed...........[OK] (0.6.0)
Check if xlwt is installed..........................[OK] (1.3.0)
Check if tqdm is installed..........................[OK] (4.60.0)
Check if transforms3d is installed..................[OK] (0.3.1)
Check if urllib3 is installed.......................[OK] (1.26.4)
Check if pytest_console_scripts is installed........[OK]
Check if wquantiles is installed....................[OK] (0.4)
Check if spinalcordtoolbox is installed.............[OK]
Check ANTs compatibility with OS ...................[OK]
Check PropSeg compatibility with OS ................[OK]
Check if DISPLAY variable is set....................[OK]
Check if figure can be opened with PyQt.............[OK]

jcohenadad · September 21, 2021, 1:40am

Hi @khemmerling,

Thank you for reaching out, and sorry you are experiencing this issue. Hum… I’m not sure what would be a good short-term solution. I see that your system is using about 10GB of RAM (out of 16GB available), would it be possible to maybe close some RAM-hungry software before running SCT and see if it solves your issue for this dataset?

Maybe @joshuacwnewton you have some ideas?

joshuacwnewton · September 21, 2021, 5:03pm

I believe there are two separate (but related) issues across both this post and the older SCT issues:

“Too many open files”: Subtly, I believe this is actually a “file descriptor limit” issue, rather than a RAM issue. There is a quick workaround to address this:
- Run “ulimit -Sn” in your terminal to check your system’s soft limit on file descriptors.
- Run “ulimit -Sn <number>” to temporarily increase the limit during the current session. (So, for example, if “ulimit -Sn” outputs 1024, you could try “ulimit -Sn 2048” or “ulimit -Sn 4096”.)
- Note that this will need to be run each time you start a new terminal session, but this gives the assurance that there won’t be any permanent effects on your system.
- While this command may fix the “Too many open files” error, it could then lead to further RAM issues, since you will be increasing the system limit on open file descriptors.
“Killed”: This is a result of the Out-of-Memory (OOM) Killer, which gets invoked when the system is critically low on memory. This is the issue that was reported in Issue #2661 (as was linked above).
- I’m not sure that there are any quick workarounds for this issue.
  - Off the top of my head, I had been thinking about splitting up the dataset using sct_image -split, performing motion correction in batches, then concatenating the volumes using sct_image -concat. But, I wonder if that may introduce discontinuities between batches…
  - If SCT had a “sliding window”-style approach to only look at N volumes at a time, then perhaps this could reduce our memory usage. Conceptually, the “-g” argument for sct_fmri_moco does work this way, but off the top of my head I don’t know how increasing “-g” would affect memory usage.
- So, the best solution for the second issue would likely require looking at how SCT handles resources internally, and seeing if there are any optimizations that could be made.

I would say to first try the workaround for Issue 1. If you then encounter Issue 2 (or any other issues), please let us know.

Kind regards,
Joshua

khemmerling · September 21, 2021, 7:31pm

@jcohenadad @joshuacwnewton Thank you both for your responses. Closing all softwares to run SCT did not work, but increasing my ulimit did! I increased from 256 to 1024, and was successfully able to run motion correction.

I see the difference now between issues 1 and 2, and I don’t think I’ve encountered issue 2.

I really appreciate the thoughtful response and suggestions!

Best,
Kim

jcohenadad · September 21, 2021, 7:37pm

This is awesome! @joshuacwnewton maybe we should consider checking that “ulimit” variable on the user end to catch that possible issue? I’m sure many people have encountered it but didn’t report on the forum.

joshuacwnewton · September 21, 2021, 8:00pm

I’m so glad to hear it’s working now.

I agree that this warrants further investigation, either by checking ulimit, or through optimizing how we use resources during motion correction.

Since we have a short-term workaround, I can mark this as resolved for now, then follow up this discussion on the corresponding GitHub issue (#2149).