Dusql
dusql is a disk usage analysis tool developed by CMS to help deal with data on our storage areas at NCI
dusql is installed in the 'unstable' CMS conda environment, to use it run
module use /g/data/hh5/public/modules module load conda/analysis3-unstable dusql --help
Contents
Commands
ncdu: Finding Files Interactively
The simplest way to find files is to use the interactive viewer, dusql ncdu
. This is a basic text interface that shows how many files match a given condition in each directory.
Say you want to find big files in your /short directory. You might run dusql ncdu /short/$PROJECT/$USER --size=10gb
to find all the files larger than 10 GB
du: Summarising a Directory
dusql du
works the same as ncdu
, it shows the total size and file count of files matching some constraint under a directory, but rather than the text interface it just prints a summary for each directory to screen. You can give it multiple directories as well, e.g. to find files under the current directory older than 3 years:
$ dusql du * --mtime=-3y | sort -hr 304.99GB, 6624 files, um-ostia 4.76GB, 223 files, wrf-era 3.41GB, 1003 files, access-cm2-ukca 1.94GB, 98 files, mpas 919.57MB, 1 files, nu-wrf_v8-wrf371-lis71rp7.tgz
It's helpful to pipe the output of dusql du
to sort -hr
as shown above to order the paths by size, or sort -nr -k 2
to sort by file count.
find: Listing Individual Files
dusql find
will print out the paths of all matching files. It can be helpful if there's just a few files you're trying to track down:
$ dusql find . --mtime=-7y | head
/short/w35/saw562/scratch/spherepack3.2/Makefile
/short/w35/saw562/scratch/wrf-era/FILE:2006-03-02_18
/short/w35/saw562/scratch/wrf-era/SST:2006-03-03_12
/short/w35/saw562/scratch/wrf-era/FILE:2006-03-01_00
/short/w35/saw562/scratch/wrf-era/SST:2006-03-01_12