Accounting at NCI

Revision as of 01:58, 13 April 2021 by A.heerdegen (talk | contribs) (Added cost)
Template:Stub This is a stub page and needs expansion

This page describes the different tools available for the accounting of computing and storage resources at NCI. Those tools have been developed by NCI and the CMS team.

Grafana

CMS has put in place a Grafana server for visualising a range of accounting statistics for CLEx. You can access this server using your NCI credentials at: https://accessdev.nci.org.au/grafana/login

Unfortunately, we haven't yet put in place the collection of all the statistics for Gadi. Currently, you can see timeseries of your storage and SU usage on the User Report.

We will update this section as more statistics become available.

Computing resources

nci_account

You can see the current allocation and the usage so far in the quarter for any computing project you are a member of. Using the -v option, you can see the usage per user:

$nci_account -v

nqstat

qstat allows you to monitor your jobs at NCI. nqstat allows you to see the jobs currently submitted:

  • for a given project
  • for a given queue and project
  • for a given user and project

uqstat

The information output by qstat and nqstat isn't always enough. CMS has developed uqstat to output more information by default such as the job efficiency (cpu%), the queueing time, the walltime, the cost in SU etc.

To use this command, please load the nci-scripts module:

module use /g/data/hh5/public/modules
module load nci-scripts

Contrary to what we usually recommend, it should be safe to load this module in your .bashrc file so this module is loaded by default.

$uqstat 

The -x option allows you to see the same information on current jobs and jobs finished up to 24 hours previously.

qcost

CMS has also developed qcost a tool to help you choose the best PBS configuration for your job. qcost calculates the cost, in SU, for a job submitted to the PBS system. The PBS queue information is provided by NCI but it can be tedious to determine which configuration of queue and memory request should be used to minimise job cost. qcost was created to make this process easier.

To use this command, please load the nci-scripts module (see above). Usage:

 $ qcost -h 
 usage: qcost [-h] -q QUEUE -n NCPUS -m MEM [-t TIME]
 
 Return the cost (in SUs) for a PBS job submitted on gadi. No checking is done to ensure requests are within queue limits.
 
 optional arguments:
   -h, --help                show this help message and exit
   -q QUEUE, --queue QUEUE   PBS queue
   -n NCPUS, --ncpus NCPUS   Number of cpus
   -m MEM, --mem MEM         Requested memory
   -t TIME, --time TIME      Job walltime (hours)
 
 e.g. qcost -q normal -n 4 -m 60GB -t 3:00:00

Note that if no time is specified it defaults to 1 hour (1:00:00). Walltime can be specified as H:M or H:M:S. Memory must be specified in units of bytes (B), e.g. 160GB, 2000MB. For example:

  $ qcost -q normal -n 4 -m 60GB -t 3:00:00
  90 SU
  $ qcost -q express -n 4 -m 60GB -t 3:00:00
  270 SU

Storage resources

lquota

This command will list the quota and usage for all the projects you are a member of on both /scratch and /g/data. It will give usage and quota for both space and number of files.

nci-files-report

This command will list the usage in both space and number of files along different dimensions depending on options.

Usage owned by a project per user

If you want to know who owns files in your project or where those files are, you should use:

$nci-files-report --group w35 --filesystem gdata

This is probably the most useful options for the project's managers to find who has the most data owned by a project.

Usage in a project directory per user

If you want to know who owns files in the main directory of one of your projects, you should use:

$nci-files-report --project w35 --filesystem gdata

This is probably the least useful option for this function.

Usage per user

If you want to know your data footprint across all your projects in a filesystem, you should use:

$nci-files-report --user --filesystem gdata

Why are the totals different in nci-files-report and Grafana?

CMS is collecting the data for Grafana and we can only scan data that has group read permissions. We ask everyone to put those permissions on if there are no restrictions on your scientific project.