Downloading from MASS

Revision as of 16:23, 27 February 2020 by P.petrelli (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

To help researchers run models from and compare results with the UK Met Office we have limited access to their data archives. On request and with approval from the UKMO we are able to download data from their research runs to NCI.

The following is documentation for the CMS team, researchers affiliated with the centre just need to get in touch by emailing the helpdesk climate_help@nci.org.au with the ID of the run they're wanting to access.

Setup

  1. Register for Jasmin - https://accounts.jasmin.ac.uk/
  2. Register for a MASS account with UKMO - the collaboration team are able to do the sponsoring - http://help.ceda.ac.uk/article/228-how-to-apply-for-a-met-office-mass-account
  3. Register for access to the mass-cli1 server on Jasmin - http://help.ceda.ac.uk/article/229-how-to-apply-for-access-to-the-met-office-mass-client-machine

After this is processed you will get an email from the Met Office storage team with a credentials file, and an email from Jasmin saying you can access mass-cli1. If there are issues contact monsoon@metoffice.gov.uk

The `moose` credentials file needs to be installed using the mass-cli1 server. Copy it to the Jasmin login node (there is a shared home drive across all Jasmin servers)

scp moose swales@jasmin-login1.ceda.ac.uk:~/

Connect to mass-cli1 (via the login node)

ssh -A swales@jasmin-login1.ceda.ac.uk
ssh mass-cli1

and install and check the credentials

moo install
moo si -v

Getting data from MASS

To see the datasets we have access to

moo projinfo -l project-jasmin-umcollab

If the desired project isn't available contact the collaboration team and ask that it be authorised To list a dataset's contents

moo ls moose:/crum/u-ai718

UM outputs are organised into a directory for each output stream, within each directory is timestamped files.

Data should be extracted into a 'project workspace' before copying to NCI (home directory is too small). Your Met Office contact should be able to recommend a location (here I'm using mo_gc3)

mkdir /group_workspaces/jasmin2/mo_gc3/swales/u-ai718/p6
moo get moose:/crum/u-ai718/ap6.pp/ai718a.p61950\* /group_workspaces/jasmin2/mo_gc3/swales/u-ai718/p6

Getting data to NCI

To copy data across to NCI swap to the server `jasmin-xfer3`, this has a fast connection to Australia. `bbcp` is the recommended transfer tool, it is a parallel version of `rsync`. Copy the files to the Gadi data mover, `gadi-dm.nci.org.au`.

ssh jasmin-xfer3

bbcp -a -k -s 10 -T "ssh -x -a -oFallBackToRsh=no %I -l %U %H module load bbcp ; bbcp" -v -4 -P 5 -r --port 50000:51000 \
    /group_workspaces/jasmin2/mo_gc3/swales/u-ai718/p6/ \
    saw562@gadi-dm.nci.org.au:/g/data1/w35/saw562/HighResMIP/u-ai718/p6

Data transfer rate should be in the ballpark of 50 mb/s Remember to clean up once the data is transferred

rm -r /group_workspaces/jasmin2/mo_gc3/swales/u-ai718/p6/

With everything on Gadi process the output to cf-netcdf.

ssh gadi
module load conda
cd /g/data1/w35/saw562/HighResMIP/u-ai718/p6
for file in *.pp; do
 iris2netcdf $file
done