Data projects update

Revision as of 00:29, 26 April 2021 by P.petrelli (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

NCI is in the process of reviewing all the data projects and how they are organised. The first step was to remove "world" readable access to these projects so everyone who wants access has to join the project. This allows NCI to keep track of the datasets usage and provide some numbers to their funding agency. To make sure they can say exactly which dataset are used, projects as ua8, which host a heterogenous collection of datasets, will be re-organised and the datasets will be either assigned a new project code or moved to a different collection.  Below is a summary of potential changes to data projects hosting climate datasets. NB all these changes are still under discussion, if you have any concern please let us know and we will pass your feedback onto NCI.  


ua8 - ARCCSS/CLEX data 

ua8 is a project that changed function many times across its existence, we currently use ua8 to host 

  • prepare CLEX datasets for publication
  • replicas of datasets we copied for our users, we often group here datasets for which a project does not yet exists, datasets for which we copied only a small subset or which are really small in size

Once the reorganisation will be completed only incomplete copies and data we are actively working on should be left in ua8

Proposed changes:

C20C+ - this dataset will have its own project code or will be managed together with other similar model ensembles (i.e. CESM and CCSM model ensembles)

CMORPH, TRMM, GPCP, GHNC, FROGs, GSMaP and other precipitation data will constitute together will other precipitation data a new collection with its own project code. Currently we are updating all the available versions in ua8. We also collected the datasets in the /g/data/ua8/Precipitation folder

JRA55-do - the official version is now available from the ana4MIPs collection hosted in qv56, the ua8 copy will be phased out with the exception of the modified version

OISST, CAMS, CERES, gimmsndvi3g, CMEMS_SeaLevel, HadISST, NCEP_Polar, GLACE_CMIP5, CanadaExtremeIndexes... we still need to find a suitable project for these datasets

GLEAM v3.5 should be moved to the CABLE data project wd9, this project should also be re-organised

ostia, is marked for removal

ocean_color is a new collection of satellite chlorophyll data that we are still organising.

MERRA2 - most of MERRA2 is still hosted in rr7, but since the rr7 storage is limited and we cannot request more, from 2020 we added newly requested MERRA2 products to ua8

Changes executed:

COSIMA - NCI assigned a new project code "cj50" for published COSIMA data, the data has been removed from ua8

ARCCSS and CLEX published dataset - have  been moved to project "ks32" . The original directories have been moved to

 /g/data/ua8/Publishing  and are used as a staging storage for data preparation


rr7 - atmospheric and climate re-analysis

This was originally a data project managed by BoM, now we are also helping with the management. Lot of the BoM legacy was monthly regridded re-analysis data. Lots of these datasets have not been updated for years and there's no documentation available.

Following a meeting with NCI the following changes were proposed

  • these datasets have now been deleted:  AWS, BIOS2, CFSv2, COREv2, CRU, ERA40, ERA40c, ERSST, JRA-55AMIP, JRA-55C, MA_APHRODITE, NCEP1, NCEP2, OAFLUX, PERSIANN, SOC, SRB
  • these datasets will be replaced by CREATE-IP in project qv56: ERA-interim (monthly), JRA-55 (including the 6hr version), CFSR, 20CR, MERRA, MERRA2 (only monthly, sub-daily still provided in rr7), GPCC (daily). They have all been quarantined except for JRA-55 (6hr)
  • these datasets will be updated by BoM: GISS, HadCRU, HadSLP2, HOAPS*, HadCRUT4, ISCCP
  • these datasets will be still available in rr7 as they are: MERRA2
  • precipitation data has been joined with the ua8 datasets and is now in /g/data/ua8/Precipitation
  • HadISST is now in ua8

We are temporarily hosting some of the CREATE-IP in the ua8 project while waiting for NCI to take over this, currently available are:

  • MERRA2 monthly
  • MERRA monthly
  • JRA25 monthly
  • JRA55 6hourly data on pressure levels and monthly
  • ERA-Interim monthly
  • 20CRv2c monthly
  • CFSR monthly

The folder is /g/data/ua8/synda/CREATE-IP