Data projects update

Revision as of 17:08, 24 February 2020 by P.petrelli (talk | contribs)

NCI is in the process of reviewing all the data projects and how they are organised. The first step is to remove "world" readable access to these projects so everyone who wants access has to join the project. This will allow NCI to keep track of the datasets usage and provide some numbers to their funding agency. To make sure they can say exactly which dataset are used projects as ua8 which host a heterogenous collection of datasets will be re-organised and the datasets will be either assigned a new project code or moved to a different collection.  Below is a summary of potential changes to data projects hosting climate datasets. NB all these changes are still under discussion, if you have any concern please let us know and we'll pass your feedback onto NCI.  

ua8 - ARCCSS data 

ua8 is a project that changed function many times across its existence, we currently use ua8 to host 

  • datasets published by the center (both ARCCSS and CLEX)
  • replicas of datasets we copied for our users, we often group here datasets we haven't yet organised to be shared more widely, datasets for which we copied only a small subset or which are really small in size
  • dataset we are in the process of publishing

Once the reorganisation will be completed only incomplete copies and data we are actively working on should be left in ua8

Proposed changes:

C20C+ - this dataset will have its own project code or will be managed together with other similar model ensembles (i.e. CESM and CCSM model ensembles)

COSIMA - NCI will assign a new project code for published COSIMA data

ARCCSS_Data and CLEX_Data - both these collections will be assigned a new project code

CMORPH, TRMM, GPCP, GHNC , GSMaP and other precipitation data will constitute together will other precipitation data a new collection with its own project code. Currently we are updating all the available versions in ua8.

JRA55-do - the official version is now available from the ana4MIPs collection hosted in qv56, the ua8 copy will be phased out with the exception of the modified version

OISST, ostia, CERES, gimmsndvi3g, CMEMS_SeaLevel ... we still need to find a suitable project for these datasets

rr7 - atmospheric and climate re-analysis

This was originally a data project managed by BoM, now we are also helping with the management. Lot of the BoM legacy was monthly regridded re-analysis data. Lots of these datasets have not been updated for years and there's no documentation available.

Following a meeting with NCI the following changes were proposed

  • these datasets have now been quarantined and will be deleted:  AWS, BIOS2, CFSv2, COREv2, CRU, ERA40, ERA40c, ERSST, JRA-55AMIP, JRA-55C, MA_APHRODITE, NCEP1, NCEP2, OAFLUX, PERSIANN, SOC, SRB
  • these datasets will be replaced by CREATE-IP in project qv56: ERA-interim (monthly), JRA-55 (including the 6hr version), CFSR, 20CR, MERRA, MERRA2 (only monthly, sub-daily still provided in rr7), GPCC (daily). They have all been quarantined except for JRA-55 (6hr)
  • these datasets will be updated by BoM: GISS, HadCRU, HadISST, HadSLP2, HOAPS, HadCRUT4, ISCCP
  • these datasets will be still available in rr7 as they are: MERRA2
  • precipitation data will be joined with the ua8 datasets and probably be histed in separate project: GPCP, GSMaP, TRMM, CMORPH ...