Difference between revisions of "Data projects update"

 
Line 1: Line 1:
[[Category: Data]]
 
  
 
NCI is in the process of reviewing all the data projects and how they are organised. The first step is to remove "world" readable access to these projects so everyone who wants access has to join the project. This will allow NCI to keep track of the datasets usage and provide some numbers to their funding agency. To make sure they can say exactly which dataset are used projects as ua8 which host a heterogenous collection of datasets will be re-organised and the datasets will be either assigned a new project code or moved to a different collection.&nbsp; Below is a summary of '''potential '''changes to data projects hosting climate datasets. <u>NB all these changes are still under discussion, if you have any concern please let us know and we'll pass your feedback onto NCI.</u> &nbsp;
 
NCI is in the process of reviewing all the data projects and how they are organised. The first step is to remove "world" readable access to these projects so everyone who wants access has to join the project. This will allow NCI to keep track of the datasets usage and provide some numbers to their funding agency. To make sure they can say exactly which dataset are used projects as ua8 which host a heterogenous collection of datasets will be re-organised and the datasets will be either assigned a new project code or moved to a different collection.&nbsp; Below is a summary of '''potential '''changes to data projects hosting climate datasets. <u>NB all these changes are still under discussion, if you have any concern please let us know and we'll pass your feedback onto NCI.</u> &nbsp;
  
 +
&nbsp;
  
 
==== <span style="font-size:medium;">'''ua8 - ARCCSS data&nbsp;'''</span> ====
 
==== <span style="font-size:medium;">'''ua8 - ARCCSS data&nbsp;'''</span> ====
Line 14: Line 14:
 
Once the reorganisation will be completed only incomplete copies and data we are actively working on should be left in ua8
 
Once the reorganisation will be completed only incomplete copies and data we are actively working on should be left in ua8
  
Proposed changes:
+
'''Proposed changes:'''
  
 
C20C+ - this dataset will have its own project code or will be managed together with other similar model ensembles (i.e. CESM and CCSM model ensembles)
 
C20C+ - this dataset will have its own project code or will be managed together with other similar model ensembles (i.e. CESM and CCSM model ensembles)
  
COSIMA - NCI will assign a new project code for published COSIMA data
+
CMORPH, TRMM, GPCP, GHNC , GSMaP and other precipitation data will&nbsp;constitute together will other precipitation data a new collection with its own project code. Currently we are updating all the available versions in ua8. We also collected the datasets in the /g/data/ua8/Precipitation folder
  
ARCCSS_Data and CLEX_Data - both these collections will be assigned a new project code
+
JRA55-do - the official version is now available from the ana4MIPs collection hosted in qv56, the ua8 copy will be phased out with the exception of the modified version
  
CMORPH, TRMM, GPCP, GHNC , GSMaP and other precipitation data will&nbsp;constitute together will other precipitation data a new collection with its own project code. Currently we are updating all the available versions in ua8.
+
OISST, ostia, CERES,&nbsp;gimmsndvi3g, CMEMS_SeaLevel ... we still need to find a suitable project for these datasets
  
JRA55-do - the official version is now available from the ana4MIPs collection hosted in qv56, the ua8 copy will be phased out with the exception of the modified version
+
'''Changes executed:'''
 +
 
 +
COSIMA - NCI assigned a new project code "cj50" for published COSIMA data, the data has been removed from ua8
 +
 
 +
ARCCSS_Data and CLEX_Data - have&nbsp; been moved to project "ks32" . The original directories have been moved to
  
OISST, ostia, CERES,&nbsp;gimmsndvi3g, CMEMS_SeaLevel ... we still need to find a suitable project for these datasets
+
&nbsp;/g/data/ua8/Publishing&nbsp; and are used as a staging storage for data preparation
  
 +
&nbsp;
  
 
==== <span style="font-size:medium;">'''rr7 - atmospheric and climate re-analysis'''</span> ====
 
==== <span style="font-size:medium;">'''rr7 - atmospheric and climate re-analysis'''</span> ====
Line 41: Line 46:
 
*precipitation data will&nbsp;be joined with the ua8 datasets&nbsp;and probably be histed in separate project:&nbsp;GPCP, GSMaP, TRMM, CMORPH ...  
 
*precipitation data will&nbsp;be joined with the ua8 datasets&nbsp;and probably be histed in separate project:&nbsp;GPCP, GSMaP, TRMM, CMORPH ...  
  
* you can access HOAPS v4.0 via opendap at&nbsp;[https://icdc.cen.uni-hamburg.de/1/daten/atmosphere/hoaps/ https://icdc.cen.uni-hamburg.de/1/daten/atmosphere/hoaps/]
+
*you can access HOAPS v4.0 via opendap at&nbsp;[https://icdc.cen.uni-hamburg.de/1/daten/atmosphere/hoaps/ https://icdc.cen.uni-hamburg.de/1/daten/atmosphere/hoaps/]
 +
 
 +
We are temporarily hosting some of the CREATE-IP in the ua8 project while waiting for NCI to take over this, currently available are:
 +
 
 +
*MERRA2 monthly
 +
*MERRA monthly
 +
*JRA25 monthly
 +
*JRA55 6hourly data on pressure levels and monthly
 +
*ERA-Interim monthly
 +
*20CRv2c monthly
 +
*CFSR monthly
 +
 
 +
The folder is /g/data/ua8/synda/CREATE-IP
 +
 
 +
[[Category:Data]]

Latest revision as of 19:17, 16 July 2020

NCI is in the process of reviewing all the data projects and how they are organised. The first step is to remove "world" readable access to these projects so everyone who wants access has to join the project. This will allow NCI to keep track of the datasets usage and provide some numbers to their funding agency. To make sure they can say exactly which dataset are used projects as ua8 which host a heterogenous collection of datasets will be re-organised and the datasets will be either assigned a new project code or moved to a different collection.  Below is a summary of potential changes to data projects hosting climate datasets. NB all these changes are still under discussion, if you have any concern please let us know and we'll pass your feedback onto NCI.  

 

ua8 - ARCCSS data 

ua8 is a project that changed function many times across its existence, we currently use ua8 to host 

  • datasets published by the center (both ARCCSS and CLEX)
  • replicas of datasets we copied for our users, we often group here datasets we haven't yet organised to be shared more widely, datasets for which we copied only a small subset or which are really small in size
  • dataset we are in the process of publishing

Once the reorganisation will be completed only incomplete copies and data we are actively working on should be left in ua8

Proposed changes:

C20C+ - this dataset will have its own project code or will be managed together with other similar model ensembles (i.e. CESM and CCSM model ensembles)

CMORPH, TRMM, GPCP, GHNC , GSMaP and other precipitation data will constitute together will other precipitation data a new collection with its own project code. Currently we are updating all the available versions in ua8. We also collected the datasets in the /g/data/ua8/Precipitation folder

JRA55-do - the official version is now available from the ana4MIPs collection hosted in qv56, the ua8 copy will be phased out with the exception of the modified version

OISST, ostia, CERES, gimmsndvi3g, CMEMS_SeaLevel ... we still need to find a suitable project for these datasets

Changes executed:

COSIMA - NCI assigned a new project code "cj50" for published COSIMA data, the data has been removed from ua8

ARCCSS_Data and CLEX_Data - have  been moved to project "ks32" . The original directories have been moved to

 /g/data/ua8/Publishing  and are used as a staging storage for data preparation

 

rr7 - atmospheric and climate re-analysis

This was originally a data project managed by BoM, now we are also helping with the management. Lot of the BoM legacy was monthly regridded re-analysis data. Lots of these datasets have not been updated for years and there's no documentation available.

Following a meeting with NCI the following changes were proposed

  • these datasets have now been quarantined and will be deleted:  AWS, BIOS2, CFSv2, COREv2, CRU, ERA40, ERA40c, ERSST, JRA-55AMIP, JRA-55C, MA_APHRODITE, NCEP1, NCEP2, OAFLUX, PERSIANN, SOC, SRB
  • these datasets will be replaced by CREATE-IP in project qv56: ERA-interim (monthly), JRA-55 (including the 6hr version), CFSR, 20CR, MERRA, MERRA2 (only monthly, sub-daily still provided in rr7), GPCC (daily). They have all been quarantined except for JRA-55 (6hr)
  • these datasets will be updated by BoM: GISS, HadCRU, HadISST, HadSLP2, HOAPS*, HadCRUT4, ISCCP
  • these datasets will be still available in rr7 as they are: MERRA2
  • precipitation data will be joined with the ua8 datasets and probably be histed in separate project: GPCP, GSMaP, TRMM, CMORPH ...

We are temporarily hosting some of the CREATE-IP in the ua8 project while waiting for NCI to take over this, currently available are:

  • MERRA2 monthly
  • MERRA monthly
  • JRA25 monthly
  • JRA55 6hourly data on pressure levels and monthly
  • ERA-Interim monthly
  • 20CRv2c monthly
  • CFSR monthly

The folder is /g/data/ua8/synda/CREATE-IP