UNSW data requirements and tools

Revision as of 17:40, 22 March 2021 by P.petrelli (talk | contribs)

Archive data at UNSW

UNSW has just recently launched a new website that allows students and researchers to managed and store their data completely through an online workflow. First step is the submission of a Research Data Management Plan (RDMP) and then you can upload your data directly on the ResData web portal or by sftp or using a command line, depending on size and complexity of data. Data can then be accessed by the web portal. All the necessary information, links and video tutorials are available on the UNSW data archive services website.


Student requirements in regard to data storing

ARC funding rules now have various stipulations about correct handling of data and collection and storage of metadata. The UNSW ResData system provides a means to do this. As such, all supervisors of PhD students probably ought to have RDMPs linked to their grants. If so, there’s an option to link plan/s to any students they supervise if the data is shared.

HDR students can also set up their own accounts and data plans. It’s a discussion best had between students and supervisors as to whether it’s beneficial for students to have their own separate plan e.g. If you the student is working with and/or generating data that is quite distinct to the supervisor’s.

In the meantime, it is not detrimental to upcoming PG reviews to tick “no” to the question about RDMPs, but it should be a prompt to look into this (staff and students) as it is an emerging compliance issue.


When you leave

You need a valid and active zID and zPass to access and use the UNSW Data Archive. If you leave UNSW you can gain access via the Lead Chief Investigator (LCI) and the Research Project Manager as defined in the Research Data Management Plan (RDMP) If the Lead Chief Investigator (LCI) and the Research Project Manager have also left UNSW, access to the project can be gained through the Head of School. You should contact your local IT support or the UNSW IT Service Centre to arrange access to data stored in the Data Archive.


Known issues with the transfer to data storage

Some users have reported having trouble using the sftp command line option to transfer data to the UNSW Data Storage. The official documentation is following the instruction for the sftp-client-guide

There are two issues we've been told about: 1) If you're trying to connect from one of the CCRC server like Storm or Maelstrom using the command in the instructions

sftp -oPort=8022 -r UNSW_RDS:<z-user-id>@rds.unsw.edu.au

you will get the following error message:

sftp: illegal option -- r usage: sftp [-1Cv] [-B buffer_size] [-b batchfile] [-F ssh_config] [-o ssh_option] [-P sftp_server_path] [-R num_requests] [-S program] [-s subsystem | sftp_server] host sftp [user@]host[:file ...] sftp [user@]host[:dir[/]] sftp -b batchfile [user@]host

This is because the servers are using an old version of sftp that doesn't have the -r flag. The -r flag is used to transfer files recursively, so you can transfer entire directories in one go. If you omit this flag than you shouldn't have any issues connecting. If you need to transfer lots of files alternative option are listed below.

2) Depending on the ssh/sftp and unix/linux distribution you're using you might experience a bug that will stop you connecting to the server. A couple of users experienced this when trying to connect from their laptop/desktops Here's an example:

sftp -oPort=8022 -r UNSW_RDS:<z-user-id>@rds.unsw.edu.au hash mismatch key_verify failed for server_host_key Couldn't read packet: Connection reset by peer

The IT services are working to solve this and the new release is scheduled for the 17th-18th of November. If you can't or don't want to wait you still have other options to transfer the data. As well as the web interface, which should work just fine from your own computer, you can use filezilla or lftp

lftp is actually available on the "storm" server and can be used in the following way

lftp -p 8022 s[[1]] Then at the lftp prompt: user UNSW_RDS:<your zID> You are then prompted for your password. Then you can transfer data. To upload: mirror –R <local dir> <archive dir>

To download: mirror <archive dir> <local dir>

NB you need to put a / after the destination directory if you want it to create the source directory in the target, otherwise it puts the contents of the source in the target

NB lftp can have issues when using ls which cause lftp to hang!! Here: [[2]] there's a suggestion of using "ftp" instead of "sftp" ie lftp -p 8022 [[3]] The IT service responsible for the data storage is testing other ways to use lftp, which has a lot of available options. Finally he can help anyone who needs a more complex data transfer to be automated to use set up a java script which is currently available on storm. If you're interested let me know (Paola) and I will put you in contact. Or send an e-mail to their helpdesk (details should be in their documentation).