Difference between revisions of "UNSW data requirements and tools"

Line 1: Line 1:
 +
 
'''Archive data at UNSW'''
 
'''Archive data at UNSW'''
UNSW has just recently launched a new website that allows students and researchers to managed and store their data completely through an online workflow.
 
First step is the submission of a Research Data Management Plan (RDMP) and then you can upload your data directly on the [https://rds.unsw.edu.au/ | RDS web portal] or by sftp or using a command line, depending on size and complexity of data. Data can then be accessed by the web portal. All the necessary information, links and video tutorials are available on the <span class="s1">[http://www.dataarchive.unsw.edu.au/ | UNSW data archive services] website.</span>
 
  
**<span class="s1">Student requirements in regard to data storing</span>'''
+
UNSW has just recently launched a new website that allows students and researchers to managed and store their data completely through an online workflow. First step is the submission of a Research Data Management Plan (RDMP) and then you can upload your data directly on the [https://rds.unsw.edu.au/ RDS web portal] or by sftp or using a command line, depending on size and complexity of data. Data can then be accessed by the web portal. All the necessary information, links and video tutorials are available on the <span class="s1">[http://www.dataarchive.unsw.edu.au/ UNSW data archive services] website.</span>
  
 +
==== &nbsp; ====
  
<span class="s1">ARC funding rules now have various stipulations about correct handling of data and collection and storage of metadata. The UNSW ResData system provides a means to do this. As such, all supervisors of PhD students probably ought to have RDMPs linked to their grants. If so, there’s an option to link plan/s to any students they supervise if the data is shared. </span>
+
==== '''<span class="s1">Student requirements in regard to data storing</span>''' ====
  
<span class="s1">HDR students can also set up their own accounts and data plans. It’s a discussion best had between students and supervisors as to whether it’s beneficial for students to have their own separate plan e.g. If you the student is working with and/or generating data that is quite distinct to the supervisor’s. </span>
+
<span class="s1">ARC funding rules now have various stipulations about correct handling of data and collection and storage of metadata. The UNSW ResData system provides a means to do this. As such, all supervisors of PhD students probably ought to have RDMPs linked to their grants. If so, there’s an option to link plan/s to any students they supervise if the data is shared.</span>
 +
 
 +
<span class="s1">HDR students can also set up their own accounts and data plans. It’s a discussion best had between students and supervisors as to whether it’s beneficial for students to have their own separate plan e.g. If you the student is working with and/or generating data that is quite distinct to the supervisor’s.</span>
  
 
<span class="s1">In the meantime, it is not detrimental to upcoming PG reviews to tick “no” to the question about RDMPs, but it should be a prompt to look into this (staff and students) as it is an emerging compliance issue.</span>
 
<span class="s1">In the meantime, it is not detrimental to upcoming PG reviews to tick “no” to the question about RDMPs, but it should be a prompt to look into this (staff and students) as it is an emerging compliance issue.</span>
  
**<span class="s1">When you leave</span>'''
+
&nbsp;
 +
 
 +
==== &nbsp; ====
 +
 
 +
==== '''<span class="s1">When you leave</span>''' ====
 +
 
 +
<span style="font-size:small;"><span style="font-family:Arial,Helvetica,sans-serif;"><span style="color: rgb(51, 51, 51);">You need a valid and active zID and zPass to access and use the UNSW Data Archive. If you leave UNSW you can gain access via the Lead Chief Investigator (LCI) and the Research Project Manager as defined in the Research Data Management Plan (RDMP) </span> <span style="color: rgb(51, 51, 51);">If the Lead Chief Investigator (LCI) and the Research Project Manager have also left UNSW, access to the project can be gained through the Head of School.</span> <span style="color: rgb(51, 51, 51);">You should contact your local IT support or the UNSW IT Service Centre to arrange access to data stored in the Data Archive.</span></span></span>
  
<span style="color: #333333; font-family: Arial,Helvetica,'Nimbus Sans L',sans-serif; font-size: 18px;">You need a valid and active zID and zPass to access and use the UNSW Data Archive. If you leave UNSW you can gain access via the <span style="font-family: Arial,Helvetica,;">Lead Chief Investigator (LCI) and the Research Project Manager as defined in the Research Data Management Plan (RDMP) </span></span>
+
&nbsp;
<span style="color: #333333; font-family: Arial,Helvetica,'Nimbus Sans L',sans-serif; font-size: 18px;">If the Lead Chief Investigator (LCI) and the Research Project Manager have also left UNSW, access to the project can be gained through the Head of School.</span>
 
<span style="color: #333333; font-family: Arial,Helvetica,'Nimbus Sans L',sans-serif; font-size: 18px;">You should contact your local IT support or the UNSW IT Service Centre to arrange access to data stored in the Data Archive. </span>
 
  
<span class="s1">'''Known issues with the transfer to data storage''' </span>
+
<span class="s1">'''Known issues with the transfer to data storage'''</span>
  
<span class="s1">Some users have reported having trouble using the sftp command line option to transfer data to the UNSW Data Storage.</span>
+
<span class="s1">Some users have reported having trouble using the sftp command line option to transfer data to the UNSW Data Storage.</span> <span class="s1">The official documentation is following the instruction for the [http://www.dataarchive.unsw.edu.au/help/sftp-client-guide#UploadDownload sftp-client-guide]</span>
<span class="s1">The official documentation is following the instruction for the [http://www.dataarchive.unsw.edu.au/help/sftp-client-guide#UploadDownload | sftp-client-guide]</span>
 
  
<span class="s1">There are two issues we've been told about: </span>
+
<span class="s1">There are two issues we've been told about: </span> <span class="s1">1) If you're trying to connect from one of the CCRC server like Storm or Maelstrom using the command in the instructions</span>
<span class="s1">1) If you're trying to connect from one of the CCRC server like Storm or Maelstrom using the command in the instructions</span>
 
  
<span class="s1"> sftp -oPort=8022 -r <span class="s2">UNSW_RDS:<z-user-id>@rds.unsw.edu.au</span></span>
+
<span class="s1">sftp -oPort=8022 -r <span class="s2">UNSW_RDS:<z-user-id>@rds.unsw.edu.au</span></span>
  
 
<span class="s1"><span class="s2">you will get the following error message:</span></span>
 
<span class="s1"><span class="s2">you will get the following error message:</span></span>
  
<span class="s3"> sftp: illegal option -- r</span>
+
<span class="s3">sftp: illegal option -- r</span> <span class="s3">usage: sftp [-1Cv] [-B buffer_size] [-b batchfile] [-F ssh_config]</span> <span class="s3">[-o ssh_option] [-P sftp_server_path] [-R num_requests]</span> <span class="s3">[-S program] [-s subsystem | sftp_server] host</span> <span class="s3">sftp [user@]host[:file ...]</span> <span class="s3">sftp [user@]host[:dir[/]]</span> <span class="s3">sftp -b batchfile [user@]host</span>
<span class="s3"> usage: sftp [-1Cv] [-B buffer_size] [-b batchfile] [-F ssh_config]</span>
 
<span class="s3"> [-o ssh_option] [-P sftp_server_path] [-R num_requests]</span>
 
<span class="s3"> [-S program] [-s subsystem | sftp_server] host</span>
 
<span class="s3"> sftp [user@]host[:file ...]</span>
 
<span class="s3"> sftp [user@]host[:dir[/]]</span>
 
<span class="s3"> sftp -b batchfile [user@]host</span>
 
  
<span class="s3">This is because the servers are using an old version of sftp that doesn't have the -r flag.</span>
+
<span class="s3">This is because the servers are using an old version of sftp that doesn't have the -r flag.</span> <span class="s3">The -r flag is used to transfer files recursively, so you can transfer entire directories in one go.</span> <span class="s3">If you omit this flag than you shouldn't have any issues connecting. If you need to transfer lots of files alternative option are listed below.</span>
<span class="s3">The -r flag is used to transfer files recursively, so you can transfer entire directories in one go.</span>
 
<span class="s3">If you omit this flag than you shouldn't have any issues connecting. If you need to transfer lots of files alternative option are listed below.</span>
 
  
<span class="s1">2) Depending on the ssh/sftp and unix/linux distribution you're using you might experience a bug that will stop you connecting to the server.</span>
+
<span class="s1">2) Depending on the ssh/sftp and unix/linux distribution you're using you might experience a bug that will stop you connecting to the server.</span> <span class="s1">A couple of users experienced this when trying to connect from their laptop/desktops</span> <span class="s1">Here's an example:</span>
<span class="s1"> A couple of users experienced this when trying to connect from their laptop/desktops</span>
 
<span class="s1">Here's an example:</span>
 
  
sftp -oPort=8022 -r <span class="s2">UNSW_RDS:<z-user-id>@rds.unsw.edu.au</span>
+
sftp -oPort=8022 -r <span class="s2">UNSW_RDS:<z-user-id>@rds.unsw.edu.au</span> <span class="s1">hash mismatch</span> <span class="s1">key_verify failed for server_host_key</span> <span class="s1">Couldn't read packet: Connection reset by peer</span>
<span class="s1"> hash mismatch</span>
 
<span class="s1"> key_verify failed for server_host_key</span>
 
<span class="s1"> Couldn't read packet: Connection reset by peer</span>
 
  
The IT services are working to solve this and the new release is scheduled for the 17th-18th of November.
+
The IT services are working to solve this and the new release is scheduled for the 17th-18th of November. If you can't or don't want to wait you still have other options to transfer the data. As well as the web interface, which should work just fine from your own computer, you can use filezilla or lftp
If you can't or don't want to wait you still have other options to transfer the data. As well as the web interface, which should work just fine from your own computer, you can use filezilla or lftp
 
  
 
lftp is actually available on the "storm" server and can be used in the following way
 
lftp is actually available on the "storm" server and can be used in the following way
  
<span class="s1">lftp -p 8022 s<span class="s2">[[ftp://rds.unsw.edu.au]]</span></span>
+
<span class="s1">lftp -p 8022 s<span class="s2">[[ftp://rds.unsw.edu.au [1]]]</span></span> <span class="s3">Then at the lftp prompt:</span> <span class="s3">user UNSW_RDS:<your zID></span> <span class="s3">You are then prompted for your password.</span> <span class="s3">Then you can transfer data. To upload:</span> <span class="s3">mirror –R <local dir> <archive dir></span>
<span class="s3">Then at the lftp prompt:</span>
 
<span class="s3">user UNSW_RDS:<your zID></span>
 
<span class="s3">You are then prompted for your password.</span>
 
<span class="s3">Then you can transfer data. To upload:</span>
 
<span class="s3">mirror –R <local dir> <archive dir></span>
 
  
<span class="s3">To download:</span>
+
<span class="s3">To download:</span> <span class="s3">mirror <archive dir> <local dir></span>
<span class="s3">mirror <archive dir> <local dir></span>
 
  
 
<span class="s1">NB you need to put a / after the destination directory if you want it to create the source directory in the target, otherwise it puts the contents of the source in the target</span>
 
<span class="s1">NB you need to put a / after the destination directory if you want it to create the source directory in the target, otherwise it puts the contents of the source in the target</span>
  
NB lftp can have issues when using ls which cause lftp to hang!!
+
NB lftp can have issues when using ls which cause lftp to hang!! Here: [[http://www.mail-archive.com/lftp@uniyar.ac.ru/msg03949.html [2]]] there's a suggestion of using "ftp" instead of "sftp" ie lftp -p 8022 <span class="s2">[[ftp://rds.unsw.edu.au [3]]]</span> The IT service responsible for the data storage is testing other ways to use lftp, which has a lot of available options. Finally he can help anyone who needs a more complex data transfer to be automated to use set up a java script which is currently available on storm. If you're interested let me know (Paola) and I will put you in contact. Or send an e-mail to their helpdesk (details should be in their documentation).
Here: [[http://www.mail-archive.com/lftp@uniyar.ac.ru/msg03949.html]] there's a suggestion of using "ftp" instead of "sftp"
 
ie
 
lftp -p 8022 <span class="s2">[[ftp://rds.unsw.edu.au]]</span>
 
The IT service responsible for the data storage is testing other ways to use lftp, which has a lot of available options. Finally he can help anyone who needs a more complex data transfer to be automated to use set up a java script which is currently available on storm. If you're interested let me know (Paola) and I will put you in contact. Or send an e-mail to their helpdesk (details should be in their documentation).
 
  
[[Category: Institution data requirements]]
+
[[Category:Institution data requirements]]

Revision as of 20:21, 23 November 2020

Archive data at UNSW

UNSW has just recently launched a new website that allows students and researchers to managed and store their data completely through an online workflow. First step is the submission of a Research Data Management Plan (RDMP) and then you can upload your data directly on the RDS web portal or by sftp or using a command line, depending on size and complexity of data. Data can then be accessed by the web portal. All the necessary information, links and video tutorials are available on the UNSW data archive services website.

 

Student requirements in regard to data storing

ARC funding rules now have various stipulations about correct handling of data and collection and storage of metadata. The UNSW ResData system provides a means to do this. As such, all supervisors of PhD students probably ought to have RDMPs linked to their grants. If so, there’s an option to link plan/s to any students they supervise if the data is shared.

HDR students can also set up their own accounts and data plans. It’s a discussion best had between students and supervisors as to whether it’s beneficial for students to have their own separate plan e.g. If you the student is working with and/or generating data that is quite distinct to the supervisor’s.

In the meantime, it is not detrimental to upcoming PG reviews to tick “no” to the question about RDMPs, but it should be a prompt to look into this (staff and students) as it is an emerging compliance issue.

 

 

When you leave

You need a valid and active zID and zPass to access and use the UNSW Data Archive. If you leave UNSW you can gain access via the Lead Chief Investigator (LCI) and the Research Project Manager as defined in the Research Data Management Plan (RDMP) If the Lead Chief Investigator (LCI) and the Research Project Manager have also left UNSW, access to the project can be gained through the Head of School. You should contact your local IT support or the UNSW IT Service Centre to arrange access to data stored in the Data Archive.

 

Known issues with the transfer to data storage

Some users have reported having trouble using the sftp command line option to transfer data to the UNSW Data Storage. The official documentation is following the instruction for the sftp-client-guide

There are two issues we've been told about: 1) If you're trying to connect from one of the CCRC server like Storm or Maelstrom using the command in the instructions

sftp -oPort=8022 -r UNSW_RDS:<z-user-id>@rds.unsw.edu.au

you will get the following error message:

sftp: illegal option -- r usage: sftp [-1Cv] [-B buffer_size] [-b batchfile] [-F ssh_config] [-o ssh_option] [-P sftp_server_path] [-R num_requests] [-S program] [-s subsystem | sftp_server] host sftp [user@]host[:file ...] sftp [user@]host[:dir[/]] sftp -b batchfile [user@]host

This is because the servers are using an old version of sftp that doesn't have the -r flag. The -r flag is used to transfer files recursively, so you can transfer entire directories in one go. If you omit this flag than you shouldn't have any issues connecting. If you need to transfer lots of files alternative option are listed below.

2) Depending on the ssh/sftp and unix/linux distribution you're using you might experience a bug that will stop you connecting to the server. A couple of users experienced this when trying to connect from their laptop/desktops Here's an example:

sftp -oPort=8022 -r UNSW_RDS:<z-user-id>@rds.unsw.edu.au hash mismatch key_verify failed for server_host_key Couldn't read packet: Connection reset by peer

The IT services are working to solve this and the new release is scheduled for the 17th-18th of November. If you can't or don't want to wait you still have other options to transfer the data. As well as the web interface, which should work just fine from your own computer, you can use filezilla or lftp

lftp is actually available on the "storm" server and can be used in the following way

lftp -p 8022 s[[1]] Then at the lftp prompt: user UNSW_RDS:<your zID> You are then prompted for your password. Then you can transfer data. To upload: mirror –R <local dir> <archive dir>

To download: mirror <archive dir> <local dir>

NB you need to put a / after the destination directory if you want it to create the source directory in the target, otherwise it puts the contents of the source in the target

NB lftp can have issues when using ls which cause lftp to hang!! Here: [[2]] there's a suggestion of using "ftp" instead of "sftp" ie lftp -p 8022 [[3]] The IT service responsible for the data storage is testing other ways to use lftp, which has a lot of available options. Finally he can help anyone who needs a more complex data transfer to be automated to use set up a java script which is currently available on storm. If you're interested let me know (Paola) and I will put you in contact. Or send an e-mail to their helpdesk (details should be in their documentation).