Difference between revisions of "Data induction"

 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
{{Working on}}
 
  
 
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">As a researcher or a post graduate student in climate science you do most of your work on a computer.&nbsp;If you were working in a laboratory you would be following a set of rules and protocols, you would be documenting your experiments to be able to reproduce them later or describe them to others in a detailed and exact way. When we are sitting at our desk it is really easy to forget that what we are doing will eventually be shared with others. If you are writing a paper&nbsp;you are used&nbsp;to receive a set of guidelines and to justify and references anything you write, as well as making sure is readable and well presented. We are not used to share our data and code but they are just another [[Why_should_I_care?|research product]]. As such they are also covered by requirements, however lots of these have been introduced only recently and the guidelines are continuously evolving.</span></span>
 
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">As a researcher or a post graduate student in climate science you do most of your work on a computer.&nbsp;If you were working in a laboratory you would be following a set of rules and protocols, you would be documenting your experiments to be able to reproduce them later or describe them to others in a detailed and exact way. When we are sitting at our desk it is really easy to forget that what we are doing will eventually be shared with others. If you are writing a paper&nbsp;you are used&nbsp;to receive a set of guidelines and to justify and references anything you write, as well as making sure is readable and well presented. We are not used to share our data and code but they are just another [[Why_should_I_care?|research product]]. As such they are also covered by requirements, however lots of these have been introduced only recently and the guidelines are continuously evolving.</span></span>
  
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">This is why we prepared this data induction, as an attempt to clarify what your responsibilities are, by covering all the applicable data policies and requirements, but also providing a set of guidelines and services which will help you satisfy them.</span></span>
+
This is why we prepared this data induction, as an attempt to clarify what your responsibilities are, by covering all the applicable data policies and requirements, but also providing a set of guidelines and services which will help you satisfy them.
  
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">This induction is structured in three&nbsp;parts, starting from your responsibility as a researcher.</span></span>
+
This induction is structured in three&nbsp;parts, starting from your responsibility as a researcher.
  
=== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Part 1: Policy'''&nbsp;</span></span> ===
+
=== '''Part 1: Policy'''&nbsp; ===
  
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">In the first part we are covering the ARC and institutional requirements, the&nbsp;journal publishers' requirements and the&nbsp;CLEX data policy.&nbsp;Trying to put all these requirements together can be confusing and overwhelming. The key is to&nbsp;remember that the&nbsp;CLEX data policy is an interpretation of how these requirements can be applied to&nbsp;climate science and all our guidelines are aiming to help you satisfying them. We also try to cover gaps in data services and are&nbsp;always available to help you.</span></span>
+
In the first part we are covering the [[Institution_data_requirements|ARC and institutional requirements]], the&nbsp;[[Publisher_policies|journal publishers' requirements]] and the&nbsp;[[CLEX_Data_policy|CLEX data policy]].&nbsp;Trying to put all these requirements together can be confusing and overwhelming. The key is to&nbsp;remember that the&nbsp;CLEX data policy is an interpretation of how these requirements can be applied to&nbsp;climate science and all our guidelines are aiming to help you satisfying them. We also try to cover gaps in data services and are&nbsp;always available to help you.
  
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">All the policies are based upon the [[FAIR|FAIR]] principles so this is the best place to start to understand better what your data and code should look like and why this is important.</span></span>
+
All the policies are based upon the [[FAIR|FAIR]] principles so this is the best place to start to understand better what your data and code should look like and why this is important.
  
=== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Part 2: Publishing'''</span></span> ===
+
=== '''Part 2: Publishing''' ===
  
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">You can satisfy most of the requirements by publishing your data, which is why in the second part we focus on publishing. Publishing is also the best way of sharing your data with others. While just putting your data online somewhere might seems the quickest solution, if your data is not properly described and formatted, it is of little use.</span></span>
+
You can satisfy most of the requirements by publishing your data, which is why in the second part we focus on publishing. Publishing is also the best way of sharing your data with others. While just putting your data online somewhere might seems the quickest solution, if your data is not properly described and formatted, it is of little use.
  
=== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Part 3: Best practices'''</span></span> ===
+
=== '''Part 3: Best practices''' ===
  
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">The final part is covering the best practices to manage your data and code. Publishing is just the last step in your data workflow, proper planning and adoption of best practices from the start of your project makes publishing much easier as well as optimising&nbsp;the use of our shared storage and computational resources.</span></span>
+
The final part is covering the best practices to manage your data and code. Publishing is just the last step in your data workflow, [[Data_management|proper planning]] and adoption of best practices from the start of your project makes publishing much easier as well as optimising&nbsp;the use of our shared storage and computational resources.
  
=== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Induction outcomes'''</span></span> ===
+
=== '''Induction outcomes''' ===
  
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">At the end of this induction you should have all the information necessary to manage, share and publish your data. In particular you will:</span></span>
+
At the end of this induction you should have all the information necessary to manage, share and publish your data. In particular, you will:
  
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">know what data you are required to archive, share or publish according to your institution, the ARC or in order to publish a paper.&nbsp;&nbsp;</span></span>
+
*know what data you are required to archive, share or publish according to [[Institution_data_requirements|your institution, the ARC]] or in order [[Publisher_policies|to publish a paper]].&nbsp;&nbsp;  
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">be familiar with the key data concepts and terms</span></span>
+
*be familiar with the [[Data_terminology|key data concepts and terms]]
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">know how to choose and apply data and software licensing</span></span>
+
*know how to choose and apply [[Open_access_license|data and software licensing]]
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">be familiar with and know how to use conventions and standard&nbsp;used by the climate community</span></span>
+
*be familiar with and know how to use [[Conventions|conventions]]&nbsp;and other [[Controlled_vocabularies|standards]] used by the climate community  
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">know&nbsp;how to manage storage and computational resources available to you</span></span>
+
*know&nbsp;how to manage [[Storage|storage]] and [[Accounting_at_NCI|computational resources]] available to you  
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">be ready when asked to publish data to make all the relevant choices:</span></span>
+
*be ready when asked to publish data and code to make all the relevant choices:  
**<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">which data to publish,</span></span>
+
**[[Which_data_should_I_publish|which data to publish]],  
**<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">where to publish,</span></span>
+
**[[Publishing_options|where to publish]],  
**<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">how to prepare your data</span></span>
+
**[[Data_management|how to prepare your data]]    
**<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">and finally how to advertise your newly published dataset</span></span>    
 
  
&nbsp;
+
Managing your data&nbsp;properly, in a way that makes your research easier to reproduce and share and so more valuable, it is not complicated, it will eventually become just another working habit. However, you might encounter concepts and terminology that are new for you.&nbsp;To help you, we created a list with the definitions of the most commonly used [[Data_terminology|terms, key concepts and&nbsp;acronyms]].
  
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Managing your data&nbsp;properly, in a way that makes your research easier to reproduce and share and so more valuable, it is not complicated, it will eventually become just another working habit. However, you might encounter concepts and terminology that are new for you.&nbsp;To help you we listed the definitions of all the most commonly used [[Data_terminology|terms, key concepts and&nbsp;acronyms]].</span></span>
+
[[Category:Data induction]]

Latest revision as of 20:24, 2 August 2021

As a researcher or a post graduate student in climate science you do most of your work on a computer. If you were working in a laboratory you would be following a set of rules and protocols, you would be documenting your experiments to be able to reproduce them later or describe them to others in a detailed and exact way. When we are sitting at our desk it is really easy to forget that what we are doing will eventually be shared with others. If you are writing a paper you are used to receive a set of guidelines and to justify and references anything you write, as well as making sure is readable and well presented. We are not used to share our data and code but they are just another research product. As such they are also covered by requirements, however lots of these have been introduced only recently and the guidelines are continuously evolving.

This is why we prepared this data induction, as an attempt to clarify what your responsibilities are, by covering all the applicable data policies and requirements, but also providing a set of guidelines and services which will help you satisfy them.

This induction is structured in three parts, starting from your responsibility as a researcher.

Part 1: Policy 

In the first part we are covering the ARC and institutional requirements, the journal publishers' requirements and the CLEX data policy. Trying to put all these requirements together can be confusing and overwhelming. The key is to remember that the CLEX data policy is an interpretation of how these requirements can be applied to climate science and all our guidelines are aiming to help you satisfying them. We also try to cover gaps in data services and are always available to help you.

All the policies are based upon the FAIR principles so this is the best place to start to understand better what your data and code should look like and why this is important.

Part 2: Publishing

You can satisfy most of the requirements by publishing your data, which is why in the second part we focus on publishing. Publishing is also the best way of sharing your data with others. While just putting your data online somewhere might seems the quickest solution, if your data is not properly described and formatted, it is of little use.

Part 3: Best practices

The final part is covering the best practices to manage your data and code. Publishing is just the last step in your data workflow, proper planning and adoption of best practices from the start of your project makes publishing much easier as well as optimising the use of our shared storage and computational resources.

Induction outcomes

At the end of this induction you should have all the information necessary to manage, share and publish your data. In particular, you will:

Managing your data properly, in a way that makes your research easier to reproduce and share and so more valuable, it is not complicated, it will eventually become just another working habit. However, you might encounter concepts and terminology that are new for you. To help you, we created a list with the definitions of the most commonly used terms, key concepts and acronyms.