Difference between revisions of "FAIR - Reusable"

(Created page with " == '''<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:large;">Data should be accompanied by enough information on how it was collected or process...")
 
 
(10 intermediate revisions by 2 users not shown)
Line 1: Line 1:
  
== '''<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:large;">Data should be accompanied by enough information on how it was collected or processed, as to guarantee its quality and hence make it usable by other. It should have a license that allows and&nbsp;facilitates reuse</span></span>''' ==
+
'''<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:large;">Data should be accompanied by enough information on how it was collected or processed, as to guarantee its quality and hence make it usable by other. It should have a license that allows and&nbsp;facilitates reuse</span></span>'''
 +
 
 +
=== '''<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Data has a detailed provenance</span></span>''' ===
 +
 
 +
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Provenance indicates the history of your data, so it should include as much information as possible on the&nbsp;code, datasets and processes used to produce the data. Provenance can be recorded as a separate document, but it is often composed of several elements. The metadata attached to&nbsp;the publication is part of the provenance, as is metadata available in the data files, any relevant technical report, links to&nbsp;source code and data documentation and other references.&nbsp;They can all contribute to your data provenance.&nbsp;</span></span>
 +
 
 +
<span style="font-size:medium;">Provenance is central to data reproducibility and hence to build trust in the data.</span>
  
 
=== '''<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Data has a license</span></span>''' ===
 
=== '''<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Data has a license</span></span>''' ===
  
=== '''<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Data has provenance</span></span>''' ===
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">A dataset without a license cannot be used. A potential user would have to contact the owner and ask for permission to use the data. A license tells immediately to a user what can be done with the data.</span></span>
 +
 
 +
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">The license should be clear, it is always better to use an internationally recognised license, rather than a custom one. Widely used licenses are easily recognised by other users, so they know what the license cover without having to read it. These licenses are also more machine-readable as software to run queries on repositories will recognised them.</span></span>
  
 
=== '''<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Data uses community standards</span></span>''' ===
 
=== '''<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Data uses community standards</span></span>''' ===
 +
 +
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Data that uses&nbsp;file formats, standards and conventions used by the community are more reusable. Applying discipline conventions makes the data more readable both by other researchers, and by software developed for the same community. Discipline specific software modules often adopt the same conventions&nbsp;and make assumptions on how data might be structured, or variables named.&nbsp;</span></span>
 +
 +
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Using accepted vocabularies, for example to name variables, reduces the&nbsp;risk of the data being misinterpreted and misused.</span></span>
 +
 +
----
 +
 +
'''<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Related pages</span></span>'''
 +
 +
<span style="font-size:medium;">[[Open_access_license|<span style="font-family:Arial,Helvetica,sans-serif;">Open Access licenses</span>]]</span>
 +
 +
[[Conventions|<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Standard Conventions</span></span>]]
 +
 +
<span style="font-size:medium;">[[Provenance|Provenance]]</span>
 +
 +
[[Data_Management_Plan|<span style="font-size:medium;">Data management plans</span>]]
 +
 +
[[Category:Data induction]]

Latest revision as of 19:44, 25 July 2021

Data should be accompanied by enough information on how it was collected or processed, as to guarantee its quality and hence make it usable by other. It should have a license that allows and facilitates reuse

Data has a detailed provenance

Provenance indicates the history of your data, so it should include as much information as possible on the code, datasets and processes used to produce the data. Provenance can be recorded as a separate document, but it is often composed of several elements. The metadata attached to the publication is part of the provenance, as is metadata available in the data files, any relevant technical report, links to source code and data documentation and other references. They can all contribute to your data provenance. 

Provenance is central to data reproducibility and hence to build trust in the data.

Data has a license

A dataset without a license cannot be used. A potential user would have to contact the owner and ask for permission to use the data. A license tells immediately to a user what can be done with the data.

The license should be clear, it is always better to use an internationally recognised license, rather than a custom one. Widely used licenses are easily recognised by other users, so they know what the license cover without having to read it. These licenses are also more machine-readable as software to run queries on repositories will recognised them.

Data uses community standards

Data that uses file formats, standards and conventions used by the community are more reusable. Applying discipline conventions makes the data more readable both by other researchers, and by software developed for the same community. Discipline specific software modules often adopt the same conventions and make assumptions on how data might be structured, or variables named. 

Using accepted vocabularies, for example to name variables, reduces the risk of the data being misinterpreted and misused.


Related pages

Open Access licenses

Standard Conventions

Provenance

Data management plans