Difference between revisions of "Data terminology"

(Vocabulary)
m
Line 1: Line 1:
  
{{Template: Working on}} &nbsp; <span style="font-family:Arial,Helvetica,sans-serif;">We are listing here some data management&nbsp;key concepts and some of the terms that occurs frequently when looking at managing or publishing your data.</span> &nbsp;
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">
 +
We are listing here some data management&nbsp;key concepts and frequently recurring terms and acronyms.
 +
 
 +
NB this is a work in progress so it is not yet an exhaustive list
  
 
&nbsp;
 
&nbsp;
Line 8: Line 11:
 
{| class="MsoTableGrid" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-family: -webkit-standard; border-collapse: collapse; width: 800px;"
 
{| class="MsoTableGrid" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-family: -webkit-standard; border-collapse: collapse; width: 800px;"
 
|-
 
|-
| style="width:99pt" width="132" | <p style="text-align: center;">'''<span style="font-family:Arial,Helvetica,sans-serif;">FAIR</span>'''</p>  
+
| style="width:99pt" width="132" | <p style="text-align: center;">[[FAIR|FAIR]]</p>  
 
| style="width:351.8pt" width="469" |  
 
| style="width:351.8pt" width="469" |  
<span style="font-family:Arial,Helvetica,sans-serif;">The FAIR Data Principles:</span>
+
The FAIR Data Principles:
  
*<span style="font-family:Arial,Helvetica,sans-serif;">Findable: &nbsp;data should be easy to find and identify.&nbsp;</span>
+
*Findable: &nbsp;data should be easy to find and identify.&nbsp;  
*<span style="font-family:Arial,Helvetica,sans-serif;">Accessible: data&nbsp;should have open access whenever possible.</span>
+
*Accessible: data&nbsp;should have open access whenever possible.  
*<span style="font-family:Arial,Helvetica,sans-serif;">Interoperable: well formatted data that uses&nbsp;discipline conventions and vocabularies, for both the data itself and the metadata used to describe it.</span>
+
*Interoperable: well formatted data that uses&nbsp;discipline conventions and vocabularies, for both the data itself and the metadata used to describe it.  
*<span style="font-family:Arial,Helvetica,sans-serif;">Reusable: data should be accompanied by enough information on how it was collected or processed, as to guarantee its quality and hence make it usable by other</span>
+
*Reusable: data should be accompanied by enough information on how it was collected or processed, as to guarantee its quality and hence make it usable by other  
  
 
|-
 
|-
| style="width:99pt" width="132" | <p align="center" style="margin:0cm 0cm 0.0001pt; text-align:center; padding:0cm 5.4pt"><span style="line-height:normal"><span style="font-family:Calibri, sans-serif">'''File Management'''</span></span></p>  
+
| style="width:99pt" width="132" | <p align="center" style="margin:0cm 0cm 0.0001pt; text-align:center; padding:0cm 5.4pt"><span style="line-height:normal">'''File Management'''</span></p>  
 
| style="width:351.8pt" width="469" |  
 
| style="width:351.8pt" width="469" |  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="line-height:normal">methods for storing, organising, naming, discovering and retrieving files in a structured consistent manner.'''Good systems mean more efficient and effective data retrieval.'''</span></span>
+
<span style="line-height:normal">methods for storing, organising, naming, discovering and retrieving files in a structured consistent manner.&nbsp;</span>
  
 
|-
 
|-
| style="width:99pt" width="132" | <p align="center" style="margin:0cm 0cm 0.0001pt; text-align:center; padding:0cm 5.4pt"><span style="line-height:normal"><span style="font-family:Calibri, sans-serif">'''Data Storage'''</span></span></p>  
+
| style="width:99pt" width="132" | <p align="center" style="margin:0cm 0cm 0.0001pt; text-align:center; padding:0cm 5.4pt">[[Storage|<span style="line-height:normal">'''Data Storage'''</span>]]</p>  
 
| style="width:351.8pt" width="469" |  
 
| style="width:351.8pt" width="469" |  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="line-height:normal">The location and/or system you use to store your data during a research project. This could include on personal computers, on external storage devices such as hard drives or SD cards, and/or networked drives managed by your University or partner institution.</span></span>
+
<span style="line-height:normal">The location and/or system you use to store your data during a research project. This could include&nbsp;disk on personal computers, disk or tape on a&nbsp;shared server, &nbsp;external storage devices such as hard drives or SD cards, and&nbsp;networked drives managed by your institution, commercial or research cloud storage.</span>
  
 
|-
 
|-
| style="width:99pt" width="132" | <p align="center" style="margin:0cm 0cm 0.0001pt; text-align:center; padding:0cm 5.4pt"><span style="font-family:Arial,Helvetica,sans-serif;"><span style="line-height:normal">'''Data Back Up'''</span></span></p>  
+
| style="width:99pt" width="132" | <p align="center" style="margin:0cm 0cm 0.0001pt; text-align:center; padding:0cm 5.4pt">[[Back_Up|<span style="line-height:normal">'''Data Back Up'''</span>]]</p>  
 
| style="width:351.8pt" width="469" |  
 
| style="width:351.8pt" width="469" |  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="line-height:normal">The process of saving your data to protect against data loss. This can be an automatic process, where the storage location automatically retains previous versions of your data, or a manual process, where you need to actively save the data in another location.</span></span>
+
<span style="line-height:normal">The process of saving your data to protect against data loss. This can be an automatic process, where the storage location automatically retains previous versions of your data, or a manual process, where you need to actively save the data in another location.</span>
  
 
|-
 
|-
| style="width:99pt" width="132" | <p align="center" style="margin:0cm 0cm 0.0001pt; text-align:center; padding:0cm 5.4pt"><span style="font-family:Arial,Helvetica,sans-serif;"><span style="line-height:normal">'''Data Archiving or Preservation'''</span></span></p>  
+
| style="width:99pt" width="132" | <p align="center" style="margin:0cm 0cm 0.0001pt; text-align:center; padding:0cm 5.4pt"><span style="line-height:normal">'''Data Archiving or Preservation'''</span></p>  
 
| style="width:351.8pt" width="469" |  
 
| style="width:351.8pt" width="469" |  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="line-height:normal">The process of putting your data in long term storage following the completion of a project or publication for a minimum of 5 years. This includes identifying who can access the data and how it can be accessed. Many Institutions have Repositories which can be used by staff and students.</span></span>
+
<span style="line-height:normal">The process of putting your data in long term storage following the completion of a project or publication for a minimum of 5 years. This includes identifying who can access the data and how it can be accessed. Many Institutions have Repositories which can be used by staff and students.</span>
  
 
|-
 
|-
| style="width:99pt" width="132" | <p align="center" style="margin:0cm 0cm 0.0001pt; text-align:center; padding:0cm 5.4pt"><span style="font-family:Arial,Helvetica,sans-serif;"><span style="line-height:normal">'''Data Sharing'''</span></span></p>  
+
| style="width:99pt" width="132" | <p align="center" style="margin:0cm 0cm 0.0001pt; text-align:center; padding:0cm 5.4pt"><span style="line-height:normal">'''Data Sharing'''</span></p>  
 
| style="width:351.8pt" width="469" |  
 
| style="width:351.8pt" width="469" |  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="line-height:normal">Making your data available for use by other researchers for their own research projects. This requires quality metadata to determine data source and changes made to allow for reuse. The best way to share data is to publish it then it will be more discoverable and will be assigned a persistent identifier (such as DOI) which helps other to cite the data.</span></span>
+
<span style="line-height:normal">Making your data available for use by other researchers for their own research projects. This requires quality metadata to determine data source and changes made to allow for reuse. The best way to share data is to publish it then it will be more discoverable and will be assigned a persistent identifier (such as DOI) which helps other to cite the data.</span>
  
 
|- style="height:21.1pt"
 
|- style="height:21.1pt"
| style="width:99pt" width="132" | <p align="center" style="margin:0cm 0cm 0.0001pt; text-align:center; padding:0cm 5.4pt"><span style="font-family:Arial,Helvetica,sans-serif;"><span style="line-height:normal"><span style="height:21.1pt">'''Data Provenance'''</span></span></span></p>  
+
| style="width:99pt" width="132" | <p align="center" style="margin:0cm 0cm 0.0001pt; text-align:center; padding:0cm 5.4pt">[[Provenance|<span style="line-height:normal"><span style="height:21.1pt">'''Data Provenance'''</span></span>]]</p>  
 
| style="width:351.8pt" width="469" |  
 
| style="width:351.8pt" width="469" |  
<span style="font-family:Arial,Helvetica,sans-serif;">Data provenance describes the journey data goes through. It documents&nbsp;the evolution of a dataset from the original source including all&nbsp;the processes and methodology by which it was produced.</span>
+
Data provenance describes the journey data goes through. It documents&nbsp;the evolution of a dataset from the original source including all&nbsp;the processes and methodology by which it was produced.
  
 
|- style="height:35.6pt"
 
|- style="height:35.6pt"
| style="width:99pt" width="132" | <p align="center" style="margin:0cm 0cm 0.0001pt; text-align:center; padding:0cm 5.4pt"><span style="font-family:Arial,Helvetica,sans-serif;"><span style="line-height:normal"><span style="height:35.6pt">'''Data Management Plan (DMP)'''</span></span></span></p>  
+
| style="width:99pt" width="132" | <p align="center" style="margin:0cm 0cm 0.0001pt; text-align:center; padding:0cm 5.4pt"><span style="line-height:normal"><span style="height:35.6pt">'''[[Data_Management_Plan|Data_Management_Plan]] (DMP)'''</span></span></p>  
 
| style="width:351.8pt" width="469" |  
 
| style="width:351.8pt" width="469" |  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="line-height:normal"><span style="height:35.6pt">Tool to help you manage the data for a specific research project. It can takes different forms depending on the stage of your project, for example a DMP to submit with a grant application will be different from the DMP required to publish your data. A DMP evolves with your project and it is useful to record your data provenance</span></span></span>
+
<span style="line-height:normal"><span style="height:35.6pt">Tool to help you manage the data for a specific research project. It can takes different forms depending on the stage of your project, for example a DMP to submit with a grant application will be different from the DMP required to publish your data. A DMP evolves with your project and it is useful to record your data provenance</span></span>
  
 +
|-
 +
| style="width:99pt" width="132" | Open Access
 +
| style="width:351.8pt" width="469" | A set of principles and a range of practices through which research&nbsp;outputs are distributed online, free of cost or other access barriers.
 
|}
 
|}
  
 
&nbsp;
 
&nbsp;
  
== '''Vocabulary''' ==
+
== '''Other terms''' ==
  
'''ARDC (ex ANDS)'''
+
'''attribution''' - is the act of recognising the author/s of a piece of work that you used in your research. It&nbsp;is a common requirement of licenses&nbsp;
 
 
'''attribution''' - is the act of recognising the author/s of a piece of work that you used in your research. It&nbsp;is a common requirement of licenses,&nbsp;
 
 
 
'''CC''' - [https://creativecommons.org/ Creative Commons], a non-profit organisation that produces licenses to encourage sharing of knowledge, commonly&nbsp;used for data products
 
  
 
'''citation''' - is the way you attribute a piece of work, it should contain&nbsp;all the information necessary to locate the original work
 
'''citation''' - is the way you attribute a piece of work, it should contain&nbsp;all the information necessary to locate the original work
Line 68: Line 70:
 
'''copyright''' - is a form of intellectual property meant to protect the right of the author of a creative work to control how the work is used. More&nbsp;comprehensive but readale information&nbsp;on copyright is available [https://smartcopying.edu.au/guidelines/copyright-basics/what-is-copyright/ here].
 
'''copyright''' - is a form of intellectual property meant to protect the right of the author of a creative work to control how the work is used. More&nbsp;comprehensive but readale information&nbsp;on copyright is available [https://smartcopying.edu.au/guidelines/copyright-basics/what-is-copyright/ here].
  
'''DMP''' - Data Management Plan
+
'''license''' - a copyright license is a legal document stating what someone else is allowed or not allowed to do with&nbsp;a research product
  
'''FAIR''' - see definition in key concepts
+
&nbsp;
  
'''license''' - a copyright license is a legal document stating what someone else is allowed or not allowed to do with (in this case) a dataset.
+
== '''Acronyms''' ==
  
'''open access'''
+
'''ARDC (ex ANDS) '''- [https://ardc.edu.au Australian Research Data Commons] is a NCRIS project aimed to&nbsp;enable&nbsp;the Australian research community and industry access to nationally significant, data intensive digital research infrastructure, platforms, skills and collections of high quality data.
  
'''provenance''' -&nbsp;see definition in key concepts
+
'''CC''' - [https://creativecommons.org/ Creative Commons], a non-profit organisation that produces licenses to encourage sharing of knowledge, commonly&nbsp;used for data products
  
'''RDA''' - [https://researchdata.edu.au/ Research Data Australia] is the data discovery service of the Australian Research Data Commons (ARDC)
+
'''CF''' - [http://climate-cms.wikis.unsw.edu.au/Conventions#CF_Conventions Climate and Forecast conventions], conventions used to set metadata attributes in NetCDF files
 +
 
 +
'''DMP''' - Data Management Plan
  
'''RDA''' - [https://www.rd-alliance.org/ Research Data Alliance] is a global community-driven initiative with the goal of building the social and technical infrastructure to enable open sharing and re-use of data.
+
'''FAIR''' - see definition in key concepts
  
 
&nbsp;
 
&nbsp;
  
&nbsp;
+
'''RDA''' - [https://researchdata.edu.au/ Research Data Australia] is the data discovery service of the Australian Research Data Commons (ARDC)
  
== Terms ==
+
'''RDA''' - [https://www.rd-alliance.org/ Research Data Alliance] is a global community-driven initiative with the goal of building the social and technical infrastructure to enable open sharing and re-use of data.
 +
</span></span>

Revision as of 22:24, 11 July 2021

We are listing here some data management key concepts and frequently recurring terms and acronyms.

NB this is a work in progress so it is not yet an exhaustive list

 

Key concepts    

FAIR

The FAIR Data Principles:

  • Findable:  data should be easy to find and identify. 
  • Accessible: data should have open access whenever possible.
  • Interoperable: well formatted data that uses discipline conventions and vocabularies, for both the data itself and the metadata used to describe it.
  • Reusable: data should be accompanied by enough information on how it was collected or processed, as to guarantee its quality and hence make it usable by other

File Management

methods for storing, organising, naming, discovering and retrieving files in a structured consistent manner. 

Data Storage

The location and/or system you use to store your data during a research project. This could include disk on personal computers, disk or tape on a shared server,  external storage devices such as hard drives or SD cards, and networked drives managed by your institution, commercial or research cloud storage.

Data Back Up

The process of saving your data to protect against data loss. This can be an automatic process, where the storage location automatically retains previous versions of your data, or a manual process, where you need to actively save the data in another location.

Data Archiving or Preservation

The process of putting your data in long term storage following the completion of a project or publication for a minimum of 5 years. This includes identifying who can access the data and how it can be accessed. Many Institutions have Repositories which can be used by staff and students.

Data Sharing

Making your data available for use by other researchers for their own research projects. This requires quality metadata to determine data source and changes made to allow for reuse. The best way to share data is to publish it then it will be more discoverable and will be assigned a persistent identifier (such as DOI) which helps other to cite the data.

Data Provenance

Data provenance describes the journey data goes through. It documents the evolution of a dataset from the original source including all the processes and methodology by which it was produced.

Data_Management_Plan (DMP)

Tool to help you manage the data for a specific research project. It can takes different forms depending on the stage of your project, for example a DMP to submit with a grant application will be different from the DMP required to publish your data. A DMP evolves with your project and it is useful to record your data provenance

Open Access A set of principles and a range of practices through which research outputs are distributed online, free of cost or other access barriers.

 

Other terms

attribution - is the act of recognising the author/s of a piece of work that you used in your research. It is a common requirement of licenses 

citation - is the way you attribute a piece of work, it should contain all the information necessary to locate the original work

copyright - is a form of intellectual property meant to protect the right of the author of a creative work to control how the work is used. More comprehensive but readale information on copyright is available here.

license - a copyright license is a legal document stating what someone else is allowed or not allowed to do with a research product

 

Acronyms

ARDC (ex ANDS) - Australian Research Data Commons is a NCRIS project aimed to enable the Australian research community and industry access to nationally significant, data intensive digital research infrastructure, platforms, skills and collections of high quality data.

CC - Creative Commons, a non-profit organisation that produces licenses to encourage sharing of knowledge, commonly used for data products

CF - Climate and Forecast conventions, conventions used to set metadata attributes in NetCDF files

DMP - Data Management Plan

FAIR - see definition in key concepts

 

RDA - Research Data Australia is the data discovery service of the Australian Research Data Commons (ARDC)

RDA - Research Data Alliance is a global community-driven initiative with the goal of building the social and technical infrastructure to enable open sharing and re-use of data.