Difference between revisions of "CF checker"

(Checking multiple files)
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
  
There are a few options when it comes to check if your files are CF compliant. You can use online checkers as these:
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">There are a few options when it comes to check if your files are CF compliant. You can use online checkers as these:</span></span>
  
[https://compliance.ioos.us/index.html https://compliance.ioos.us/index.html] [http://cfconventions.org/compliance-checker.html http://cfconventions.org/compliance-checker.html]
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">[https://compliance.ioos.us/index.html https://compliance.ioos.us/index.html]</span></span>
  
You upload your netcdf file on the web and you get back a report.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">[http://cfconventions.org/compliance-checker.html http://cfconventions.org/compliance-checker.html]</span></span>
  
This is fine if you have one small file but if you want to check a lot of files you have to use a checker on your computer. We installed the [https://github.com/ioos/compliance-checker ioos checker] in our conda environments. The ioos checker is python based, it checks for CF conventions and the ACDD conventions. NCI uses this checker and in fact they require both conventions to be applied to the data they publish.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">You upload your netCDF&nbsp;file on the web and you get back a report.</span></span>
  
=== <span style="font-family:Arial,Helvetica,sans-serif;">'''Running ioos on raijin'''</span> ===
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">This is fine if you have one small file but if you want to check a lot of files you have to use a checker on your computer. We installed the [https://github.com/ioos/compliance-checker IOOS checker] in our conda environments. The IOOS&nbsp;checker is python based, it checks for CF conventions and the ACDD conventions. NCI uses this checker as part of their publishing process, as it covers both CF and ACDD conventions, which are required to publish on their data portal.&nbsp;</span></span>
  
Using it is very simple, just load the conda module
+
=== <span style="font-size:large;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Running IOOS&nbsp;at NCI'''</span></span> ===
 +
 
 +
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Using it is very simple, just load the conda module</span></span>
 
<blockquote>
 
<blockquote>
$ module use /g/data3/hh5/public/modules
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">$ module use /g/data/hh5/public/modules</span></span>
  
$ module load conda/analysis3
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">$ module load conda/analysis3</span></span>
 
</blockquote>  
 
</blockquote>  
You can now call the checker on a netcdf file:
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">You can now call the checker on a netCDF&nbsp;file:</span></span>
 
<blockquote>
 
<blockquote>
$ cchecker.py test_file.nc
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">$ cchecker.py test_file.nc</span></span>
 
</blockquote>  
 
</blockquote>  
which will produce:
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">which will produce:</span></span>
 
<blockquote>
 
<blockquote>
Running Compliance Checker on the datasets from: ['test_file.nc']<br/> --------------------------------------------------------------------------------<br/> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;IOOS Compliance Checker Report &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<br/> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; acdd:1.3 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br/> [http://wiki.esipfed.org/index.php?title=Category:Attribute_Conventions_Dataset_Discovery http://wiki.esipfed.org/index.php?title=Category:Attribute_Conventions_Dataset_Discovery]<br/> --------------------------------------------------------------------------------<br/> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Corrective Actions &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<br/> test_file.nc has 4 potential issues<br/> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Highly Recommended &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<br/> --------------------------------------------------------------------------------<br/> Global Attributes<br/> * Conventions does not contain 'ACDD-1.3'<br/> * summary not present<br/> .........
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Running Compliance Checker on the datasets from: ['test_file.nc']<br/> --------------------------------------------------------------------------------<br/> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;IOOS Compliance Checker Report &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<br/> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; acdd:1.3 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br/> [http://wiki.esipfed.org/index.php?title=Category:Attribute_Conventions_Dataset_Discovery http://wiki.esipfed.org/index.php?title=Category:Attribute_Conventions_Dataset_Discovery]<br/> --------------------------------------------------------------------------------<br/> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Corrective Actions &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<br/> test_file.nc has 4 potential issues<br/> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Highly Recommended &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;<br/> --------------------------------------------------------------------------------<br/> Global Attributes<br/> * Conventions does not contain 'ACDD-1.3'<br/> * summary not present<br/> .........</span></span>
 
</blockquote>  
 
</blockquote>  
<span style="font-family:Arial,Helvetica,sans-serif;">As you can see from the example, without passing any option the tool checks for ACDD 1.3 (the latest version) compliance. The report is printed out to screen.</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">As you can see from the example, without passing any option, the tool checks for ACDD 1.3 (the latest version) compliance. The report is printed out to screen.</span></span>
 
<blockquote>
 
<blockquote>
<span style="font-family:Arial,Helvetica,sans-serif;">$ cchecker --help</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">$ cchecker.py --help</span></span>
 
</blockquote>  
 
</blockquote>  
<span style="font-family:Arial,Helvetica,sans-serif;">will list all the available options, the main ones are:</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">will list all the available options, the main ones are:</span></span>
 +
 
 +
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''&nbsp;-t/--test'''&nbsp; to choose the test;</span></span>
  
<span style="font-family:Arial,Helvetica,sans-serif;">'''&nbsp;-t/--test'''&nbsp; to choose the test;</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">&nbsp; &nbsp; &nbsp;ie.e -t=cf&nbsp; will test the file against&nbsp;the latest available version of the CF conventions.</span></span>
  
<span style="font-family:Arial,Helvetica,sans-serif;">&nbsp; &nbsp; &nbsp;ie.e -t=cf&nbsp; will test the file against&nbsp;the latest available version of the CF conventions.</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''&nbsp;-l/--list'''&nbsp; to list the available tests.</span></span>
  
<span style="font-family:Arial,Helvetica,sans-serif;">'''&nbsp;-c /-- criteria''' set the level to which to run the test;</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''&nbsp;-c /-- criteria''' set the level to which to run the test;</span></span>
  
&nbsp; &nbsp; &nbsp; possible options are < lenient, normal, strict >, default to <normal>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">&nbsp; &nbsp; &nbsp; possible options are < lenient, normal, strict >, default to <normal></span></span>
  
'''&nbsp;-f/--format'''&nbsp; the output format;
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''&nbsp;-f/--format'''&nbsp; the output format;</span></span>
  
&nbsp; &nbsp; &nbsp; possible options are < text, html, json, json_new >
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">&nbsp; &nbsp; &nbsp; possible options are < text, html, json, json_new ></span></span>
  
'''&nbsp;-o/--output''' optional file/s to redirect output to.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''&nbsp;-o/--output''' optional file/s to redirect output to.</span></span>
  
 
&nbsp;
 
&nbsp;
  
=== '''Checking multiple files''' ===
+
==== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Checking multiple files'''</span></span> ====
  
You can also run the checker on several&nbsp;files at one time. For example&nbsp;:
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">You can also run the checker on several&nbsp;files at one time. For example:</span></span>
 
<blockquote>
 
<blockquote>
$ cchecker.py -t=cf -c strict -o cf_test.txt test_data/test*.nc&nbsp;
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">$ cchecker.py -t=cf -c strict -o cf_test.txt test_data/test*.nc&nbsp;</span></span>
 
</blockquote>  
 
</blockquote>  
In this case I tested all the files matching&nbsp;<test_data/test*.nc> against the CF standards, I applied the standard at a <strict> level and&nbsp;the report is written to the file cf_test.txt.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">In this case I tested all the files matching&nbsp;<test_data/test*.nc> against the CF standards, I applied the standard at a <strict> level and&nbsp;the report is written to the file cf_test.txt.</span></span>
  
When I test mutiple files, the tools print a report for each file, if you have a lot of files you will end up with a long and repetitive report.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">When you&nbsp;test mutiple files, the tools print a report for each file, if you have a lot of files you will end up with a long and repetitive report.</span></span>
  
<span style="font-family:Arial,Helvetica,sans-serif;">We created a simple python script you can run to summarise the report so you get each error or warning reported only once.</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">We created a simple python script you can run to summarise the report so you get each error or warning reported only once.</span></span>
  
<span style="font-family:Arial,Helvetica,sans-serif;">You can acces the script here:</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">You can access the script here:</span></span>
  
<span style="font-family:Arial,Helvetica,sans-serif;">[https://gist.github.com/paolap/e37447c9c00e8894437b13a76021c857 https://gist.github.com/paolap/e37447c9c00e8894437b13a76021c857]</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">[https://gist.github.com/paolap/e37447c9c00e8894437b13a76021c857 https://gist.github.com/paolap/e37447c9c00e8894437b13a76021c857]</span></span>
  
<span style="font-family:Arial,Helvetica,sans-serif;">This script creates a summary of CF/ACDD tests run by the ioos checker on multiple files</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">This script creates a summary of CF/ACDD tests run by the ioos checker on multiple files</span></span>
  
<span style="font-family:Arial,Helvetica,sans-serif;">First run the checker generating a json output, for example:</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">First run the checker generating a json output, for example:</span></span>
 
<blockquote>
 
<blockquote>
<span style="font-family:Arial,Helvetica,sans-serif;">$ cchecker.py -t=cf -c strict -f json_new -o cf_test.json&nbsp;test_data/test*.nc&nbsp;&nbsp;</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">$ cchecker.py -t=cf -c strict -f json_new -o cf_test.json&nbsp;test_data/test*.nc&nbsp;&nbsp;</span></span>
 
</blockquote>  
 
</blockquote>  
<span style="font-family:Arial,Helvetica,sans-serif;">Then pass&nbsp;the json file as input to this script</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Then pass&nbsp;the json file as input to this script</span></span>
 
<blockquote>
 
<blockquote>
<span style="font-family:Arial,Helvetica,sans-serif;">$ python parse_checker.py test.json</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">$ python parse_checker.py test.json</span></span>
 
</blockquote>  
 
</blockquote>  
The reports will be summarised:
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">The reports will be summarised:</span></span>
 
<blockquote>
 
<blockquote>
Results for cf checks<br/> 3 files were checked
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Results for cf checks<br/> 3 files were checked</span></span>
  
High priority results
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">High priority results</span></span>
  
3 files failed:<br/> �2.2 Data Types: The variable time failed because the datatype is int64
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">3 files failed:<br/> �2.2 Data Types: The variable time failed because the datatype is int64</span></span>
  
3 files failed:<br/> �3.3 Standard Name: Attribute long_name or/and standard_name is highly recommended for variable time
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">3 files failed:<br/> �3.3 Standard Name: Attribute long_name or/and standard_name is highly recommended for variable time</span></span>
  
...
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">...</span></span>
 
</blockquote>  
 
</blockquote>  
<span style="font-family:Arial,Helvetica,sans-serif;">If you want to save the summary in a file just redirect the output</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">If you want to save the summary in a file just redirect the output</span></span>
 
<blockquote>
 
<blockquote>
<span style="font-family:Arial,Helvetica,sans-serif;">$ python parse_checker.py test.json > checks_summary.txt</span>
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">$ python parse_checker.py test.json > checks_summary.txt</span></span>
 
</blockquote>   
 
</blockquote>   
[[Category:Data]]
+
&nbsp;
 +
 
 +
[[Category:Data induction]]

Latest revision as of 01:49, 14 July 2021

There are a few options when it comes to check if your files are CF compliant. You can use online checkers as these:

https://compliance.ioos.us/index.html

http://cfconventions.org/compliance-checker.html

You upload your netCDF file on the web and you get back a report.

This is fine if you have one small file but if you want to check a lot of files you have to use a checker on your computer. We installed the IOOS checker in our conda environments. The IOOS checker is python based, it checks for CF conventions and the ACDD conventions. NCI uses this checker as part of their publishing process, as it covers both CF and ACDD conventions, which are required to publish on their data portal. 

Running IOOS at NCI

Using it is very simple, just load the conda module

$ module use /g/data/hh5/public/modules

$ module load conda/analysis3

You can now call the checker on a netCDF file:

$ cchecker.py test_file.nc

which will produce:

Running Compliance Checker on the datasets from: ['test_file.nc']
--------------------------------------------------------------------------------
                        IOOS Compliance Checker Report                         
                                    acdd:1.3                                    
http://wiki.esipfed.org/index.php?title=Category:Attribute_Conventions_Dataset_Discovery
--------------------------------------------------------------------------------
                               Corrective Actions                               
test_file.nc has 4 potential issues
                               Highly Recommended                               
--------------------------------------------------------------------------------
Global Attributes
* Conventions does not contain 'ACDD-1.3'
* summary not present
.........

As you can see from the example, without passing any option, the tool checks for ACDD 1.3 (the latest version) compliance. The report is printed out to screen.

$ cchecker.py --help

will list all the available options, the main ones are:

 -t/--test  to choose the test;

     ie.e -t=cf  will test the file against the latest available version of the CF conventions.

 -l/--list  to list the available tests.

 -c /-- criteria set the level to which to run the test;

      possible options are < lenient, normal, strict >, default to <normal>

 -f/--format  the output format;

      possible options are < text, html, json, json_new >

 -o/--output optional file/s to redirect output to.

 

Checking multiple files

You can also run the checker on several files at one time. For example:

$ cchecker.py -t=cf -c strict -o cf_test.txt test_data/test*.nc 

In this case I tested all the files matching <test_data/test*.nc> against the CF standards, I applied the standard at a <strict> level and the report is written to the file cf_test.txt.

When you test mutiple files, the tools print a report for each file, if you have a lot of files you will end up with a long and repetitive report.

We created a simple python script you can run to summarise the report so you get each error or warning reported only once.

You can access the script here:

https://gist.github.com/paolap/e37447c9c00e8894437b13a76021c857

This script creates a summary of CF/ACDD tests run by the ioos checker on multiple files

First run the checker generating a json output, for example:

$ cchecker.py -t=cf -c strict -f json_new -o cf_test.json test_data/test*.nc  

Then pass the json file as input to this script

$ python parse_checker.py test.json

The reports will be summarised:

Results for cf checks
3 files were checked

High priority results

3 files failed:
�2.2 Data Types: The variable time failed because the datatype is int64

3 files failed:
�3.3 Standard Name: Attribute long_name or/and standard_name is highly recommended for variable time

...

If you want to save the summary in a file just redirect the output

$ python parse_checker.py test.json > checks_summary.txt