Difference between revisions of "Coding best practices"

(How)
 
(13 intermediate revisions by 3 users not shown)
Line 1: Line 1:
  
{{Template:Working on}}  
+
= Introduction =
 +
There are code standards and conventions available depending on the language you are using and sometimes also conventions adopted for specific collaborative projects. These can be quite complex and out of scope if you are writing a code for your analysis, however there are a few things you can do to make your code much more readable and safer from bugs which are quite simple. In the video linked below, kindly provided by DataTAS, the presenter gives some useful tips which can be applied to any language:
  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">There are code standards and conventions available depending on the language you are using and sometimes also conventions adopted for specific&nbsp;collaborative projects. While below we provide a few links to these, here&nbsp;we are just going to focus on&nbsp;some basic tips that can help you making&nbsp;your code more readable and safer from bugs. These can be applied to any language.</span></span> &nbsp;
+
"[https://www.facebook.com/1354624204600925/videos/138577474926767 Reproducible&nbsp;research&nbsp;how&nbsp;to&nbsp;write&nbsp;code&nbsp;that&nbsp;is&nbsp;built&nbsp;to&nbsp;last]"&nbsp;
  
=== <span style="font-size:large;">'''<span style="font-family:Arial,Helvetica,sans-serif;">Naming</span>'''</span> ===
+
It is worth watching the video (the actual presentation is about half of the video ~35&nbsp;minutes) to understand fully how valuable these tips are and also to get a perspective from someone who went from a science background to a commercial software engineering position.
  
==== <span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">'''Use long and descriptive names for variables and functions'''</span></span> ====
+
Below is a list of best practices discussed in the video.
  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">There is no advantage in using&nbsp;short names for variables and functions, it is good practice instead to use names that are descriptive, even if this will make them longer. Particularly for functions it is good to specify in the name what they do.</span></span>
+
=== Naming ===
  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">For example for a function that calculates an anomaly use&nbsp;calculate_anomaly() rather than calc().</span></span> <span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">This also reduce the chances of using a reserved word.</span></span>
+
*Use descriptive names for variables and functions
 +
*Use consistent naming across the code
 +
*Avoid hard-coding values
 +
*Initialising variables
  
&nbsp;<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">NB If you're using an IDE to edit your code then you can easily auto complete the names</span></span>&nbsp;&nbsp;
+
=== Code structure ===
  
==== '''<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Avoid reserved keywords and names of common functions</span></span>''' ====
+
*Indents
 +
*Comments
 +
*Use functions to organise your code&nbsp;
 +
*Don't&nbsp;Repeat Yourself (DRY) code
 +
*One statement per line
 +
*Write explicit code
 +
*Keep your files a reasonable length
 +
*Clear flow: try to have only one exit point in a function
 +
*Test important parts of your code&nbsp;&nbsp;  
  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Reserved keywords are words that are used for special functionalities, they change with the language but usually are used to control the flow of the code&nbsp;as "if", "for", "import" or used for declarations like "None", "True", "global, so they cannot be used to name anything else. Any other word can be used as a name but be careful, in [https://www.geeksforgeeks.org/why-python-is-called-dynamically-typed/ dinamycally typed languages] like Python a variable is evaluated while running the code, it is not declared at the start. This is because a variable&nbsp;name is just a link&nbsp;to an object and so the same name can refer to different objects in the same code.</span></span>
+
&nbsp;
<div style="background:#eeeeee;border:1px solid #cccccc;padding:5px 10px;"><span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">#For example if I have a function called mean</span></span></div> <div style="background:#eeeeee;border:1px solid #cccccc;padding:5px 10px;"><span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">def mean():</span></span></div> <div style="background:#eeeeee;border:1px solid #cccccc;padding:5px 10px;"><span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">....</span></span></div> <div style="background:#eeeeee;border:1px solid #cccccc;padding:5px 10px;"><span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">#and then use mean as a variable name</span></span></div> <div style="background:#eeeeee;border:1px solid #cccccc;padding:5px 10px;">mean = 45.3</div> <div style="background:#eeeeee;border:1px solid #cccccc;padding:5px 10px;">#if I try then to call the function I will get an error</div> <div style="background:#eeeeee;border:1px solid #cccccc;padding:5px 10px;">sst_mean = mean(sst)</div> <div style="background:#eeeeee;border:1px solid #cccccc;padding:5px 10px;">#as "mean is now referrring to the float object "45.3" as i have overwritten the link to the mean function</div>
 
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">mean type of variable can be&nbsp;are some that is still best to avoid, common examples are 'file'</span></span>
 
  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">There are also some that can be used as a name,like 'file', 'format', 'int', 'list', 'dict' , but that are already names of&nbsp;existing functions. This will of course also depend on which modules you are using and how you import them.</span></span>
+
= Writing Reusable Code =
  
==== <span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">'''Use consistent naming across the code'''</span></span> ====
+
There are many definitions of reusable code, and the details often depend on the use to which the code is being put. From&nbsp;[https://en.wikipedia.org/wiki/Code_reuse wikipedia]
 +
<blockquote>
 +
The key idea in reuse is that parts of a computer program written at one time can be or should be used in the construction of other programs written at a later time.
 +
</blockquote>  
 +
&nbsp;
  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Try to be consistent in the way you name your variable, constants and functions. There are a few conventions out there among the most common:</span></span>
+
=== Why ===
  
*<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">lowercase_words_with_underscores,</span></span>
+
Why write reusable code? This excellent&nbsp;[https://www.frontiersin.org/articles/10.3389/fninf.2017.00069/full article]&nbsp;articulates what it takes to transform code into a fully fledged scientific contribution, and reusability is a key component:
*<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">CapitalisedWords (also known as&nbsp;CamelCase),</span></span>  
+
<blockquote>
*<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">ALL_CAPITALS for constants values&nbsp;</span></span>  
+
Making your program reusable means it can be easily used, and modified, by you and other people, inside and outside your lab
*<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">mixedCase</span></span>
+
</blockquote>  
 +
The process of making your code reusable will also make it better, less error prone, saving time and increasing productivity.&nbsp;
  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Whatever you choose&nbsp;try to be consistent, use the same conventions for the same type of objects.</span></span>
+
=== How ===
  
Avoid hard-coding values
+
How do you write&nbsp;reusable code? There are levels of reusability, from re-using your own code to a fully published library/module for others to use. Start with the basics and with experience add more reproducible practices. If you're already doing the basics, try some of the intermediate or advanced ideas.
  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Initiliasing variables</span></span>
+
&nbsp;
  
You
+
'''Basic'''
  
[https://www.python.org/dev/peps/pep-0526/#global-and-local-variable-annotations https://www.python.org/dev/peps/pep-0526/#global-and-local-variable-annotations]
+
The basics of code reusability involve easily readable&nbsp;code and DRY (don't repeat yourself) principles, so using functions/subroutines/procedures to avoid copying blocks of code and modifying each block. The tips in the introduction can all help with code reusability as well as those listed below.
  
=== <span style="font-size:large;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Code structure'''</span></span> ===
+
Ten tips for writing readable code (based on PHP but principles are universal):
  
==== <span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">'''Indents'''</span></span> ====
+
[https://dzone.com/articles/10-tips-how-to-improve-the-readability-of-your-sof https://dzone.com/articles/10-tips-how-to-improve-the-readability-of-your-sof]&nbsp;
  
&nbsp;<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Some languages like Python enforce indenting, but even where it is not necessary indenting your code can help outlining the code structure. Again try&nbsp;to be consistent either use tabs or spaces, some languages like python have a preference for spaces.</span></span>
+
A good (python specific) section of a course&nbsp;from Software Carpentry about writing functions:
  
==== <span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">'''Comments'''</span></span> ====
+
[https://swcarpentry.github.io/python-novice-inflammation/08-func/index.html https://swcarpentry.github.io/python-novice-inflammation/08-func/index.html]
  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Comments are extremely useful to document what your code is doing. Again you do not need to comment every line, as that might actually made the code less redable. But it is a good practice to have:</span></span>
+
For the beginner python programmer it can be difficult to know exactly '''how''' to go about reusing code, how to organise it and import it into your notebooks or programs. This is a short, clear&nbsp;article&nbsp;about the python specific details on reusing your code:
  
*<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">&nbsp;a block of comments at the start of the code listing author, license,a a date for the last update,&nbsp;what the code does and how to use it.</span></span>
+
[https://towardsdatascience.com/creating-reusable-code-for-data-science-projects-740391ec7bad https://towardsdatascience.com/creating-reusable-code-for-data-science-projects-740391ec7bad]
*<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">similarly have a fe wlines of comments for each function or&nbsp;before a coherent block of code in the main program, for example before an "if/else" block.</span></span>
 
  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">It can also be useful starting a&nbsp;code by writing what do you want to do as comments. For example</span></span> &nbsp;
+
In a similar vein (and also python specific), how and when to use a <tt>main</tt> function in python:
<nowiki># Assign arguments
 
# Open data files
 
# Calculate ...
 
# Plot ...
 
# Save output to file</nowiki>
 
  
&nbsp;<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">In this way you have a draft of your comments and you can work out the best structure, individuate blocks that can be included ina. function etc,&nbsp;before you even start coding. It can save you a lot of time.</span></span>
+
[https://realpython.com/python-main-function/ https://realpython.com/python-main-function/]
  
==== <span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">'''Subdive&nbsp;your code using functions or sub-modules'''</span></span> ====
+
&nbsp;
  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Ideally a function should execute one coherent operation. As none of us it's a software developer we are not suggesting every line&nbsp;of code is a function , more than a logical block of lines should be included in a function. This is particularly true and useful if it's a block of code you might want to repeat in other parts of this same or others programs.</span></span> <span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Another good reason to enclose a block of code in a function it is to make it easier to add a test.</span></span> &nbsp; <span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Use tests</span></span> &nbsp; <font face="Arial, Helvetica, sans-serif" size="3">It is a good idea to test at least critical parts of your code. You want to be sure that&nbsp;a&nbsp;calculation which is critical to your results, is producing consistent and correct results, no matter what changes you introduce to the same function or&nbsp;other part of your code.</font> <font face="Arial, Helvetica, sans-serif" size="3">No matter how many tests you are conducting it is hard to preview all the possible ways a code can be used and it is often the case that as soon as you sue a different set of data you are going to find some bugs. Every time you fix a bug make sure you are adding a test to capture it. So you will know if accidentally you re-introduce it later on.</font>
+
'''Intermediate'''
  
==== DRY code ====
+
Once code style, readability and DRY principles hace been mastered the next step is improving what you're already doing and using&nbsp;the more advanced language features.
  
==== <span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">'''One statement per line'''</span></span> ====
+
This is a really nice and clear (FORTRAN specific)&nbsp;explanation of how to&nbsp;move&nbsp;from a purely procedural approach, explaining progressively more advanced features of FORTRAN functions and&nbsp;subroutines, and finishing with a real-world scientific example program:
  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Cramming a lot of instrcutions in one line of code won't make your code faster, just less readable. It will also make it more difficult to pinpoint what is causing an error if you have a few instructions in the same line.</span></span>
+
[https://livebook.manning.com/book/modern-fortran/chapter-3/62 https://livebook.manning.com/book/modern-fortran/chapter-3/62]
  
==== '''<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Keep your files a reasonable length</span></span>''' ====
+
Unfortunately the above link is to a book (Modern Fortran) only part of which is freely viewable. Ideally the book may be available institutionally, but if there is an equivalent freely available link let us know.
  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Main message here is to be consistent and descriptive, in the end any conventions aim is to have consistency. Even&nbsp;if you are not using an established convention, being at least consistent across all of your codes will help making them more&nbsp;readable and re-usable.</span></span>
+
&nbsp;
  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Order of preference</span></span>
+
'''Advanced'''
  
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">make your intentions clear even if it is not necessary</span></span>
+
To see the advice above put into practice, you can [https://youtu.be/8bZh5LMaSmE watch this video]. It is a bit long (18min). It is under the Advanced resources because the code it is using is written in Rails. Not many of you will have experience in Rails but the purpose of the video is to talk about the principles and not the specificity of the code. It also involves some more advanced concepts such as classes and modules.
  
=== '''<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Functions</span></span>''' ===
+
Documentation plays a critical role for code reusability. Any effort to document code is worthwhile and will improve reusability, but it is likely a large effort in code documentation will only be made in advanced code reuse scenarios, like a published module or library. In that scenario this is an excellent introduction primarily about &nbsp;taking into account the audiences for different aspects of code documentation:
  
==== '''<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Try to have only one exit point in a function</span></span>''' ====
+
[https://documentation.divio.com/introduction/ https://documentation.divio.com/introduction/]
  
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Return example</span></span>
+
&nbsp;
  
=== '''<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Style guides</span></span>''' ===
+
=== '''Style guides''' ===
  
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">[https://www.python.org/dev/peps/pep-0008/ Python:&nbsp;pep8]</span>&nbsp; &nbsp;</span>Python Enhancement Proposal  
+
*[https://www.python.org/dev/peps/pep-0008/ Python:&nbsp;pep8]&nbsp; &nbsp;Python Enhancement Proposal  
 
**[https://docs.python-guide.org/writing/style/ https://docs.python-guide.org/writing/style/]  
 
**[https://docs.python-guide.org/writing/style/ https://docs.python-guide.org/writing/style/]  
 
**[https://realpython.com/python-pep8/ https://realpython.com/python-pep8/]   
 
**[https://realpython.com/python-pep8/ https://realpython.com/python-pep8/]   
*[https://realpython.com/lessons/reserved-keywords/ <span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">Python reserved keywords</span></span>]  
+
*[https://realpython.com/lessons/reserved-keywords/ Python reserved keywords]  
*<span style="font-size:medium;">[https://docs.julialang.org/en/v1/manual/style-guide/ Julia: style&nbsp;guide]</span>
+
*[https://docs.julialang.org/en/v1/manual/style-guide/ Julia: style&nbsp;guide]  
*<span style="font-size:medium;">[https://cran.r-project.org/web/packages/AirSensor/vignettes/Developer_Style_Guide.html R style guide]</span>
+
*[https://cran.r-project.org/web/packages/AirSensor/vignettes/Developer_Style_Guide.html R style guide]  
*[https://www.datamentor.io/r-programming/reserved-words/ <span style="font-size:medium;">R reserved keywords</span>]  
+
*[https://www.datamentor.io/r-programming/reserved-words/ R reserved keywords]  
*
 
&nbsp;
 
 
 
  
 
&nbsp;
 
&nbsp;
 
References
 
 
This page was inspired by the seminar "[https://www.eventbrite.com.au/e/reproducible-research-how-to-write-code-that-is-built-to-last-tickets-153241109283# Reproducible&nbsp;research&nbsp;how&nbsp;to&nbsp;write&nbsp;code&nbsp;that&nbsp;is&nbsp;built&nbsp;to&nbsp;last]&nbsp;organised by DataTas. A&nbsp;recording is available on their facebook page.
 
  
 
[[Category:Data induction]]
 
[[Category:Data induction]]

Latest revision as of 01:32, 30 August 2021

Introduction

There are code standards and conventions available depending on the language you are using and sometimes also conventions adopted for specific collaborative projects. These can be quite complex and out of scope if you are writing a code for your analysis, however there are a few things you can do to make your code much more readable and safer from bugs which are quite simple. In the video linked below, kindly provided by DataTAS, the presenter gives some useful tips which can be applied to any language:

"Reproducible research how to write code that is built to last

It is worth watching the video (the actual presentation is about half of the video ~35 minutes) to understand fully how valuable these tips are and also to get a perspective from someone who went from a science background to a commercial software engineering position.

Below is a list of best practices discussed in the video.

Naming

  • Use descriptive names for variables and functions
  • Use consistent naming across the code
  • Avoid hard-coding values
  • Initialising variables

Code structure

  • Indents
  • Comments
  • Use functions to organise your code 
  • Don't Repeat Yourself (DRY) code
  • One statement per line
  • Write explicit code
  • Keep your files a reasonable length
  • Clear flow: try to have only one exit point in a function
  • Test important parts of your code  

 

Writing Reusable Code

There are many definitions of reusable code, and the details often depend on the use to which the code is being put. From wikipedia

The key idea in reuse is that parts of a computer program written at one time can be or should be used in the construction of other programs written at a later time.

 

Why

Why write reusable code? This excellent article articulates what it takes to transform code into a fully fledged scientific contribution, and reusability is a key component:

Making your program reusable means it can be easily used, and modified, by you and other people, inside and outside your lab

The process of making your code reusable will also make it better, less error prone, saving time and increasing productivity. 

How

How do you write reusable code? There are levels of reusability, from re-using your own code to a fully published library/module for others to use. Start with the basics and with experience add more reproducible practices. If you're already doing the basics, try some of the intermediate or advanced ideas.

 

Basic

The basics of code reusability involve easily readable code and DRY (don't repeat yourself) principles, so using functions/subroutines/procedures to avoid copying blocks of code and modifying each block. The tips in the introduction can all help with code reusability as well as those listed below.

Ten tips for writing readable code (based on PHP but principles are universal):

https://dzone.com/articles/10-tips-how-to-improve-the-readability-of-your-sof 

A good (python specific) section of a course from Software Carpentry about writing functions:

https://swcarpentry.github.io/python-novice-inflammation/08-func/index.html

For the beginner python programmer it can be difficult to know exactly how to go about reusing code, how to organise it and import it into your notebooks or programs. This is a short, clear article about the python specific details on reusing your code:

https://towardsdatascience.com/creating-reusable-code-for-data-science-projects-740391ec7bad

In a similar vein (and also python specific), how and when to use a main function in python:

https://realpython.com/python-main-function/

 

Intermediate

Once code style, readability and DRY principles hace been mastered the next step is improving what you're already doing and using the more advanced language features.

This is a really nice and clear (FORTRAN specific) explanation of how to move from a purely procedural approach, explaining progressively more advanced features of FORTRAN functions and subroutines, and finishing with a real-world scientific example program:

https://livebook.manning.com/book/modern-fortran/chapter-3/62

Unfortunately the above link is to a book (Modern Fortran) only part of which is freely viewable. Ideally the book may be available institutionally, but if there is an equivalent freely available link let us know.

 

Advanced

To see the advice above put into practice, you can watch this video. It is a bit long (18min). It is under the Advanced resources because the code it is using is written in Rails. Not many of you will have experience in Rails but the purpose of the video is to talk about the principles and not the specificity of the code. It also involves some more advanced concepts such as classes and modules.

Documentation plays a critical role for code reusability. Any effort to document code is worthwhile and will improve reusability, but it is likely a large effort in code documentation will only be made in advanced code reuse scenarios, like a published module or library. In that scenario this is an excellent introduction primarily about  taking into account the audiences for different aspects of code documentation:

https://documentation.divio.com/introduction/

 

Style guides