Difference between revisions of "Coding best practices"

Line 2: Line 2:
 
{{Template:Working on}}  
 
{{Template:Working on}}  
  
<span style="font-family:Arial,Helvetica,sans-serif;">
+
<span style="font-family:Arial,Helvetica,sans-serif;"><span style="font-size:medium;">There are code standards and conventions available depending on the language you are using and sometimes also conventions adopted for specific&nbsp;collaborative projects. While below we provide a few links to these, here&nbsp;we are just going to focus on&nbsp;some basic tips that can help you making&nbsp;your code more readable and safer from bugs. These can be applied to any language.</span> &nbsp;</span>
<span style="font-size:medium;">There are code standards and conventions available depending on the language you are using and sometimes also conventions adopted for specific&nbsp;collaborative projects. While below we provide a few links to these, here&nbsp;we are just going to focus on&nbsp;some basic tips that can help you making&nbsp;your code more readable and safer from bugs. These can be applied to any language.</span> &nbsp;
 
  
 
=== <span style="font-size:large;">'''Naming'''</span> ===
 
=== <span style="font-size:large;">'''Naming'''</span> ===
  
==== <span style="font-size:medium;">'''Use long and descriptive names for variables and functions''' ====
+
==== <span style="font-size:medium;">'''Use descriptive names for variables and functions'''</span> ====
  
There is no advantage in using&nbsp;short names for variables and functions, it is good practice instead to use names that are descriptive, even if this will make them longer. Particularly for functions it is good to specify in the name what they do.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">There is no advantage in using&nbsp;short names for variables and functions, it is good practice instead to use names that are descriptive, even if this will make them longer. Particularly for functions it is good to specify in the name what they do.</span></span>
  
For example for a function that calculates an anomaly use&nbsp;calculate_anomaly() rather than calc().This also reduce the chances of using a reserved word.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">For example for a function that calculates an anomaly use&nbsp;calculate_anomaly() rather than calc().This also reduce the chances of using a reserved word. NB If you're using an IDE to edit your code then you can easily auto complete the names</span></span>
NB If you're using an IDE to edit your code then you can easily auto complete the names
 
  
==== '''Avoid reserved keywords and names of common functions''' ====
+
==== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Avoid reserved keywords and names of common functions'''</span></span> ====
  
Reserved keywords are words that are used for special functionalities, they change with the language but usually are used to control the flow of the code&nbsp;as "if", "for", "import" or used for declarations like "None", "True", "global, so they cannot be used to name anything else. Any other word can be used as a name but be careful, in [https://www.geeksforgeeks.org/why-python-is-called-dynamically-typed/ dinamycally typed languages] like Python a variable is evaluated while running the code, it is not declared at the start. This is because a variable&nbsp;name is just a link&nbsp;to an object and so the same name can refer to different objects in the same code.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Reserved keywords are words that are used for special functionalities, they change with the language but usually are used to control the flow of the code&nbsp;as "if", "for", "import" or used for declarations like "None", "True", "global, so they cannot be used to name anything else. Any other word can be used as a name but be careful, in [https://www.geeksforgeeks.org/why-python-is-called-dynamically-typed/ dinamycally typed languages] like Python a variable is evaluated while running the code, it is not declared at the start. This is because a variable&nbsp;name is just a link&nbsp;to an object and so the same name can refer to different objects in the same code.</span></span>
  
For example if I have a function called ''mean''&nbsp;and later on in the code assigned a float value to it.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">For example if I have a function called ''mean''&nbsp;and later on in the code assigned a float value to it.</span></span>
  
I won't get an error then, but I will when trying to call&nbsp;''mean''&nbsp;as a function
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">I won't get an error then, but I will when trying to call&nbsp;''mean''&nbsp;as a function</span></span>
 
<syntaxhighlight lang="bash">
 
<syntaxhighlight lang="bash">
 
#For example if I have a function called mean
 
#For example if I have a function called mean
Line 32: Line 30:
 
</syntaxhighlight>
 
</syntaxhighlight>
  
mean type of variable can be&nbsp;are some that is still best to avoid, common examples are 'file'
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">mean type of variable can be&nbsp;are some that is still best to avoid, common examples are 'file'</span></span>
  
There are also some that can be used as a name,like 'file', 'format', 'int', 'list', 'dict' , but that are already names of&nbsp;existing functions. This will of course also depend on which modules you are using and how you import them.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">There are also some that can be used as a name,like 'file', 'format', 'int', 'list', 'dict' , but that are already names of&nbsp;existing functions. This will of course also depend on which modules you are using and how you import them.</span></span>
  
 +
&nbsp;
  
==== '''Use consistent naming across the code''' ====
+
==== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Use consistent naming across the code'''</span></span> ====
  
Try to be consistent in the way you name your variables, constants and functions. There are different opionions on what is the best convention:
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Try to be consistent in the way you name your variables, constants and functions. There are different opionions on what is the best convention:</span></span>
  
*lowercase_words_with_underscores  
+
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">lowercase_words_with_underscores</span></span>
*CapitalisedWords (also known as&nbsp;CamelCase)  
+
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">CapitalisedWords (also known as&nbsp;CamelCase)</span></span>
*mixedCase  
+
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">mixedCase</span></span>
*ALL_CAPITALS usually used for constants values&nbsp;  
+
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">ALL_CAPITALS usually used for constants values&nbsp;</span></span>
  
Whatever you choose&nbsp;try to be consistent, use the same conventions for the same type of objects.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Whatever you choose&nbsp;try to be consistent, use the same conventions for the same type of objects.</span></span>
  
==== '''Avoid hard-coding values''' ====
+
==== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Avoid hard-coding values'''</span></span> ====
  
==== '''Initialising variables''' ====
+
==== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Define constant values only once, at the start of the code, or as part of a configuration file , or as arguments to your function or&nbsp;program.</span></span> ====
  
You
+
==== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">This includes paths, or other string type variables, not just numbers. Defining&nbsp;all these values in one place, &nbsp;will make it easier to update them, without having to chnage a value in multiple places, it will reduce the chnace of errors. It will also make your code more readable, make your intentions clearer and the code will be more generic and faster to adapt for different configurations.</span></span> ====
  
[https://www.python.org/dev/peps/pep-0526/#global-and-local-variable-annotations https://www.python.org/dev/peps/pep-0526/#global-and-local-variable-annotations]
+
==== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Initialising variables'''</span></span> ====
</span>
 
=== <span style="font-size:large;">'''Code structure''' </span>===
 
  
==== <span style="font-size:medium;">'''Indents''' ====
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">You</span></span>
  
&nbsp;Some languages like Python enforce indenting, but even where it is not necessary indenting your code can help outlining the code structure. Again try&nbsp;to be consistent either use tabs or spaces, some languages like python have a preference for spaces.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">[https://www.python.org/dev/peps/pep-0526/#global-and-local-variable-annotations https://www.python.org/dev/peps/pep-0526/#global-and-local-variable-annotations]</span></span>
  
==== '''Comments''' ====
+
=== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Code structure'''</span></span> ===
  
Comments are extremely useful to document what your code is doing. Again you do not need to comment every line, as that might actually made the code less redable. But it is a good practice to have:
+
==== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Indents'''</span></span> ====
  
*a block of comments at the start of the code listing author, license,a a date for the last update,&nbsp;what the code does and how to use it.  
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">&nbsp;Some languages like Python enforce indenting, but even where it is not necessary indenting your code can help outlining the code structure. Again try&nbsp;to be consistent either use tabs or spaces, some languages like python have a preference for spaces.</span></span>
*similarly have a fe wlines of comments for each function or&nbsp;before a coherent block of code in the main program, for example before an "if/else" block.
 
  
It can also be useful starting a&nbsp;code by writing what do you want to do as comments. For example &nbsp;
+
==== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Comments'''</span></span> ====
 +
 
 +
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Comments are extremely useful to document what your code is doing. Again you do not need to comment every line, as that might actually made the code less redable. But it is a good practice to have:</span></span>
 +
 
 +
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">a block of comments at the start of the code listing author, license,a a date for the last update,&nbsp;what the code does and how to use it.</span></span>
 +
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">similarly have a fe wlines of comments for each function or&nbsp;before a coherent block of code in the main program, for example before an "if/else" block.</span></span>
 +
 
 +
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">It can also be useful starting a&nbsp;code by writing what do you want to do as comments. For example &nbsp;</span></span>
 
<nowiki># Assign arguments
 
<nowiki># Assign arguments
 
# Open data files
 
# Open data files
Line 76: Line 79:
 
# Save output to file</nowiki>
 
# Save output to file</nowiki>
  
&nbsp;In this way you have a draft of your comments and you can work out the best structure, individuate blocks that can be included in a function etc,&nbsp;before you even start coding. It can save you a lot of time.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">&nbsp;In this way you have a draft of your comments and you can work out the best structure, individuate blocks that can be included in a function etc,&nbsp;before you even start coding. It can save you a lot of time.</span></span>
 +
 
 +
==== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Use functions to organise your code / DRY code'''</span></span> ====
 +
 
 +
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Ideally a function should execute one coherent operation. As none of us it's a software developer we are not suggesting to transform every single line&nbsp;of code in&nbsp;a function, but try to inlcude in a function a logical block of lines. This is particularly true and useful if it's a block of code you might want to repeat in other parts of this same or others programs. Another good reason to enclose a block of code in a function it is to make it easier to add a test. &nbsp;</span></span>
  
==== '''Subdive&nbsp;your code using functions or sub-modules''' ====
+
=== &nbsp; ===
  
Ideally a function should execute one coherent operation. As none of us it's a software developer we are not suggesting every line&nbsp;of code is a function , more than a logical block of lines should be included in a function. This is particularly true and useful if it's a block of code you might want to repeat in other parts of this same or others programs.
+
==== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">DRY code</span></span> ====
Another good reason to enclose a block of code in a function it is to make it easier to add a test. &nbsp; Use tests &nbsp; It is a good idea to test at least critical parts of your code. You want to be sure that&nbsp;a&nbsp;calculation which is critical to your results, is producing consistent and correct results, no matter what changes you introduce to the same function or&nbsp;other part of your code.
 
No matter how many tests you are conducting it is hard to preview all the possible ways a code can be used and it is often the case that as soon as you sue a different set of data you are going to find some bugs. Every time you fix a bug make sure you are adding a test to capture it. So you will know if accidentally you re-introduce it later on.
 
  
==== DRY code ====
+
==== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''One statement per line'''</span></span> ====
  
==== '''One statement per line''' ====
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Cramming a lot of instructions in one line of code won't make your code faster, just less readable. It will also make it more difficult to pinpoint what is causing an error if you have a few instructions in the same line.</span></span>
  
Cramming a lot of instructions in one line of code won't make your code faster, just less readable. It will also make it more difficult to pinpoint what is causing an error if you have a few instructions in the same line.
+
==== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Keep your files a reasonable length'''</span></span> ====
  
==== '''Keep your files a reasonable length''' ====
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Main message here is to be consistent and descriptive, in the end any conventions aim is to have consistency. Even&nbsp;if you are not using an established convention, being at least consistent across all of your codes will help making them more&nbsp;readable and re-usable.</span></span>
  
Main message here is to be consistent and descriptive, in the end any conventions aim is to have consistency. Even&nbsp;if you are not using an established convention, being at least consistent across all of your codes will help making them more&nbsp;readable and re-usable.
+
=== '''<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Order of precedence</span></span>''' ===
  
Order of preference
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">make your intentions clear even if it is not necessary</span></span>
  
make your intentions clear even if it is not necessary
+
=== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Functions'''</span></span> ===
  
=== '''Functions''' ===
+
==== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Clear flow: try to have only one exit point in a function'''</span></span> ====
  
==== '''Clear flow Try to have only one exit point in a function''' ====
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">Return example</span></span>
  
Return example
+
=== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Use tests &nbsp;'''</span></span> ===
  
=== '''Style guides''' ===
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">It is a good idea to test at least critical parts of your code. You want to be sure that&nbsp;a&nbsp;calculation which is critical to your results, is producing consistent and correct results, no matter what changes you introduce to the same function or&nbsp;other part of your code. No matter how many tests you are conducting it is hard to preview all the possible ways a code can be used and it is often the case that as soon as you sue a different set of data you are going to find some bugs. Every time you fix a bug make sure you are adding a test to capture it. So you will know if accidentally you re-introduce it later on.</span></span>
  
*[https://www.python.org/dev/peps/pep-0008/ Python:&nbsp;pep8]&nbsp; &nbsp;Python Enhancement Proposal  
+
&nbsp;
**[https://docs.python-guide.org/writing/style/ https://docs.python-guide.org/writing/style/]  
+
 
**[https://realpython.com/python-pep8/ https://realpython.com/python-pep8/]   
+
=== <span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">'''Style guides'''</span></span> ===
*[https://realpython.com/lessons/reserved-keywords/ Python reserved keywords]  
+
 
*[https://docs.julialang.org/en/v1/manual/style-guide/ Julia: style&nbsp;guide]  
+
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">[https://www.python.org/dev/peps/pep-0008/ Python:&nbsp;pep8]&nbsp; &nbsp;Python Enhancement Proposal</span></span>
*[https://cran.r-project.org/web/packages/AirSensor/vignettes/Developer_Style_Guide.html R style guide]  
+
**<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">[https://docs.python-guide.org/writing/style/ https://docs.python-guide.org/writing/style/]</span></span>
*[https://www.datamentor.io/r-programming/reserved-words/ <span style="font-size:medium;">R reserved keywords]  
+
**<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">[https://realpython.com/python-pep8/ https://realpython.com/python-pep8/]</span></span>    
 +
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">[https://realpython.com/lessons/reserved-keywords/ Python reserved keywords]</span></span>
 +
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">[https://docs.julialang.org/en/v1/manual/style-guide/ Julia: style&nbsp;guide]</span></span>
 +
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">[https://cran.r-project.org/web/packages/AirSensor/vignettes/Developer_Style_Guide.html R style guide]</span></span>
 +
*<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">[https://www.datamentor.io/r-programming/reserved-words/ R reserved keywords]</span></span>
 
*&nbsp;  
 
*&nbsp;  
  
Line 119: Line 128:
 
&nbsp;
 
&nbsp;
  
References
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">References</span></span>
  
This page was inspired by the seminar "[https://www.eventbrite.com.au/e/reproducible-research-how-to-write-code-that-is-built-to-last-tickets-153241109283# Reproducible&nbsp;research&nbsp;how&nbsp;to&nbsp;write&nbsp;code&nbsp;that&nbsp;is&nbsp;built&nbsp;to&nbsp;last]&nbsp;organised by DataTas. A&nbsp;recording is available on their facebook page.
+
<span style="font-size:medium;"><span style="font-family:Arial,Helvetica,sans-serif;">This page was inspired by the seminar "[https://www.eventbrite.com.au/e/reproducible-research-how-to-write-code-that-is-built-to-last-tickets-153241109283# Reproducible&nbsp;research&nbsp;how&nbsp;to&nbsp;write&nbsp;code&nbsp;that&nbsp;is&nbsp;built&nbsp;to&nbsp;last]&nbsp;organised by DataTas. A&nbsp;recording is available on their facebook page.</span></span>
  
</span></span>
+
&nbsp;
  
 
[[Category:Data induction]]
 
[[Category:Data induction]]

Revision as of 02:01, 16 July 2021

Template:Working on New page under construction

 

There are code standards and conventions available depending on the language you are using and sometimes also conventions adopted for specific collaborative projects. While below we provide a few links to these, here we are just going to focus on some basic tips that can help you making your code more readable and safer from bugs. These can be applied to any language.  

Naming

Use descriptive names for variables and functions

There is no advantage in using short names for variables and functions, it is good practice instead to use names that are descriptive, even if this will make them longer. Particularly for functions it is good to specify in the name what they do.

For example for a function that calculates an anomaly use calculate_anomaly() rather than calc().This also reduce the chances of using a reserved word. NB If you're using an IDE to edit your code then you can easily auto complete the names

Avoid reserved keywords and names of common functions

Reserved keywords are words that are used for special functionalities, they change with the language but usually are used to control the flow of the code as "if", "for", "import" or used for declarations like "None", "True", "global, so they cannot be used to name anything else. Any other word can be used as a name but be careful, in dinamycally typed languages like Python a variable is evaluated while running the code, it is not declared at the start. This is because a variable name is just a link to an object and so the same name can refer to different objects in the same code.

For example if I have a function called mean and later on in the code assigned a float value to it.

I won't get an error then, but I will when trying to call mean as a function

#For example if I have a function called mean
def mean(variable):
...
return value
mean = 45.3
#if I try then to call the function I will get an error
sst_mean = mean(sst)
#as "mean is now referring to the float object "45.3" as i have overwritten the link to the mean

mean type of variable can be are some that is still best to avoid, common examples are 'file'

There are also some that can be used as a name,like 'file', 'format', 'int', 'list', 'dict' , but that are already names of existing functions. This will of course also depend on which modules you are using and how you import them.

 

Use consistent naming across the code

Try to be consistent in the way you name your variables, constants and functions. There are different opionions on what is the best convention:

  • lowercase_words_with_underscores
  • CapitalisedWords (also known as CamelCase)
  • mixedCase
  • ALL_CAPITALS usually used for constants values 

Whatever you choose try to be consistent, use the same conventions for the same type of objects.

Avoid hard-coding values

Define constant values only once, at the start of the code, or as part of a configuration file , or as arguments to your function or program.

This includes paths, or other string type variables, not just numbers. Defining all these values in one place,  will make it easier to update them, without having to chnage a value in multiple places, it will reduce the chnace of errors. It will also make your code more readable, make your intentions clearer and the code will be more generic and faster to adapt for different configurations.

Initialising variables

You

https://www.python.org/dev/peps/pep-0526/#global-and-local-variable-annotations

Code structure

Indents

 Some languages like Python enforce indenting, but even where it is not necessary indenting your code can help outlining the code structure. Again try to be consistent either use tabs or spaces, some languages like python have a preference for spaces.

Comments

Comments are extremely useful to document what your code is doing. Again you do not need to comment every line, as that might actually made the code less redable. But it is a good practice to have:

  • a block of comments at the start of the code listing author, license,a a date for the last update, what the code does and how to use it.
  • similarly have a fe wlines of comments for each function or before a coherent block of code in the main program, for example before an "if/else" block.

It can also be useful starting a code by writing what do you want to do as comments. For example   # Assign arguments # Open data files # Calculate ... # Plot ... # Save output to file

 In this way you have a draft of your comments and you can work out the best structure, individuate blocks that can be included in a function etc, before you even start coding. It can save you a lot of time.

Use functions to organise your code / DRY code

Ideally a function should execute one coherent operation. As none of us it's a software developer we are not suggesting to transform every single line of code in a function, but try to inlcude in a function a logical block of lines. This is particularly true and useful if it's a block of code you might want to repeat in other parts of this same or others programs. Another good reason to enclose a block of code in a function it is to make it easier to add a test.  

 

DRY code

One statement per line

Cramming a lot of instructions in one line of code won't make your code faster, just less readable. It will also make it more difficult to pinpoint what is causing an error if you have a few instructions in the same line.

Keep your files a reasonable length

Main message here is to be consistent and descriptive, in the end any conventions aim is to have consistency. Even if you are not using an established convention, being at least consistent across all of your codes will help making them more readable and re-usable.

Order of precedence

make your intentions clear even if it is not necessary

Functions

Clear flow: try to have only one exit point in a function

Return example

Use tests  

It is a good idea to test at least critical parts of your code. You want to be sure that a calculation which is critical to your results, is producing consistent and correct results, no matter what changes you introduce to the same function or other part of your code. No matter how many tests you are conducting it is hard to preview all the possible ways a code can be used and it is often the case that as soon as you sue a different set of data you are going to find some bugs. Every time you fix a bug make sure you are adding a test to capture it. So you will know if accidentally you re-introduce it later on.

 

Style guides

 

 

References

This page was inspired by the seminar "Reproducible research how to write code that is built to last organised by DataTas. A recording is available on their facebook page.