Introduction
This document is intended to be a compilation of a set of tips for developing by giving your maximum potential, in Python, SQL and in cloud application development.
In addition, a part will focus on the developer’s posture, with tips and advice on workstation ergonomics.
Write clean code
Limit yourself to useful comments
There are pros and cons of comments, but overall, when you are new to a project reading comments makes it easier to understand the code and is a driving force in integrating into the team.
The dosage of the comment remains an essential element of its usefulness. Indeed, the comment serves to compensate for the complexity of the code. On the other hand, it becomes tiresome to read verbose comments, if the reading of the code is sufficiently clear.
Although there is a difference in the assessment of the usefulness of the comment, between a junior and a senior, a basic rule can be used to decide when to insert a comment, but it is up to you to define it, although we give some clues below.
Besides the fact that adding a description to each function (in Python it is a docstring), adding a comment should not compensate for a lack of meaning in the naming of variables. Just know that your comments can integrate your unit tests and therefore be particularly useful in the evolutions/maintenance of your code, especially if it is not you who modifies it.
Finally, the construction of the code, the choice of a design pattern, the good naming of functions, parameters and variables could make comments useless, except that comments will often help you write supported documentation, etc.
A function does only one thing, but well
Here is a rule that eliminates complexity and allows to calibrate its function to its action.
In the following example, the code is presented either in an expanded form or in a factored form.
If the call to the function remains identical, each step of the program calls a single function.
Although the code has more lines when it is factored, it will be easier to debug and maintain.
import string
def encryption(message, cypher):
"""Returns an encrypted message"""
char_list = string.printable
char_dict = {y:x for x,y in enumerate(char_list)}
modulo = len(char_list)
crypted_message = []
for char in message:
token = char_list[(char_dict[char] + cypher) % modulo]
crypted_message.append(token)
return ''.join(crypted_message)
>>> encryption("I love Python", 123456789)
'yV\x0c2dg3@Eni6dc'
import string
def get_tables():
"""Returns tables of character values"""
ordered_chars = string.printable
chars_dictionary = {y:x for x, y in enumerate(ordered_chars)}
length_of_chars_list = len(ordered_chars)
return ordered_chars, chars_dictionary, length_of_chars_list
def encoder(character_to_encrypt, shift):
"""Returns the encoded character"""
chars_list, chars_dict, length = get_tables()
crypted_char = chars_list[(chars_dict[character_to_encrypt] + shift) % length]
return crypted_char
def encryption(message, cypher):
"""Returns a message by encrypting it"""
crypted_message = []
for char in message:
token = encoder(char, cypher)
crypted_message.append(token)
return ''.join(crypted_message)
>>> encryption("I love Python", 123456789)
'yV\x0c2dg3@Eni6dc'
The code is called the same way and returns the same result, but the reading is made easier by the simplicity of each function.
If a function becomes too complex, reduce it by factoring. Factoring often means adding functions or new parameters to functions.
Object programming allows you to encapsulate complexity
When a program groups several bodies, we can then talk about the concept of objects. Objects are instantiations of classes or interfaces, presenting characteristics that they can inherit and share with another parent or child object.
There are also abstract classes, whose values cannot be instantiated but they are used to implement functionalities.
Objects can interact with each other, following two types of actions, either aggregation or composition.
From objects, design patterns (or DPs) were born, a kind of model encapsulating the complexity of computer code.
There is a DP for each object programming problem. They can be combined and bring a lot of elegance to writing code by allowing its complexity to be hidden.
They are classified into three main families:
-
Creational Patterns
-
Structural Patterns
-
Behavioral Patterns
If you want to start using DP in Python, I recommend this website.
Give sensible names to variables and parameters
Without turning the tables, everyone will agree that reading the code is simplified when the variables are not just letters, even more so if the names cover the role of the variable in the application.
Although in python, the interpreter is strongly typed (no need to define the types of your variables), it can be a good idea to end the name of your variable with its type.
Of course, you can go further by declaring the type of the variable, as in the example below, presenting a mix of possible solutions in Python:
from io import TextIOWrapper
filename_str: str = "my_file.csv"
file: TextIOWrapper = open(filename_str, "r")
The interest of declaring the type is not without any meaning, even if it represents an additional effort, it can allow to control the presence of a potential bug, in particular with the mypy library in Python.
Indent your code
One of the strengths of the Python language is to make end-of-line semicolons or function braces superfluous, which are everywhere in other languages, such as C or Java.
It is through the indentation of the code that the Python interpreter understands what it must do.
For us coders, indentation is often the paradigm of a better understanding of the code.
Today, there are tools to indent the code automatically, whose main advantage is to avoid the coder having to remember all the rules of the language, which can be tedious to learn.
In the Python code
Delete unused variables
During development, the coder tests many elements and not everyone uses a debugger in their IDE. In fact, there are numerous reasons why unnecessary variables can appear when the code evolves, even normally.
It is easy to imagine that unused variables, like dead code, are a hindrance to understanding the code. Therefore, every coder should spend some time to go through the code and eliminate what is no longer useful to the program.
Do not write exceptions without an Error type
Catching exceptions, in Python, is the royal road to making a program run despite everything that can happen at runtime.
The problem of exceptions offers a wide spectrum of cases. To convince yourself, just take a look at the Python documentation, The Zeal software will be, in this sense, a valuable asset and will allow you to always have it with you, offline.
try:
with open("file.txt", "r") as file:
content = file.read()
except FileNotFoundError as e:
print("File does not exist: ", e)
finally:
with open("file.txt", "w") as file:
file.write("Hello World!")
Remember that you are writing code, so it should be readable and understood by other coders. Choosing the right exception adds an element of understanding about the purpose of the code.
Avoid writing infinite loops
Although our systems are multi-tasking, infinite loops are still considered bugs by programmers.
While loops like:
while True:
(action code)
or even:
while 1 == 1:
(action code)
Are detestable to any coder with a minimum of experience. It is therefore important to always take the trouble to write coherent and sensible tests, even during the development phase of a program, where many tests are performed. This will reduce the possibility of forgetting an infinite loop in the final code.
Writing unit tests
In my own opinion, unit tests are the least fun part of writing a program. Not only do they require some effort to find, but choosing the right technical solution among the different existing products is a challenge.
I think it is not for nothing that program tester is a profession. However, let’s not forget that their automation represents a guarantee of greater stability of the program, in the face of the challenges that represents team development, Agile most of the time.
Among the different technical solutions, in Python, you will find an implementation in the standard library, through the functionality doctest, which you can also use, and even more with the standard library unittest or the third-party library pytest.
We have presented the most important ones, but an overview of the existing solutions is available by clicking on this link.
Using mypy to type variables
Although Python has an interpreter with duck typing, this could have conditioned it to the rank of languages ideal for prototyping, but not safe enough for production phases.
However, it is possible to use types, explicitly in the code. Of course, the interpreter is not directly influenced by your annotations, but a library will help you determine the consistency of types, through the functions of the code: this is mypy.
Mypy analyzes annotations on variables and allows you to control them. Writing code by typing your variables will allow you to write better quality and more robust code to failures.
However, to go further than mypy, it will be necessary to condition the interpreter, so that it takes into account the annotations. This can be done by testing the type of variables at runtime.
Python will therefore be more verbose, but the program will only run in specific cases, which allows you to exclude errors by limiting the program.
To get started with mypy, you can test the type of the variable by following the function as described in the following:
# reveal.py
import math
reveal_type(math.pi)
mypy reveal.py
/home/user/project/.venv/bin/python -m mypy reveal.py
reveal.py:3: note: Revealed type is "builtins.float"
The quality of your programs will be much better thanks to mypy and you will be able to increase the complexity of the processes, without having to worry about bug management.
Indenting your code with the black library
Python is a language with a refined syntax, thanks to rules on code indentation. The interpreter understands line breaks and indentation, where a language like C requires semicolons and braces.
However, the rules of the interpreter are not enough to make Python code readable. This is why the rule PEP8 was written by Python developers: Guido van Rossum, Barry Warsaw, Alyssa Coghlan.
Many IDEs already integrate the PEP8 rules and will allow you to format your code with one click. If you don’t have an IDE at hand, you can use the library Black.
Just run the following command for the file.
pip install black
python -m black {source_file_or_directory}
Since this library is compatible with jupyter, you can use it in all your uses of Python code.
Do not code a shallow copy, if a deep copy is necessary.
Copying an object, in computing, can mean copying a reference or a value. The exact term is a shallow copy for copying the reference in memory, and deep copy for copying the value of the object.
Concretely, this difference can lead to abnormal behavior in a program, where data is moved before being modified. Indeed, the shallow copy assigns the memory position of a variable to another variable. The behavior is as follows:
>>> a = [1, 2, 3]
>>> b = a
>>> a.pop()
>>> print(b)
[1, 2]
This behavior is due to copying the references of a into b and not the values of a.
However, a shallow copy could be used to update a variable from a second variable. To choose between the two copy modes, the copy module of Python, contains a copy function, which returns a shallow copy. In addition this module contains a deepcopy function, which returns a deep copy of an object.
>>> import copy
>>> a = [1, 2, 3]
>>> b = copy.deepcopy(a)
>>> a.pop()
>>> print(b)
[1, 2, 3]
In the SQL code
Correctly type the columns according to the data
There are many different types to categorize data according to the uses they require. Storing some data consumes more bytes than other data. This can have a greater impact when storing blobs or simply when the number of rows exceeds a million.
Since the price of storage depends on the size of the tables, it is good for the wallet to type the fields as close as possible to the type of data represented by the fields. This means typing a boolean as a boolean and not as an integer, among other things. The savings in big data can be colossal, ron 800%.
Do not insert an Order By if it is not useful
When a query is performed in a test, it is often useful to order the results according to a column, or to highlight null values. This sorting is done at the end of the query, but it adds a significant amount of calculation during execution.
As a general rule, the data is only in bulk in the tables and in the views as well. There is very little need to order them.
Given that billing is applied by counting the amount of data processed by the user, the order by command should be used in good intelligence with the real needs of your data engineering.
In hardware
Measuring program consumption
After optimizing a program, the balance must be found with the allocated resources. However, the consumables of a program are measurable, so finding a match remains an achievable goal.
The parameters of a program are:
-
The execution time
-
The memory used
-
The disk space used
-
The number of cores
If the influence of the code is large, the type of process can by itself define a necessary resource. However, there may be incompressible parameters depending on the case, such as the maximum number of simultaneous calls to a server, which can slow down the execution of the program to a duration that cannot be reduced.
Reduce execution time
The ways to reduce execution time are:
-
Simplify the complexity of the program
-
Analyze and optimize the process of each function
-
Opt for multithreading or multiprocessing
-
Take libraries already optimized for your operations
Reduce memory consumption
The ways to reduce memory consumption are:
-
Writing to disk
-
Destruction of unnecessary variables
-
Data compression
-
Data format
-
Number of libraries
Locate servers as close as possible to clients
Deploying a cloud service server as close as possible to the users of the service has many advantages in terms of failure resistance or for regulatory compliance.
Let’s just note that, under an optimization axis, the latency time will be reduced, so the proximity of the server will significantly improve the user experience.
Reserve only the data used by the client for the flow
Since data transfer is counted in multiples of bytes, a lever for improvement that counts for a lot is to ensure that the data sent to the client is really necessary for it.
Similarly, it will probably be useful to carefully choose the operations performed on the client’s terminal and the operations calculated by the server.
This resonates with the architecture of your program and should be thought about at the time of its design, because the impacts are numerous in terms of use.
The developer’s posture without curvature
Avoid taking your hands off the keyboard
To code well, a relaxed posture is essential. The ideal posture is the one that causes the least movement of the shoulders. In this way, they can stay relaxed.
Keeping your elbows resting on your desk, or on the elbow rests of your chair, will allow you to relax your shoulders.
However, to protect your wrists, you will need to rest them too, but using a wrist rest.
VIM
Using a VIM plugin in your IDE and learning a few keyboard shortcuts will allow you to do without the mouse to move around in your integrated development environment (IDE) while performing various tasks such as copying/pasting, searching/replacing character sequences, etc.
You can install it on PyCharm by following this documentation, or on VSCode with this other documentation.
Learn an ergonomic keyboard layout
BÉPO
To go further in keyboard ergonomics, it is advantageous to change to a more ergonomic keyboard layout, because the AZERTY is not designed to be ergonomic.
Indeed, the AZERTY dates back to the era of typewriters and prevented the hammers of the machine from getting tangled when professionals used them at full speed.
For more than 20 years, there has been a keyboard layout, ergonomic and adapted to the French language: it was called the BÉPO layout.
The BÉPO allows you to limit the movement of the arm, shoulder, wrists and fingers, in particular by placing the most common letters of French in the center of the keyboard and placing the less used ones on the periphery of the keyboard.
5 minutes a day for 2 weeks allows you to type in BÉPO quickly. e typing will allow you to practice for free: All the touch typing tutors
Then, to use your keyboard, you will no longer need to look at your hands, which will relax your vision, moreover, you will use all your fingers, which will increase your typing speed up to 45 words per minute.
To install the BÉPO layout on Windows, this site will show you the method to follow.
Using a static code analysis tool
Qodana
To do real in-depth work on your code, if you use a JetBrains IDE, you will be able to use Qodana, the static code analysis tool.
Qodana is based on JetBrains' native inspections, offering more than 2500 code inspections. It will allow you to detect a wide variety of problems and can present them in the form of a report.
Conclusion
The goal of this book is to deliver to you in about fifteen pages, the essentials to know to write clean code.
The principle has been extended to posture, because it greatly influences the quality of the code, although indirectly.
I really hope that you enjoyed this document and that it was not painful for you to read it.
If you have any remarks or comments, you can send them to me at the following address romain@boyrie.email.
All that remains for me is to wish you "Good Code!"