Learning Pieces
Docker
Test deployment service
During the development, I would like to have an easy way to start the Docker services I've created based on the most recent changes in the code. The solution I came up with involved creating a production and a test environment.
My first approach was to create a test pypi-server and a test environment for the journal-manager. The idea was that the journal-manager services would install the journal-manager packages from the test pypi-server.
Info
A neat thing that docker-compose does is to create a network among the services described in the docker-compose.yml file.
I put both test pypi-server and the test journal-manager services in the same docker-compose file to take
advantage of the network docker created for the services contained in the docker-compose file. That didn't
work. I needed to call pip install indexing the test pypi-server from the image build, but the network
between the services is only created after the build step.
Note
A solution would be to setup the test pypi-server in a different docker-compose file and have it started before building the image.
Upload the wheel and extract it during image build
I decided to take a simpler approach.
- Build the journal package in my dev machine, producing a wheel.
- Copy the wheel to a deploy-packages folder in the deployment server. This folder is part of the image context.
- Install the journal-manager from the wheel available in the deploy-package during the build.
The deploy-package folder has a folder for production and test packages. The deploy can be configured to use one or another. Also, depending of the deploy mode, we build the test services or the production services.
Bash
Executing commands via ssh in a remote machine
I had to manually set the PATH variable to contain the path /usr/local/bin.
ssh -p "${SSH_PORT}" "${DEPLOY_USER}@${DEPLOY_HOST}" "export PATH=\$PATH:/usr/local/bin; echo \$PATH; ${DEPLOY_SCRIPT} ${BUILD_MODE} ${DEPLOY_ACTION}"
Info
I've set up ssh and scp to use public/private key pair identification file. More info here
- Fermat's little theorem: For any integer \(a\) and any prime number \(p\): \(a^p - a\) is a multiple of \(p\).
- Totient function: \(\phi(n)\) denotes the number of co-primes of \(n\).
- Totient of prime product: Given prime numbers \(p,q\): \(\phi(pq) = \phi(p)\phi(q) = (p-1)(q-1)\).
- Euller's theorem: For any integer \(n\) and \(a\) co-prime of with \(n\): \(a^{\phi(n)} \equiv 1 \mod n\).
Let keyword
The let keyword allows you to compute arithmetic operations.
Error handling
Bash uses the error code returned by programs to handle errors. A zero code means success and a non-zero code means error.
One can individually handle errors in bash by using the logical OR operator ||.
You can also trigger cleanup functions or special treatment for some types of errors using trap.
Bash has a setting that instructs it to exit if a command fails.
-e: immediately exit if an error occurs.-x: print every command that is executed.-u: if set, any reference to an unset variable is an error.-o pipefail: any command that fails in a pipe sequence will return an error code.
More in Bash error handling manual.
Danger
Be aware that set -e has several traps. You may face your
program exiting unexpectedly due to some of these traps. Some commands return
non-zero code even if no error occurred (conditionals, test).
Vim
Pasting to vim command line
- yank a text
-
go to command mode and press
<CTRL> + Rand then press" - Type
registersto display register's contents. - Type
help registersto get more information.
MyPy
Static type checking in test files
To locate a module, mypy relies on __init.py__ files.
--no-namespace-packages: Inpkg/a/b/mod.pyyou will need an__init__.pyin each folder ofmod.pypath.--namespace-packageson and--explicit-package-basesoff: In this case, you only need that the top-level folder of the module path have a__init.py__.--namespace-packageson and--explicit-pacjage-baseson: You don't need any__init__.py, but you mypy will only recognize modules that are inside the folders specified in the mypy_path.
Pytest
A note about naming convention
- Not only the functions but also the files need to follow a name convention. I had
t_setup.pyand nothing was collecting until I haven't changed totest_setup.py
Setting PYTHONPATH in the configuration
The pythonpath configuration attribute. This variable contains the directories Python will look at during module importing.
General python
Logging
The Python logging library is a very complete solution for your logging problems. You can find most of the information in this link:
Some features I discovered recently are:
And also some bad patterns:
importlib_resources
The importlib_resources package is built on the top of Python import
system to facilitate the use of package resources on your packages.
from importlib_resources import files, as_file
class BuildIndexPage:
assets = files("danoan.journal_manager.templates").joinpath("material-index", "assets")
def build(self):
build_result = super().build()
if isinstance(build_result, FailedStep):
return build_result
try:
if not self.build_instructions.build_index:
return self
env = Environment(
loader=PackageLoader("danoan.journal_manager", package_path="templates")
)
with as_file(BuildIndexPage.assets) as assets_path:
shutil.copytree(assets_path, self.journals_site_folder.joinpath("assets"))
except:
pass
More on the importlib_resources package can be found here
Type hinting of classmethods
I want to give as type hint of a classmethod the very class in which it is defined.
To do that, we need to use TypeVar
from typing import Optional, TypeVar, Type
class TomlDataClassIO:
"""
Base class for a simple dataclass (i.e. with no mapping types)
"""
T = TypeVar("T", bound="TomlDataClassIO")
@classmethod
def read(cls: Type[T], filepath: str) -> Optional[T]:
pass
Type variables are useful for generic programming.
Info
Notice that as soon as the type is bounded to a type, this type
does not change. You can also explicitly bound to a type, as it was
done in the example above with the parameter bound.
isinstance, type, mro and getattr
class Shape:
name: str
class Square(Shape):
length: int
sh = Shape("my-shape")
sq = Square("my-square",10)
t_sh = type(sh)
t_sq = type(sq)
isinstance(sq,sh) # True
type(sh) == t_sh # True ( <class 'Shape'> )
type(sq) == t_sq # True ( )<class 'Square'> )
# Given a type, how to check if an instance of this type is an
# instance of a base class?
# 1. Instantiate the type
isinstance( t_sq("my-other-square", 5), sh) # True
# 2. Use mro()
Shape in t_sq.mro() # True
Note
MRO stands for Method Resolution Order.
Some types do not have the "mro" attribute. Make sure to check for it
with the getattr function.
ts = str
to = Optional[str]
getattr(ts, "mro", None) # returns the mro builtin method
getattr(to, "mro", None) # None
Origins of Python Functional Features
Take aways:
- Python had functions as first-class objects since the beginning but was not intended to be a functional programming language.
- Lambda is a more a syntactic feature. It is less powerful than lambdas in other programming languages. Think about using variables within the scope the lambda is defined.
- The bike shed and the atomic bomb plant
Inheritance x Composition
Takeaways
super()returns a proxy object.- You can interfere in the hierarchy lookup by passing super some parameters. But doing so might be an indication of a design issue.
is a(base and derived) andhas a(component and composite) relationships.- Duck typing makes the declaration of interfaces not necessary. But you can still create abstract classes for that (what about Protocols?)
- When doing multiple inheritance, think about inheriting from one base class and possibly implementing several interfaces.
- Multiple inheritance can lead to unexpected resolution of the
__init__due to MRO. This is particularly problematic if you are running in a class hierarchy design problem called the diamond problem. - Composition is a loosely-coupled alternative to class inheritance. Usually, this design is more flexible then class inheritance (vulnerable to class explosion and the diamond problem).
- Mixins are kind of components, but using inheritance. They are more strongly coupled but can be useful to inherit an interface implementation.
Danger
Mixins should not be viewed as base classes. It does not model a is a relationship.
It is also not a component. It is more like to a skill.
Indeed, it is good to make clear in the class name that the class is a Mixin.
Note
The TomlDataClassIO fits better in the concept of a Mixin.
Superclass and subclass terminology
A derived class D inherits all the methods of its base class B and
eventually implements some others. The set of methods of D is a superset of
the set of methods of B. This could lead to confusion with respect to the
superclass and subclass terminology.
However, the terminology is actually correct. The super and sub terminology
refers to the set of instances and not the set of methods or attributes. In the
sense that every instance of D is also an instance of (or it can be though
of) as as instance of B.
In a concrete example: every Mammal is an Animal, but not every Animal is a Mammal. Therefore, Animal is a superset of the Mammals which makes Mammals a subset of the Animals.
This was once asked in stackoverflow
Danger
The confusion is not out of place though. In OOP there is a technique called
mixin which works the same way as regular inheritance. But conceptually, it is
different. When an object D inherits a mixin B, we are not modelling the
relation is-a. Instead, we are modelling includes-a. Under this concept,
it makes much more sense to say that D is a superclass of B.
Method Resolution Order
Every Python object has a __mro__ attribute that tells you the order in which methods are
going to be resolved. It is an ordered list of classes in which Python will look, in order, while
searching for a method after a statement call.
One of the things that interferes in the MRO is the order of classes in multiple inheritance designs. They are done bottom-to-top and left-to-right
Special type hinting in Python
- Callback protocols in mypy
- New features in typying module in python 3.10
- Python properties
- Descriptor Protocol
Takeaways
- Protocols are the formalization of duck-typing. You can think of a Protocol P as a type specified in the form of a class with attributes and methods. Every class that has the attributes and methods specified in the Protocol P are said to be of type P.
- Particularly useful in static type checking.
- Lightweight interfaces (also more flexible than the tight-coupled regular class interfaces).
- Use properties to:
- read-only attributes;
- lazy computation;
- api compatibility.
- Properties are descriptors. A Descriptor is any class that implements the Descriptor protocol,
that is, that implements one or more of the methods below:
__get__(self,obj,type=None)->Object;__set__(self, obj,value)->None;__delete__(self,obj) -> none__set_name__(self, owner, name)
- If implements only
__get__, then it is a non-data descriptor. If it also implements__set__or__del__, then it is a data descriptor. - Lookup chain: That is the order in which Python access attributes.
__get__data descriptor;- Object's
__dict__; __get__non-data descriptor;- Class'
__dict__; - Object parent's class'
__dict__; - Repeat previous action until no more base class;
- Raise AttributeError.
Python multiprocessing library
- https://docs.python.org/3/library/multiprocessing.html
- https://docs.python.org/3/library/asyncio-subprocess.html
- https://itnext.io/practical-guide-to-async-threading-multiprocessing-958e57d7bbb8
Signal handling
- https://superfastpython.com/kill-a-process-in-python/
- Testing sys.exit with pytest
Dictionary expasion in function calls
'''
Expanding a dictionary in a function call can also replace
positional arguments.
'''
def f(p1,n1=None,n2=None):
print(f"{p1},{n1},{n2}")
d1 = {"p1":"positional", "n1":"named 1", "n2":"named 2"}
# Calling with a dict expansion only
f(**d1)
# positional, named 1, named 2
# This raises an error
f(p1="new positional", **d1)
# multiple values for keyword argument p1
# This one also fails
f("does not work", **d1)
# multiple values for keyword argument p1
# Also this one
f(n1="new named 1", **d1)
# multiple values for keyword argument n1
Python Importing System
Modules - Python Tutorial Python Import System
There are some subtleties in the Python import system that I should master.
My initial model for the importing system is the following:
- Package: A collection of modules (you need to put a init.py in order to Python recognize it as a package)
- Module: A collection of python instructions (usually grouped in a .py file)
It is based on the filesystem analogy in which directories are packages and modules are files. But this is not
quite correct. From the Python Import System documentation page:
It’s important to keep in mind that all packages are modules, but not all modules are packages. Or put another way, packages are just a special kind of module. Specifically, any module that contains a __path__ attribute is considered a package.
Import examples
# In this form, you have to call it by the complete name
import package.subpackage.module.function_A
# In this form, you can call function_A or function_B directly
from package.subpackage.module import function_A, function_B
Import a package as a namespace
Consider the following file hierarchy:
danoan/journal_manager/commands
__init__.py
journal_commands/
__init__.py
activate.py
create.py
deactivate.py
Let us say that I want to access the modules in the journal_commands from a jm namespace. That is,
Then I need to add import statements in the __init__.py file of journal_commands package. That is,
# __init__.py
from .activate import activate
from .create import create
from .deactivate import deactivate
Somewhat odd behaviour
There is an odd behaviour though. Assume we have the __init__.py as the one above.
import danoan.journal_manager.commands.journal_commands as jm
from danoan.journal_manager.commands.journal_commands import activate
If we check the globals() function we get
jm: <module 'danoan.journal_manager.commands.journal_commands' from '/home/daniel/Projects/Git/journal-manager/src/danoan/journal_manager/commands/journal_commands/__init__.py'>
'activate': <function activate at 0x7fe10eabc3a0>
I was expecting that activate would be resolved to the module activate. Indeed, it is resolved to the module activate if
we have an empty __init__.py for the journal_commands package.
It is not that odd if you think that after
we have the import statements in the __init__.py being called and the functions imported there will
collide with the module names, that are the same. What is happening is that the import is overwriting
the entry which points to a module to point to the function instead.
If we write like this:
# __init__.py
from .activate import activate_journal
from .create import create_journal
from .deactivate import deactivate_journal
no collision occurs. jm.activate_journal is a function and activate is a module.
Multiprocessing and Signal Handling
In order to correctly terminate the processes I started during the build
subcommand with the --with-http-server flag (namely the node http-server
and the entr file monitor), I had to register handlers for the SIGINT and
SIGTERM signs. Both in the Python and Bash scripts.
t1 = multiprocessing.Process(
target=node_wrapper.start_server, args=[http_server_folder.joinpath("init.js")]
)
t2 = multiprocessing.Process(target=app_call.start, args=[file_monitor_script])
t1.start()
t2.start()
def terminate_processes(sig, frame):
print("Terminating http server")
t1.terminate()
print("Terminating file-monitor")
t2.terminate()
signal.signal(signal.SIGINT, terminate_processes)
signal.signal(signal.SIGTERM, terminate_processes)
t1.join()
t2.join()
Pypi-server
Setting up the Pypi-server
Set up a service in your docker-compose.yml
iversion: '3.7'
services:
pypi-server:
image: pypiserver/pypiserver:latest
ports:
- 4962:8080
volumes:
- type: bind
source: /Users/capitu/Services/pypi-server/auth
target: /data/auth
- type: volume
source: pypi-server
target: /data/packages
command: -P /data/auth/.htpasswd -a update,download,list /data/packages
restart: always
volumes:
pypi-server:
After starting the server, you should be able to access the index in the address capitu_home:4962/packages
Setting up authentication
Create the authentication files with htpasswd
Note
htpasswd is an utility tool to manage user authentication in web servers. It store usernames and
their passwords (digest with the sha-1 method in the case above) in a text file.
Uploading your package
Build the distribution with your front-end tool (for example, build)
We are using twine to upload the package to the server:
pipx install twine
twine upload --repository-url http://192.168.1.14:4962 dev/extract-sdist/output/dist/*
More information can be found here.
Sphinx
Sphinx Templates
It is possible to modify the output generated by sphinx-apidoc by using templates
- apidoc: sphinx extension that extends sphinx with some directives that are able to render information from python source code.
- sphinx-apidoc: tool that uses the apidoc sphinx extension to generate API alike documentation.