Skip to content

01 - Cool new features in Python 3.8

Link to the material: RealPython

Assignment Expressions: The Walrus Operator

It assignes and returns the value.

print( current := "Hello")
# Hello

Some use cases for The Walrus(leão marinho) operator:

While Loops Test

inputs = list()
while ( current := input("Write Something")) != "quit":
    inputs.append(current)

Regex

discount = 0.0
if (mo := re.search(r'(\d+)% discount', advertisement)):
    discount = float(mo.group(1)) / 100.0

Stream Reading

# Loop over fixed length blocks
while (block := f.read(256)) != '':
    process(block)

Positional Only Arguments

You can use / in a function definition to denote that all arguments before it must be specified by position, or an error is thrown otherwise.

def sum(a, b, /):
    return a+b

sum(3,5)
# 8
sum(a=3,b=5)
# Error

It could be handy in situations in which we want to have a default value for a function argument but it doesn't make sense to give a name for it.

The Keyword Only Argument

You can use * in a function definition to denote that all arguments following it must be keyword arguments.

def to_fahrenheit(*, celsius=0):
    return 32 + celsius*9/5

to_fahrenheit(celsius=40)
# 104.0
to_fahrenheit(40)
# Error

Combining all three types

You can combine positional-only, regular and keyword-only arguments.

def headline(text, /, border="~", *, width=50):
    pass

More precise types

Literal

Use literals to restrict the accepted values an argument or a return value can accept.

def draw_line(type:Literal['horizontal','vertical']) -> None:
    pass

If combined with overload you can dispatch some test functions to the static type checker

from typing import Literal, Union, overload

@overload
def add(a:int,b:int, /, to_roman:Literal[True]) -> str: ...

@overload
def add(a:int,b:int, /, to_roman:Literal[False]) -> int: ...


def add(a:int,b:int, /, to_roman:bool = True) -> Union[str, int]:
    pass

The overload decorator gives more information to the type checker about the behaviour of the function. Each overload "instance" tells the type checker about a valid way to call the function. The decorated declarations are called overloads, and the non-decorated is called the implementation. Ultimately, the implementation is the one to be called, and it should be declared in a way to cover all the cases specified by the declared overloads.

Danger

Notice that after defining overloads, mypy will only accept calls to add that match one of the overloaded declarations

Union x TypeVar

from typing import TypeVar
AnyStr = TypeVar('AnyStr', str, bytes)

def f1(x: AnyStr, y: AnyStr, /):
    pass

def f2(x: Union[str,bytes], y: Union[str,bytes]):
    pass

# Accepted
f1("hello", "world")
f1(b"hello", b"world")

# Error
f1("hello", b"world")
f1(b"hello", "world")

# Accepted
f2("hello", "world")
f2(b"hello", b"world")
f2("hello", b"world")
f2(b"hello", "world")

Final

It tells you that the variable can be defined only once.

from typing import Final
from pathlib import Path

base_dir: Final[Path] = Path("/home/user")
...
base_dir = Path("/home/other_user")
# Error

You can use the decorator @final on classes and functions in order to mark them as non-subclassable and non-overloadable.

from typing import final

@final
class Base:
    pass

TypedDict

It is a way to define a dictionary type with multiple type values.

from typing import TypedDict

class PythonVersion(TypedDict):
    version: str
    release_year: int

py38: PythonVersion = {"version":"3.8", "release_year":2019}
py39: PythonVersion = PythonVersion("version":"3.8", "release_year":2019)
Hint

This puts an end to the common idiom

d: Dict[str, Any]

Protocol

Protocols define a sort of interface. If a variable is type-hinted with a protocol, the contents of this variable must have the attributes defined in the protocol.

from typing import Protocol
class Named(Protocol):
    name: str

class Dog:
    pass

class NamedDog:
    name: str

def fn_1(a1: Named = Dog()):
    pass
# Type checker error

def fn_2(a1: Named = NamedDog()):
    pass
# Ok
Protocol x Class Inheritance

Protocol offers an alternative to class inheritance. In which cases Protocol is a more suitable choice than class inheritance? Some references are:

A good example of Protocol application is Iterable. This class is a Protocol. It checks if an object has the __iter__ method.

Hint

Predefined protocol in typing reference

References

Simpler Debugging With f-Strings

F-string in one glampse (Python 3.6)

import math
r: Final[float] = 3.8

s = f"Diameter {(diam:=2*r)} gives circumference {math.pi * diam:.2f}"
print(s)
Info

If you want to assign values in the f-string, you need to enclose the expression between parenthesis.

Here it is what was added in Python 3.8

name="Daniel"
print(f"{name=}")
# name=Daniel
print(f"{name = :>10}")
# name =            Daniel
print(f"{name.upper[::-1] = }")
# name.upper[::-1] = LEINAD
Negative step in list slicing

The negative step indicates to the slicing operation to start the operation from the end of the list.

It is a syntactic sugar that allows you to print the expression and the value simultaneously.

The python steering council

Not too much to talk about it here. But a podcast was cited and I am curious to know its format:

importlib.metadata

This is a companion library to importlib.resources and it allows you to get metadata information about packages.

from importlib import metadata

metadata.version("pip")
# 19.3.1
metadata.metadata("pip")["Home-page"]
# https://pip.pypa.io
metadata.requires("realpython-reader")
# ['feed-parser', 'html2text', 'importlib-resources', 'typing']
Hint

The underscore gives you the last returned value in a python REPL interpreter.

Improved math and statistics

#Math
math.prod() # Product of factors in an iterable
math.isqrt() # Returns integer parg
math.dist() # Euclidean distance
math.hypot() # Euclidean norm

#Statistics
statistics.fmean # mean of float numbers
statistics.geometric_mean
statistics.multimode # return the most frequente values in a sequence
statistics.quantiles # calculates cut points for dividing data into n continuous intervals with equal probability
Statistic Libraries

Check it out statsmodel and scipy.stats

Warnings about dangerous syntax

# Warning dubious use of `is` and `==`

version = "3.7"
if version == "3.7":
    # Value equality

if version is "3.7":
    # Check if version is the same object as the literal "3.7" (same address in memory)
    # A syntax warning is emmited in this example
# Warnings about missing commas in a list of tuples
v = [
    (1,2)
    (3,4)
]
# Previously, it simply emmited an error saying that a tuple is not callable.

Optimizations

  • Faster namedtuples
  • Smaller memory footprint for lists initialized with a known length

Exercises

  • In the course they mention bpython, which is an alternative REPL python interpreter. It is really cool! It has cool features as highlighting and instantaneous autocompletion.

Note: It seems that is not maintained anymore, though. Last version dates back to 2014. Maybe IPython is a better option. I could not use bpython to a general python version as well (from what I got I need to install in a different venv to use a different python version).

pipx install bpython
  • Write a program using bpython that reads a string stream 256 bytes at time.
  • Check if is easier to use IPython with different Python versions from the command line.

Note: It is not possible to do that with IPython neither bpython. I tried to isolate the installation of IPython with pipx, but pipx itself is not isolated so I couldn't manage different versions of IPython from pipx.

  • Check it out pyenv and evaluate if it is worthy installing it.

Note: No. The application asdf already cover the main features of pyenv that I am interested in. That is, switch between python versions.

  • Read about this article on how to use the @overload decorator.

Note: The article highlights another reason to use it. You may have side-effects on mypy type checking if you don't use the overload decorator. In the code example in section Literal, if we don't use the overload decorator, the return type of add according to mypy would be Union[str, int]. Therefore, the following command would be marked as an error by mypy:

result = 100/add(2,3)
  • Create a function that returns the value of a transaction in number of cents (int) or dollar quantity (float).

Note: sandbox/literal-and-overload.py. This reference was helpful.

  • What we should consider when defining a method as a classmethod or staticmethod? Why someone should prefer one over the other?

Note: staticmethod do not have access to any global class attribute neither instance attribute; classmethod has access to global class attributes via the first parameter, which is the calling class itself. It can be used in the factory pattern, for example. Finally, we have instance methods, which has access to the attributes of an instance of the class.

  • Protocol is said to be a way to formalize duck typing in python. Look for answers to the question posed at the end of section Protocol.

Note: Consider the following example.

class A:
    name: str
    age: str

class B:
    name: str
    age: str

class P(Protocol):
    name: str
    age: str

Accordingly to nominal subtyping, A and B are different types. According to structural subtyping, defined by the protocol P, they represent the same type.

Info

In some structural subtyping systems, the attributes identifiers themselves are not taken into account. The protocol P could have no matter what names as attributes. All that would matter would be that the subtype holds two string values. Can I give an example of this? I can't think about a use case in which this would be useful...

Both approaches are valid but they should be used in different scenarios.

class ShopItem:
    id: int
    name: str

class BookItem(ShopItem):
    author: str
    pages: num

class BoardGame(ShopItem):
    min_players: int
    max_players: int

# ------------------------------------------------

class Employee:
    id: int
    name: str

class RemoteEmployee(Employee):
    contract_duration: datetime

class OfficeEmployee(Employee):
    office_location: str    

# Think about conceptually distinct classes but with the same attributes
# Even being conceptually distinct, they may implement a kind of a general/universal method such as __iter__

Notice that ShopItem and Employee hold the same information but they are semantically different. If could model the problem above using protocols.

class Identifiable(Protocol):
    id: int
    name: str

Both ShopItem and Employee implements the Identifiable protocol. This could be used in a print function, for example:

def print(thing_to_display: Identifiable):
    print(f"Id: {thing_to_display.id}, Name:{thing_to_display.name}")
Hint

Protocols look less verbose than class hierarchy. Personally, I would choose Protocols to define interfaces (equivalent to fully abstract classes in c++). Whatever it makes sense to offer a basic implementation, I would choose class hierarchy.

  • Read the f-string guide and write down a maximum of 5 new things learned. Note: Here it is my top 5 list:
  • Multiline strings:
    message = (
                f"Good morning, Mr. {name}! "
                f"Welcome to our precincts."
    )
    print(message)
    # Good morning, Mr. Daniel! Welcome to out precincts.
    
  • f-strings are faster than their alternatives format and %.

  • Python has a lot of string formatting features implemented in the format of a mini-laguage. Read the specification and list the top 5 instructions (with examples) that are the most useful in your opinion.

Formatted string literals specification

f_string          ::=  (literal_char | "{{" | "}}" | replacement_field)*
replacement_field ::=  "{" f_expression ["="] ["!" conversion] [":" format_spec] "}"
f_expression      ::=  (conditional_expression | "*" or_expr)
                         ("," conditional_expression | "," "*" or_expr)* [","]
                       | yield_expression
conversion        ::=  "s" | "r" | "a"
format_spec       ::=  (literal_char | NULL | replacement_field)*
literal_char      ::=  <any code point except "{", "}" or NULL>

Format Specification

format_spec     ::=  [[fill]align][sign][z][#][0][width][grouping_option][.precision][type]
fill            ::=  <any character>
align           ::=  "<" | ">" | "=" | "^"
sign            ::=  "+" | "-" | " "
width           ::=  digit+
grouping_option ::=  "_" | ","
precision       ::=  digit+
type            ::=  "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%"
s = "Daniel"
n = 11

print(f"{n:b}")
# 1011
print(f"{s:>10}")
#    Daniel<end>
print(f"{s:=^20}")
#=======Daniel=======<end>
  • Python Data Structures looks like a cool reading.
  • Listen to an episode of Talk Python to me. (The episodes are 1 hour long. That's too much!)
  • Listen to an episode of Python Bytes
  • Find out why the new fmean function is 80x faster than the previous mean implementation.
  • Program the example in the section Combining all three types.
  • Use the package timeit to measure the execution time of some function (you may need to pass globals=globals() to the timeit function.
  • Use statistics.NormalDist to create the corresponding normal distribution given your measures.
  • What's the difference between a namedtuple and a dictionary?

Note: A namedtuple is a tuple with extended features. Namely, their members can be accessed by name using the . operator (as in a class). A namedtuple is based on te tuple implementation, which is written in C and it is fast (hash and comparing, for example).

A Dataclass is written in Python and it is based on the dict data structure.

  • You have fast access in a dict, but a tuple needs less space to be in the memory.
  • Dataclass are mutable, namedtuple don't.
  • Dataclasses vs namedtuples
  • What are the advantages of using pickle over toml or json?

Note: Wherever you need your data to be readable or even manually modified, go with toml. If you don't need a human to read it (and also if the data to be serialized is complex) go with pickle.