<img src="https://lund-observatory-teaching.github.io/lundpython/imgs/front_4.jpeg" width="1400">

<h1><center> Course website </center></h1>

To download all lecture files and see the schedule, please visit:

[lund-observatory-teaching.github.io/lundpython/](https://lund-observatory-teaching.github.io/lundpython/)


Each lecture contains (as notebooks)
- Manual 
- Exercises
- Presentation

---

#### Today we are going to be using a selection of tools to work through an example of how to improve a piece of code. 

In [None]:
def sieve(n):
    primes = []
    test = list(range(2, n + 1))
    while test[0] < n**0.5:
        p = test.pop(0)
        primes.append(p)
        new_list = []
        for n in test:
            if n % p:
                new_list.append(n)
        test = new_list
    return primes + test


primes = sieve(50)
print(primes)

# Testing

   #### "If debugging is the process of removing bugs, then programming must be the process of putting them in."
    - Edsger W. Dijkstra

In order to find bugs, we test our code. Just running the code is a form of testing, but we can do it in a more structured way.

Since we are inside a Jupyter notebook we are going to use [ipytest](https://pypi.org/project/ipytest/), details in the manual. 

In [None]:
import ipytest

ipytest.autoconfig()

In [None]:
%%ipytest

def test_sieve():
    assert sieve(19) == [2, 3, 5, 7, 11, 13, 17, 19]
    assert len(sieve(100)) == 25

# So you found a bug

In [None]:
print(sieve(9))

What to do now? 
 - Update test
 - Fix the bug

In [None]:
%%ipytest

# The test that reveals the problems
def test_sieve():
    assert sieve(19) == [2, 3, 5, 7, 11, 13, 17, 19]
    assert sieve(9) == [2, 3, 5, 7] # test for when n is a square of a prime
    assert len(sieve(100)) == 25

When `n` is the square of a prime, the output is includes `n` at the end of the output. 

This is because if `test[0] == n**0.5` the while loop is stopped early and the last number is not checked. 

The solution is to change `while test[0] < n**0.5: ` to `while test[0] <= n**0.5:`. 

In [None]:
def sieve(n):
    primes = []
    test = list(range(2, n + 1))
    while test[0] <= n**0.5:
        p = test.pop(0)
        primes.append(p)
        new_list = []
        for n in test:
            if n % p:
                new_list.append(n)
        test = new_list
    return primes + test


primes = sieve(9)
print(primes)

# Performance optimization & profiling

You might find that code you've written runs (very) slowly. In order to identify what the source of your slowness is you'll want to use profilers.  

You've already encountered [timeit](https://docs.python.org/3/library/timeit.html) so let's go over some more extensive alternatives.  

First, let's import the things we need.

In [None]:
%load_ext line_profiler
import matplotlib.pyplot as plt

We'll keep working on the `sieve()` function.

In [None]:
x = range(3, 5000)
primes_under_n = [len(sieve(n)) for n in x]

plt.plot(x, primes_under_n)
plt.xlabel("$n$")
plt.ylabel("Number of primes smaller than $n$")

In Jupyter we can use this [line_profiler](https://github.com/pyutils/line_profiler) to show which lines take a long time.

In [None]:
def sieve(n):
    primes = []
    test = list(range(2, n + 1))
    while test[0] <= n**0.5:
        p = test.pop(0)
        primes.append(p)
        test = [n for n in test if n % p]  # Overwrite test each loop
    return primes + test

In [None]:
%lprun -f sieve sieve(5000)

In [None]:
%%ipytest

def test_sieve():
    assert sieve(19) == [2, 3, 5, 7, 11, 13, 17, 19]
    assert sieve(9) == [2, 3, 5, 7] # test for when n is a perfect square of a prime
    assert len(sieve(100)) == 25

# Command line
When profiling on the command line, I again encourage you to use `line_profiler`. 

We will need the following command.

`kernprof -l -v sieve.py`

##### Run in command line

In [None]:
@profile
def sieve(n):
    primes = [2]
    test = list(range(3, n + 1, 2))
    while test[0] ** 2 <= n:
        p = test.pop(0)
        primes.append(p)
        test = [n for n in test if n % p]  # Overwrite test each loop
    return primes + test


primes = sieve(5000)
print(len(primes))

In [None]:
%%ipytest

def test_sieve():
    assert sieve(19) == [2, 3, 5, 7, 11, 13, 17, 19]
    assert sieve(9) == [2, 3, 5, 7] # test for when n is a perfect square of a prime
    assert len(sieve(100)) == 25

# Spyder
Spyder uses `line_profiler` too in the package [spyder-line-profiler](https://github.com/spyder-ide/spyder-line-profiler) 

Once installed you can use it by placing a `@profile` decorator in front of the functions that you want to be profiled. Then either press Shift + F10 or go to `Run > Profile line by line` to start the profiler.

A quick demonstration!

<video controls width="900" src="https://lund-observatory-teaching.github.io/lundpython/imgs/spyder_line_profiler.mov" />

# PEP8

Recall from lecture 1 [The Python Style Guide](https://www.python.org/dev/peps/pep-0008/) a.k.a. PEP8.  

It is a lengthy document that can be hard to memorize. Instead, there are nifty tools one can use to check the PEP8 compliance of a script and/or fix issues automatically. Here we will be looking at two of the most prominent such tools and demonstrating them against the code below.

##### Code example:

In [None]:
a='This code is not PEP 8 compliant! Not only will the linter get very upset, it will make sure you will be upset too.'
for sentence in a.split( '! ' ):
  print(sentence ,end ='\n\n')# Notice how the Python interpreter does not require 4 space indents"

# [`Ruff`](https://docs.astral.sh/ruff/)

Ruff is an extremely fast Python linter that implements existing style rules and appears to be becoming the de facto standard. 

It can fix some issues automatically and has several editor integrations. To run it (once installed), use the following command.

`ruff check is_this_pep8.py`
##### Run in terminal.

# `Ruff` formatter

Ruff also has a formatter. This will run on a script and change the code into its style, taking the decision away from the user. This means that code formatted in this has a very consistent look.

All the code demonstrated in this course has been formatted with Ruff (except for the deliberately bad examples). 

Automatic code linters and formatters like Ruff are very useful for collaborative projects. It is already used by Pandas, Jupyter, pytest, SciPy, Amazon, CERN, Godot, Hugging Face, IBM, Mozilla, Netflix, and many more.

##### Before applying `ruff`
```python
from seven_dwarfs import Grumpy, Happy, Sleepy, Bashful, Sneezy, Dopey, Doc
x = {  'a':37,'b':42,

'c':927}

x = 123456789.123456789E300

if very_long_variable_name is not None and \
 very_long_variable_name.field > 0 or \
 very_long_variable_name.is_debug:
 z = 'hello '+'world'
else:
 world = 'world'
 a = 'hello {}'.format(world)
 f = rf'hello {world}'
if (this
and that): y = 'hello ''world'#Comment
```


##### After applying `ruff`
```python
from seven_dwarfs import Grumpy, Happy, Sleepy, Bashful, Sneezy, Dopey, Doc

x = {"a": 37, "b": 42, "c": 927}

x = 123456789.123456789e300

if (
    very_long_variable_name is not None
    and very_long_variable_name.field > 0
    or very_long_variable_name.is_debug
):
    z = "hello " + "world"
else:
    world = "world"
    a = "hello {}".format(world)
    f = rf"hello {world}"
if this and that:
    y = "hello world"  # Comment
```

# Docstrings

Docstrings contain documentation information for different functions in Python and we have a few ways of accessing them. But first, let's write our own docstring. We recommend using the [NumPy docstring format](https://numpydoc.readthedocs.io/en/latest/format.html).

In [None]:
def sieve(n):
    """Generate a list of prime numbers smaller than a given n.

    Parameters
    ----------
    n : int
        The upper limit (inclusive) of numbers to search.

    Returns
    -------
    list
        List of prime numbers smaller than or equal to n.
    """
    primes = [2]
    to_test = list(range(3, n + 1, 3))
    while test[0] ** 2 <= n:
        p = to_test.pop(0)
        primes.append(p)
        to_test = [n for n in to_test if n % p]  # Overwrite to_test each loop
    return primes + to_test

With the `help()` function we can access the docstring, which can give us useful information on what a function does. We want to write docstrings if we work with other people.

In [None]:
help(sieve)

Let's also access the docstring of some existing function!

In [None]:
from numpy import identity

help(identity)

Jupyter notebook also has some useful ways of accessing docstrings. We can use<code style="color:#AA29FF"><b>?</b></code> and <code style="color:#AA29FF"><b>??</b></code> for example to access the docstring and source code respectively

In [None]:
?identity

In [None]:
??identity

We can also utilize `Shift + Tab` inside a function.  
1 `Tab` brings up a brief docstring.  
2 `Tab` makes it bigger.  
3 `Tab` makes it linger for 10 seconds.  
4 `Tab` opens the pager.

Jupyter lab only shows the long description.

In [None]:
identity()

# Progress bars

Recommendation: [`tqdm`](https://tqdm.github.io/)

In [None]:
from time import sleep

from tqdm import tqdm, trange

for t in tqdm((0.5, 1, 0.5, 1)):
    sleep(t)
for _ in trange(5):
    sleep(0.5)
for _ in trange(3, desc="Doing important work: "):
    sleep(0.5)

Inside a Jupyter notebook you might prefer to use the versions of the functions defined in `tqdm.notebook`.

In [None]:
from tqdm.notebook import tqdm, trange

for t in tqdm((0.5, 1, 0.5, 1)):
    sleep(t)
for _ in trange(5):
    sleep(0.5)
for _ in trange(3, desc="Doing important work: "):
    sleep(0.5)

# Git

Git is a free and open-source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

Using version control will let you keep track of changes and go back if something goes wrong. It will also make group projects much easier to manage. 

It is available for all operating systems and can make your lives a lot easier. A good place to start learning it is with the [Git book](https://git-scm.com/book/en/v2) (it's not super long). 

Knowing how to use Git is also a marketable skill that can go on anyone's CV.

<h1><center> RISE </center></h1>