<img src="https://lund-observatory-teaching.github.io/lundpython/imgs/front_1.jpeg" width="1400">

<h1><center> The course </center></h1>

#### Prior lecturer - Eero Vaher
<br>
<br>

### This lecture is a mini-series companion to: 

**ASTM28**: Dynamical Astronomy

**ASTM29**: Statistical tools in astrophysics
<br>
<br>
<br>
<br>

## Todays's Lecture: Basics

<h1><center> Course website </center></h1>

To download all lecture files and see the schedule, please visit:

[lund-observatory-teaching.github.io/lundpython/](https://lund-observatory-teaching.github.io/lundpython/)


Each lecture contains (as notebooks)
- Manual 
- Exercises
- Presentation

---

# Go to www.menti.com and enter code: xxxx xxxx

 - 3.9   Required for ``astropy`` 5.3 or newer
 - 3.10  More helpful error messages and pattern matching
 - 3.11  Even better error messages and better performance
 - 3.12  Better f-strings and better error messages again

Example of comparison between new and old error messages.
```python
expected = {9: 1, 18: 2, 19: 2, 27: 3, 28: 3, 29: 3, 36: 4, 37: 4,
            38: 4, 39: 4, 45: 5, 46: 5, 47: 5, 48: 5, 49: 5, 54: 6,
some_other_code = foo()
```
Pre 3.10 output:
```python
File "example.py", line 3
    some_other_code = foo()
                    ^
SyntaxError: invalid syntax
```
3.10 output:
```python
File "example.py", line 1
    expected = {9: 1, 18: 2, 19: 2, 27: 3, 28: 3, 29: 3, 36: 4, 37: 4,
               ^
SyntaxError: '{' was never closed
```

# What is Python good for?

Python is (arguably) the world's biggest programming language, and one of the most versatile.
The below table is from a [2023 Stackscale blog post](https://www.stackscale.com/blog/most-popular-programming-languages/).

<img src="https://lund-observatory-teaching.github.io/lundpython/imgs/languages.png" width="900">

Knowing this, let's review some of the things Python is used for!

## 1) Financial sector

HackerRank showed in a [blogpost from 2016](https://blog.hackerrank.com/emerging-languages-still-overshadowed-by-incumbents-java-python-in-coding-interviews/) that when hiring developers/programmers, the prioritized programming languages for finance always had Python ranked high. 

<img src="https://lund-observatory-teaching.github.io/lundpython/imgs/finance.png"   width=1400 height=500/>  

## 2) Web development & Building web apps
Python is an excellent language for web development and has pre-built libraries and web frameworks like [Django](https://www.djangoproject.com/) and [Flask](https://flask.palletsprojects.com/en/1.1.x/).  

In fact, **reddit** is written in Python and available on GitHub (https://github.com/reddit-archive/reddit)!

<img src="https://lund-observatory-teaching.github.io/lundpython/imgs/reddit.png" width="1500">

## 3) Startups

It is hightly common that startups use Python.
[Dropbox](https://www.dropbox.com/?landing=dbv2) was started by student Drew Houston who kept forgetting his flash drive, so he built it for himself in Python.

## 4) Data science
There is an impressive range to the science that can be done using Python.
For example we have the libraries:

<img src="https://lund-observatory-teaching.github.io/lundpython/imgs/astropy.png"    width="400" align="right" style="margin: 0px 0px 0px 0px;"/> 

- [Astropy](https://www.astropy.org): Astronomy

<img src="https://lund-observatory-teaching.github.io/lundpython/imgs/biopython.png"  width="260" align="right" style="margin: 0px 140px 0px 0px;"/>  


- [Biopython](https://biopython.org): Biology & Bioinformatics

<img src="https://lund-observatory-teaching.github.io/lundpython/imgs/graph-tool.png" width="400" align="right" style="margin: 0px 0px 10px 0px;"/>  

- [Graph-tool](https://graph-tool.skewed.de): Statistical analysis of graphs

<img src="https://lund-observatory-teaching.github.io/lundpython/imgs/psychopy.png"   width="290" align="right" style="margin: 0px 110px 0px 0px;"/>

- [Psychopy](https://www.psychopy.org): Neuroscience & Experimental Psychology

<img src="https://s3.amazonaws.com/aasie/images/0004-637X/935/2/null/apjac7c74f1_hr.jpg" width="1000" align="top">

Number of astronomy papers that refer to different programming languages. Python is the most commonly referred language since 2016. Figure by [Astropy Collaboration (2022)](https://ui.adsabs.harvard.edu/abs/2022ApJ...935..167A/abstract)

<h1><center> Writing Python </center></h1>

We recommend following the guide made by Numpy for installing Python:

[https://numpy.org/install/](https://numpy.org/install/#python-numpy-install-guide)

It covers all cases with different operating systems and user experience levels. 

There are two common programs you might want to use for writing Python:

### [Jupyter Notebook](https://jupyter.org/)
> The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more.

### [Spyder](https://www.spyder-ide.org/)
> Spyder is a powerful scientific environment written in Python, for Python, and designed by and for scientists, engineers and data analysts. It features a unique combination of the advanced editing, analysis, debugging, and profiling functionality of a comprehensive development tool with the data exploration, interactive execution, deep inspection, and beautiful visualization capabilities of a scientific package.

# Let's get started with a brief introduction!
## Hello, World!

Let's make Python say `Hello, World!`

In [None]:
print("Hello, World!")

In [None]:
print('Hello, World!')

And now we try something more complicated.

In [None]:
a = 4
b = 5
print(f"The sum of {a=} and {b=} is {a+b = }.")

# Built-in types

Some of the Python built-in types you will use a lot are:

In [None]:
for elem in (5, 5.0, "5", [5], (5,)):
    print(f"{elem} is {type(elem).__name__}")

## Booleans

There are two Boolean values: `True` and `False`.

If the result of converting an object to a Boolean is `True` then the object is called truthy, falsy objects get converted to `False`.

Truthy and falsy objects can be used directly in if-statements and while-loops.

In [None]:
for elem in (2, -1, "", 0.0, [], [0], (), ((),)):
    if elem:
        print(f"{elem} is truthy")
    else:
        print(f"{elem} is falsy")

Numerical representations of 0 and empty containers are falsy.

But `numpy` arrays are an exception!

## Sequences

Sequences contain elements that can be accessed with an index

In [None]:
for seq in ([0, 1], (2, 3), range(5), "Hello, World!"):
    print(
        f'The {type(seq).__name__} "{seq}" starts with {seq[0]} and ends with {seq[-1]}',
    )

But the index is not needed for looping over the elements.

In [None]:
for letter in "word":
    print(letter)

# Let's write a primality test
## And learn not to repeat ourselves

A prime number is an integer that cannot be written as a product of two smaller positive integers.
Primality can be tested through trial division.

In [None]:
n = 4
if not n % 2:
    print(f"{n} is not prime")
elif not n % 3:
    print(f"{n} is not prime")
else:
    print(f"{n} is prime")

n = 5
if not n % 2:
    print(f"{n} is not prime")
elif not n % 3:
    print(f"{n} is not prime")
elif not n % 4:
    print(f"{n} is not prime")
else:
    print(f"{n} is prime")

But what if we want to test a large number, e.g. 23?
We can avoid repeating code if we use a while-loop.

In [None]:
n = 23
i = 2
while i < n:
    if not n % i:
        print(f"{n} is not prime")
        break
    i += 1
if i == n:
    print(f"{n} is prime")

In [None]:
n = 1023
i = 2
while i < n:
    if not n % i:
        print(f"{n} is not prime")
        break
    i += 1
if i == n:
    print(f"{n} is prime")

Apparently 1023 has non-trivial factors. Let's see what they are.

In [None]:
n = 1023
i = 2
while i < n:
    if not n % i:
        print(f"{n} is not prime, it can be written as {i}*{n//i}")
        break
    i += 1
if i == n:
    print(f"{n} is prime")

What about 341, can it be factorised?

In [None]:
n = 341
i = 2
while i < n:
    if not n % i:
        print(f"{n} is not prime, it can be written as {i}*{n//i}")
        break
    i += 1
if i == n:
    print(f"{n} is prime")

Let's try an even larger number.
Inside a notebook we can use the `timeit` magic command to see how much time it takes to run.

In [None]:
%%timeit -n1 -r1

n = 433_494_437
i = 2
while i < n:
    if not n % i:
        print(f"{n} is not prime, it can be written as {i}*{n//i}")
        break
    i += 1
if i == n:
    print(f"{n} is prime")

The while-loop means we don't have to write out every single trial division separately, but we are still repeating code each time we test a new number.
We can save ourselves a lot of trouble if we define a function.

In [None]:
def is_prime(n, verbose=True):
    if n < 2:
        if verbose:
            print(f"{n} is not prime")
        return False
    i = 2
    while i < n:
        if not n % i:
            if verbose:
                print(f"{n} is not prime, it can be written as {i}*{n//i}")
            return False
        i += 1
    if verbose:
        print(f"{n} is prime")
    return True


is_prime(47)
is_prime(48)
huge_prime = 433_494_437
%timeit -n1 -r1 is_prime(huge_prime)

Let's see if we can make the function run any faster.

In [None]:
def is_prime(n, verbose=True):
    if n < 2:
        if verbose:
            print(f"{n:,} is not prime")
        return False
    if not n % 2 and n != 2:
        if verbose:
            print(f"{n:,} is not prime, it can be written as 2*{n//2:,}")
        return False
    i = 3
    while i * i <= n:  # i*i is the smallest possible composite we haven't checked yet
        if not n % i:
            if verbose:
                print(f"{n:,} is not prime, it can be written as {i:,}*{n//i:,}")
            return False
        i += 2
    if verbose:
        print(f"{n:,} is prime")
    return True


is_prime(huge_prime)
%timeit is_prime(huge_prime, verbose=False)

This way our changes are applied to every instance where we use `is_prime()`.

# Comments

Too few comments will make your code obscure.

Too many comments will bury useful information in clutter.

Better code together with better function and variable names means fewer comments are needed.

Overly complicated and overcommented code together with bad names:

In [None]:
def norm_lst(list_):
    lst=list_.copy()           # Copy the input
    total=0                    # Initialize the total variable
    for i in range(len(lst)):  # Loop over the elements of the list
        total=total+lst[i]     # Add values to total
    lst_av=total/len(lst)      # Divide total by len(lst)

    for i in range(len(lst)):  # Loop over the elements of the list
        lst[i]/=lst_av         # Divide

    return lst                 # Return the result


start_seq = [20, 24, 28, 29, 300, 560, 578]
print(norm_lst(start_seq))

We named a variable `list_` to avoid overwriting the built-in `list`.

The same code but with better comments.

In [None]:
def norm_lst(list_):
    """Return a normalized list of numbers with mean 1"""
    lst = list_.copy()

    # Compute the mean
    total = 0
    for i in range(len(lst)):
        total = total + lst[i]
    lst_av = total / len(lst)

    # Normalize
    for i in range(len(lst)):
        lst[i] /= lst_av

    return lst


print(norm_lst(start_seq))

Good naming and clear code means comments might not be needed.

In [None]:
def normalize_list_by_mean(list_):
    """Return a normalized list of numbers with mean 1"""
    mean = sum(list_) / len(list_)
    return [elem / mean for elem in list_]


print(normalize_list_by_mean(start_seq))


# Absolute and relative paths

**Path** is a string that specifies the location of a file in the file system hierarchy.

**Absolute path** specifies the location independently of the current working directory.

**Relative path** specifies the location relative to the current working directory.

For example, if this notebook were located in `/home/username/Documents/course/project/` and the directory also contained a subdirectory `data/` with the file `input1.csv` then the absolute path to that file would be `/home/username/Documents/course/project/data/input1.csv` and the relative path would be `data/input1.csv`.

**NB! The path separator is OS-dependent: "/" on Unix-like systems, "\\" on Windows!**

In [None]:
from pathlib import Path

print(Path("data", "input1.csv"))


### PEP-8

[PEP-8](https://www.python.org/dev/peps/pep-0008/) is the Python style guide.
The Python interpreter does not care if you follow PEP-8, but people and bots do.

## These topics (and more) is available in greater detail in the accompanying manual notebook!

# www.menti.com code: xxxx xxxx

### Exit question 1: Mark which things below are sequences

$\quad$<b>A)</b> `[1, 2, 3]`<br>
$\quad$<b>B)</b> `(1, 2, 3)`<br>
$\quad$<b>C)</b> `'123'`<br>
$\quad$<b>D)</b> `1.23`

Correct answer: A & B & C

### Exit question 2: What does the following block return?

```python
def add_five(a, b):
    return a+5, b+5

result = add_five(3, 2)
print(result)
```

$\quad$<b>A)</b> 5<br>
$\quad$<b>B)</b> 15<br>
$\quad$<b>C)</b> (8, 7)<br>
$\quad$<b>D)</b> `SyntaxError`

Correct answer: C

### Exit question 3: Mark the correct statements.

$\quad$<b>A)</b> Arithmetic expressions can include ints and floats together.<br>
$\quad$<b>B)</b> All strings must be declared with single quotation marks ( ' )<br>
$\quad$<b>C)</b> PEP-8 is the Python definition. Adherence is required for the code to work.<br>
$\quad$<b>D)</b> The first element of a sequence is always 0.

Correct answer: A

# Now it's time to use the manual to solve the exercises.