QTM 151 - Introduction to Statistical Computing II

Lecture 06 - For Loops

Danilo Freire

Emory University

18 September, 2024

And here we go again! Nice to see you all! 😊

Brief recap of last class 📚

Last class we covered:

  • Boolean operators: & and |
    • & is the logical AND operator: both conditions must be true
    • | is the logical OR operator: at least one condition must be true
  • if statements: conditional execution
    • if statements allow you to execute a block of code only if a condition is true
  • else statements: alternative execution
    • else statements allow you to execute a block of code if the condition is false
  • elif statements: multiple conditions
    • elif statements allow you to check multiple conditions

Something that kept me thinking last class

Was it & or |? 🤔

  • Do you remember this exercise from last class?
    • Check whether age (age = 31) is strictly less than 20, or greater than 30
    • Not in the age range 25-27
  • The first answer was (and we agreed):
age = 31

(age < 20) | (age > 30) 
  • The original answer for the second part was:
(age < 25) | (age > 27)
  • But in class we thought the answer should be:
(age < 25) & (age > 27)
  • However, the original answer was indeed correct! 🤯
  • Can you guys explain why? 🤔
  • Appendix 01

Today’s plan 📋

  • Today, we will talk about
    • Manipulation of lists
    • None to create empty lists
    • list.append() method to add elements to a list
    • for loops, a useful tool to iterate over lists
    • How to use for loops to create multiple graphs

A bit more about lists in Python 📝

Lists with blank elements

How to create an empty list?

  • You can create an empty list using the None object
  • None is a special object in Python that represents the absence of a value
  • The type of None is NoneType. It is the only instance of this type
  • None is often used to represent missing values or, in our case today, placeholders
  • Please note that None is not the same as 0, False, or an empty string ''
  • You should also not use quotes with None, as it is not a string
  • Let’s see how to create an empty list using None:
# Simply type "None"
list_answers = [None,None,None]
print(list_answers)
[None, None, None]

Note

You can read more about None at https://realpython.com/null-in-python/.

Assigning or replacing values to lists

  • You can assign or replace values in a list using the index of the element
  • The index of a list starts at 0
  • We use the following syntax:
# What's the name of your hometown?
list_answers[0] = "Nashville"

print(list_answers)
['Nashville', None, None]

Appending values to lists

The list.append() method

  • You can add elements to a list using the list.append() command
  • This command adds the element to the end of the list
  • You can only add one element at a time
# We can start an empty list with []
# Use the command "new_list.append(item)" with the function "append()"
# and an arbitrary value of "item"

new_list = []
new_list.append("Nashville")
new_list.append("Bogota")
# new_list.append()

print(new_list)
['Nashville', 'Bogota']

Extending lists

The list.extend() method

  • You can also add multiple elements to a list using the list.extend() command
  • Here you can add multiple elements at once
my_list = ["Nashville", "Bogota"]
my_list.extend(["Atlanta", "São Paulo", "Rio de Janeiro"])
print(my_list)
['Nashville', 'Bogota', 'Atlanta', 'São Paulo', 'Rio de Janeiro']

Lists with repetition 🔄

Lists with repeated values

  • You can create a list with repeated values using the * operator
  • The syntax is very simple and is as follows:
    • list = [value] * n
# Check our previous list
list_answers
['Nashville', None, None]
  • Now, let’s create a list with repeated values
    • Repeat a single value 30 times
    • Repeat a list 4 times
    • Repeat 8 null values
# Repeat a single value 30 times
list_two_rep = [7] * 30
print(list_two_rep)
[7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7]
# Repeat a list 4 times
list_answers_rep = list_answers * 4 
print(list_answers_rep)
['Nashville', None, None, 'Nashville', None, None, 'Nashville', None, None, 'Nashville', None, None]
# Repeat 8 null values
list_none_rep = [None] * 8 
print(list_none_rep)
[None, None, None, None, None, None, None, None]

Common pitfalls with lists

  • A common mistake is to confuse lists and np.array objects when doing operations
  • Lists are not arrays, and you cannot perform operations like addition or multiplication
  • You can only concatenate lists using the + operator
# When you multipy a list times a number you repeat the list
list_a = [1,2,3]
print(list_a * 4)
[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]
# When you add two lists, you concatenate them
list_b = [4,5,6]
print(list_a + list_b)
[1, 2, 3, 4, 5, 6]
# When you multipy an array times a number, you multiply each element
import numpy as np

vec_a = np.array(list_a)
print(vec_a * 4)
[ 4  8 12]
  • Is that clear? 🤓

Counting the length of a list

  • You can count the length of a list using the len() function
# Count the length of the list
print(len(list_answers))
3
print(len(list_two_rep))
30
print(len(list_answers_rep))
12

Try it yourself! 😊

  • Create an empty list called “list_personal”
  • Add two more values using “.append”
  • Find the total length of the list
  • Change the last value to “Last element” using the index
  • Appendix 02

For loops in Python 🔄🐍

What is a for loop?

  • A for loop is a way to iterate over a sequence of elements
  • It is a very useful tool to automate repetitive tasks
  • The syntax is similar to an if statement, including the colon : and the necessary indentation (4 spaces)
  • The syntax is as follows:
for element in sequence:
    do something
variable = [1,2,3,4]

for i in variable:
    print(i)
1
2
3
4

What can you do with a for loop?

  • You can iterate over a list of elements
    • Numbers, strings, or any other type of object
  • You can also iterate over a range of numbers
  • You can even iterate over a list of lists (nested lists) 🤯
    • This is very useful when working with dataframes or matrices
students_scores = [
    ["Alice", 85, 90, 88],
    ["Bob", 78, 82, 84],
    ["Charlie", 92, 95, 93]
]

for i in students_scores:
    print(i)
['Alice', 85, 90, 88]
['Bob', 78, 82, 84]
['Charlie', 92, 95, 93]

Some examples

list_ids = ["KIA", "Ferrari", "Ford", "Tesla"]
print("Dear customer, we are writing about your " + list_ids[0] + " car.")
print("Dear customer, we are writing about your " + list_ids[1] + " car.")
print("Dear customer, we are writing about your " + list_ids[2] + " car.")
print("Dear customer, we are writing about your " + list_ids[3] + " car.")
Dear customer, we are writing about your KIA car.
Dear customer, we are writing about your Ferrari car.
Dear customer, we are writing about your Ford car.
Dear customer, we are writing about your Tesla car.
  • This is a very boring! 🥱
list_ids = ["KIA", "Ferrari", "Ford", "Tesla"]

for i in list_ids:
    print("Dear customer, we are writing about your " + i + " car.")
Dear customer, we are writing about your KIA car.
Dear customer, we are writing about your Ferrari car.
Dear customer, we are writing about your Ford car.
Dear customer, we are writing about your Tesla car.
  • This is much better! 😎

You can also iterate over additional elements

  • This code iterates over each element in list_ids and three additional elements: ‘a’, ‘b’, and ‘c’.

  • The + operator is used to concatenate list_ids with another list ['a', 'b', 'c'].

  • For each element in this combined list, the loop assigns the element to the variable id and then prints it out using the print function.

for id in list_ids + ['a', 'b', 'c']:
    print(id)
KIA
Ferrari
Ford
Tesla
a
b
c

Customised messages + numbering

  • You can also include numbers in your messages
  • Initiate index = 1 before the loop (just to start at 1, since Python indexes start at 0)
  • Add index = index + 1 at the end of the body
list_ids = ["KIA", "Ferrari", "Ford", "Tesla"]

index = 1
print('We are out of the loop', index)
for id in list_ids:
    print("Dear customer, your position is " + str(index) + " on the waitlist" +
           " and your car brand is " + id )
    index = index + 1
    print('We are inside the loop', index)
We are out of the loop 1
Dear customer, your position is 1 on the waitlist and your car brand is KIA
We are inside the loop 2
Dear customer, your position is 2 on the waitlist and your car brand is Ferrari
We are inside the loop 3
Dear customer, your position is 3 on the waitlist and your car brand is Ford
We are inside the loop 4
Dear customer, your position is 4 on the waitlist and your car brand is Tesla
We are inside the loop 5

Another example

for i in range(len(list_ids)):
    print("Dear customer, your position is " + str(i+1) + " on the waitlist" +
           " and your car brand is " + list_ids[i])
Dear customer, your position is 1 on the waitlist and your car brand is KIA
Dear customer, your position is 2 on the waitlist and your car brand is Ferrari
Dear customer, your position is 3 on the waitlist and your car brand is Ford
Dear customer, your position is 4 on the waitlist and your car brand is Tesla
  • Who can explain to me what range(len(list_ids)) does?
  • And what does str(i+1) do?
  • Appendix 03

Saving time while coding!

How to make your work easier with for loops

  • Boring 🥱
import pandas as pd
import matplotlib.pyplot as plt

carfeatures = pd.read_csv("data/features.csv")
list_vars = ["acceleration","weight"]

variable_name = "acceleration"
plt.scatter(x = carfeatures[variable_name],
            y = carfeatures["mpg"])
plt.ylabel("mpg")
plt.xlabel(variable_name)
plt.show()

variable_name = "weight"
plt.scatter(x = carfeatures[variable_name], 
            y = carfeatures["mpg"])
plt.ylabel("mpg")
plt.xlabel(variable_name)
plt.show()

Cool 😎

carfeatures = pd.read_csv("data/features.csv")
list_vars = ["acceleration","weight"]

for variable_name in list_vars:
    plt.scatter(x = carfeatures[variable_name],
                y = carfeatures["mpg"])
    plt.ylabel("mpg")
    plt.xlabel(variable_name)
    plt.show()

Even cooler! 🤩

carfeatures = pd.read_csv("data/features.csv")
list_vars   = ["acceleration","weight"]

index = 1
for variable_name in list_vars:
    plt.scatter(x= carfeatures[variable_name], y = carfeatures["mpg"])
    plt.ylabel("mpg")
    plt.xlabel(variable_name)
    plt.title("Figure " + str(index))
    plt.show()
    index = index + 1

Solving many equations at once

  • Solve the equation \(y = x^2 + 2x\) for \(x = 1,2,4,5,6,7,8,9,10\)
# Create a list of x-values list_x = [1,2,4,5, ..., 50]
# Create a list of y-values to fill in later.

list_x = [1,2,4,5,6,7,8,9,10]
list_y = [None] * len(list_x)

# Create an index 
index = 0
for x in list_x:
    list_y[index] = list_x[index]**2 + 2*list_x[index]
    index = index + 1

# Display results visually
print(list_y)
plt.scatter(list_x, list_y)
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Scatter plot of Y = X^2 + 2X")
plt.show()
[3, 8, 24, 35, 48, 63, 80, 99, 120]

And that’s it for today! 🎉

Thank you all for your attention! 🙏🏽😊

Appendix 01

Why the original answer was correct?

  • The original answer was (age < 20) | (age > 30)
  • We thought the answer should be (age < 25) & (age > 27)
  • But the correct answer is the original one
  • Why? 🤔
  • Think about it with me:
    • The & operator checks if both conditions are true simultaneously. So, someone should be less than 25 and greater than 27 at the same time, which is impossible.
    • Therefore, the correct operator is |, which checks if at least one of the conditions is true (but not both!). For the age to be not in the range 25-27, it must be either less than 25 or greater than 27.
    • Remember, the trick was in the not in the range part of the question! 🤓
  • Does that make sense? 😅

Back to the main text

Appendix 02

  • Create an empty list called “list_personal”
  • Add two more values using “.append”
  • Find the total length of the list
  • Change the last value to “Last element”
list_personal = []

list_personal.append("First element")
list_personal.append("Second element")

print(len(list_personal))
2
# Here I used the index -1 to change the last element
# You could also use the index 1
list_personal[-1] = "Last element" 
print(list_personal)
['First element', 'Last element']
  • Did you get it right? 🤓

Appendix 02 - Continued

  • Here is another way to solve the exercise, now using list.extend() and [1] index
list_personal = []

list_personal.extend(["First element", "Second element"])
print(list_personal)
print(len(list_personal))
['First element', 'Second element']
2
list_personal[1] = "Last element"
print(list_personal)
['First element', 'Last element']

Back to the main text

Appendix 03

  1. range(len(list_ids)):
    • len(list_ids) gets the length (number of items) in the list_ids list.
    • range() then creates a sequence of numbers from 0 up to (but not including) that length.
    • This allows the loop to iterate over each index of the list[1].
  2. str(i+1):
    • i is the current loop index, starting at 0.
    • i+1 adds 1 to that index so that the position number starts at 1. (Remember: Python indexes start at 0.)
    • str() converts the resulting number to a string[2].
    • This is done because i starts at 0, but we want to display position numbers starting at 1 for the customers.

So, this code loops through each item in list_ids, printing a message for each customer that includes: - Their position (index + 1) - The corresponding car brand from list_ids

The loop will run once for each item in list_ids, with i taking on values from 0 to len(list_ids) - 1.

Back to the main text