QTM 151 - Introduction to Statistical Computing II

Lecture 06 - While and For Loops

Danilo Freire

Emory University

Nice to see you all! 😊

Brief recap of the last class

Recap: conditional statements

In our last lecture, we covered conditional statements:

  • if statement: Executes a block of code if a condition is True.
  • elif statement: Checks a new condition if previous if or elif conditions were False.
  • else statement: Executes a block of code if all preceding if and elif conditions were False.

We also looked at: - Combining Conditions: Using logical operators and, or, not. - Nested if statements: Placing one if statement inside another for more complex decision-making.

Recap: Conditional Example

Remember this kind of structure?

#| echo: true
#| eval: false
age = 22
is_student = True
has_scholarship = False

# Eligible if (under 25 AND a student) OR 
# if has a scholarship
if (age < 25 and is_student) or has_scholarship:
    print("Eligible for program discount.")
else:
    print("Not eligible for program discount.")

This ability to control program flow based on conditions is fundamental, and loops will build upon this by allowing us to repeat actions.

For loops in Python 🔄🐍

What is a for loop?

  • A for loop is a way to iterate over a sequence of elements
  • It is a very useful tool to automate repetitive tasks
  • The syntax is similar to an if statement, including the colon : and the necessary indentation (4 spaces)
  • The syntax is as follows:
for element in sequence:
    do something
variable = [1,2,3,4]

for i in variable:
    print(i)
1
2
3
4

What can you do with a for loop?

  • You can iterate over a list of elements
    • Numbers, strings, or any other type of object
  • You can also iterate over a range of numbers
  • You can even iterate over a list of lists (nested lists) 🤯
    • This is very useful when working with dataframes or matrices
students_scores = [
    ["Alice", 85, 90, 88],
    ["Bob", 78, 82, 84],
    ["Charlie", 92, 95, 93]
]

for i in students_scores:
    print(i)
['Alice', 85, 90, 88]
['Bob', 78, 82, 84]
['Charlie', 92, 95, 93]

Some examples

list_ids = ["KIA", "Ferrari", "Ford", "Tesla"]
print("Dear customer, we are writing about your " + list_ids[0] + " car.")
print("Dear customer, we are writing about your " + list_ids[1] + " car.")
print("Dear customer, we are writing about your " + list_ids[2] + " car.")
print("Dear customer, we are writing about your " + list_ids[3] + " car.")
Dear customer, we are writing about your KIA car.
Dear customer, we are writing about your Ferrari car.
Dear customer, we are writing about your Ford car.
Dear customer, we are writing about your Tesla car.
  • This is a very boring! 🥱
list_ids = ["KIA", "Ferrari", "Ford", "Tesla"]

for i in list_ids:
    print("Dear customer, we are writing about your " + i + " car.")
Dear customer, we are writing about your KIA car.
Dear customer, we are writing about your Ferrari car.
Dear customer, we are writing about your Ford car.
Dear customer, we are writing about your Tesla car.
  • This is much better! 😎

You can also iterate over additional elements

  • This code iterates over each element in list_ids and three additional elements: ‘a’, ‘b’, and ‘c’.

  • The + operator is used to concatenate list_ids with another list ['a', 'b', 'c'].

  • For each element in this combined list, the loop assigns the element to the variable id and then prints it out using the print function.

for id in list_ids + ['a', 'b', 'c']:
    print(id)
KIA
Ferrari
Ford
Tesla
a
b
c

Iterating with range()

The range() function is often used with for loops to generate a sequence of numbers.

  • range(stop): Generates numbers from 0 up to (but not including) stop.

    for i in range(5): # Generates 0, 1, 2, 3, 4
        print(i)
    0
    1
    2
    3
    4
  • range(start, stop): Generates numbers from start up to (but not including) stop.

  • range(start, stop, step): Generates numbers from start up to (but not including) stop, with an increment of step.

    for i in range(2, 8, 2): # Generates 2, 4, 6
        print(i)
    2
    4
    6

Iterating over strings

Strings are sequences of characters, so you can iterate over them directly:

course_name = "QTM 151"
for char in course_name:
    print(char)
Q
T
M
 
1
5
1

This is useful for character-by-character processing.

Using enumerate()

Sometimes you need both the index and the value of an item in a sequence. The enumerate() function provides this:

Syntax: for index, value in enumerate(sequence):

fruits = ["apple", "banana", "cherry"]
for index, fruit in enumerate(fruits):
    print(f"Index {index}: {fruit}")
Index 0: apple
Index 1: banana
Index 2: cherry

You can also specify a starting index for enumerate(): enumerate(fruits, start=1)

Customised messages + numbering

  • You can also include numbers in your messages
  • Initiate index = 1 before the loop (just to start at 1, since Python indexes start at 0)
  • Add index = index + 1 at the end of the body
list_ids = ["KIA", "Ferrari", "Ford", "Tesla"]

index = 1
print('We are out of the loop', index)
for id in list_ids:
    print("Dear customer, your position is " + str(index) + " on the waitlist" +
           " and your car brand is " + id )
    index = index + 1
    print('We are inside the loop', index)
We are out of the loop 1
Dear customer, your position is 1 on the waitlist and your car brand is KIA
We are inside the loop 2
Dear customer, your position is 2 on the waitlist and your car brand is Ferrari
We are inside the loop 3
Dear customer, your position is 3 on the waitlist and your car brand is Ford
We are inside the loop 4
Dear customer, your position is 4 on the waitlist and your car brand is Tesla
We are inside the loop 5

Another example

for i in range(len(list_ids)):
    print("Dear customer, your position is " + str(i+1) + " on the waitlist" +
           " and your car brand is " + list_ids[i])
Dear customer, your position is 1 on the waitlist and your car brand is KIA
Dear customer, your position is 2 on the waitlist and your car brand is Ferrari
Dear customer, your position is 3 on the waitlist and your car brand is Ford
Dear customer, your position is 4 on the waitlist and your car brand is Tesla
  • Who can explain to me what range(len(list_ids)) does?
  • And what does str(i+1) do?
  • Appendix 01

Saving time while coding!

How to make your work easier with for loops

  • Boring 🥱
import pandas as pd
import matplotlib.pyplot as plt

carfeatures = pd.read_csv("data/features.csv")
list_vars = ["acceleration","weight"]

variable_name = "acceleration"
plt.scatter(x = carfeatures[variable_name],
            y = carfeatures["mpg"])
plt.ylabel("mpg")
plt.xlabel(variable_name)
plt.show()

variable_name = "weight"
plt.scatter(x = carfeatures[variable_name], 
            y = carfeatures["mpg"])
plt.ylabel("mpg")
plt.xlabel(variable_name)
plt.show()

Cool 😎

carfeatures = pd.read_csv("data/features.csv")
list_vars = ["acceleration","weight"]

for variable_name in list_vars:
    plt.scatter(x = carfeatures[variable_name],
                y = carfeatures["mpg"])
    plt.ylabel("mpg")
    plt.xlabel(variable_name)
    plt.show()

Even cooler! 🤩

carfeatures = pd.read_csv("data/features.csv")
list_vars   = ["acceleration","weight"]

index = 1
for variable_name in list_vars:
    plt.scatter(x= carfeatures[variable_name], y = carfeatures["mpg"])
    plt.ylabel("mpg")
    plt.xlabel(variable_name)
    plt.title("Figure " + str(index))
    plt.show()
    index = index + 1

Solving many equations at once

  • Solve the equation \(y = x^2 + 2x\) for \(x = 1,2,4,5,6,7,8,9,10\)
# Create a list of x-values list_x = [1,2,4,5, ..., 50]
# Create a list of y-values to fill in later.

list_x = [1,2,4,5,6,7,8,9,10]
list_y = [None] * len(list_x)

# Create an index 
index = 0
for x in list_x:
    list_y[index] = list_x[index]**2 + 2*list_x[index]
    index = index + 1

# Display results visually
print(list_y)
plt.scatter(list_x, list_y)
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Scatter plot of Y = X^2 + 2X")
plt.show()
[3, 8, 24, 35, 48, 63, 80, 99, 120]

Math equations using append

# Create a list of x-values list_x = [1,2,4,5, ..., 50]
# Start an empty list of y-values with []

list_x = [1,2,4,5,6,7,8,9,10]
list_y = []

# Create an index 
for x in list_x:
    y = x**2 + 2*x
    list_y.append(y)

# Display results visually
print(list_y)
plt.scatter(list_x, list_y)
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
[3, 8, 24, 35, 48, 63, 80, 99, 120]
Text(0, 0.5, 'Y-axis')

Try it yourself! 🧠

  • There are two datasets in the data folder:
    • features.csv
    • worldbank_wdi_2019.csv
  • Create a new object called list_datasets and assign it a list with the two dataset names
  • Run a for loop over this list:
    • Read each of the datasets using pd.read_csv()
    • Print a table of descriptive statistics for each dataset
  • Appendix 02

While loops in Python ⏳🐍

What is a while loop?

  • A while loop is used to iterate over a block of code as long as the test expression (condition) is True
  • We use while loops when we don’t know the number of times to iterate beforehand, or when the iteration depends on a condition being met
  • Syntax:
while test_expression:
    Body of while loop
    (must include logic to eventually make test_expression False)

When to Use while vs for

  • for loop:
    • Use when you know the number of iterations or you are iterating over a known sequence (list, string, range, etc.).
    • Example: “Process each item in this list.”
  • while loop:
    • Use when the number of iterations is not known in advance and depends on a condition becoming False.
    • Example: “Keep doing this until a certain condition is met (e.g., user enters ‘quit’, or a calculation converges).”
    • Requires careful management of the condition to avoid infinite loops.

while loop: simple counter example

This while loop acts like for i in range(5):

count = 0  # 1. Initialise a counter variable
while count < 5:  # 2. Set the condition
    print(f"Count is: {count}")
    count = count + 1  # Update the counter (SERIOUSLY! 😂)
print("Loop finished.")
Count is: 0
Count is: 1
Count is: 2
Count is: 3
Count is: 4
Loop finished.

If count = count + 1 is forgotten, it creates an infinite loop!

while loop: user input validation

while loops are great for repeatedly asking for user input until it’s valid

# Example: Get a positive number from the user
number = -1 # Initialise to a value that fails the condition
while number <= 0:
    try:
        user_input = input("Please enter a positive number: ")
        number = float(user_input) # Try to convert to float
        if number <= 0:
            print("That's not a positive number. Try again.")
    except ValueError:
        print("Invalid input. Please enter a number.")
print(f"You entered a valid positive number: {number}")

(This code is best run in an interactive Python session.)

Caution: Infinite Loops!

A common pitfall with while loops is creating an infinite loop! - This happens if the condition in the while statement never becomes False - Cause: Often due to forgetting to update the variable(s) involved in the condition within the loop

Example of an infinite loop (DO NOT RUN THIS CELL AS IS 😅):

#| echo: true
#| eval: false # This would run forever
count = 0
while count < 5:
    print("Still looping...")
    # Missing count = count + 1
  • If you accidentally run an infinite loop in a Jupyter notebook or Python script, you usually need to interrupt the kernel (e.g., by pressing Ctrl+C in the terminal, or using the “Interrupt Kernel” button in Jupyter).

Try it yourself!

  • You have the following list of sports: list_sports = ["tennis","golf","golf","tennis","golf","golf"]
  • Write a while loop to find the second occurrence of “tennis” in the list
  • You can start with the following code:
list_sports = ["tennis","golf","golf","tennis","golf","golf"]
candidate_list = []
num_candidates = 0

while...
    # Add your code here
    # Use candidate_list.append() to add the found elements
    # Use num_candidates to count the number of occurrences

Loop control statements 🚦

Introduction to loop control

Python provides statements to control the flow of execution within loops:

  • break: Terminates the current loop prematurely and transfers control to the statement immediately following the loop.
  • continue: Skips the rest of the code inside the current iteration of the loop and proceeds to the next iteration.
  • else clause in loops: An optional block that executes if the loop completes all its iterations without encountering a break statement.

The break statement

  • The break statement is used to exit a for or while loop immediately, regardless of whether the loop’s condition is still true or if there are more items to iterate over

Example: Find the first non-integer element in a list and stop the loop

list_mixed = [1, 2, "text_message", 5, 10.5]
print(f"Original list: {list_mixed}")

for value in list_mixed:
    if not isinstance(value, int): # Check if value is NOT an integer
        print(f"Stopped: Found a non-integer element: '{value}' (type: {type(value).__name__})")
        break # Exit the loop
    print(f"Processing integer: {value}")
print("Loop finished or broken.")
Original list: [1, 2, 'text_message', 5, 10.5]
Processing integer: 1
Processing integer: 2
Stopped: Found a non-integer element: 'text_message' (type: str)
Loop finished or broken.

The continue statement

  • The continue statement is used to skip the remaining code inside the current iteration of a loop and move to the next iteration

Example: Print only integers from a list and skip non-integer elements

list_mixed = [1, 2, "text_message", 5, 10.5, "another_string", 7]
print(f"Original list: {list_mixed}\nPrinting only integers:")

for value in list_mixed:
    if not isinstance(value, int):
        print(f"  Skipping non-integer: '{value}'")
        continue # Skip to the next iteration
    print(f"Integer found: {value}")
print("Loop finished processing all elements.")
Original list: [1, 2, 'text_message', 5, 10.5, 'another_string', 7]
Printing only integers:
Integer found: 1
Integer found: 2
  Skipping non-integer: 'text_message'
Integer found: 5
  Skipping non-integer: '10.5'
  Skipping non-integer: 'another_string'
Integer found: 7
Loop finished processing all elements.

Loops with an else clause

  • Both for and while loops can have an optional else clause
  • The else block is executed if and only if the loop terminates normally (i.e., not by a break statement)

Example with for and else: Searching for an item

my_numbers = [1, 3, 5, 7, 9]
search_for = 6

for num in my_numbers:
    if num == search_for:
        print(f"Found {search_for} in the list.")
        break
else: # Executed if the loop completes without a break
    print(f"{search_for} was not found in the list.")

search_for = 5 # Try again with a number that is in the list
for num in my_numbers:
    if num == search_for:
        print(f"Found {search_for} in the list.")
        break
else:
    print(f"{search_for} was not found in the list.")
6 was not found in the list.
Found 5 in the list.

List comprehensions ⚡️

What are list comprehensions?

  • List comprehensions provide a concise and readable way to create lists
  • They often achieve the same result as a for loop (and sometimes an if condition) but in a single line of code.
  • Basic Syntax: new_list = [expression for item in iterable]

Advantages: - More compact and often easier to read for simple list generations - Can be more efficient than using a for loop with append()

List comprehension examples

Example 1: Customised Messages

id_list = ["KIA", "Ferrari", "Ford", "Tesla"]
message_list = ["Your car model is: " + car_id for car_id in id_list]
print(message_list)
['Your car model is: KIA', 'Your car model is: Ferrari', 'Your car model is: Ford', 'Your car model is: Tesla']

Example 2: Math operations

x_list_lc = [1, 2, 3, 4, 5, 6, 7]
x_sqr_list = [x**2 for x in x_list_lc]
print(f"Original list: {x_list_lc}")
print(f"Squared list:  {x_sqr_list}")
Original list: [1, 2, 3, 4, 5, 6, 7]
Squared list:  [1, 4, 9, 16, 25, 36, 49]

List comprehensions with conditionals

You can add an if condition to filter items during list creation.

  • Syntax with if: new_list = [expression for item in iterable if condition]

Example: Create a list of squares of only the even numbers from 0 to 9.

squares_of_evens = [x**2 for x in range(10) if x % 2 == 0]
print(f"Squares of even numbers from 0-9: {squares_of_evens}")

# Equivalent for loop:
# squares_of_evens_loop = []
# for x in range(10):
#     if x % 2 == 0:
#         squares_of_evens_loop.append(x**2)
# print(squares_of_evens_loop)
Squares of even numbers from 0-9: [0, 4, 16, 36, 64]

Try it yourself! 🧠

You have a list of names: names = ["Alice", "Bob", "Charlie", "David", "Eve", "Fiona", "George"]

  1. Using a list comprehension, create a new list called short_names that contains only the names from the names list that have 5 or fewer characters.
  2. Using a list comprehension, create a new list called uppercase_long_names that contains the uppercase versions of names from the names list that have more than 5 characters.

And that’s all for today! 🎉

Thank you for your attention! 🙏🏽😊

Appendix 01

  1. range(len(list_ids)):
    • len(list_ids) gets the length (number of items) in the list_ids list.
    • range() then creates a sequence of numbers from 0 up to (but not including) that length.
    • This allows the loop to iterate over each index of the list[1].
  2. str(i+1):
    • i is the current loop index, starting at 0.
    • i+1 adds 1 to that index so that the position number starts at 1. (Remember: Python indexes start at 0.)
    • str() converts the resulting number to a string[2].
    • This is done because i starts at 0, but we want to display position numbers starting at 1 for the customers.

So, this code loops through each item in list_ids, printing a message for each customer that includes: - Their position (index + 1) - The corresponding car brand from list_ids

The loop will run once for each item in list_ids, with i taking on values from 0 to len(list_ids) - 1.

Back to the main text

Appendix 02

import pandas as pd
list_datasets = ["features.csv", "worldbank_wdi_2019.csv"]
for dataset in list_datasets:
    data = pd.read_csv("data/" + dataset)
    print(f"Descriptive statistics for {dataset}:")
    print(data.describe())
    print("\n")
Descriptive statistics for features.csv:
              mpg   cylinders  displacement       weight  acceleration
count  398.000000  398.000000    398.000000   398.000000    398.000000
mean    23.514573    5.454774    193.427136  2970.424623     15.568090
std      7.815984    1.701004    104.268683   846.841774      2.757689
min      9.000000    3.000000     68.000000  1613.000000      8.000000
25%     17.500000    4.000000    104.250000  2223.750000     13.825000
50%     23.000000    4.000000    148.500000  2803.500000     15.500000
75%     29.000000    8.000000    262.000000  3608.000000     17.175000
max     46.600000    8.000000    455.000000  5140.000000     24.800000


Descriptive statistics for worldbank_wdi_2019.csv:
       life_expectancy  gdp_per_capita_usd
count       252.000000          255.000000
mean         72.682931        17230.949757
std           7.382636        25792.183785
min          52.910000          216.972968
25%          67.109750         2186.046581
50%          73.599000         6837.717826
75%          78.234892        19809.323135
max          85.078049       199377.481800

Appendix 03

list_sports = ["tennis","golf","golf","tennis","golf","golf"]
candidate_list = []
num_candidates = 0
index = 0
while index < len(list_sports):
    if list_sports[index] == "tennis":
        candidate_list.append(list_sports[index])
        num_candidates = num_candidates + 1
        if num_candidates == 2:
            break
    index = index + 1
print(f"Second occurrence of 'tennis': {candidate_list}")
Second occurrence of 'tennis': ['tennis', 'tennis']

Appendix 04: Solution for List Comprehensions Exercise

Solution for the list comprehension “Try it yourself!” exercise:

names = ["Alice", "Bob", "Charlie", "David", "Eve", "Fiona", "George"]

# 1. Names with 5 or fewer characters
short_names = [name for name in names if len(name) <= 5]
print(f"Original names: {names}")
print(f"Short names (<= 5 chars): {short_names}")

# 2. Uppercase versions of names longer than 5 characters
uppercase_long_names = [name.upper() for name in names if len(name) > 5]
print(f"Uppercase long names (> 5 chars): {uppercase_long_names}")
Original names: ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Fiona', 'George']
Short names (<= 5 chars): ['Alice', 'Bob', 'David', 'Eve', 'Fiona']
Uppercase long names (> 5 chars): ['CHARLIE', 'GEORGE']

Back to the main text