QTM 151 - Introduction to Statistical Computing II

Lecture 08 - Custom Functions

Danilo Freire

Emory University

25 September, 2024

I hope you’re having a great day! 😊

Brief recap of last class 📚

Last class we learned about:

  • Using np.random.seed() to set the seed for reproducibility
  • Running simulations to estimate probabilities and expected values
    • np.random.distribution()
  • How to simulate random variables from different distributions
    • np.random.normal(), np.random.binomial(), np.random.poisson(), etc
  • Creating subplots with plt.subplots()
  • Using for loops to iterate over a range of values

Today’s plan 📋

What we will do today:

  • Learn about functions in Python
  • Understand the difference between arguments, parameters, and return values
  • Define functions with def and return
  • Use functions to encapsulate repetitive code

But first…

Let’s just finish the last example from last class

  • We were simulating how the uniform distribution behaves as we increase the number of samples
  • We will finish this example and then move on to functions
  • According to the Central Limit Theorem, the sample mean of a large number of random variables will be approximately normally distributed regardless of the distribution of the original random variables
  • We will see how to write simulations and use a subplot to compare the distribution of the sample mean with the normal distribution

Finishing the last example from last class

## Importing libraries
import numpy as np
import matplotlib.pyplot as plt

## Setting the seed
np.random.seed(151)

## Number of simulations
num_simulations = 2000

# Simulate with sample size one
sample_size = 1
vec_xbar = [None] * num_simulations
for iteration in range(num_simulations):
    vec_unif  = np.random.uniform(low = -2, high=2, size = sample_size)
    vec_xbar[iteration] = vec_unif.mean()
plt.hist(vec_xbar)
plt.title("Distribution of Xbar with size 1")
plt.ylabel("Frequency")
plt.xlabel("Values of Xbar")
plt.show()

Finishing the last example from last class

Now using a for loop to simulate the sample mean for different sample sizes

num_simulations = 2000
sample_size_list = [1,10,50,100,200]

for sample_size in sample_size_list:

    # The following command a vector null values, of length "num_simulations"
    vec_xbar = [None] * num_simulations
    
    for iteration in range(num_simulations):
            vec_unif  = np.random.uniform(low = -2, high=2, size = sample_size)
            vec_xbar[iteration] = vec_unif.mean()
    plt.hist(vec_xbar)
    plt.title("Distribution of Xbar when n is " + str(sample_size))
    plt.ylabel("Frequency")
    plt.xlabel("Values of Xbar")
    plt.show()

Functions in Python 🐍

What is a function?

A function is a block of code that performs a specific task

  • Functions are used to organise code, make it readable, and reusable
  • The main idea behind writing and using functions is that, if you have to do the same task multiple times, you can write a function to do that task and then call it whenever you want
  • A (somewhat silly) rule of thumb is that if you do the same task more than three times, you should write a function for it
  • As your code grows, functions will help you keep it maintainable and scalable
  • We have already seen lots of functions in Python
    • print(), np.mean(), plt.hist(), type(), etc
  • These functions are built-in, but you can also create your own functions as we will see today

What is a function?

A function is a block of code that performs a specific task

  • Functions have parameters, which are the variables that the function expects to receive
    • For example, np.random.normal() expects two parameters: the mean (loc) and the standard deviation (scale). Size is an optional parameter
  • Functions can take arguments and return values
    • For example, np.random.normal(0, 1) takes two arguments and returns a random number from a normal distribution with mean 0 and standard deviation 1
  • Functions can also have default arguments, which are optional
    • If you don’t provide a value for a default argument, the function will use the default value
    • Example: np.random.normal() will provide a sample of 1 number with mean 0 and standard deviation 1 if you don’t provide any arguments

Some examples

# Argument: "Hello" 
# Return: Showing the message on screen

print("Hello "+str(24))
Hello 24
# Argument: ABC
# Return: The type of object, e.g. int, str, boolean, float, etc.

type("ABC")
str
# First Argument: np.pi (a numeric value)
# Second Argument: 10 (the number of decimals)
# Return: Round the first argument, given the number of decimals in the second argument

round(np.pi,  10)
3.1415926536
list_fruits = ["Apple","Orange","Pear"]

# Argument: list_fruits
# Return: The number of elements in the list
len('Hello')
5

So far, so good? 😊

Enter arguments by assignment

  • The most common way to pass arguments to a function is by assignment
  • You can pass arguments by position or by name
  • When you pass arguments by name, you can change the order of the arguments
    • That is the case with many functions in Python, and it makes it easier to remember the arguments
  • You can also use default arguments if you don’t want to pass a specific value
# Here "df" and "size" are both parameters
# They get assigned the arguments "2" and "20", respectively
# The return is a vector of random variables

vec_x = np.random.chisquare(df = 2, size = 20)
print(vec_x)
[1.26494504 4.46114373 1.17076231 6.91479626 3.36189123 2.33285233
 1.00960541 0.47839223 0.80520588 0.15343627 0.80128118 0.94337907
 8.547391   0.98529665 1.20953449 1.2501058  7.89879179 1.96852874
 1.39020263 2.7088297 ]
# Another example
vec_y = np.random.normal(loc = 2, scale = 1, size = 20)
print(vec_y)
[1.35356383 0.2155341  2.07374213 2.71737325 0.71210129 1.31526156
 2.65836538 1.16327072 1.79217552 1.67364679 1.0316952  2.2727245
 0.79098905 1.6456399  0.63350419 3.35936216 3.32088958 1.53761036
 2.73834784 3.21157881]

What are the parameters, arguments, and return values in these examples? 🤓

Custom functions in Python 🐍

Defining a function

You can define a function using the def keyword

  • You can create your own functions using the def keyword
  • The syntax is as follows:
#---- DEFINE
def my_function(parameter):
    body
    return expression

#---- RUN
my_function(parameter = argument) 

#---- RUN
my_function(argument)
  • The function name should be descriptive, that is, its name should reflect what the function does
  • The parameters are the variables that the function expects to receive
    • In our case, the parameter is parameter (duh! 😅)
  • The body is the code that the function will run
    • Please don’t forget that the body should be indented!
  • The return statement is optional
    • If you don’t provide a return statement, the function will return None
    • So it’s a good practice to always return something!

Let’s create a function!

  • Let’s create a function that solves this equation for any combination of numbers:

\[V=P\left(1+{\frac {r}{n}}\right)^{nt}\]

To know what each parameter means, click here: Appendix 01

def fn_compound_interest(P, r, n, t):
    V = P*(1 + r/n)**(n*t)
    return V

Let’s test our function

  • Now that we have defined our function, we can use it to calculate the future value of an investment
# You can know compute the formula with different values
# Let's see how much one can gain by investing 50k and 100k
# Earning 10% a year for 10 years

V1 = fn_compound_interest(P = 50000, r = 0.10, n = 12, t = 10)
V2 = fn_compound_interest(100000, 0.10, 12, 10)
V3 = fn_compound_interest(r = 0.10, P = 100000, t = 10, n = 12)

print(V1)
print(V2)
print(V3)
135352.0745431122
270704.1490862244
270704.1490862244

Try it yourself! 🤓

  • Now it’s your turn to try it out!
  • Write a function that calculates

\(f(x) = x^2 + 2x + 1\)

  • Test your function with \(x = 2\) and \(x = 3\)
  • Appendix 02

Try it yourself! 🤓

  • Write a function with a parameter numeric_grade
  • Inside the function write an if/else statement for \(grade \ge 55\).
  • If it’s true, then assign status = pass
  • If it’s false, then assign status = fail
  • Return the value of status
  • Test your function with \(numeric\_grade = 60\)
  • Appendix 03

Lambda functions

Lambda functions

  • Lambda functions are short functions, which you can write in one line
  • They can have any number of arguments but only one expression (no return statement)
  • They are used when you need a simple function for a short period of time
  • They are also known as anonymous functions, although you can assign them to a variable
  • Format: my_function = lambda parameters: expression
    • Example: fn_squared = lambda x: x**2
  • More information here

Lambda functions

  • Example: calculate \(x + y + z\) using a lambda function
  • The function will take three arguments: \(x\), \(y\), and \(z\)
fn_sum = lambda x, y, z: x + y + z

result = fn_sum(1, 2, 3)
print(result)
6
fn_v = lambda P, r, n, t: P*(1+(r/n))**(n*t)

result = fn_v(50000, 0.10, 12, 10)
print(result)
135352.0745431122

Try it yourself! 🤓

Boleean + Functions

  • Write a function called fn_iseligible_vote
  • This functions returns a boolean value that checks whether \(age \ge\) 18
  • Test your function with \(age = 20\)
  • Appendix 04

Last one! 🤓

For loop + Functions

  • Create list_ages = [18,29,15,32,6]
  • Write a loop that checks whether above ages are eligible to vote
  • Use the above function
  • Appendix 05

And that’s it for today! 🎉

Have a great day! 😊

Appendix 01: Compound Interest Equation

  • \(V\) is the future value of the investment/loan, including interest
  • \(P\) is the principal investment amount (the initial deposit or loan amount)
  • \(r\) is the annual interest rate (decimal)
  • \(n\) is the number of times that interest is compounded per year
  • \(t\) is the time the money is invested/borrowed for, in years
  • More information

Back to the function

Appendix 02: Quadratic Equation

def fn_quadratic(x):
    f = x**2 + 2*x + 1
    return f

f1 = fn_quadratic(2)
f2 = fn_quadratic(3)

print(f1)
print(f2)
9
16

Back to the exercise

Appendix 03: Pass/Fail Function

def fn_pass_fail(numeric_grade):
    if numeric_grade >= 55:
        status = "pass"
    else:
        status = "fail"
    return status

status = fn_pass_fail(60)
print(status)
pass

Back to the exercise

Appendix 04: Lambda Function

fn_iseligible_vote = lambda age: age >= 18

result = fn_iseligible_vote(20)
print(result)
True

Back to the exercise

Appendix 05: For loop + Function

```{python #| eval: true #| echo: true list_ages = [18,29,15,32,6]

for age in list_ages: result = fn_iseligible_vote(age) print(f”Age: {age} - Eligible to vote: {result}“) ```

Back to the exercise