Today we start a crash course on programming using Julia
We will cover the main features to get you started…
Of course, we’ll be just scratching the surface. You’ll learn a lot more as we go
Unless you told me you are using a different language, by now you hopefully have installed
Julia
Visual Studio Code with Julia extension
Relevant sources for this lecture
Software Carpentry
QuantEcon
Lecture notes for Grant McDermott’s Data Science for Economists (Oregon) and Ivan Rudik’s Dynamic Optimization (Cornell)
Julia documentation
Why learn Julia?
Reason 1: It is easy to learn and use
Julia is a high-level language
Low-level = you write instructions are closer to what the hardware understands (Assembly, C++, Fortran)
These are usually the fastest because there is little to translate (what a compiler does) and you can optimize your code depending on your hardware
High-level means you write in closer to human language (Julia, R, Python)
The compiler has to do a lot more work to translate your instructions
Why learn Julia?
Reason 2: Julia delivers C++ and Fortran speed
Sounds like magic, but it’s just a clever combination of design choices targeting numerical methods
In this graph, time to execute in C++ is 1
Why learn Julia?
Reason 3: Julia is free, open-source, and popular
You don’t need expensive licenses to use (unlike Matlab)
The people who want to use or verify what you did also don’t have to pay
There is a large and active community of users and developers
So it’s easy to get help and new packages
Tools for programming in Julia
There are 2 Integrated Development Environments (IDEs) I generally recommend
Visual Studio (VS) code
Jupyter Lab notebooks
In this course, we will only program plain .jl files, so I highly recommend you get familiarized with VS code
At the end of this unit, we will talk about using AI tools to help you learn to code and become a more productive programmer
Intro to programming
Programming\(\equiv\)writing a set of instructions
There are hard rules you can’t break if you want your code to work
There are elements of style (e.g. Strunk and White) that make your code easier to read, modify, and maintain
There are elements that make your code more efficient
Using less time or space (memory)
Intro to programming
If you will be doing computational work, there are:
Language-independent coding basics you should know
Arrays are stored in memory in particular ways
Language-independent best practices you should use
Indent to convey program structure, naming conventions
Language-dependent idiosyncracies that matter for function, speed, etc
Vectorizing, memory management
Intro to programming
Learning these early will:
Make coding a lot easier
Reduce total programmer time
Reduce total computer time
Make your code understandable by someone else or your future self
Make your code flexible
A broad view of programming
Your goal is to make a program
A program is made of different components and sub-components
The most basic component is a statement, more commonly called a line of code
A broad view of programming
Here is an example of a pseudoprogram:
deck = ["4 of hearts", "King of clubs", "Ace of spades"]shuffled_deck =shuffle(deck)first_card = shuffled_deck[1]println("The first drawn card was "* shuffled_deck ".")
This program is very simple:
Create a deck of cards
A broad view of programming
Here is an example of a pseudoprogram:
deck = ["4 of hearts", "King of clubs", "Ace of spades"]shuffled_deck =shuffle(deck)first_card = shuffled_deck[1]println("The first drawn card was "* shuffled_deck ".")
This program is very simple:
Create a deck of cards
Shuffle the deck
A broad view of programming
Here is an example of a pseudoprogram:
deck = ["4 of hearts", "King of clubs", "Ace of spades"]shuffled_deck =shuffle(deck)first_card = shuffled_deck[1]println("The first drawn card was "* shuffled_deck ".")
This program is very simple:
Create a deck of cards
Shuffle the deck
Draw the top card
A broad view of programming
Here is an example of a pseudoprogram:
deck = ["4 of hearts", "King of clubs", "Ace of spades"]shuffled_deck =shuffle(deck)first_card = shuffled_deck[1]println("The first drawn card was "* shuffled_deck ".")
This program is very simple:
Create a deck of cards
Shuffle the deck
Draw the top card
Print it
A broad view of programming
deck = ["4 of hearts", "King of clubs", "Ace of spades"]shuffled_deck =shuffle(deck)first_card = shuffled_deck[1]println("The first drawn card was "* shuffled_deck ".")
What are the parentheses and why are they different from square brackets?
How does shuffle work?
What’s println?
It’s important to know that a good program has understandable code
Julia specifics
We will discuss coding in the context of Julia but a lot of this ports to Python, MATLAB, etc1
We will review
Types
Functions
Iterating
Broadcasting/vectorization
And some slightly more advanced aspects to help you debug
Scope
Multiple dispatch
1. Types
Types: boolean
All languages have some kind of variable types like integers or arrays
The first type you will often use is a boolean (Bool) variable that takes on a value of true or false:
x =true
true
typeof(x)
Bool
Types: boolean
We can save the boolean value of actual statements in variables this way:
@show y =1>2
y = 1 > 2 = false
false
@show is a Julia macro for showing the operation.
You can think of a macro as a shortcut name that calls a bunch of other things to run
Quick detour: logical operators
Logical operators work like you’d think
== (equal equal) tests for equality
1==1
true
!= (exclaimation point equal) tests for inequality
2!=2
false
Quick detour: logical operators
You can also test for approximate equality with \(\approx\) (type \approx<TAB>)
1.00000001≈1
true
We will see why this can be super useful in the next unit
Now back to types
Types: numbers
Two other data types you will use frequently are integers
typeof(1)
Int64
and floating point numbers
typeof(1.0)
Float64
64 means 64 bits of storage for the number, which is probably the default on your machine
Types: numbers
You can always instantiate alternative floating point number types
This lets the compiler generate efficient code because it knows the types of the fields when you construct a FoobarType
2. Functions
Functions: why?
Functions are an essential part of programming. But why use functions?
To avoid duplicating code
If you have the same set of instructions repeated in multiple parts of your code, whenever you need to change something, you have to search through the code and change in many places. This is prone to bugs!
Rule of thumb: if you are using same (or a very similar) block of the instructions more than twice, turn that block into a function
Functions: why?
To make our program more efficient
Julia optimizes functions in the background, but not code outside functions (more on that soon)
To make our code easier to read
Functions can give meaninful names to a block of code that does a specific tasks
Also, it can generalize the operation, letting that block take in different values
Functions: defining them
Creating functions in Julia is easy
functionmy_function(argument_1, argument_2)# Do something hereend;typeof(my_function)
typeof(my_function) (singleton type of function my_function, subtype of Function)
You can also define functions with no arguments. This can be, for example, for some calculation that will print results or save them in a file or manipulate objects somewhere in memory
Just-in-time compilation (JIT) is one of the tricks Julia does to make things run faster
It translates your code to processor language the first time you run it and uses the translated version every time you call it again
When we run the function for the first time, it may take longer because of this compilation step
When we run it again, it will be much faster
This is one of the reasons why putting your code inside functions is important: Julia can optimize functions better than code outside functions
3. Iteration
Iterating
As in other languages we have loops at our disposal:
for loops iterate over containers
for count in1:10 random_number =rand()if random_number >0.2println("We drew a $random_number.")endend
We drew a 0.3461218609384674.
We drew a 0.5016784482103657.
We drew a 0.7113269730434086.
We drew a 0.34513819092176823.
We drew a 0.6499745943134212.
We drew a 0.7734888105779526.
Iterating
while loops iterate until a logical expression is false
x =1;while x <50 x = x *2println("After doubling, x is now equal to $x.")end
After doubling, x is now equal to 2.
After doubling, x is now equal to 4.
After doubling, x is now equal to 8.
After doubling, x is now equal to 16.
After doubling, x is now equal to 32.
After doubling, x is now equal to 64.
Iterating
An Iterable is something you can loop over, like arrays
actions = ["codes well", "skips class"];for action in actionsprintln("Charlie $action")end
Charlie codes well
Charlie skips class
Iterating
The type Iterator is a particularly convenient subset of Iterables
These include things like the dictionary keys:
for key inkeys(d1)println(d1[key])end
ACE592
97
Iterating
Iterating on Iterators is more memory efficient than iterating on arrays
Here’s a very simple example. The top function iterates on an Array, the bottom function iterates on an Iterator:
functionshow_array_speed() m =1for i = [1, 2, 3, 4, 5, 6] m = m*iendend;functionshow_iterator_speed() m =1for i =1:6 m = m*iendend;
Use this (and the compact nested loop) sparingly since it’s hard to read and understand
4. Broadcating/Vectorization
Vectorization
Iterated operations element by element is usually an inefficient approach
Another way is to do operations over an entire array. This is called vectorization
It’s faster because your processor can do some operations over multiple values with one instruction
Dot syntax: broadcasting/vectorization
Vectorizing operations is easy in Julia: just use dot syntax (like in MATLAB)
g(x) = x^2;squared_2 =g.(1:2:11)
6-element Vector{Int64}:
1
9
25
49
81
121
This is actually called broadcasting in Julia
Dot syntax: math intuition
There is a mathematical intuition to make a distinction
\(g(x) = x^2\) is a function \(\mathbb{R} \rightarrow \mathbb{R}\), i.e., mapping a scalar to a scalar
But if \(z\) is, say, a \(6 \times 1\) vector: \(z \in \mathbb{R}^6\), it’s unclear what \(g(z)\) is
What does the square of a \(6 \times 1\) vector even mean? Is it the square of each element? Is it a dot product with the vector itself? Something else?
When you use the dot syntax g.(1:2:11), you are telling Julia: apply functiongto each element in vector[1, 3, 5, 7, 9, 11]
If we needed a function to do something else with the whole vector, we need to write a different function for that
Dot syntax: we must pay attention to definitions!
Julia will generally be picky about this: if you call a function that is defined for a scalar and give it a vector, you will get an error message
This strictness (called “strong typed language”) is actually one of the reasons it gets so fast: it doesn’t need to spend time figuring out what kind of variable you give to it
tryg(1:2:11)catcheprintln(e)end
MethodError(^, (1:2:11, 2), 0x00000000000097fa)
The try/catch block let’s you handle error messages within your program
If anything fails, you can program ways to handle errors
We won’t be using it in the course; it’s here just because I have to handle it this way to generate slides. If you run this code, Julia REPL will give you a more informative error message
5. Scope
Scope
The scope of a variable name determines when it is valid to refer to that variable
E.g.: if you create a variable inside a function, can you reference that variable outside the function?
You can think of scope as different contexts within your program
The two basic scopes are local and global
Scope can be a frustrating concept to grasp at first. But understanding how scopes work can save you a lot of debugging time
Let’s walk through some simple examples to see how it works
Scope
First, functions have their own local scope
ff(xx) = xx^2;yy =5;ff(yy)
25
xx isn’t bound to any values outside the function ff
It is only used inside the function
Scope
Locally scoped functions allow us to do things like:
xx =10;fff(xx) = xx^2;fff(5)
25
Although xx was declared equal to 10 outside the function, the function still evaluated xx within its own scope at 5 (the value passed as argument)
Scope
But this type of scoping also has (initially) counterintuitive results like:
zz =0;functiondo_some_iteration()for ii =1:10 zz = iiendenddo_some_iteration()println("zz = $zz")
zz = 0
What happened?
Scope
What happened?
The zzoutside the for loop has a different scope: it’s in the global scope
The global scope is the outermost scope, outside all functions
The zzinside the function has a scope local to the loop
Since the outside zz has global scope, the locally scoped variables in the loop can’t change it
Scope
But hold on. If try the same loop outside a function, it will actually return 10, not 0. \(^{*}\)
zz =0;for ii =1:10 zz = iiendprintln("zz = $zz")
zz = 10
That’s because this for loop sits in global scope
It can get more complicated than that because there are soft and hard local scopes… but we don’t need to dwell on that
Generally, you want to avoid global scope because it can cause conflicts, slowness, etc. But you can use global to force it if you want something to have global scope
This is almost always a bad practice, though!
zz =0;functiondo_some_iteration()for ii =1:10global zz = iiendenddo_some_iteration()println("zz = $zz")
zz = 10
Scope
Local scope kicks in whenever you are defining variables inside a function
Global variables inside a local scope are inherited for reading, not writing
x, y =1, 2;functionfoo() x =2# assignment introduces a new localreturn x + y # y refers to the globalend;foo()x
1
Scope
We can fix looping issues with global scope by using a wrapper function that doesn’t do anything but change the parent scope so it is not global
Julia lets you define multiple functions with the same name but different types of input variables
This is useful because some operations have different steps depending on the context. For example
Multiplication (*) can work on scalars, vectors, matrices, and more complex types
By allowing different instructions depending on what type of variable is given, Julia makes it easier for user to use functions consistently
But if you try to call a function with types of input variables it doesn’t know how to handle, it will throw an error
This is usually the most common error you will encounter while learning Julia
Multiple dispatch
/ has MANY different methods for division depending on the input types! Each of these is a function specialized function that treats the inputs differently
Project Euler is a series of challenging mathematical/computer programming problems that will require more than just mathematical insights to solve. Although mathematics will help you arrive at elegant and efficient methods, the use of a computer and programming skills will be required to solve most problems.
Example of problems 1. If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23. Find the sum of all the multiples of 3 or 5 below 1000. . . .
The prime factors of 13195 are 5, 7, 13 and 29. What is the largest prime factor of the number 600851475143 ?
You can type in your answer and it will tell you if it’s correct
Some concluding words on programming
More on coding practices and efficiency:
See JuliaPraxis for best practices for naming, spacing, comments, etc
You are lucky! You’re among the first cohorts learning to program with an available AI language model that is advanced enough to understand, explain, and generate code
Currently, one of the best available services for that is called GitHub Copilot. It’s paid, but you can get it for free with an .edu email
But hold on. Don’t use this powerful resource without careful consideration
What about ChatGPT, Claude, and other LLMs?
But hold on. Don’t use this powerful resource without careful consideration
This must be a complement, not a substitute for your programming skills
Why?
For especialized scientific use, AI often produces buggy, incomplete, or outright incorrect code
Before you use it accurately, you need to be familiar enough with programming logic and the language you are using to know when things are wrong
These tools are improving fast, but they will always be imperfect
There is an inherent limitation in translating ambiguous (natural) languages to non-ambiguous (formal) languages
Treat it like a very smart intern who can do well-defined tasks very quickly
But, to correctly supervise an intern, you need to know the job well yourself!
Advice on AI coding assistants
Here is my personal advice to you focusing on your the medium/long-term career as a researcher
Do not use AI assistants to generate code you still cannot write and understand
There’s too big of a risk of producing incorrect code
It will place a low cap on your logical thinking for computational methods
Once you advance and become familiar with programming structures, you can start relying in AI to speed up your coding
For this course, coding assignments are relatively short, so I expect you to know what every line of code you write does
Advice on AI coding assistants
Here is my personal advice to you focusing on your the medium/long-term career as a researcher
Use AI assistants to explain code to you or generate examples
Throughout the semester, you will see many examples of algorithms
AI can offer tremendous help explaining the inner workings of algorithms
If you are a good programmer in one language, AI tools can also help you translate code
Even in that case, I’d still recommend you start using it to explain code in the “new” language rather than simply generate code for you