Macro 318: Tutorial #3 ¶

Optimisation and the consumer problem ¶

Lecturer:
Dawie van Lill (dvanlill@sun.ac.za)

Introduction¶

Before getting started with this tutorial I want to acknowledge the excellent work being done by Jeppe Druedahl and his team at Copenhagen. The following tutorial is basically a port of the Python lecture for his course on numerical methods. Much of the material is directly copied, but there are also some smaller amendments to fit the 318 course. The original lecture can be found here.

import Pkg

Pkg.add("DataFrames");
Pkg.add("ForwardDiff")
Pkg.add("IntervalRootFinding")
Pkg.add("Ipopt");
Pkg.add("JuMP"); # Optimisation package
Pkg.add("LaTeXStrings")
Pkg.add("Optim");
Pkg.add("Plots");
Pkg.add("Symbolics")

   Resolving package versions...
  No Changes to `~/.julia/environments/v1.7/Project.toml`
  No Changes to `~/.julia/environments/v1.7/Manifest.toml`
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.7/Project.toml`
  No Changes to `~/.julia/environments/v1.7/Manifest.toml`
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.7/Project.toml`
  No Changes to `~/.julia/environments/v1.7/Manifest.toml`
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.7/Project.toml`
  No Changes to `~/.julia/environments/v1.7/Manifest.toml`
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.7/Project.toml`
  No Changes to `~/.julia/environments/v1.7/Manifest.toml`
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.7/Project.toml`
  No Changes to `~/.julia/environments/v1.7/Manifest.toml`
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.7/Project.toml`
  No Changes to `~/.julia/environments/v1.7/Manifest.toml`
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.7/Project.toml`
  No Changes to `~/.julia/environments/v1.7/Manifest.toml`
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.7/Project.toml`
  No Changes to `~/.julia/environments/v1.7/Manifest.toml`

using DataFrames
using ForwardDiff
using IntervalRootFinding
using Ipopt
using JuMP
using LaTeXStrings
using Optim
using Plots
using Symbolics

Learning objectives¶

Unconstrained optimisation¶

In the previous tutorial we looked at derivatives and conditions under which a maximum or minimum for a function can be achieved. This is essentially a problem in unconstrained optimisation. We will reiterate some of the main points from the previous tutorial, and also continue from where we concluded. When we are done with unconstrained optimisation we will move to the setting of constrained optimisation. This is a very important topic and will form the basis for two crucial examples that we will cover in the span of the following two tutorials.

Let us restate the optimisation problem from the previous tutorial, as a quick reminder. In the case of unconstrained optimisation we want to optimise (let us stick with minimization for this example) a function on the whole space $X = \mathbb{R}^{n}$. The optimisation problem is then to

$$ \min_{x \in \mathbb{R}^n} f(x). $$

The optimal point would be a global minimum, which is a point $x\in X = \mathbb{R}^{n}$ which satisfies,

$$ f(x) \le f(y) \text{ for all }y\in X. $$

Note that since $x \in \mathbb{R}^{n}$ it can be seen as an $n$ element vector.

Optimisation and gradients?¶

Remember that the gradient of the function $f: \mathbb{R}^{n} \rightarrow \mathbb{R}$ is denoted $\nabla f: \mathbb{R}^{n} \rightarrow \mathbb{R}^{n} $ and returns a vector of values,

$$ \nabla f(x) = \left(\frac{\partial f}{\partial x_1}(x),\frac{\partial f}{\partial x_2}(x),\dots,\frac{\partial f}{\partial x_n}(x) \right) $$

Here we are taking partial derivatives with respect to each of the components in the $x$ vector.

Consider a differentiable function $f$ over $X=\mathbb{R}^n$. If $x$ is its local minimum, then $\nabla f(x)=0$. In other words, in the case of $n = 2$, both partial derivatives need to be zero. This is a direct analogue of the $n = 1$ case (which is the univariate case).

Points with $\nabla f(x)=0$ are known as stationary points.

Optimization algorithms often try to find local minima or stationary points, hoping to minimize the function $f$.

function minmax()
	
	v = collect(range(-10, stop = 10, length = 30))  # values
	mini = [x^2 + y^2 for x in v, y in v]
	maxi = -mini   # max is just negative min
	saddle = [x^2 + y^3 for x in v, y in v]
	
	return Dict(:x => v,:min => mini, :max => maxi, :saddle => saddle)
end;

function mmplotter(s::Symbol)
	
    d = minmax()

    surface(d[:x], d[:x], d[s], fillalpha = 0.7, legend = false, fillcolor =:viridis)
end;

mmplotter(:min)

In the figure above we have an example of a function, $f(x, y) = x^2 + y^2$, that has a local minimum. We can determine where the stationary point is by setting the gradient equal to zero. If we want to know whether this is in fact a minimum we need to perform a second partial derivative test. For this test we need to calculate the Hessian matrix of the function. This is a $2 \times 2$ matrix of partial derivatives of $f$ whose determinant is given by $d = f_{xx}f_{yy} - f_{xy}^2$.

The function has a local minimum if $f_{xx} < 0$ and $d > 0$. The second derivative test is similar to the one we employed in the univariate case, without considering the determinant of the Hessian matrix.

What does mean for us? It means that we need to compute two quantities in order to determine if our function has a local minimum. Let us consider an example to see how this works.

Example: Profit maximisation¶

Suppose that we have the following profit function,

$$ \Pi = -5 + 30x - 3x^2 + 25y - 5y^2 +xy $$

At what level of output will profit be maximized?

We can start by plotting the function to get an idea of what it looks like.

function profit_function()
	
	v = collect(range(0, stop = 8, length = 30))  # values
	profit = [-5 + 30x - 3x^2 + 25y - 5y^2 + x*y for x in v, y in v]
	
	return Dict(:x => v, :profit => profit)
end;

function profit_plotter(s::Symbol)
	
    d = profit_function()

    surface(d[:x], d[:x], d[s], fillalpha = 0.7, legend = false, fillcolor =:viridis)
end;

profit_plotter(:profit)

The function seems like it will have a maximum. We first need to look at the gradient and then check the second order conditions to determine if this is true. In order to determine the gradient we need the partial derivatives with respect to $x$ and $y$.

Exercise¶

Conduct the following steps to determine if we have a local maximum.

Calculate the partial derivatives with respect to $x$ and $y$ of the profit function.
Calculate $f_{xx}$
Calculate $d = f_{xx}f_{yy} - f_{xy}^2$

If $f_{xx} < 0$ and $d > 0$, then we have a local maximum. We will be calculating these quantities with Julia to determine if there is a maximum and also where the maximum is located.

Local optima via Julia (optional)¶

In most cases we will not be calculating the Hessian and it's determinant by hand. We would rather use the computer to do these complicated things for us. Let us look at the example that we solved by hand above and try and solve the same problem, but this time with our software package. We will use the ForwardDiff package here.

x = rand(2)

Π(x) = -5 + 30 * x[1] - 3 * x[1] ^ 2 + 25 * x[2] - 5 * x[2] ^ 2 + x[1] * x[2] 

∇Π(x) = ForwardDiff.gradient(Π, x) # gradient

H(x) = ForwardDiff.hessian(Π, x) # Hessian

H (generic function with 1 method)

∇Π(x) # both partial derivatives are positive

2-element Vector{Float64}:
 29.42686888223324
 17.443228729154622

H(x)

2×2 Matrix{Float64}:
 -6.0    1.0
  1.0  -10.0

From this we see that both partial derivatives are positive. The value for $f_{xx} = -6 < 0$ and the value for $d = f_{xx}f_{yy} - f_{xy}^2 = (-6)*(-10) - (1.0)^2 = 59 > 0$. This means that we have a maximum. Next we are going to try and figure out where the gradient is equal to zero. We will use the IntervalRootFinding package for this.

f( (x, y) ) = -5 + 30 * x - 3 * x ^ 2 + 25 * y - 5 * y ^ 2 + x * y
∇f = ∇(f)
rts = roots(∇f, IntervalBox(-5..6, 2), IntervalRootFinding.Newton, 1e-5)

1-element Vector{Root{IntervalBox{2, Float64}}}:
 Root([5.50847, 5.50848] × [3.05084, 3.05085], :unique)

midpoints = mid.(interval.(rts))

xs = first.(midpoints)
ys = last.(midpoints)

surface(0:0.5:10, 0:0.5:10, (x,y) -> f([x, y]), fillalpha = 0.6, legend = false, fillcolor =:viridis, camera = (50, 10))
scatter!(xs, ys, f.(midpoints), color = :black)

In the graph you can see that the scatter point is the maximum.

As an aside, it is possible to have a function that has neither a local maximum or minimum. This is called a saddle point. The following figure has a saddle point.

mmplotter(:saddle)

`Optim` for optimisation¶

We can use the excellent Optim package to find the answer to unconstrained optimisation problems, both univariate and multivariate optimisation. In the univariate case we could optimise a function like our quadratic utility function from the previous tutorial,

$$ U(x) = x - α x ^ 2 $$

The optimisation routine in this case is called Brent's method. If you are interested in numerical optimisation this is worthwhile taking a look at.

α = 0.2
result = maximize(x -> x - α * x ^ 2, -10, 10) # x -> x - α * x ^ 2 is the same as U(x) = x - α * x ^ 2. It is called an anonymous function representation.

Results of Maximization Algorithm
 * Algorithm: Brent's Method
 * Search Interval: [-10.000000, 10.000000]
 * Maximizer: 2.500000e+00
 * Maximum: 1.250000e+00
 * Iterations: 5
 * Convergence: max(|x - x_upper|, |x - x_lower|) <= 2*(1.5e-08*|x|+2.2e-16): true
 * Objective Function Calls: 6

maximum(result)

1.25

This gives the same answer as in Tutorial 2!

This was super easy right!?!

I think it is much easier than working out derivatives and setting them to zero and checking all those conditions. What I want you to take away from this is that if you can use an optimisation routine to try and solve a problem, use it! There are going to be instances where it isn't as easy as simply using a solver, but most of the time this is going to be your best case scenario. Don't reinvent the wheel!

Now let's move on to the multivariate case. Univariate is always the simplest, so perhaps multivariate is going to give us a few headaches?

Let us consider the profit maximization problem from the previous section. Remember that we want to maximize profit, where profit is given by,

$$ \Pi = -5 + 30x - 3x^2 + 25y - 5y^2 +xy $$

Now let us try and code up an answer to this problem,

Π(x) = -5 + 30 * x[1] - 3 * x[1] ^ 2 + 25 * x[2] - 5 * x[2] ^ 2 + x[1] * x[2] 
results = maximize(Π, [0.0, 0.0], LBFGS(), autodiff = :forward)

Results of Optimization Algorithm
 * Algorithm: L-BFGS
 * Starting Point: [0.0,0.0]
 * Maximizer: [5.508474576271187,3.0508474576271185]
 * Maximum: 1.157627e+02
 * Iterations: 2
 * Convergence: true
   * |x - x'| ≤ 0.0e+00: false 
     |x - x'| = 1.00e+00 
   * |f(x) - f(x')| ≤ 0.0e+00 |f(x)|: false
     |f(x) - f(x')| = 5.36e-02 |f(x)|
   * |g(x)| ≤ 1.0e-08: true 
     |g(x)| = 3.55e-15 
   * Stopped by an decreasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 5
 * Gradient Calls: 5

You might notice the L-BFGS algorithm is used here. In addition, we used forward mode auto differentiation. The reason is that the L-BFGS algorithm requires derivatives to operate. If we don't specify the auto differentiation method then the algorithm will use finite differences, which is less accurate than auto differentiation. We could have also supplied analytical derivatives here, the ones that we worked out by hand earlier. This would have worked equally well. In this case, the code would have looked like this,

# Provide analytical partial derivatives

function g!(G, x)
    G[1] = 30.0 - 6.0 * x[1] + x[2] 
    G[2] = 25.0 - 10.0 * x[2] + x[1]
end

results_1 = maximize(Π, g!, [0.0, 0.0], LBFGS()) # use analytical derivatives

Results of Optimization Algorithm
 * Algorithm: L-BFGS
 * Starting Point: [0.0,0.0]
 * Maximizer: [5.508474576271187,3.0508474576271185]
 * Maximum: 1.157627e+02
 * Iterations: 2
 * Convergence: true
   * |x - x'| ≤ 0.0e+00: false 
     |x - x'| = 1.00e+00 
   * |f(x) - f(x')| ≤ 0.0e+00 |f(x)|: false
     |f(x) - f(x')| = 5.36e-02 |f(x)|
   * |g(x)| ≤ 1.0e-08: true 
     |g(x)| = 3.55e-15 
   * Stopped by an decreasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 5
 * Gradient Calls: 5

We see that the result is basically the same across these methods.

Constrained optimisation¶

In this section we talk about the possibility of adding a constraint to the unconstrained problem from before. These can be either equality or inequality constraints. The normal formulation of a constrained optimisation problem is,

$$ \begin{align*} \min_{x \in \mathbb{R}^n}\qquad &f(x) \\ \text{subject to}\qquad &g_i(x) \le 0,\ i=1,\dots,I, \\ &h_j(x) = 0,\ j=1,\dots,J. \end{align*} $$

where functions $g_i$ generate inequality constraints, while functions $h_j$ generate equality constraints.

Important point: We can potentially have equality and inequality constraints in our constrained optimisation problem.

Some times the constrained solution is going to coincide with the unconstrained solution. This will be the case when the constraint is not binding. We will showcase this with an example soon.

We start with an example where we have a single equality constraint.

How would we go about solving this problem by hand, and how would we do this via Julia?

Equality constraint: Profit maximization revisited¶

In the following example we will optimize a function subject to a single equality constraint,

$$ \begin{align*} \min_{x \in \mathbb{R}^n}\qquad &f(x) \\ &h(x) = 0 \end{align*} $$

We are looking for the contour lines of the function $f(x)$ that are aligned with the contours of $h(x) = 0$.

This means that we want to find the optimal value for $x$ such that $h(x) = 0$.

In terms of gradients we are going to be looking for the point where $\nabla f(x) = \lambda \nabla h(x)$.

The scalar value $\lambda$ is known as the Lagrange multiplier.

We can reformulate our constrained optimisation problem in the form of a Lagrangian as follows,

$$ \mathcal{L}(x,\lambda) = f(x) - \lambda h(x) $$

where $\mathcal{L}$ means Lagrangian.

Let us consider our profit maximization example from before but this time with a constraint, so that we can get more comfortable with the idea of a Lagrangian,

$$ \begin{align*} \min_x \qquad & -5 + 30x_1 - 3x_1^2 + 25x_2 - 5x_2 \\ \text{subject to } \qquad & x_1 - x_2 = 0 \end{align*} $$

From this formulation we can form the Lagrangian,

$$ \mathcal{L}(x_1,x_2,\lambda) = \left(-5 + 30x_1 - 3x_1^2 + 25x_2 - 5x_2^2 \right) - \lambda(x_1 - x_2) $$

In order to solve the problem we have to compute the gradient with respect to $x_1$, $x_2$, $\lambda$ and set to zero. Then we need to solve that system of equations. We will quickly do this problem by hand, but in the future we will let the computer do all of the hard work.

Lagrangian 'by hand'¶

We have to get the first order conditions, which means that we have to calculate partial derivatives with respect to $x_1$, $x_2$, $\lambda$.

I am going to cheat a bit here and calculate the partial derivatives with the Symbolics package.

Symbolics.@variables x1 x2 λ

g = -5 + 30 * x1 - 3 * x1 ^ 2 + 25 * x2 - 5 * x2 ^ 2 - λ * (x1 - x2)

I = Differential(x1);
J = Differential(x2);
K = Differential(λ);

expand_derivatives(I(g)) # partial derivative of f wrt to x1

expand_derivatives(J(g)) # partial derivative of f wrt to x2

expand_derivatives(K(g)) # partial derivative of f wrt to λ

This means that we have following partial derivatives,

$$ \begin{align*} \frac{\partial \mathcal{L}}{\partial x_1} & \rightarrow 30 + x_2 - 6x_1 -\lambda = 0 \\ \frac{\partial \mathcal{L}}{\partial x_2} & \rightarrow 25 + x_1 + \lambda - 10x_2 = 0 \\ \frac{\partial \mathcal{L}}{\partial \lambda} & \rightarrow -x_1 + x_2 = 0 \end{align*} $$

Now we have three equations and three unknowns, which means that we could solve for $x_1$, $x_2$ and $\lambda$. We see that this is a system of linear equations, which means that we can actually use the computer to solve the problem for us. How can we do this? Well this system can be written in matrix format. Once it is in matrix format, we can easily solve it via the backslash \ operator in Julia.

To write the system in terms of a matrix we will need to reorder our equations,

$$ \begin{align*} - 6x_1 + x_2 -\lambda & = -30 \\ x_1 - 10x_2 + \lambda & = -25 \\ -x_1 + x_2 & = 0 \end{align*} $$

Now we can write it in matrix form,

$$ \begin{bmatrix} -6 & 1 & -1 \\ 1 & -10 & 1 \\ -1 & 1 & 0 \end{bmatrix} \cdot \begin{bmatrix} x_1 \\ x_2 \\ \lambda \end{bmatrix} = \begin{bmatrix} -30 \\ -25 \\ 0 \end{bmatrix} $$

This can be written in compact mathematical form as $Ax = b$. We then need to solve for $x = A^{-1}b$. We can solve this with Julia as follows,

A = [-6 1 -1; 1 -10 1; -1 1 0]
b = [-30; -25; 0]

x_ans = A \ b

3-element Vector{Float64}:
  3.9285714285714284
  3.928571428571429
 10.35714285714286

The solution here shows that the optimal point will be where $x_1 = 3.93, x_2 = 3.93$. Below we will draw a surface plot that shows the profit function, and next to it we have a contour plot which is a two dimensional projection of the three dimensional surface plot. We first write out the objective function and constraint as well as the range of $x_1$ and $x_2$ values over which we are going to be plotting the functions.

f(x1, x2) = -5 .+ 30 .* x1 .- 3 .* x1 .^ 2 .+ 25 .* x2 .- 5 .* x2 .^ 2 .+ x1 .* x2 # objective function
c(z) = z # constraint
x = -5:0.05:10	# grid of x values (x1 and x2)

-5.0:0.05:10.0

Next up is the plotting code. For now it is good enough to simply take a look at the code, you don't have to understand what is happening here. We will do something similar for the consumer problem which we cover next. In that instance we will discuss the plotting functionality in more detail.

p1 = surface(x, x, (x, y) -> f(x, y), xlab = L"x_1", ylab = L"x_2", legend = false, alpha = 0.9)

scatter3d!(p1, [x_ans[1]], [x_ans[2]], [f(x_ans[1], x_ans[2])], markercolor = :red)

p2 = contour(x, x, (x,y) -> f(x, y), lw = 1.5, xlab = L"x_1", ylab = L"x_2", legend = false)
plot!(p2, c, -5, 10, label = "", lw = 2, color = :black)
scatter!(p2, [x_ans[1]], [x_ans[2]], markersize = 5, markercolor = :red)

plot(p1, p2, size = (800, 400))

UndefVarError: surface not defined

Stacktrace:
 [1] top-level scope
   @ In[25]:1
 [2] eval
   @ ./boot.jl:373 [inlined]
 [3] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
   @ Base ./loading.jl:1196

Inequality constraint: Consumer problem¶

We have worked with a problem where there was an equality constraint, now we are going to introduce the idea of an inequality constraint. This section will be a bit more tricky, since we are going to employ several methods to solve the problem. I will indicate which methods are worthwhile spending a lot of time on and which you can just simply browse through. The code here will become a bit more complex, but I will try and explain everything as clearly as possible.

Consider the following consumer problem. You should be quite familiar with this, since it is very similar to the one that we covered in chapter 4 of Williamson last year.

We have an economy with two goods and the following components,

utility function $u(x_1,x_2):\mathbb{R}^2_{+}\rightarrow\mathbb{R}$,
exogenous income $I$, and
price-vector $(p_1,p_2)$,

In the textbook, from Chapter 4, the inputs into the utility function are consumption and leisure. You might recognise the problem the components $x_1 = C_1$ and $x_2 = l_1$. These two goods form part of a consumption bundle. The problem that we are posing in this notebook is more general, but also applicable to the textbook chapters.

We can write this problem as follows,

$$ \begin{aligned} V(p_{1},p_{2},I) & = \max_{x_{1},x_{2}}u(x_{1},x_{2})\\ \text{s.t.}\\ p_{1}x_{1}+p_{2}x_{2} & \leq I,\,\,\,p_{1},p_{2},I>0\\ x_{1},x_{2} & \geq 0 \end{aligned} $$

Since this is quite a simple problem, we can solve it by hand. The requires that we setup a Lagrangian and then work out the first order conditions.

Utility function¶

We have worked with the quadratic utility function before. However, that is function is not frequently used. We will use the Cobb-Douglas utility function, which is given by,

$$ u(x_1,x_2) = x_1^{\alpha}x_2^{1-\alpha} $$

function u_func(x1, x2, α = 0.5)
    return x1 .^ α .* x2 .^ (1 .- α)
end;

We can evaluate this utility function for different values of the inputs $x_1$ and $x_2$.

x1 = 1
x2 = 3

u = u_func(x1, x2)

println("x1 = $x1, x2 = $x2 -> u = $u")

x1 = 1, x2 = 3 -> u = 1.7320508075688772

We can also loop through multiple values for $x_1$ to see how the values for the utility function changes. This reflects the elasticity of utility with respect to $x_1$

x1_values = [2, 4, 6, 8, 10, 12, 14, 16]
x2 = 3

for x1 in x1_values
    u = u_func(x1, x2)

    println("x1 = $x1, x2 = $x2 -> u = $u")
end

x1 = 2, x2 = 3 -> u = 2.4494897427831783
x1 = 4, x2 = 3 -> u = 3.4641016151377544
x1 = 6, x2 = 3 -> u = 4.242640687119285
x1 = 8, x2 = 3 -> u = 4.898979485566357
x1 = 10, x2 = 3 -> u = 5.477225575051661
x1 = 12, x2 = 3 -> u = 5.999999999999999
x1 = 14, x2 = 3 -> u = 6.4807406984078595
x1 = 16, x2 = 3 -> u = 6.928203230275509

For a little nicer output we can use the enumerate command,

for (i, x1) in enumerate(x1_values)
    u = u_func(x1, x2)

    println("$i: x1 = $x1, x2 = $x2 -> u = $u")
end

1: x1 = 2, x2 = 3 -> u = 2.4494897427831783
2: x1 = 4, x2 = 3 -> u = 3.4641016151377544
3: x1 = 6, x2 = 3 -> u = 4.242640687119285
4: x1 = 8, x2 = 3 -> u = 4.898979485566357
5: x1 = 10, x2 = 3 -> u = 5.477225575051661
6: x1 = 12, x2 = 3 -> u = 5.999999999999999
7: x1 = 14, x2 = 3 -> u = 6.4807406984078595
8: x1 = 16, x2 = 3 -> u = 6.928203230275509

We can now see how utility changes with different values for $x_1$. In this case it seems that there is an increase in utility associated with an increase in $x_1$.

In mathematical terms, it seems that the utility function is monotonically increasing with respect to changes in $x_1$. The way that we would determine the change in the utility function with respect to $x_1$ over the entire function is by using the tools of calculus.

Plotting the utility function¶

The next thing we might want to do in our analysis is to plot the utility function. We want some idea of what the function looks like graphically. We could draw it by hand, but it is much easier to utilise the computer. We have also done this before with the profit function.

First, we are going to replicate the indifference curve similar to Figure 4.1 in the textbook.

function indifference_plot()

    x = 0:0.01:5 # Define the range of possible values for x1 and x2. 
    b = [0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4]

    plot(x, x, (x, y) -> u_func(x, y), st = :contour, colour = :viridis, levels = b, legend = false, lw = 2)
    
end;

p_ind = indifference_plot()

Here we have a few indifference curves for the consumer. Each indifference curve represents a set of consumption bundles among which the consumer is indifferent. Higher indifference curves represent higher welfare for the consumer.

Plotting utility in 3D¶

Next we are going to plot a 3D version of the utility function. We start by constructing our grids of values against which we can evaluate the function. We create equally spaced grids for $x_1$ and $x_2$ in the following way,

x1_values = collect(0:0.5:15);
x2_values = collect(0:0.5:15);

plot(x1_values, x2_values, (x1_values, x2_values) -> u_func(x1_values, x2_values), st = :surface, xlabel = L"x1", ylabel = L"x2", zlabel = L"u(x_1, x_2)", camera = (40, 60), colour = :viridis, legend = false, alpha = 0.9)

We have the graph for the utility function, now we need to think about finding a solution to our constrained optimisation problem. We want to achieve the highest level of utility, given the constraint that we are provided.

Budget and preferences¶

We have already plotted the utility function, so let us now plot the utility function with our budget constraint and the optimal point. Remember, the budget constraint is given by,

$$ p_{1}x_{1}+p_{2}x_{2} \leq I,\,\,\,p_{1},p_{2},I>0,\,\, x_{1},x_{2} \geq 0 $$

We can combine the budget and utility together in a contour plot. For this problem we are going to assume that $p1 = 1$, $p2 = 2$ and that $I = 10$.

p1 = 1;
p2 = 2;
I = 10;

x = 0:0.1:10

b = collect(0:0.5:5);
c(z) = 10 - 2 .* z;

plot_utility = plot(x, x, (x, y) -> u_func(x, y), st = :contour, colour = :ice, levels = b, legend = false, lw = 1.5);

plot!(plot_utility, c, 0.01, 10, ylims = (0,10), lw = 1.5, color = :black, fill=(0,0.5,:blue))

Lagrangian 'by hand'¶

As a reminder, the Cobb-Douglas utility function is

$$ u(x_1,x_2) = x_1^{\alpha}x_2^{1-\alpha} $$

then we can pose the Lagrangian as follows,

$$ \mathcal{L}(x_1,x_2,\mu) = \left(x_1^{\alpha}x_2^{1-\alpha} \right) - \mu(p_1x_1 - p_2x_2 - I) $$

This means that we have following partial derivatives,

$$ \begin{align*} \frac{\partial \mathcal{L}}{\partial x_1} & \rightarrow \alpha x_1^{\alpha - 1}x_2^{1-\alpha} - \mu p_1 = 0 \\ \frac{\partial \mathcal{L}}{\partial x_2} & \rightarrow (1 - \alpha) x_2^{-\alpha}x_1^{\alpha} + \mu p_2 = 0 \\ \frac{\partial \mathcal{L}}{\partial \lambda} & \rightarrow - p_1x_1 + p_2x_2 + I = 0 \end{align*} $$

Previously we had that the system was one of linear equations. However, now we have some nonlinearity in our equations. Terms like $x_1^{\alpha - 1}$ are nonlinear. This means that we cannot use our previous solution method of putting things into a matrix. At this stage we can solve the system of nonlinear equations with the NLsolve package in Julia. However, let us try and solve this by hand first. We have three equations and three unknowns.

We can rearrange the last equation and then we have $x_1 = \frac{p_2x_2}{p_1} + \frac{I}{p_1}$. We can also set the first and second equation equal to each other since they both share the term $\mu$.

$$ \begin{align*} \frac{\alpha x_1^{\alpha - 1}x_2^{1-\alpha}}{p_1} & = - \frac{(1 - \alpha) x_2^{-\alpha}x_1^{\alpha}}{p_2} \\ \frac{\alpha x_2}{x_1p_1} & = - \frac{(1 - \alpha)}{p_2} \\ x_2 & = - \frac{(1 - \alpha)x_1p_1}{\alpha p_2} \\ \end{align*} $$

Substitute this into our equation for $x_1$ and then we have,

$$ \begin{align*} & x_1 = -\frac{p_2\frac{(1 - \alpha)x_1p_1}{\alpha p_2}}{p_1} + \frac{I}{p_1} \\ & x_1 = -\frac{\frac{(1 - \alpha)x_1p_1}{\alpha}}{p_1} + \frac{I}{p_1} \\ & x_1 = -\frac{(1 - \alpha)x_1}{\alpha} + \frac{I}{p_1} \\ & x_1 + \frac{x_1 - \alpha x_1}{\alpha} = \frac{I}{p_1} \\ & x_1 + \frac{x_1}{\alpha} - x_1 = \frac{I}{p_1} \\ & x_1^{*} = \alpha\frac{I}{p_1} \end{align*} $$

Now we finish off by substituting $x_1^{*}$ into $x_2$,

$$ \begin{align*} x_2^{*} & = - \frac{(1 - \alpha)\alpha\frac{I p_1}{p_1}}{\alpha p_2} \\ & = - (1 - \alpha)\frac{I}{p_2} \end{align*} $$

The solution is then provided algebraically as,

$$ \begin{aligned} x_1^{\ast} &= \alpha \frac{I}{p_1} \\ x_2^{\ast} &= -(1 - \alpha) \frac{I}{p_2} \end{aligned} $$

which implies that $\alpha$ is the budget share of the first good and $1-\alpha$ is the budget share of the second good.

While this is not a difficult problem to solve by hand, it would be way more convenient to be able to do this on a computer. One of the main reasons is that if the utility function changes we don't need to do the whole process again, we simply need to alter the utility function in our code and we can calculate the new solution.

Optimisation with Julia¶

In this next section we talk about two ways (algorithms) for finding a solution to our consumer optimisation problem. Once again, remember that the problem that we want to solve is the following,

$$ \begin{aligned} V(p_{1},p_{2},I) & = \max_{x_{1},x_{2}}u(x_{1},x_{2})\\ & \text{s.t.}\\ p_{1}x_{1}+p_{2}x_{2} & \leq I,\,\,\,p_{1},p_{2},I>0\\ x_{1},x_{2} & \geq 0 \end{aligned} $$

Algorithm 1: Using a Solver (Optim)¶

The best option is to try and use code that someone else has written.

Sometimes it is useful to write your own routines if no other option is available. However, if you can make use of a package from a professional software developer that is the first prize.

Do not try and reinvent the wheel. Some of you might end of being software engineers, and then you can spend countless hours trying to write performant code. However, most of you are going to use code for a specific goal.

My ideal world is one where everyone knows something about software engineering and we students are aware of best programming practices. However, I am aware not everyone is interested in these topics.

In this section we will be making use of Optim.jl. This package does not allow constrained optimisation, so we are going to have to be a bit creative with respect to the construction of constraints for the problem.

We will be working with a package that deals directly with constrained optimisation for problems with nonlinear objective functions in the next section, but it is worthwhile seeing what Optim.jl can do in the meantime.

We will be wrapping our code block in a function. This is generally good practice. We want to operate by writing functions whenever possible.

function value_of_choice(x)

    α = 0.25
    I = 10
    p1 = 1
    p2 = 2

    x1 = x[1]
    x2 = x[2]

    penalty = 0

    exp = p1 * x1 + p2 * x2 # total expenses

    if exp > I # expenses > income is not allowed
        fac = I/exp
        penalty = penalty + 1000 * (exp - I) # calculate the penalty
        x1 = x1 * fac
        x2 = x2 * fac
    end

    return -u_func(x1, x2, α) + penalty
end

value_of_choice (generic function with 1 method)

I = 10
p1 = 1
p2 = 2

initial_x = [I/p1/2, I/p2/2]

res = Optim.optimize(value_of_choice, initial_x)

 * Status: success

 * Candidate solution
    Final objective value:     -3.388508e+00

 * Found with
    Algorithm:     Nelder-Mead

 * Convergence measures
    √(Σ(yᵢ-ȳ)²)/n ≤ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    77
    f(x) calls:    145

Optim.minimizer(res)

2-element Vector{Float64}:
 2.4997703025669873
 3.7501148390997785

Algorithm 2: Using a Solver (JuMP)¶

In the next section we will use a really cool package called JuMP.jl to solve the model.

model = Model(Ipopt.Optimizer)
@variable(model, x1 >= 0)
@variable(model, x2 >= 0)
@NLobjective(model, Max, x1 ^ 0.25 * x2 ^ 0.75)
@NLconstraint(model, x1 + 2x2 <= 10)
print(model)

optimize!(model)

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

This is Ipopt version 3.14.4, running with linear solver MUMPS 5.4.1.

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:        2
Number of nonzeros in Lagrangian Hessian.............:        3

Total number of variables............................:        2
                     variables with only lower bounds:        2
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:        1
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        1

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  9.9999900e-03 0.00e+00 2.92e-01  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  1.1520025e-01 0.00e+00 6.71e-01  -1.0 3.15e-01    -  6.17e-01 1.00e+00f  1
   2  1.6555282e+00 0.00e+00 3.84e-01  -1.0 4.54e+00    -  1.58e-01 1.00e+00f  1
   3  3.1361665e+00 0.00e+00 2.81e-02  -1.0 4.38e+00    -  7.25e-01 1.00e+00f  1
   4  3.3631647e+00 0.00e+00 1.06e-03  -1.7 6.57e-01    -  1.00e+00 1.00e+00f  1
   5  3.3880704e+00 0.00e+00 5.88e-05  -3.8 7.32e-02    -  9.95e-01 1.00e+00f  1
   6  3.3885056e+00 0.00e+00 7.18e-09  -5.7 1.28e-03    -  1.00e+00 1.00e+00f  1
   7  3.3885075e+00 0.00e+00 1.69e-13  -8.6 5.54e-06    -  1.00e+00 1.00e+00f  1

Number of Iterations....: 7

                                   (scaled)                 (unscaled)
Objective...............:  -3.3885075144174759e+00    3.3885075144174759e+00
Dual infeasibility......:   1.6871971243448143e-13    1.6871971243448143e-13
Constraint violation....:   0.0000000000000000e+00    0.0000000000000000e+00
Variable bound violation:   0.0000000000000000e+00    0.0000000000000000e+00
Complementarity.........:   2.5079494571310116e-09    2.5079494571310116e-09
Overall NLP error.......:   2.5079494571310116e-09    2.5079494571310116e-09


Number of objective function evaluations             = 8
Number of objective gradient evaluations             = 8
Number of equality constraint evaluations            = 0
Number of inequality constraint evaluations          = 8
Number of equality constraint Jacobian evaluations   = 0
Number of inequality constraint Jacobian evaluations = 8
Number of Lagrangian Hessian evaluations             = 7
Total seconds in IPOPT                               = 2.194

EXIT: Optimal Solution Found.

@show value(x1)

value(x1) = 2.500000004350334

2.500000004350334

@show value(x2)

value(x2) = 3.7499999991241637

3.7499999991241637

x_ans = value(x1), value(x2)

(2.500000004350334, 3.7499999991241637)

If we compare these values to our solution from before we will see that they are quite similar.

Plotting the solution¶

Next up we plot the solution to the problem.

function u_func(x1, x2, α = 0.5)
    return x1 .^ α .* x2 .^ (1 .- α)
end;

x = 0:0.01:10

b = collect(0:0.1:5);
c(z) = 5 .- 1/2 .* z;

p1 = contour(x, x, (x, y) -> u_func(x, y), lw = 1.5, xlab = L"x_1", ylab = L"x_2", legend = false)
plot!(p1, c, 0, 10, lw = 2, color = :black, label = "", fill = (0, 0.5, :blue))
scatter!(p1, [x_ans[1]], [x_ans[2]], markersize = 5, markercolor = :red)

Macro 318: Tutorial #3 ¶

Optimisation and the consumer problem ¶

Lecturer: Dawie van Lill (dvanlill@sun.ac.za)

Introduction¶

Learning objectives¶

Unconstrained optimisation¶

Optimisation and gradients?¶

Example: Profit maximisation¶

Exercise¶

Local optima via Julia (optional)¶

Optim for optimisation¶

Constrained optimisation¶

Equality constraint: Profit maximization revisited¶

Lagrangian 'by hand'¶

Inequality constraint: Consumer problem¶

Utility function¶

Plotting the utility function¶

Plotting utility in 3D¶

Budget and preferences¶

Lagrangian 'by hand'¶

Optimisation with Julia¶

Algorithm 1: Using a Solver (Optim)¶

Algorithm 2: Using a Solver (JuMP)¶

Plotting the solution¶

Lecturer:
Dawie van Lill (dvanlill@sun.ac.za)

`Optim` for optimisation¶