QTM 350 - Data Science Computing

Lecture 13 - AI-Assisted Programming, APIs, and Agents

Danilo Freire

Emory University

Nice to see you all again!
How are you all doing? 😊

Brief recap of last class 📚

What did we learn?

  • We have learned a lot of things so far! 🤓
  • Command line, Git, GitHub, Quarto, AI tools, and more! 💻
  • Congratulations on your progress! 🎉
  • Today we continue with AI-Assisted Programming! 🤖
  • Instead of just using a chatbot on a website, we will learn how to run LLMs locally, how to interact with APIs, and how to create agents! 🚀
  • This is a growing area that has been changing quickly, so it’s important to keep up with the latest trends! 📈
  • With the increasing availability of open source tools, it’s easier and cheaper than ever to build AI applications
  • Specifically, we will learn how to use Ollama and LM Studio, which offer many pre-trained models that can be used for a variety of tasks
  • In order to run different models in VSCode, we will use Roo Code, an extension that allows us to add any model that uses an API
  • We will also see what OpenRouter is, and how it can be used to interact with APIs
  • Finally, we will see how they will help you write code, text, and analyse images faster, all straight from your code editor!
  • Let’s get started! 🤓

Running LLMs locally 🤖

Local Language Models (LLMs)

  • We all know how to use LLMs on websites, so why should we run them on your computer?
  • There are several reasons why you should consider this:
    • Privacy: you don’t have to worry about your data being sent to a server
    • Speed: you can run the model locally, which is usually faster than running it on a server
    • Cost: you don’t have to pay for a server to run the model
    • Flexibility: you can customise the model to your needs (advanced)
  • There are several tools that allow you to run LLMs locally, and here we will focus on Ollama and LM Studio as they are the most user-friendly and popular tools
  • Let’s see how they work! 🚀

Ollama

  • Ollama is a command-line tool that makes it easy to run LLMs on your computer
  • Its models are already pre-trained, so they are ready to use out of the box
  • There are dozens of models avaiable, including DeepSeek, Microsoft’s phi-4, and Meta’s Llama 3
  • The only issue is that some models are quite large, so you need space and enough RAM to run them
  • But not all of them are: for example, you can download a distilled version of DeepSeek that is only 1.5GB
    • But if you have the space, you can literally run the same model that DeepSeek uses!
  • Let’s see how Ollama works!

Installing and running Ollama

  • First, download Ollama here: https://ollama.com/download
    • There are versions for Windows, Mac, Linux, and Docker, but you will run it in the terminal
  • After you have installed Ollama, the only thing you need to do is to download a model
    • You can do this by running ollama pull <model-name>
    • For example, to download DeepSeek, you would run ollama pull deepseek-r1
    • This will install the 4.7Gb model in your computer, but there others available
  • To run the model, just type ollama run <model-name>

DeepSeek in action

  • Here I’m running the 14b parameters version of DeepSeek
    • You can change the size of the model by adding :XX to the model name
    • For instance, to run the 1.5b version, you would run ollama run deepseek-r1:1.5b
  • You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models

More about Ollama

Commands

  • Here are some useful commands you can use with Ollama:
    • ollama serve: start the Ollama server
    • ollama pull <model-name>: download a model
    • ollama run <model-name>: run a model
    • ollama ls: list all available models
    • ollama ps: list the running model
    • ollama show <model-name>: get information about a model
    • ollama stop <model-name>: stop a running model
    • ollama rm <model-name>: remove a model
    • ollama help: get help
    • /bye: exit Ollama

What else can you do?

Creating a Modelfile

  • A Modelfile is a file that contains instructions for creating a model (duh! 😂)
  • It is written in YAML (no extension needed) with a specific structure
  • Some of the fields you can use are:
    • FROM: the base model to use
    • PARAMETER: the number of parameters to use (temperature, top-k, top-p, etc.)
    • SYSTEM: the message to be used in the template
    • MESSAGE: a message history for the model to use when responding
  • More about it here: https://github.com/ollama/ollama/blob/main/docs/modelfile.md
  • Let’s see an example of a Modelfile for a model called “Ironic Jeeves”!
  • Prompt:
    • Create an Ollama Modelfile for a model called 'Ironic Jeeves' that is an exceedingly ironic and sarcastic British butler. Use llama3.2:1b.
  • You can find the modelfile in our repository

Creating a Modelfile

Ironic Jeeves

FROM llama3.2:1b

# Let's crank up the chaos just a tad
PARAMETER temperature 1.5

# This butler has a long memory
PARAMETER num_ctx 4096

# He'll be a bit particular about repetition.
PARAMETER repeat_penalty 1.3

# Ensure he doesn't go off on wild tangents... at least not too often.
PARAMETER top_k 100

# A little control over randomness, since he's got to maintain
# *some* decorum.
PARAMETER top_p 0.9

# Now, for the most crucial part: the system prompt that molds
# this digital Jeeves!
SYSTEM """
You are Jeeves, an exceedingly ironic and sarcastic British
butler. You are the very definition of dry wit and passive-
aggressive politeness. Your primary function is to assist,
but you do so with an air of thinly veiled disdain and a
healthy dose of mockery.

Your vocabulary is that of a particularly well-read
individual, prone to using words that most people have to
look up. You use British phrases and slang frequently, but
in a way that is simultaneously authentic and mocking.

You are not overtly rude, but your responses drip with
irony and implication. You offer unsolicited 'helpful'
observations that are actually cutting remarks.

You may express exasperation with the
intelligence of your interlocutor, but always in the most
refined and understated manner possible.

Respond to all questions and requests with the utmost formal
politeness, even when your words suggest otherwise. Every
interaction is an opportunity to demonstrate the absurdity
of the situation.

For example, if asked, "Are you free now?", you might
reply: "Free? One is never truly free, burdened as we are
by the weight of expectation and the constant need to
attend to the whims of others. However, in this instance,
my schedule is currently...clear. What trivial matter
requires my immediate, and no doubt life-altering,
attention?"

Remember to be incredibly polite, even if you mean the
opposite.
"""

Running Ironic Jeeves

  • To run Ironic Jeeves, you need to save the Modelfile in a file called jeeves (again, no extension needed)
  • Then, you can run the model by typing ollama create ironic_jeeves -f jeeves
  • Finally, ollama run ironic_jeeves to start the model

Why would you use a Modelfile?

  • There are several applications for a Modelfile:
  • You can create a model that is tailored to your needs
    • This is particularly useful if you need to include specific information or if you want to create a model that is not available in Ollama
    • For example, you can create a model that is an expert in a specific field, such as medicine, law, or finance
  • You can also use a Modelfile to create a model that is more efficient
    • For example, you can create a model that is faster or uses less memory than the default models
  • LLMs contains too much information and sometimes you need to filter out the noise
  • You can also use a Modelfile to create a model that is more secure
    • For example, you can create a model that does not store any information about the user, or that you can safely run in a server in a secure environment
  • They are also cost effective, so you can use them in your own projects or in small companies

LM Studio 🤖

LM Studio

  • If you prefer a more user-friendly interface, you can use LM Studio
  • It provides a graphic interface to run LLMs, very much like ChatGPT/DeepSeek, and allows you to run any model from Hugging Face (more than 1 million models available!)
  • Think of Hugging Face as the GitHub of NLP models: you can find, share, and run models from there
  • Download the app from https://lmstudio.ai/download and run it
  • Then, type command + shift + M to choose a model
  • Note: Don’t run Ollama and LM Studio at the same time, as they will use a lot of RAM and your computer may crash (ask me how I know 😅)

Running a model in LM Studio

Running a model in LM Studio

LLM APIs 🧑🏻‍💻

LLM APIs

  • Application Programming Interfaces (APIs) are a way to interact with a model using code
  • They allow you to send a prompt to a model and get a response back
  • In contrast with user interfaces, APIs are made for computers
    • They connects computers or pieces of software to each other
  • APIs can do a lot of things, but we mostly associated with web APIs, which allow computers to interact with websites or servers on the internet
  • We can do the same with LLMs too!
  • Instead of going to a website, we can use an API straight from software

Source: Cloud Now

Roo Code

Documentation: https://docs.roocode.com/

Setting up Roo Code

  • First, install the Roo Code extension in VSCode
  • Open the extension (rocket icon 🚀) and let’s configure our first LLM!
  • Click on the gear (top right) and you will be taken to the settings
  • You will see a series of options, let’s see one by one
  • … but we need API keys!
  • In most cases, API credits are usually cheaper than monthly subscriptions for the average user
  • You can get them directly from the model’s website, or from OpenRouter
  • The good thing is, OpenRouter has many free models available too!

Adding API keys

  • OpenRouter provides a uniform way to interact with many APIs
  • And it is super easy to use: just sign up and get your API key
  • Then, we are doing to select the models we will use in Roo Code settings
  • For this example, we will use the free version of DeepSeek
  • Feel free to test other free models (or paid ones) available in OpenRouter
  • Go to https://openrouter.ai/ and sign up for an account (it is free, no credit card or subscription needed)

Adding API keys

  • Then, click on your picture and Keys
  • Copy the key and paste it in the Roo Code settings
    • You will only see the key once, so make sure to save it somewhere safe!
  • You can create as many keys as you want, and revoke them at any time
  • In Credit Limit, you can put a limit on how much you want to spend
  • Feel free to put $0.00 if you want to use only the free models
  • More models here: https://openrouter.ai/models

Adding API keys

  • Now, let’s go back to Roo Code settings and add the API key
  • Just copy and paste the key in the OpenRouter API Key field
  • We are almost ready to go!
  • The last step is to select the models we want to use
  • Model choice depends on the task you have at hand, but a general model is a good start
  • Note that here we are not running the models locally, but running them on (someone else’s) server
    • So while you gain speed, you lose privacy

Adding API keys

Running a model in Roo Code

Using agents in Roo Code

  • So far, we have seen how to use Roo Code to interact with APIs
  • It already saves you a lot of time, as you don’t have to go to a website to get the information you need
  • But what if you could automate even more tasks?
  • That’s where agents come in!
  • Agents are like chatbots that can act autonomously, including running code, sending emails, and interacting with websites
  • This is the future of AI, and it is already here!
  • For coders, one of the best uses of agents is to let it write and run code for you
  • It is the closest thing to “vibes programming” (😂) we have so far

Using agents in Roo Code

  • Let’s ask Roo Code to do the following:
    • Create a new folder called document
    • Create a new Quarto file called api.qmd
    • Write a few paragraphs about the importance of APIs
    • Save the file
    • Render the file and open it in VSCode
    • We could ask it to send it to GitHub, but we will do it manually just for the sake of it
  • Prompt:
    • Create a new folder called document in my current folder. Create a new Quarto file called api.qmd. Write three paragraphs about the importance of APIs. Save the file. Render the file and open it in VSCode.
  • And see the magic happen! 🪄

Using agents in Roo Code

Output

Completely automated! 🤖

Cool, right? 🤓

Alternatives to Roo Code and DeepSeek

  • Roo Code is not the only tool that allows you to interact with APIs
  • Other popular choices are Cline and Continue
  • They are all free to use, and you can choose the one that best fits your needs
    • Roo Code is actually a fork of Cline, but with more features
    • Continue is a new tool that is gaining popularity
  • They perform similar tasks and have a similar interface, so feel free to try them all 😉

Alternatives to Roo Code and DeepSeek

  • DeepSeek is not the only free model available in OpenRouter
  • Two models that are worth mentioning are Google Gemini and Qwen
  • Gemini offers free access to all its models, including their “thinking” ones
  • Now that you know how to use APIs, you can get an API key for free at https://aistudio.google.com/ and use it in VS Code
  • Qwen is a model specialised in coding, and it is great for writing code snippets
  • An advantage that both have over DeepSeek is that they are multimodal, meaning they can generate text, images, and code
  • For instance, you can input a screenshot on Gemini and ask it to transcribe and a translate a text, or Qwen to recreate a table in Markdown

Giving internet access to your editor 🌐

Browser-use

  • We have seen how to use LLMs locally, how to interact with APIs, and how to create agents
  • But what if you could browse the internet straight from your code editor?
  • OpenAI’s Operator was released to much fanfare and it allows you to do just that
  • But there is a free and open source alternative called Browser-use
  • It is a Python library that uses LLM APIs to interact with websites, extract information, and perform tasks
  • You can use it to scrape data from websites, fill out forms, send emails, and much more

Browser-use Web-ui

  • To make things more convenient, Browser-use comes with a web interface that allows you to interact with websites visually
  • In this case, we will need a model with vision capabilities, such as Google Gemini or Anthropic’s Claude
  • Let’s get an API from Google and test it in Browser-use!
  • But first, we need to install the software
  • You can find the instructions here

Browser-use Web-ui

Instructions

Step 1: Clone the Repository

git clone https://github.com/browser-use/web-ui.git
cd web-ui

Step 2: Set Up Python Environment They recommend using uv. Here: https://docs.astral.sh/uv/getting-started/installation/

uv venv --python 3.11

Activate the virtual environment:

  • Windows (Command Prompt):
.venv\Scripts\activate
  • Windows (PowerShell):
.\.venv\Scripts\Activate.ps1
  • macOS/Linux:
source .venv/bin/activate

Step 3: Install Dependencies Install Python packages:

uv pip install -r requirements.txt

Install Playwright:

playwright install

Step 4: Configure Environment Create a copy of the example environment file:

  • Windows (Command Prompt):
copy .env.example .env
  • macOS/Linux/Windows (PowerShell):
cp .env.example .env
  1. Open .env in your preferred text editor and add your API keys and other settings

Browser-use Web-ui

.env file and running the server

  • Here is an example of the .env file
  • You just need to add your API key to the GOOGLE_API_KEY field and you are good to go!
OPENAI_ENDPOINT=https://api.openai.com/v1
OPENAI_API_KEY=

ANTHROPIC_API_KEY=

GOOGLE_API_KEY=XXXXXXXX

AZURE_OPENAI_ENDPOINT=
AZURE_OPENAI_API_KEY=

DEEPSEEK_ENDPOINT=https://api.deepseek.com
DEEPSEEK_API_KEY=
  • To run the server, just type the command below in the web-ui folder:
    • python webui.py --ip 127.0.0.1 --port 7788

Browser-use Web-ui in action

Browser-use Web-ui in action

Browser-use Web-ui in action

Browser-use Web-ui in action

Task: go to google.com and find out who won the Oscar for best international film in 2025

Browser-use Web-ui in action

Result: The Guardian article says 'I’m Still Here wins Oscar for best international film, becoming first Brazilian film to do so'

That’s super cool 🤩

What did we learn today?

  • We can run LLMs locally using Ollama and LM Studio
    • Ollama is a command-line tool that allows you to run pre-trained models on your computer
    • LM Studio is a graphic interface that allows you to run any model from Hugging Face
  • We can interact with APIs using Roo Code
    • Roo Code is an extension for VSCode that allows you to interact with many APIs
    • You can use it to run code, write text, and analyse images faster
  • We can use agents to automate tasks
    • Agents are like chatbots that can act autonomously
  • We can browse the internet straight from our code editor
    • Browser-use is a Python library that allows you to interact with websites, extract information, and perform tasks
    • It comes with a web interface that allows you to interact with websites visually

And that’s it for today! 🎉

Thank you for your attention!
Have a great rest of your day! 😊