QTM 350 - Data Science Computing

Lecture 15 - AI-Assisted Programming, APIs, and Agents (Continued)

Danilo Freire

Emory University

Nice to see you all again!
How are you all doing? 😊

Tavily discount code

LLM APIs 🧑🏻‍💻

LLM APIs

  • Application Programming Interfaces (APIs) are a way to interact with a model using code
  • They allow you to send a prompt to a model and get a response back
  • In contrast with user interfaces, APIs are made for computers
    • They connects computers or pieces of software to each other
  • APIs can do a lot of things, but we mostly associated with web APIs, which allow computers to interact with websites or servers on the internet
  • We can do the same with LLMs too!
  • Instead of going to a website, we can use an API straight from software

Source: Cloud Now

Roo Code

Documentation: https://docs.roocode.com/

Setting up Roo Code

  • First, install the Roo Code extension in VSCode
  • Open the extension (rocket icon 🚀) and let’s configure our first LLM!
  • Click on the gear (top right) and you will be taken to the settings
  • You will see a series of options, let’s see one by one
  • … but we need API keys!
  • In most cases, API credits are usually cheaper than monthly subscriptions for the average user
  • You can get them directly from the model’s website, or from OpenRouter
  • The good thing is, OpenRouter has many free models available too!

Adding API keys

  • OpenRouter provides a uniform way to interact with many APIs
  • And it is super easy to use: just sign up and get your API key
  • Then, we are doing to select the models we will use in Roo Code settings
  • For this example, we will use the free version of DeepSeek
  • Feel free to test other free models (or paid ones) available in OpenRouter
  • Go to https://openrouter.ai/ and sign up for an account (it is free, no credit card or subscription needed)

Adding API keys

  • Then, click on your picture and Keys
  • Copy the key and paste it in the Roo Code settings
    • You will only see the key once, so make sure to save it somewhere safe!
  • You can create as many keys as you want, and revoke them at any time
  • In Credit Limit, you can put a limit on how much you want to spend
  • Feel free to put $0.00 if you want to use only the free models
  • More models here: https://openrouter.ai/models

Adding API keys

  • Now, let’s go back to Roo Code settings and add the API key
  • Just copy and paste the key in the OpenRouter API Key field
  • We are almost ready to go!
  • The last step is to select the models we want to use
  • Model choice depends on the task you have at hand, but a general model is a good start
  • Note that here we are not running the models locally, but running them on (someone else’s) server
    • So while you gain speed, you lose privacy

Adding API keys

Running a model in Roo Code

Using agents in Roo Code

  • So far, we have seen how to use Roo Code to interact with APIs
  • It already saves you a lot of time, as you don’t have to go to a website to get the information you need
  • But what if you could automate even more tasks?
  • That’s where agents come in!
  • Agents are like chatbots that can act autonomously, including running code, sending emails, and interacting with websites
  • This is the future of AI, and it is already here!
  • For coders, one of the best uses of agents is to let it write and run code for you
  • It is the closest thing to “vibes programming” (😂) we have so far

Using agents in Roo Code

  • Let’s ask Roo Code to do the following:
    • Create a new folder called document
    • Create a new Quarto file called api.qmd
    • Write a few paragraphs about the importance of APIs
    • Save the file
    • Render the file and open it in VSCode
    • We could ask it to send it to GitHub, but we will do it manually just for the sake of it
  • Prompt:
    • Create a new folder called document in my current folder. Create a new Quarto file called api.qmd. Write three paragraphs about the importance of APIs. Save the file. Render the file and open it in VSCode.
  • And see the magic happen! 🪄

Using agents in Roo Code

Output

Completely automated! 🤖

Cool, right? 🤓

Alternatives to Roo Code and DeepSeek

  • Roo Code is not the only tool that allows you to interact with APIs
  • Other popular choices are Cline and Continue
  • They are all free to use, and you can choose the one that best fits your needs
    • Roo Code is actually a fork of Cline, but with more features
    • Continue is a new tool that is gaining popularity
  • They perform similar tasks and have a similar interface, so feel free to try them all 😉

Alternatives to Roo Code and DeepSeek

  • DeepSeek is not the only free model available in OpenRouter
  • Two models that are worth mentioning are Google Gemini and Qwen
  • Gemini offers free access to all its models, including their “thinking” ones
  • Now that you know how to use APIs, you can get an API key for free at https://aistudio.google.com/ and use it in VS Code
  • Qwen is a model specialised in coding, and it is great for writing code snippets
  • An advantage that both have over DeepSeek is that they are multimodal, meaning they can generate text, images, and code
  • For instance, you can input a screenshot on Gemini and ask it to transcribe and a translate a text, or Qwen to recreate a table in Markdown

Giving internet access to your editor 🌐

Browser-use

  • We have seen how to use LLMs locally, how to interact with APIs, and how to create agents
  • But what if you could browse the internet straight from your code editor?
  • OpenAI’s Operator was released to much fanfare and it allows you to do just that
  • But there is a free and open source alternative called Browser-use
  • It is a Python library that uses LLM APIs to interact with websites, extract information, and perform tasks
  • You can use it to scrape data from websites, fill out forms, send emails, and much more

Browser-use Web-ui

  • To make things more convenient, Browser-use comes with a web interface that allows you to interact with websites visually
  • In this case, we will need a model with vision capabilities, such as Google Gemini or Anthropic’s Claude
  • Let’s get an API from Google and test it in Browser-use!
  • But first, we need to install the software
  • You can find the instructions here

Browser-use Web-ui

Instructions

Step 1: Clone the Repository

git clone https://github.com/browser-use/web-ui.git
cd web-ui

Step 2: Set Up Python Environment They recommend using uv. Here: https://docs.astral.sh/uv/getting-started/installation/

curl -LsSf https://astral.sh/uv/install.sh | sh 
# pip install uv
uv venv --python 3.11

Activate the virtual environment:

  • Windows (Command Prompt):
.venv\Scripts\activate
  • Windows (PowerShell):
.\.venv\Scripts\Activate.ps1
  • macOS/Linux:
source .venv/bin/activate

Step 3: Install Dependencies Install Python packages:

uv pip install -r requirements.txt

Install Playwright:

playwright install --with-deps

Step 4: Configure Environment Create a copy of the example environment file:

  • Windows (Command Prompt):
copy .env.example .env
  • macOS/Linux/Windows (PowerShell):
cp .env.example .env
  1. Open .env in your preferred text editor and add your API keys and other settings

How to create a Google API key

  • It’s super easy to create a Google API key! You can do it in a few steps:
  • Go to the Google Cloud Console and sign in with your Google account
  • Create a new project or select an existing one (you can name it whatever you want)
  • Now, go to https://aistudio.google.com/apikey and click on “Get API Key” and “Create API Key”
  • Select the project you created or selected earlier
  • Copy the API key and paste it in the .env file in the GOOGLE_API_KEY field
  • Check if you are using the Free Plan (no credit card needed)
  • And that’s it! You are ready to use Google Gemini in Browser-use! 🤖

How to create a Google API key

https://console.cloud.google.com/

How to create a Google API key

How to create a Google API key

How to create a Google API key

https://aistudio.google.com/apikey

Browser-use Web-ui

.env file and running the server

  • Here is an example of the .env file
  • You just need to add your API key to the GOOGLE_API_KEY field and you are good to go!
OPENAI_ENDPOINT=https://api.openai.com/v1
OPENAI_API_KEY=

ANTHROPIC_API_KEY=

GOOGLE_API_KEY=XXXXXXXX

AZURE_OPENAI_ENDPOINT=
AZURE_OPENAI_API_KEY=

DEEPSEEK_ENDPOINT=https://api.deepseek.com
DEEPSEEK_API_KEY=
  • To run the server, just type the command below in the web-ui folder:
    • python webui.py --ip 127.0.0.1 --port 7788
    • If you need to install dotenv, you can do it by running uv pip install python-dotenv
    • Then run .venv/bin/python webui.py --ip 127.0.0.1 --port 7788

Browser-use Web-ui in action

Browser-use Web-ui in action

Browser-use Web-ui in action

Browser-use Web-ui in action

Task: go to google.com and find out who won the Oscar for best international film in 2025

Browser-use Web-ui in action

Result: The Guardian article says 'I’m Still Here wins Oscar for best international film, becoming first Brazilian film to do so'

That’s super cool 🤩

What did we learn today?

  • We can run LLMs locally using Ollama and LM Studio
    • Ollama is a command-line tool that allows you to run pre-trained models on your computer
    • LM Studio is a graphic interface that allows you to run any model from Hugging Face
  • We can interact with APIs using Roo Code
    • Roo Code is an extension for VSCode that allows you to interact with many APIs
    • You can use it to run code, write text, and analyse images faster
  • We can use agents to automate tasks
    • Agents are like chatbots that can act autonomously
  • We can browse the internet straight from our code editor
    • Browser-use is a Python library that allows you to interact with websites, extract information, and perform tasks
    • It comes with a web interface that allows you to interact with websites visually

And that’s it for now! 🎉

Thank you for your attention! 😊