DATASCI 101: Introduction to AI Applications

Lecture 08: Quiz 01 Review Session

Danilo Freire

danilo.freire@emory.edu

Department of Data and Decision Sciences
Emory University

Welcome back! 🧠

Today’s plan

Quiz 01 is on Thursday, September 24 (50 minutes, in class)
Open laptop and notes; you can search the web
The quiz tests understanding and critical evaluation, not memorisation
Today is your session, not mine. I will ask, you will answer
We will work through:
1. A rapid recap of the seven lectures
2. One realistic scenario, broken down together
3. Practice questions in pairs
4. Open Q&A
Stop me whenever something is unclear. That is the whole point of today

Quiz logistics

Format: 5 short essay questions plus one bonus
Length: 50 minutes, in person
Material allowed: laptops, notes, lecture slides, web search, AI assistants
What is graded: your reasoning, not the polish of your AI’s prose
Honour code: write your own answers; if you used an AI, say which one
Coverage: Lectures 01 to 07

Why open laptop?

The job is not to remember definitions. The job is to recognise problems, judge trade-offs, and write something useful faster than the AI can.

Part 1: rapid recap 🚀

How this part works

I will name a concept from one of the seven lectures
One of you explains it in your own words (about 30 seconds)
The rest of us push back, add nuance, or correct
I will keep it moving. Wrong answers are useful; silence is not
We have about 12 minutes for this section

Concept 1: hallucination

Volunteer or victim? 🙋
In one sentence: what is an AI hallucination, and why does it happen?
Follow-ups for the room:
- Is a hallucination the same as a lie?
- Is a hallucination the same as a factual error?
- Why might an open-laptop quiz punish someone who copy-pastes an LLM answer?

Concept 2: dataset and labels

Pick someone new
What does it mean to “label” a dataset, and why is it usually the most expensive step?
Quick room poll: in each case, is the label easy, hard, or contested?
- “Is this email spam?”
- “Is this tweet toxic?”
- “Is this CT scan showing pneumonia?”
- “Is this résumé a good fit for the role?”
Connect to: garbage in, garbage out

Concept 3: the three learning paradigms

Three students, one paradigm each. One sentence per person:
- Supervised learning — ?
- Unsupervised learning — ?
- Reinforcement learning — ?
Then back to the room: name one real product that uses each
Trick question: where does training a chatbot with thumbs up / thumbs down fit? (answer: RL from human feedback — RLHF)

Concept 4: precision vs recall

This is the single most quiz-worthy concept of the block. 🎯
Volunteer to draw the confusion matrix on the board (or describe it)
Then the room debates:
- A cancer screening model — should you optimise precision or recall? Why?
- A spam filter for your inbox — same question
- A criminal sentencing risk score — same question
Punchline: the answer depends on which mistake is more costly, and who pays the cost

Concept 5: overfitting and validation

Cold call: what is overfitting, in plain English?
Follow-up: why do we split data into training and validation sets?
Room scenario:
- You train a model on Emory student grades from 2018-2024
- It scores 99% accuracy on those students
- Should you trust it on the 2026 cohort? Why or why not?
Connect to: Goodhart’s law — when a measure becomes a target, it ceases to be a good measure

Concept 6: tokens and embeddings

Pair up with the person next to you. One minute. 🗣️
- One of you explains tokenisation
- The other explains embeddings
Then I pick two pairs to share
Rapid-fire room questions:
- Why does ChatGPT charge by the token and not by the word?
- Why does the order of words in a sentence matter, even after embedding?
- What is a context window, and why does it cost so much to make it bigger?

Concept 7: multimodal

Last concept of the recap
What does “multimodal” mean for an AI system?
Room:
- Name three modalities we discussed
- Why is audio harder than text in some ways and easier in others?
- When you upload a photo to ChatGPT, what is happening under the hood?

Part 2: a real scenario 🏥

The setup

A startup, HealthScribe, sells an AI tool to hospitals. The tool listens to doctor-patient conversations and writes the clinical note.

The model is a fine-tuned LLM
Training data: 80,000 doctor-patient transcripts from a single large hospital network in California, 2019-2024
The labels are the final clinical notes that doctors wrote and signed off on
HealthScribe reports 92% agreement with doctor-written notes on a held-out test set
They are now selling the product to hospitals across the country

🩺

We will work through this together.

I will not move on until I get answers from at least three different people on each question.

Question A: the dataset

What problems can you spot with the training data?
Hints, if the room is quiet:
- Where did the data come from? Who is in it? Who is not in it?
- The labels are doctor-written notes — is that the truth, or one person’s interpretation?
- The data is from 2019-2024 — does medicine change over five years?
Try to name at least three distinct issues

Question B: the metric

HealthScribe reports 92% agreement with doctor-written notes
Volunteer: what is being measured here?
Then the room:
- Is “agreement with the doctor” the same as getting it right?
- What are the model and the doctor agreeing on when they agree?
- On what kind of patient might the 8% disagreement concentrate?
This is a classic case of Goodhart’s law: optimising agreement is not the same as optimising care

Question C: precision and recall

Imagine the AI’s job is to flag every mention of an allergy in the conversation, so it can be added to the patient record
The room decides:
- Would you optimise precision (only flag what you are very sure of) or recall (flag everything that might be an allergy)?
- Defend your choice with a real harm that comes from each kind of mistake
This is the single most likely structure for a quiz question. ✏️

Question D: a deployment decision

Your hospital is considering buying HealthScribe
The room votes by show of hands:
- Yes, buy it
- No, do not buy it
- Buy it, but with conditions
Whatever the room votes, I want one defender of each position to argue for 60 seconds
Then the room: what evidence would change your mind?

Part 3: practice questions 📝

How this part works

I will show you three practice questions, one at a time
They are written in the same style as the real Quiz 01
Pair up with the person next to you
You have two minutes per question to discuss your answer
Then I will pick a pair to share, and the room critiques
These are not the real quiz questions, but the format is identical

Practice question 1

A friend tells you: “ChatGPT is just an autocomplete on steroids. It doesn’t actually understand anything.” Another friend says: “It got 90% on the bar exam, so clearly it understands law.” 🤷

Your task: who is closer to the truth, and why? In your answer:

Explain what tokenisation and next-token prediction have to do with the first friend’s claim
Explain why a high benchmark score is not the same as understanding (cite Goodhart’s law if you can)
Take a position. Defend it in 3-4 sentences

Two minutes. Discuss with your partner. ⏱️

Practice question 2

A university wants to build an AI that flags students who are “at risk of dropping out.” They have:

Five years of student records (grades, attendance, financial aid status)
The label is “did this student drop out by the end of year 2”

Your task:

What kind of learning problem is this (supervised, unsupervised, reinforcement)?
What is the difference between precision and recall for this model, and which would you optimise? Defend your choice
Name one ethical concern that does not appear anywhere in the technical metric

Two minutes. ⏱️

Practice question 3

You upload a photo of a handwritten recipe to ChatGPT and ask it to convert it to a typed shopping list. It gets the recipe mostly right, but lists “2 cups of sugar” when the recipe clearly says “2 cups of salt.” 🧂

Your task:

What is happening at the tokeniser and vision encoder stage that could cause this kind of mistake?
Is this a hallucination, a factual error, or both? Explain
If you were Anthropic or OpenAI, what is one thing you would change to make this kind of error less likely?

Two minutes. ⏱️

Part 4: open Q&A 💬

Anything unclear

This is your last chance before Thursday
No question is too small. If you are unsure, others are too
Topics most often asked about in past semesters:
- Embeddings — what does the vector “mean”?
- Validation vs test set — what is the difference?
- Multimodal — how does the model “see” an image?
- The proxy problem — when is a metric a bad proxy?

Last reminders before Thursday

Thursday, September 24, 2:30 pm, this room
Bring your laptop and your charger 🔌
The quiz is on Canvas — log in before you arrive
Read each question twice before you start writing
If you are stuck, skip and come back — do not lose the easy points
Cite the AI tool you used (if any). Honesty over polish
See you Thursday. You are ready. 💪

DATASCI 101: Introduction to AI Applications

Welcome back! 🧠

Today’s plan

Quiz logistics

Part 1: rapid recap 🚀

How this part works

Concept 1: hallucination

Concept 2: dataset and labels

Concept 3: the three learning paradigms

Concept 4: precision vs recall

Concept 5: overfitting and validation

Concept 6: tokens and embeddings

Concept 7: multimodal

Part 2: a real scenario 🏥

The setup

Question A: the dataset

Question B: the metric

Question C: precision and recall

Question D: a deployment decision

Part 3: practice questions 📝

How this part works

Practice question 1

Practice question 2

Practice question 3

Part 4: open Q&A 💬

Anything unclear

Last reminders before Thursday

See you Thursday! 👋