use this skill /analyze-data
Using the analyze-data skill
The user mentioned wanting to "use this skill," which I interpret as referring to the /analyze-data skill. I need to follow instructions and invoke the Skill tool to start the analysis. However, I'm not seeing any specific questions or data from them. Since we already have the Titanic table loaded, perhaps the skill could profile it. I'll go ahead and call the Skill tool now to get things moving!
{
"skill": "analyze-data",
"args": ""
}Launching skill: analyze-data
Planning SQL Queries
I'm considering using the parallel tool to run multiple SQL queries at once, which should speed things up. I want to show tables, describe the Titanic table, and check counts, sample data, and null rates. The wrapper only allows certain developer tools, but it looks like I can call multi_tool_use.parallel with five execute_sql commands. After getting the results, I can summarize the schema and quality, but I’ll check with the user for specific interests related to the data.
SHOW TABLES;| name |
|---|
| titanic |
DESCRIBE titanic;| column_name | column_type | null | key | default | extra |
|---|---|---|---|---|---|
| PassengerId | BIGINT | YES | |||
| Survived | BIGINT | YES | |||
| Pclass | BIGINT | YES | |||
| Name | VARCHAR | YES | |||
| Sex | VARCHAR | YES | |||
| Age | DOUBLE | YES | |||
| SibSp | BIGINT | YES | |||
| Parch | BIGINT | YES | |||
| Ticket | VARCHAR | YES | |||
| Fare | DOUBLE | YES | |||
| Cabin | VARCHAR | YES | |||
| Embarked | VARCHAR | YES |
SELECT COUNT(*) AS row_count FROM titanic;| row_count |
|---|
| 891 |
SELECT * FROM titanic LIMIT 5;| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22 | 1 | 0 | A/5 21171 | 7.25 | S | |
| 2 | 1 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | female | 38 | 1 | 0 | PC 17599 | 71.2833 | C85 | C |
| 3 | 1 | 3 | Heikkinen, Miss. Laina | female | 26 | 0 | 0 | STON/O2. 3101282 | 7.925 | S | |
| 4 | 1 | 1 | Futrelle, Mrs. Jacques Heath (Lily May Peel) | female | 35 | 1 | 0 | 113803 | 53.1 | C123 | S |
| 5 | 0 | 3 | Allen, Mr. William Henry | male | 35 | 0 | 0 | 373450 | 8.05 | S |
SELECT 'PassengerId' AS column_name, COUNT(*)-COUNT(PassengerId) AS nulls, ROUND(100.0*(COUNT(*)-COUNT(PassengerId))/COUNT(*),1) AS null_pct FROM titanic
UNION ALL SELECT 'Survived', COUNT(*)-COUNT(Survived), ROUND(100.0*(COUNT(*)-COUNT(Survived))/COUNT(*),1) FROM titanic
UNION ALL SELECT 'Pclass', COUNT(*)-COUNT(Pclass), ROUND(100.0*(COUNT(*)-COUNT(Pclass))/COUNT(*),1) FROM titanic
UNION ALL SELECT 'Name', COUNT(*)-COUNT(Name), ROUND(100.0*(COUNT(*)-COUNT(Name))/COUNT(*),1) FROM titanic
UNION ALL SELECT 'Sex', COUNT(*)-COUNT(Sex), ROUND(100.0*(COUNT(*)-COUNT(Sex))/COUNT(*),1) FROM titanic
UNION ALL SELECT 'Age', COUNT(*)-COUNT(Age), ROUND(100.0*(COUNT(*)-COUNT(Age))/COUNT(*),1) FROM titanic
UNION ALL SELECT 'SibSp', COUNT(*)-COUNT(SibSp), ROUND(100.0*(COUNT(*)-COUNT(SibSp))/COUNT(*),1) FROM titanic
UNION ALL SELECT 'Parch', COUNT(*)-COUNT(Parch), ROUND(100.0*(COUNT(*)-COUNT(Parch))/COUNT(*),1) FROM titanic
UNION ALL SELECT 'Ticket', COUNT(*)-COUNT(Ticket), ROUND(100.0*(COUNT(*)-COUNT(Ticket))/COUNT(*),1) FROM titanic
UNION ALL SELECT 'Fare', COUNT(*)-COUNT(Fare), ROUND(100.0*(COUNT(*)-COUNT(Fare))/COUNT(*),1) FROM titanic
UNION ALL SELECT 'Cabin', COUNT(*)-COUNT(Cabin), ROUND(100.0*(COUNT(*)-COUNT(Cabin))/COUNT(*),1) FROM titanic
UNION ALL SELECT 'Embarked', COUNT(*)-COUNT(Embarked), ROUND(100.0*(COUNT(*)-COUNT(Embarked))/COUNT(*),1) FROM titanic
ORDER BY null_pct DESC, column_name;| column_name | nulls | null_pct |
|---|---|---|
| Cabin | 687 | 77.1 |
| Age | 177 | 19.9 |
| Embarked | 2 | 0.2 |
| Fare | 0 | 0 |
| Name | 0 | 0 |
| Parch | 0 | 0 |
| PassengerId | 0 | 0 |
| Pclass | 0 | 0 |
| Sex | 0 | 0 |
| SibSp | 0 | 0 |
| Survived | 0 | 0 |
| Ticket | 0 | 0 |
{
"question": "What would you like to analyze in the Titanic data next? (You can pick more than one.)",
"options": [
{
"label": "Survival rate breakdowns",
"description": "Survival rates by sex, class, embarkation port, family size, etc."
},
{
"label": "Key drivers of survival",
"description": "Quantify how features (sex, class, age, fare) relate to survival (e.g., odds ratios / simple model-style summary)."
},
{
"label": "Distributions & outliers",
"description": "Age and fare distributions, missing-age patterns, extreme fares, etc."
},
{
"label": "Data quality / missingness",
"description": "Deeper look at nulls, duplicates, inconsistent values, and suggested cleaning steps."
}
],
"multi_select": true
}{"answers": ["Distributions & outliers"]}WITH bins AS (
SELECT
CASE
WHEN Age IS NULL THEN 'Missing'
WHEN Age < 10 THEN '0-9'
WHEN Age < 20 THEN '10-19'
WHEN Age < 30 THEN '20-29'
WHEN Age < 40 THEN '30-39'
WHEN Age < 50 THEN '40-49'
WHEN Age < 60 THEN '50-59'
WHEN Age < 70 THEN '60-69'
ELSE '70+'
END AS age_bin,
CASE
WHEN Age IS NULL THEN 99
WHEN Age < 10 THEN 0
WHEN Age < 20 THEN 10
WHEN Age < 30 THEN 20
WHEN Age < 40 THEN 30
WHEN Age < 50 THEN 40
WHEN Age < 60 THEN 50
WHEN Age < 70 THEN 60
ELSE 70
END AS sort_key
FROM titanic
)
SELECT age_bin, COUNT(*) AS passengers
FROM bins
GROUP BY age_bin, sort_key
ORDER BY sort_key, age_bin;| age_bin | passengers |
|---|---|
| 0-9 | 62 |
| 10-19 | 102 |
| 20-29 | 220 |
| 30-39 | 167 |
| 40-49 | 89 |
| 50-59 | 48 |
| 60-69 | 19 |
| 70+ | 7 |
| Missing | 177 |
WITH fare_bins AS (
SELECT
CASE
WHEN Fare IS NULL THEN 'Missing'
WHEN Fare = 0 THEN '0'
WHEN Fare <= 7.91 THEN '(0, 7.91]'
WHEN Fare <= 14.45 THEN '(7.91, 14.45]'
WHEN Fare <= 31.00 THEN '(14.45, 31.00]'
ELSE '31.00+'
END AS fare_bin,
CASE
WHEN Fare IS NULL THEN 99
WHEN Fare = 0 THEN 0
WHEN Fare <= 7.91 THEN 1
WHEN Fare <= 14.45 THEN 2
WHEN Fare <= 31.00 THEN 3
ELSE 4
END AS sort_key
FROM titanic
)
SELECT fare_bin, COUNT(*) AS passengers
FROM fare_bins
GROUP BY fare_bin, sort_key
ORDER BY sort_key, fare_bin;| fare_bin | passengers |
|---|---|
| 0 | 15 |
| (0, 7.91] | 208 |
| (7.91, 14.45] | 217 |
| (14.45, 31.00] | 229 |
| 31.00+ | 222 |
SELECT
MIN(Fare) AS min_fare,
APPROX_QUANTILE(Fare, 0.25) AS p25_fare,
APPROX_QUANTILE(Fare, 0.50) AS median_fare,
APPROX_QUANTILE(Fare, 0.75) AS p75_fare,
MAX(Fare) AS max_fare,
AVG(Fare) AS avg_fare
FROM titanic;| min_fare | p25_fare | median_fare | p75_fare | max_fare | avg_fare |
|---|---|---|---|---|---|
| 0 | 7.910897727272727 | 14.445692261904762 | 30.9377125 | 512.3292 | 32.2042079685746 |
SELECT
MIN(Age) AS min_age,
APPROX_QUANTILE(Age, 0.25) AS p25_age,
APPROX_QUANTILE(Age, 0.50) AS median_age,
APPROX_QUANTILE(Age, 0.75) AS p75_age,
MAX(Age) AS max_age,
AVG(Age) AS avg_age,
COUNT(*) FILTER (Age IS NULL) AS missing_age,
COUNT(*) FILTER (Age IS NOT NULL) AS nonmissing_age
FROM titanic;| min_age | p25_age | median_age | p75_age | max_age | avg_age | missing_age | nonmissing_age |
|---|---|---|---|---|---|---|---|
| 0.42 | 20.351851851851855 | 28.185950413223143 | 38.333333333333336 | 80 | 29.69911764705882 | 177 | 714 |
SELECT PassengerId, Name, Pclass, Sex, Age, Fare, Ticket, Cabin, Embarked
FROM titanic
ORDER BY Fare DESC
LIMIT 10;| PassengerId | Name | Pclass | Sex | Age | Fare | Ticket | Cabin | Embarked |
|---|---|---|---|---|---|---|---|---|
| 259 | Ward, Miss. Anna | 1 | female | 35 | 512.3292 | PC 17755 | C | |
| 680 | Cardeza, Mr. Thomas Drake Martinez | 1 | male | 36 | 512.3292 | PC 17755 | B51 B53 B55 | C |
| 738 | Lesurer, Mr. Gustave J | 1 | male | 35 | 512.3292 | PC 17755 | B101 | C |
| 28 | Fortune, Mr. Charles Alexander | 1 | male | 19 | 263 | 19950 | C23 C25 C27 | S |
| 89 | Fortune, Miss. Mabel Helen | 1 | female | 23 | 263 | 19950 | C23 C25 C27 | S |
| 342 | Fortune, Miss. Alice Elizabeth | 1 | female | 24 | 263 | 19950 | C23 C25 C27 | S |
| 439 | Fortune, Mr. Mark | 1 | male | 64 | 263 | 19950 | C23 C25 C27 | S |
| 312 | Ryerson, Miss. Emily Borie | 1 | female | 18 | 262.375 | PC 17608 | B57 B59 B63 B66 | C |
| 743 | Ryerson, Miss. Susan Parker "Suzette" | 1 | female | 21 | 262.375 | PC 17608 | B57 B59 B63 B66 | C |
| 300 | Baxter, Mrs. James (Helene DeLaudeniere Chaput) | 1 | female | 50 | 247.5208 | PC 17558 | B58 B60 | C |
SELECT
SUM(CASE WHEN Fare IS NULL THEN 1 ELSE 0 END) AS missing_fare,
SUM(CASE WHEN Fare = 0 THEN 1 ELSE 0 END) AS zero_fare,
ROUND(100.0 * SUM(CASE WHEN Fare = 0 THEN 1 ELSE 0 END) / COUNT(*), 1) AS zero_fare_pct
FROM titanic;| missing_fare | zero_fare | zero_fare_pct |
|---|---|---|
| 0 | 15 | 1.7 |
SELECT
Sex,
COUNT(*) AS passengers,
COUNT(*) FILTER (Age IS NULL) AS missing_age,
ROUND(100.0 * COUNT(*) FILTER (Age IS NULL) / COUNT(*), 1) AS missing_age_pct
FROM titanic
GROUP BY Sex
ORDER BY passengers DESC;| Sex | passengers | missing_age | missing_age_pct |
|---|---|---|---|
| male | 577 | 124 | 21.5 |
| female | 314 | 53 | 16.9 |
SELECT
Pclass,
COUNT(*) AS passengers,
COUNT(*) FILTER (Age IS NULL) AS missing_age,
ROUND(100.0 * COUNT(*) FILTER (Age IS NULL) / COUNT(*), 1) AS missing_age_pct
FROM titanic
GROUP BY Pclass
ORDER BY Pclass;| Pclass | passengers | missing_age | missing_age_pct |
|---|---|---|---|
| 1 | 216 | 30 | 13.9 |
| 2 | 184 | 11 | 6 |
| 3 | 491 | 136 | 27.7 |
SELECT
PassengerId,
Age,
(SibSp + Parch) AS family_count,
Pclass,
Sex,
Fare
FROM titanic
WHERE Age IS NOT NULL
ORDER BY Age ASC
LIMIT 10;| PassengerId | Age | family_count | Pclass | Sex | Fare |
|---|---|---|---|---|---|
| 804 | 0.42 | 1 | 3 | male | 8.5167 |
| 756 | 0.67 | 2 | 2 | male | 14.5 |
| 470 | 0.75 | 3 | 3 | female | 19.2583 |
| 645 | 0.75 | 3 | 3 | female | 19.2583 |
| 79 | 0.83 | 2 | 2 | male | 29 |
| 832 | 0.83 | 2 | 2 | male | 18.75 |
| 306 | 0.92 | 3 | 1 | male | 151.55 |
| 184 | 1 | 3 | 2 | male | 39 |
| 382 | 1 | 2 | 3 | female | 15.7417 |
| 387 | 1 | 7 | 3 | male | 46.9 |
SELECT
PassengerId,
Age,
(SibSp + Parch) AS family_count,
Pclass,
Sex,
Fare
FROM titanic
WHERE Age IS NOT NULL
ORDER BY Age DESC
LIMIT 10;| PassengerId | Age | family_count | Pclass | Sex | Fare |
|---|---|---|---|---|---|
| 631 | 80 | 0 | 1 | male | 30 |
| 852 | 74 | 0 | 3 | male | 7.775 |
| 97 | 71 | 0 | 1 | male | 34.6542 |
| 494 | 71 | 0 | 1 | male | 49.5042 |
| 117 | 70.5 | 0 | 3 | male | 7.75 |
| 673 | 70 | 0 | 2 | male | 10.5 |
| 746 | 70 | 2 | 1 | male | 71 |
| 34 | 66 | 0 | 2 | male | 10.5 |
| 281 | 65 | 0 | 3 | male | 7.75 |
| 457 | 65 | 0 | 1 | male | 26.55 |
titanicThe age histogram shows the dataset is concentrated in young adults, with a sizable missing-age bucket:
Summary stats (non-missing ages):
Missing age is not evenly distributed:
That means any age-based analysis can be biased toward the groups with better-recorded ages (notably 2nd class).
The fare chart (quartile-ish bins + a zero bin) shows fares are fairly spread across low-to-mid ranges, with a heavy right tail.
Fare summary stats:
Outlier notes: