DigiKat Map 3: Emotional Structure

Mapping Sentiment and Controversy in Croatian Catholic Digital Media

Author

DigiKat Project

Published

January 27, 2026

1 Introduction

This document presents Map 3 of the DigiKat project, analyzing the emotional structure of Croatian Catholic digital media. The core question driving this analysis is What emotional register characterizes Croatian Catholic digital communication, and where does controversy emerge?

1.1 Methodological Overview

Sentiment analysis classifies text into positive, neutral, or negative categories based on the emotional tone of the language used. This automated classification (AUTO_SENTIMENT in the data) provides a high level view of emotional valence across the corpus.

Facebook reactions offer a more granular view of audience emotional response. Unlike simple likes, reactions (LOVE, WOW, HAHA, SAD, ANGRY) capture distinct emotional states, enabling us to construct emotional fingerprints for different actors and content types.

Key metrics in this analysis:

Metric Definition Interpretation
Sentiment ratio Positive posts / Negative posts Higher values indicate more positive tone
LOVE share LOVE reactions / Total emotional reactions Measures affection/appreciation response
ANGRY share ANGRY reactions / Total emotional reactions Measures anger/outrage response
Controversy Index (ANGRY / Interactions) x log(Interactions) Combines anger intensity with reach

Interpretation guidance:

  • High LOVE share with low ANGRY share indicates devotional, well received content
  • High ANGRY share may indicate controversial topics or provocative framing
  • Controversy Index accounts for both intensity (ANGRY ratio) and visibility (interaction volume)
  • Differences between automated sentiment and audience reactions reveal gaps between intent and reception
Show code
dta <- readRDS("C:/Users/lsikic/Luka C/HKS/Projekti/Digitalni Kat/SHKM/DigiKat/data/merged_comprehensive.rds") %>%
  filter(SOURCE_TYPE != "tiktok", !is.na(SOURCE_TYPE)) %>%
  filter(DATE >= as.Date("2021-01-01") & DATE <= as.Date("2025-12-31")) %>%
  filter(year >= 2021 & year <= 2025)

setDT(dta)

# Check data loaded
cat("Data loaded:", nrow(dta), "rows\n")
Data loaded: 608879 rows
Show code
if (nrow(dta) == 0) stop("No data after filtering! Check your date filters.")

n_posts <- nrow(dta)
n_sources <- uniqueN(dta$FROM)
date_range <- paste(min(dta$DATE), "to", max(dta$DATE))
Show code
# =============================================================================
# ACTOR CLASSIFICATION V4 - SYNCHRONIZED WITH MAP 1
# =============================================================================
# Uses hierarchical priority: manual overrides -> secular exclusions -> 
# domain matching -> pattern matching -> lay influencer detection -> default
# =============================================================================

# MANUAL OVERRIDES: Highest priority explicit classifications
manual_overrides <- list(
  "Institutional Official" = c(
    "hrvatska katolička mreža", "hrvatska katolicka mreza",
    "informativna katolička agencija", "informativna katolicka agencija",
    "hrvatski katolički radio", "hrvatski katolicki radio",
    "hrvatska biskupska konferencija", "ika", "hkr", "hkm", "hbk",
    "tiskovni ured hbk", "radio marija"
  ),
  "Independent Media" = c(
    "laudato tv", "laudatotv", "laudato.tv", "laudato.hr", "laudato",
    "bitno.net", "bitno net", "glas koncila", "glaskoncila",
    "nova eva", "nova-eva", "verbum", "totus tuus", "totus-tuus",
    "katolički tjednik", "katolicki tjednik", "kršćanska sadašnjost",
    "krscanska sadasnjost", "mir i dobro", "svjetlo riječi",
    "novizivot.net", "novi zivot", "novi život"
  ),
  "Charismatic Communities" = c(
    "božja pobjeda", "bozja pobjeda", "bozjapobjeda",
    "muževni budite", "muzevni budite", "muzevnibudite",
    "srce isusovo", "srceisuovo", "cenacolo", "comunità cenacolo",
    "duhovna obnova", "molitvena snaga",
    "dom molitve slavonski brod", "dom molitve",
    "molitvena zajednica sv. josipa"
  ),
  "Lay Influencers" = c(
    "katolička obitelj", "katolicka obitelj",
    "marija majka isusova", "božanske molitve", "bozanske molitve",
    "moćne molitve tv", "mocne molitve tv", "moćne molitve", "mocne molitve",
    "katoličke molitve", "katolicke molitve",
    "pulherissimus", "pod smokvom",
    "hrana za dušu", "hrana za dusu",
    "добровољци", "miletić marin", "miletic marin",
    "dijete vjere", "vjera",
    "kapljice ljubavi božje", "kapljice ljubavi bozje",
    "kršćanstvo", "krscanstvo",
    "jutarnja molitva duhu svetom", "blago molitve",
    "biblija krunice molitve",
    "molitve bogu",
    "vojnik sreće", "vojnik srece", "duhovne poruke i inspiracija",
    "kes duhovni kutak", "duhovni kutak",
    "molitve.hr", "duhovniportal.com", "duhovniportal"
  ),
  "Diocesan" = c(
    "zagrebačka nadbiskupija", "zagrebacka nadbiskupija",
    "sisačka biskupija", "sisacka biskupija",
    "župa šurkovac", "zupa surkovac", "sveta mati slobode",
    "župa sv. ilije proroka metković", "zupa sv. ilije proroka metkovic",
    "župa uznesenja bdm", "zupa uznesenja bdm", "župa uznesenja bdm - stenjevec",
    "šibenska biskupija", "sibenska biskupija",
    "požeška biskupija", "pozeskа biskupija",
    "dubrovacka biskupija", "dubrovačka biskupija",
    "dubrovacka-biskupija.hr",
    "zupa tramosnica", "župa tramošnica",
    "župa sv. vida", "zupa sv. vida", "župa sv. vida - petruševec"
  ),
  "Youth Organizations" = c(
    "susret hrvatske katoličke mladeži", "susret hrvatske katolicke mladezi",
    "shkm požega", "shkm pozega"
  ),
  "Academic" = c(
    "hrvatsko katoličko sveučilište", "hrvatsko katolicko sveuciliste",
    "universitas studiorum catholica croatica"
  )
)

# SECULAR MEDIA EXCLUSIONS: Always classify as Other
secular_exclusions <- c(
  # National portals (with and without .hr)
  "slobodnadalmacija", "slobodnadalmacija.hr", 
  "vecernji", "vecernji.hr", 
  "index.hr", "index",
  "jutarnji", "jutarnji.hr", 
  "novilist", "novilist.hr",
  "24sata", "24sata.hr",
  "direktno", "direktno.hr",
  "nacional", "nacional.hr", 
  "tportal", "tportal.hr",
  "dnevnik.hr", "dnevnik", 
  "hrt.hr", "hrt", 
  "n1info", "n1info.hr", "n1",
  "rtl.hr", "rtl", 
  "net.hr", 
  "telegram.hr", "telegram", 
  "story.hr", "express.hr", "express", "advance.hr",
  # Regional portals
  "glasistre", "glasistre.hr",
  "dnevno.hr", "dnevno", 
  "prigorski", "glas-slavonije", "glas slavonije",
  "croativ", "oluja.info", 
  "maxportal", "maxportal.hr",
  "hkv.hr", "icv.hr", "novosti.hr", "7dnevno", "mnovine", 
  "sjever.hr", "dulist.hr", "pozega.eu", "sibenik.in",
  "ferata.hr", "epodravina", "glasgacke", "radio-zlatar", 
  "medjimurski.hr", "sbperiskop", "zagorje-international",
  "pozeski", "novine.hr", "dubrovnikinsider", "regionalni", 
  "leportale", "varazdinske-vijesti", "radionasice", 
  "brodportal", "ljportal", "dubrovnikportal", "01portal",
  "tomislavnews", "hia.com.hr", "portalnovosti", "antenazadar",
  "dalmacijanews", "zadarskilist", "medjimurjepress", 
  "zagreb.info", "034portal", "057info", "cityportal",
  "klikaj.hr", "lika-online", "ploce.com", "sbonline", 
  "narod.hr", "infokiosk", "hrsvijet", "tomislavcity", 
  "vrisak.info", "dalmacijadanas", "dalmacijadanas.hr",
  "morski.hr", "zagreb.hr", "osijek031", "rijeka.hr", "zadar.hr",
  "zupanjac.net", "zupanjac", 
  "dalmatinskiportal.hr", "dalmatinskiportal",
  "campaign-archive.com",
  # Forums and aggregators
  "forum.hr", "reddit", "anonymous_user", "komentari", "bug.hr",
  # Entertainment and other
  "inmemoriam", "magicus.info", "book.hr", "mojzagreb.info", 
  "skole.hr", "tvprofil", "priznajem.hr", "dragovoljac.com", 
  "croatia", "wikipedia",
  "facebook.com", "youtube.com", "instagram.com", "twitter.com",
  # Government and administrative
  "županija", "zupanija", "grad ", "opcina", "općina",
  # Non-Catholic religious groups
  "kršćanska proročka crkva", "krscanska prorocka crkva",
  "crkva svemogućeg boga", "crkva svemoguceg boga",
  "jehovini svjedoci", "adventisti", "baptisti", "pentekostalna",
  # Political parties
  "domovinski pokret", "hdz", "sdp", "most", "možemo", "mozemo"
)

classify_actor_v4 <- function(from_value, url_value = NA, platform_value = NA) {
  from_lower <- tolower(from_value)
  url_lower <- tolower(ifelse(is.na(url_value), "", url_value))
  platform_lower <- tolower(ifelse(is.na(platform_value), "", platform_value))
  combined <- paste(from_lower, url_lower)
  
  # Helper function for matching

  match_any <- function(patterns, text) {
    any(sapply(patterns, function(p) grepl(p, text, fixed = TRUE)))
  }
  
  # PRIORITY 1: Manual overrides (highest priority)
  for (actor_type in names(manual_overrides)) {
    if (match_any(manual_overrides[[actor_type]], from_lower)) {
      return(actor_type)
    }
  }
  
  # PRIORITY 2: Secular exclusions
  if (match_any(secular_exclusions, combined)) {
    return("Other")
  }
  
  # PRIORITY 3: Domain-based classification
  institutional_domains <- c("hkm.hr", "ika.hkm.hr", "hkr.hkm.hr", "hbk.hr")
  if (any(sapply(institutional_domains, function(x) grepl(x, url_lower, fixed = TRUE)))) {
    return("Institutional Official")
  }
  
  diocesan_domains <- c(
    "zg-nadbiskupija.hr", "biskupija-varazdinska.hr", "djos.hr", 
    "biskupija-sj.hr", "rzs.hr", "rkc-sisak.hr", "zadarskanadbiskupija.hr",
    "gospicko-senjska-biskupija.hr", "nadbiskupija-split.com",
    "dubrovacka-biskupija.hr", "porec-biskupija.hr", "biskupija-kk.hr"
  )
  if (any(sapply(diocesan_domains, function(x) grepl(x, url_lower, fixed = TRUE)))) {
    return("Diocesan")
  }
  
  independent_media_domains <- c(
    "laudato.hr", "laudato.tv", "bitno.net", "glaskoncila.hr", 
    "nova-eva.com", "verbum.hr", "ks.hr", "novizivot.net"
  )
  if (any(sapply(independent_media_domains, function(x) grepl(x, url_lower, fixed = TRUE)))) {
    return("Independent Media")
  }
  
  # PRIORITY 4: Pattern-based classification
  
  # Check for parish patterns (covers missing diacritics)
  is_parish <- grepl("^župa|^zupa|župi|zupi|- župa|- zupa", from_lower, ignore.case = TRUE) ||
               grepl("parish", from_lower, ignore.case = TRUE)
  
  diocesan_names <- c("nadbiskupija", "biskupija")
  if (is_parish || any(sapply(diocesan_names, function(x) grepl(x, from_lower, fixed = TRUE)))) {
    return("Diocesan")
  }
  
  # Religious orders
  religious_orders <- c(
    "franjevci", "franjevački", "franjevacki", "ofm",
    "isusovci", "družba isusova", "druzba isusova",
    "dominikanci", "dominikanski",
    "salezijanci", "salezijanski", "sdb",
    "karmelićani", "karmelicani", "karmel",
    "benediktinci", "benediktinski", "osb",
    "kapucini", "kapucinski", "ofmcap",
    "pavlini", "pavlinski", "trapisti", "cisterciti",
    "sestre milosrdnice", "uršulinke", "klarise",
    "službenice milosrđa", "kćeri božje ljubavi", "kceri bozje ljubavi"
  )
  if (any(sapply(religious_orders, function(x) grepl(x, from_lower, fixed = TRUE)))) {
    return("Religious Orders")
  }
  
  # Charismatic communities
  charismatic <- c(
    "molitvena zajednica", "karizmatska", "cenacolo",
    "emmanuel community", "emmanuel zajednica", "taize",
    "neokatekumenski", "neokatekumenska", "kursiljo",
    "fokolari", "fokolarini", "komunija i oslobođenje",
    "dom molitve", "kuća molitve", "kuca molitve"
  )
  if (any(sapply(charismatic, function(x) grepl(x, from_lower, fixed = TRUE)))) {
    return("Charismatic Communities")
  }
  
  # Individual priests (check prefix at start of name)
  priest_prefixes <- c("fra ", "don ", "vlč.", "vlc.", "msgr.", "mons.",
                       "o. ", "pater ", "preč.", "prec.")
  for (prefix in priest_prefixes) {
    if (startsWith(from_lower, prefix)) return("Individual Priests")
  }
  if (grepl("svećenik|svecenik|župnik|zupnik", from_lower)) {
    return("Individual Priests")
  }
  
  # Youth organizations
  youth <- c(
    "frama", "shkm", "katolička mladež", "katolicka mladez",
    "ministranti", "mladifra", "mladi fra", "kaem",
    "sveučilišna kapelanija", "studentska kapelanija"
  )
  if (any(sapply(youth, function(x) grepl(x, from_lower, fixed = TRUE)))) {
    return("Youth Organizations")
  }
  
  # Academic
  academic <- c(
    "unicath", "katolički bogoslovni fakultet", "kbf",
    "teologija", "filozofski fakultet družbe isusove", "hks.hr",
    "hrvatsko katoličko sveučilište", "hku"
  )
  if (any(sapply(academic, function(x) grepl(x, from_lower, fixed = TRUE)))) {
    return("Academic")
  }
  
  # PRIORITY 5: Lay influencer detection (platform-aware)
  lay_keywords <- c(
    "vjera", "molitva", "molitve", "isus", "krist", "gospa", "marija",
    "duhovnost", "duhovna", "duhovni", "biblija", "psalm", "blagoslov",
    "krunica", "rozarij", "katolička obitelj", "evanđelje", "evandelje"
  )
  
  lay_exclude <- c(".hr", ".net", ".com", "portal", "vijesti", "news", "radio")
  
  is_likely_social <- platform_lower %in% c("facebook", "instagram", "youtube", "twitter") ||
                      !grepl("\\.[a-z]{2,4}$", from_lower)
  
  has_devotional <- any(sapply(lay_keywords, function(x) grepl(x, from_lower, fixed = TRUE)))
  has_media_marker <- any(sapply(lay_exclude, function(x) grepl(x, from_lower, fixed = TRUE)))
  
  if (is_likely_social && has_devotional && !has_media_marker) {
    return("Lay Influencers")
  }
  
  return("Other")
}

# Apply classification
dta[, ACTOR_TYPE := mapply(classify_actor_v4, FROM, URL, SOURCE_TYPE)]

# Verify classification
cat("ACTOR_TYPE class:", class(dta$ACTOR_TYPE), "\n")
ACTOR_TYPE class: character 
Show code
cat("ACTOR_TYPE distribution:\n")
ACTOR_TYPE distribution:
Show code
print(table(dta$ACTOR_TYPE))

               Academic Charismatic Communities                Diocesan 
                    671                    1282                   13841 
      Independent Media  Institutional Official         Lay Influencers 
                  31401                   62744                   28451 
                  Other        Religious Orders     Youth Organizations 
                 468760                    1677                      52 

1.2 Corpus Overview

Show code
tibble(
  Metric = c("Total posts", "Unique sources", "Date range", "Platforms",
             "Posts with sentiment", "Posts with Facebook reactions"),
  Value = c(
    format(n_posts, big.mark = ","),
    format(n_sources, big.mark = ","),
    date_range,
    paste(unique(dta$SOURCE_TYPE), collapse = ", "),
    format(sum(!is.na(dta$AUTO_SENTIMENT)), big.mark = ","),
    format(sum(dta$SOURCE_TYPE == "facebook" & !is.na(dta$LOVE_COUNT)), big.mark = ",")
  )
) %>%
  kable(col.names = c("Metric", "Value")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Metric Value
Total posts 608,879
Unique sources 16,426
Date range 2021-01-01 to 2025-12-31
Platforms web, facebook, twitter, comment, youtube, forum, reddit, instagram
Posts with sentiment 608,879
Posts with Facebook reactions 68,771

2 Analysis 3.1: Sentiment Distribution

Sentiment analysis classifies each post as positive, neutral, or negative based on linguistic features. This section examines how sentiment varies across the corpus, platforms, and actor types.

Reading the charts:

  • Overall distribution shows the baseline emotional tone of the corpus
  • Platform comparison reveals how different media afford different emotional registers
  • Actor comparison identifies which communicators tend toward more positive or negative framing
  • Positivity ratio (positive/negative) summarizes relative emotional valence
Show code
sentiment_dist <- dta[!is.na(AUTO_SENTIMENT), .(
  Count = .N
), by = AUTO_SENTIMENT][order(-Count)]

sentiment_dist[, Percentage := Count / sum(Count) * 100]

ggplot(sentiment_dist, aes(x = reorder(AUTO_SENTIMENT, -Count), y = Count, fill = AUTO_SENTIMENT)) +
  geom_col(width = 0.7) +
  geom_text(aes(label = sprintf("%s\n(%.1f%%)", format(Count, big.mark = ","), Percentage)), 
            vjust = -0.3, size = 4) +
  scale_fill_manual(values = sentiment_colors) +
  scale_y_continuous(labels = comma, expand = expansion(mult = c(0, 0.15))) +
  labs(
    title = "Overall Sentiment Distribution",
    subtitle = "Distribution of automated sentiment classification across corpus",
    x = NULL,
    y = "Number of Posts"
  ) +
  theme(legend.position = "none")

2.1 Sentiment by Platform

Different platforms may encourage different emotional registers due to their affordances, audience expectations, and content formats.

Show code
sentiment_platform <- dta[!is.na(AUTO_SENTIMENT), .(
  Count = .N
), by = .(SOURCE_TYPE, AUTO_SENTIMENT)]

sentiment_platform[, Total := sum(Count), by = SOURCE_TYPE]
sentiment_platform[, Percentage := Count / Total * 100]

ggplot(sentiment_platform, aes(x = reorder(SOURCE_TYPE, Total), y = Percentage, fill = AUTO_SENTIMENT)) +
  geom_col(position = "stack", width = 0.7) +
  geom_text(aes(label = sprintf("%.1f%%", Percentage)), 
            position = position_stack(vjust = 0.5), 
            size = 3, color = "white") +
  coord_flip() +
  scale_fill_manual(values = sentiment_colors) +
  labs(
    title = "Sentiment Distribution by Platform",
    subtitle = "Share of positive, neutral, and negative content per platform",
    x = NULL,
    y = "Percentage of Posts",
    fill = "Sentiment"
  ) +
  theme(legend.position = "top")

2.2 Sentiment by Actor Type

Actor types may have characteristic emotional registers reflecting their institutional roles, communication strategies, and target audiences.

Show code
sentiment_actor <- dta[!is.na(AUTO_SENTIMENT), .(
  Count = .N
), by = .(ACTOR_TYPE, AUTO_SENTIMENT)]

sentiment_actor[, Total := sum(Count), by = ACTOR_TYPE]
sentiment_actor[, Percentage := Count / Total * 100]

actor_order <- sentiment_actor[AUTO_SENTIMENT == "positive"][order(-Percentage)]$ACTOR_TYPE

ggplot(sentiment_actor, aes(x = factor(ACTOR_TYPE, levels = rev(actor_order)), 
                            y = Percentage, fill = AUTO_SENTIMENT)) +
  geom_col(position = "stack", width = 0.7) +
  coord_flip() +
  scale_fill_manual(values = sentiment_colors) +
  labs(
    title = "Sentiment Distribution by Actor Type",
    subtitle = "Ordered by share of positive content",
    x = NULL,
    y = "Percentage of Posts",
    fill = "Sentiment"
  ) +
  theme(legend.position = "top")

2.3 Sentiment Summary Table

The Positivity Ratio divides positive posts by negative posts. Values above 1 indicate more positive than negative content; higher values indicate a more positive overall tone.

Show code
sentiment_summary <- dta[!is.na(AUTO_SENTIMENT), .(
  Total_Posts = .N,
  Positive = sum(AUTO_SENTIMENT == "positive"),
  Neutral = sum(AUTO_SENTIMENT == "neutral"),
  Negative = sum(AUTO_SENTIMENT == "negative")
), by = ACTOR_TYPE]

sentiment_summary[, `:=`(
  Positive_Pct = Positive / Total_Posts * 100,
  Neutral_Pct = Neutral / Total_Posts * 100,
  Negative_Pct = Negative / Total_Posts * 100,
  Positivity_Ratio = Positive / (Negative + 0.1)
)]

sentiment_summary[order(-Positive_Pct)] %>%
  mutate(
    Total_Posts = format(Total_Posts, big.mark = ","),
    Positive_Pct = sprintf("%.1f%%", Positive_Pct),
    Neutral_Pct = sprintf("%.1f%%", Neutral_Pct),
    Negative_Pct = sprintf("%.1f%%", Negative_Pct),
    Positivity_Ratio = sprintf("%.2f", Positivity_Ratio)
  ) %>%
  select(ACTOR_TYPE, Total_Posts, Positive_Pct, Neutral_Pct, Negative_Pct, Positivity_Ratio) %>%
  kable(col.names = c("Actor Type", "Total Posts", "Positive %", "Neutral %", 
                      "Negative %", "Positivity Ratio")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Actor Type Total Posts Positive % Neutral % Negative % Positivity Ratio
Youth Organizations 52 94.2% 5.8% 0.0% 490.00
Lay Influencers 28,451 64.2% 18.7% 17.1% 3.75
Academic 671 54.8% 27.6% 17.6% 3.12
Diocesan 13,841 53.9% 29.7% 16.4% 3.28
Religious Orders 1,677 52.4% 30.9% 16.7% 3.13
Independent Media 31,401 51.4% 26.0% 22.6% 2.28
Institutional Official 62,744 49.0% 27.8% 23.2% 2.11
Other 468,760 45.5% 23.4% 31.0% 1.47
Charismatic Communities 1,282 36.5% 35.3% 28.2% 1.30

2.4 Sentiment and Engagement Relationship

Does emotional tone affect audience engagement? This analysis examines whether positive, neutral, or negative content generates different levels of interaction.

Show code
sentiment_engagement <- dta[!is.na(AUTO_SENTIMENT) & !is.na(INTERACTIONS), .(
  Mean_Interactions = mean(INTERACTIONS, na.rm = TRUE),
  Median_Interactions = median(INTERACTIONS, na.rm = TRUE),
  Total_Interactions = sum(INTERACTIONS, na.rm = TRUE),
  Count = .N
), by = AUTO_SENTIMENT]

p1 <- ggplot(sentiment_engagement, aes(x = AUTO_SENTIMENT, y = Mean_Interactions, fill = AUTO_SENTIMENT)) +
  geom_col(width = 0.6) +
  geom_text(aes(label = sprintf("%.1f", Mean_Interactions)), vjust = -0.3) +
  scale_fill_manual(values = sentiment_colors) +
  labs(
    title = "Mean Engagement by Sentiment",
    x = NULL,
    y = "Mean Interactions"
  ) +
  theme(legend.position = "none")

p2 <- ggplot(dta[!is.na(AUTO_SENTIMENT) & INTERACTIONS > 0 & INTERACTIONS < quantile(INTERACTIONS, 0.99, na.rm = TRUE)], 
             aes(x = AUTO_SENTIMENT, y = INTERACTIONS, fill = AUTO_SENTIMENT)) +
  geom_violin(alpha = 0.7) +
  geom_boxplot(width = 0.2, fill = "white", outlier.shape = NA) +
  scale_fill_manual(values = sentiment_colors) +
  scale_y_log10(labels = comma) +
  labs(
    title = "Engagement Distribution by Sentiment",
    subtitle = "Log scale, top 1% outliers excluded",
    x = NULL,
    y = "Interactions (log scale)"
  ) +
  theme(legend.position = "none")

p1 + p2

3 Analysis 3.2: Emotional Fingerprinting

Facebook reactions provide granular emotional data beyond simple sentiment. Each reaction type captures a distinct emotional response:

Reaction Emotional Meaning Content Association
LOVE Deep affection, strong appreciation Devotional, inspirational
WOW Surprise, amazement News, revelations
HAHA Amusement, humor Light content, satire
SAD Sorrow, empathy Death notices, suffering
ANGRY Outrage, disagreement Controversy, injustice

By analyzing the mix of reactions each actor type receives, we can construct emotional fingerprints that characterize their audience relationships.

Show code
fb_data <- dta[SOURCE_TYPE == "facebook" & 
               !is.na(LOVE_COUNT) & 
               !is.na(INTERACTIONS) & 
               INTERACTIONS > 0]

cat("Facebook posts with reaction data:", nrow(fb_data), "\n")
Facebook posts with reaction data: 63843 
Show code
if (nrow(fb_data) > 0) {
  fb_data[, `:=`(
    LOVE_RATIO = LOVE_COUNT / INTERACTIONS,
    WOW_RATIO = fifelse(is.na(WOW_COUNT), 0, WOW_COUNT) / INTERACTIONS,
    HAHA_RATIO = fifelse(is.na(HAHA_COUNT), 0, HAHA_COUNT) / INTERACTIONS,
    SAD_RATIO = fifelse(is.na(SAD_COUNT), 0, SAD_COUNT) / INTERACTIONS,
    ANGRY_RATIO = fifelse(is.na(ANGRY_COUNT), 0, ANGRY_COUNT) / INTERACTIONS,
    LIKE_RATIO = LIKE_COUNT / INTERACTIONS
  )]
  
  fb_data[, Total_Reactions := fifelse(is.na(LOVE_COUNT), 0, LOVE_COUNT) + 
                               fifelse(is.na(WOW_COUNT), 0, WOW_COUNT) + 
                               fifelse(is.na(HAHA_COUNT), 0, HAHA_COUNT) + 
                               fifelse(is.na(SAD_COUNT), 0, SAD_COUNT) + 
                               fifelse(is.na(ANGRY_COUNT), 0, ANGRY_COUNT)]
  
  fb_data[Total_Reactions > 0, `:=`(
    LOVE_SHARE = LOVE_COUNT / Total_Reactions,
    WOW_SHARE = fifelse(is.na(WOW_COUNT), 0, WOW_COUNT) / Total_Reactions,
    HAHA_SHARE = fifelse(is.na(HAHA_COUNT), 0, HAHA_COUNT) / Total_Reactions,
    SAD_SHARE = fifelse(is.na(SAD_COUNT), 0, SAD_COUNT) / Total_Reactions,
    ANGRY_SHARE = fifelse(is.na(ANGRY_COUNT), 0, ANGRY_COUNT) / Total_Reactions
  )]
}

3.1 Overall Emotional Profile

The overall distribution of emotional reactions reveals the dominant emotional register of Croatian Catholic Facebook content.

Show code
if (nrow(fb_data) > 0) {
  emotion_totals <- data.table(
    Emotion = c("LOVE", "WOW", "HAHA", "SAD", "ANGRY"),
    Total = c(
      sum(fb_data$LOVE_COUNT, na.rm = TRUE),
      sum(fb_data$WOW_COUNT, na.rm = TRUE),
      sum(fb_data$HAHA_COUNT, na.rm = TRUE),
      sum(fb_data$SAD_COUNT, na.rm = TRUE),
      sum(fb_data$ANGRY_COUNT, na.rm = TRUE)
    )
  )
  
  emotion_totals[, Percentage := Total / sum(Total) * 100]
  
  ggplot(emotion_totals, aes(x = reorder(Emotion, -Total), y = Total, fill = Emotion)) +
    geom_col(width = 0.7) +
    geom_text(aes(label = sprintf("%s\n(%.1f%%)", format(Total, big.mark = ","), Percentage)), 
              vjust = -0.2, size = 3.5) +
    scale_fill_manual(values = emotion_colors) +
    scale_y_continuous(labels = comma, expand = expansion(mult = c(0, 0.15))) +
    labs(
      title = "Overall Emotional Reaction Distribution",
      subtitle = "Total Facebook reactions by type across all Catholic content",
      x = NULL,
      y = "Total Reactions"
    ) +
    theme(legend.position = "none")
} else {
  cat("No Facebook reaction data available.\n")
}

3.2 Emotional Fingerprints by Actor Type

Each actor type has a characteristic mix of emotional reactions. These fingerprints reveal how audiences emotionally engage with different types of Catholic communicators.

Show code
if (nrow(fb_data) > 0 && sum(fb_data$Total_Reactions > 0, na.rm = TRUE) > 0) {
  actor_emotions <- fb_data[Total_Reactions > 0, .(
    LOVE = mean(LOVE_SHARE, na.rm = TRUE) * 100,
    WOW = mean(WOW_SHARE, na.rm = TRUE) * 100,
    HAHA = mean(HAHA_SHARE, na.rm = TRUE) * 100,
    SAD = mean(SAD_SHARE, na.rm = TRUE) * 100,
    ANGRY = mean(ANGRY_SHARE, na.rm = TRUE) * 100,
    Posts = .N
  ), by = ACTOR_TYPE]
  
  actor_emotions_long <- actor_emotions %>%
    select(-Posts) %>%
    pivot_longer(cols = -ACTOR_TYPE, names_to = "Emotion", values_to = "Share")
  
  ggplot(actor_emotions_long, aes(x = Emotion, y = Share, fill = Emotion)) +
    geom_col(width = 0.7) +
    facet_wrap(~ACTOR_TYPE, ncol = 3, scales = "free_y") +
    scale_fill_manual(values = emotion_colors) +
    labs(
      title = "Emotional Fingerprints by Actor Type",
      subtitle = "Mean share of each reaction type among emotional reactions",
      x = NULL,
      y = "Share of Emotional Reactions (%)"
    ) +
    theme(
      legend.position = "none",
      axis.text.x = element_text(angle = 45, hjust = 1)
    )
} else {
  cat("Insufficient Facebook reaction data for actor analysis.\n")
}

3.3 Comparative Emotional Profiles

The heatmap provides a comparative view of emotional profiles across all actor types, making it easy to identify which actors receive similar or different emotional responses.

Show code
if (exists("actor_emotions") && nrow(actor_emotions) > 0) {
  actor_emotions_matrix <- actor_emotions %>%
    select(ACTOR_TYPE, LOVE, WOW, HAHA, SAD, ANGRY) %>%
    pivot_longer(cols = -ACTOR_TYPE, names_to = "Emotion", values_to = "Share")
  
  ggplot(actor_emotions_matrix, aes(x = Emotion, y = ACTOR_TYPE, fill = Share)) +
    geom_tile(color = "white", linewidth = 0.5) +
    geom_text(aes(label = sprintf("%.1f%%", Share)), size = 3) +
    scale_fill_viridis(option = "plasma", direction = -1) +
    labs(
      title = "Emotional Profile Heatmap by Actor Type",
      subtitle = "Mean share of each reaction type",
      x = NULL,
      y = NULL,
      fill = "Share %"
    ) +
    theme(
      axis.text.x = element_text(angle = 0, hjust = 0.5),
      panel.grid = element_blank()
    )
}

3.4 LOVE Ratio Analysis

LOVE reactions indicate deep appreciation and emotional connection. High LOVE share suggests content that resonates strongly with audience values and emotions.

Show code
if (nrow(fb_data) > 0 && sum(fb_data$Total_Reactions > 0, na.rm = TRUE) > 0) {
  love_by_actor <- fb_data[Total_Reactions > 0, .(
    Mean_Love_Share = mean(LOVE_SHARE, na.rm = TRUE) * 100,
    Median_Love_Share = median(LOVE_SHARE, na.rm = TRUE) * 100,
    SD_Love_Share = sd(LOVE_SHARE, na.rm = TRUE) * 100,
    Posts = .N
  ), by = ACTOR_TYPE][order(-Mean_Love_Share)]
  
  ggplot(love_by_actor, aes(x = reorder(ACTOR_TYPE, Mean_Love_Share), 
                            y = Mean_Love_Share, fill = ACTOR_TYPE)) +
    geom_col(width = 0.7) +
    geom_errorbar(aes(ymin = Mean_Love_Share - SD_Love_Share/sqrt(Posts),
                      ymax = Mean_Love_Share + SD_Love_Share/sqrt(Posts)),
                  width = 0.2) +
    geom_text(aes(label = sprintf("%.1f%%", Mean_Love_Share)), hjust = -0.1, size = 3.5) +
    coord_flip() +
    scale_fill_manual(values = actor_colors) +
    scale_y_continuous(expand = expansion(mult = c(0, 0.15))) +
    labs(
      title = "LOVE Reaction Share by Actor Type",
      subtitle = "Mean percentage of LOVE among all emotional reactions (with standard error)",
      x = NULL,
      y = "LOVE Share (%)"
    ) +
    theme(legend.position = "none")
}

3.5 ANGRY Ratio Analysis

ANGRY reactions signal disagreement, outrage, or frustration. High ANGRY share may indicate controversial content, provocative framing, or topics that elicit strong negative responses.

Show code
if (nrow(fb_data) > 0 && sum(fb_data$Total_Reactions > 0, na.rm = TRUE) > 0) {
  angry_by_actor <- fb_data[Total_Reactions > 0, .(
    Mean_Angry_Share = mean(ANGRY_SHARE, na.rm = TRUE) * 100,
    Median_Angry_Share = median(ANGRY_SHARE, na.rm = TRUE) * 100,
    SD_Angry_Share = sd(ANGRY_SHARE, na.rm = TRUE) * 100,
    Posts = .N
  ), by = ACTOR_TYPE][order(-Mean_Angry_Share)]
  
  ggplot(angry_by_actor, aes(x = reorder(ACTOR_TYPE, Mean_Angry_Share), 
                             y = Mean_Angry_Share, fill = ACTOR_TYPE)) +
    geom_col(width = 0.7) +
    geom_errorbar(aes(ymin = pmax(0, Mean_Angry_Share - SD_Angry_Share/sqrt(Posts)),
                      ymax = Mean_Angry_Share + SD_Angry_Share/sqrt(Posts)),
                  width = 0.2) +
    geom_text(aes(label = sprintf("%.2f%%", Mean_Angry_Share)), hjust = -0.1, size = 3.5) +
    coord_flip() +
    scale_fill_manual(values = actor_colors) +
    scale_y_continuous(expand = expansion(mult = c(0, 0.2))) +
    labs(
      title = "ANGRY Reaction Share by Actor Type",
      subtitle = "Mean percentage of ANGRY among all emotional reactions (with standard error)",
      x = NULL,
      y = "ANGRY Share (%)"
    ) +
    theme(legend.position = "none")
}

3.6 Emotional Profile Table

Show code
if (exists("actor_emotions") && nrow(actor_emotions) > 0) {
  actor_emotions %>%
    mutate(
      Posts = format(Posts, big.mark = ","),
      LOVE = sprintf("%.1f%%", LOVE),
      WOW = sprintf("%.1f%%", WOW),
      HAHA = sprintf("%.1f%%", HAHA),
      SAD = sprintf("%.1f%%", SAD),
      ANGRY = sprintf("%.1f%%", ANGRY)
    ) %>%
    arrange(desc(as.numeric(gsub("%", "", LOVE)))) %>%
    kable(col.names = c("Actor Type", "LOVE", "WOW", "HAHA", "SAD", "ANGRY", "Posts")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
}
Actor Type LOVE WOW HAHA SAD ANGRY Posts
Academic 100.0% 0.0% 0.0% 0.0% 0.0% 1
Lay Influencers 99.8% 0.0% 0.2% 0.0% 0.0% 21
Diocesan 75.1% 0.0% 0.0% 24.9% 0.0% 8
Independent Media 62.8% 2.0% 9.2% 20.5% 5.5% 32
Other 42.9% 2.7% 25.8% 12.9% 15.8% 137
Institutional Official 0.0% 50.0% 16.7% 0.0% 33.3% 2

4 Analysis 3.3: Controversy Detection

Controversy is operationalized through the Controversy Index, which combines the intensity of angry reactions with the visibility of the content.

Formula: Controversy Index = (ANGRY_COUNT / INTERACTIONS) × log(INTERACTIONS + 1)

This formula ensures that: 1. Content with higher ANGRY ratios scores higher 2. Content with more total interactions (greater reach) scores higher 3. The logarithmic transformation prevents extremely viral posts from dominating

Posts above the 95th percentile threshold are flagged as controversial.

Show code
if (nrow(fb_data) > 0) {
  fb_data[, Controversy_Index := (fifelse(is.na(ANGRY_COUNT), 0, ANGRY_COUNT) / (INTERACTIONS + 1)) * log(INTERACTIONS + 1)]
  
  controversy_threshold <- quantile(fb_data$Controversy_Index, 0.95, na.rm = TRUE)
  fb_data[, Is_Controversial := Controversy_Index > controversy_threshold]
  
  cat("Controversy threshold (95th percentile):", round(controversy_threshold, 4), "\n")
  cat("Controversial posts:", sum(fb_data$Is_Controversial, na.rm = TRUE), "\n")
}
Controversy threshold (95th percentile): 0 
Controversial posts: 71 

4.1 Controversy Index Distribution

Show code
if (nrow(fb_data) > 0 && sum(fb_data$Controversy_Index > 0, na.rm = TRUE) > 0) {
  ggplot(fb_data[Controversy_Index > 0], aes(x = Controversy_Index)) +
    geom_histogram(bins = 50, fill = "#ef4444", alpha = 0.7, color = "white") +
    geom_vline(xintercept = controversy_threshold, linetype = "dashed", color = "darkred", linewidth = 1) +
    annotate("text", x = controversy_threshold * 1.5, y = Inf, vjust = 2,
             label = sprintf("95th percentile threshold\n(%.3f)", controversy_threshold),
             size = 3.5) +
    scale_x_log10() +
    scale_y_continuous(labels = comma) +
    labs(
      title = "Controversy Index Distribution",
      subtitle = "Controversy Index = (ANGRY / INTERACTIONS) x log(INTERACTIONS)",
      x = "Controversy Index (log scale)",
      y = "Number of Posts"
    )
}

4.2 Most Controversial Posts

Click to expand (Top 20 most controversial posts)
Show code
if (nrow(fb_data) > 0) {
  top_controversial <- fb_data[order(-Controversy_Index)][1:min(50, nrow(fb_data)), .(
    Rank = .I,
    FROM,
    ACTOR_TYPE,
    DATE,
    TITLE_SHORT = substr(TITLE, 1, 60),
    INTERACTIONS,
    ANGRY_COUNT,
    Controversy_Index
  )]
  
  top_controversial[1:min(20, nrow(top_controversial))] %>%
    mutate(
      INTERACTIONS = format(INTERACTIONS, big.mark = ","),
      Controversy_Index = sprintf("%.3f", Controversy_Index)
    ) %>%
    kable(col.names = c("Rank", "Source", "Actor Type", "Date", "Title", 
                        "Interactions", "ANGRY", "Index")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
    scroll_box(height = "500px")
}
Rank Source Actor Type Date Title Interactions ANGRY Index
1 srednja.hr Other 2021-04-22 Fuchs se susreo s Bozanićem: Razgovarali o suradnji Ministar 118 42 1.687
2 Index Vijesti Other 2021-04-21 Nadbiskup Barišić: Nezgodno je sad otkazati krizme, kasnije 108 26 1.119
3 DNEVNIK.hr Other 2021-04-15 Za Provjereno progovorili bivši djelatnici ogulinskog Crveno 245 47 1.052
4 Narod.hr Other 2021-04-16 Nadbiskup Uzinić: Dozvoljena su misna slavlja na kojima ima 11 5 1.035
5 Index.hr Other 2021-04-22 Splitski nadbiskup: Preporučujemo odgodu krizmi. Župa: Ipak 117 25 1.011
6 Net.hr Other 2021-04-22 Stožer želi odgodu krizmi i pričesti, nadbiskup se neda: 'Bi 324 56 0.997
7 Nisam Vjernik Other 2021-09-30 "Neće ništa loše čuti na vjeronauku" (Zvuči suludo, naravno, 74 17 0.979
8 Jutarnji Other 2021-04-22 Liječnici: ‘Zaustavite pričesti i krizme‘: Nadbiskup Barišić 624 92 0.948
9 Index.hr Other 2021-05-02 Svećenik u Salima na župu objesio zastavu s ustaškim slogano 889 122 0.931
10 Pokret za sekularnu Hrvatsku Other 2021-04-25 Svećenik na misi koju je prenosila HTV spominjao "bradate sv 5 3 0.896
11 Index.hr Other 2021-04-21 Nadbiskup Barišić: Nezgodno je sad otkazati krizme, kasnije 790 105 0.886
12 Slobodna Dalmacija Other 2021-04-25 Sramota na ekranima: svećenik na misi koju je prenosila tele 902 111 0.837
13 Dalmatinski portal Other 2021-04-22 KARIN: Zbog korone treba odgoditi krizme i pričesti 335 43 0.744
14 Jutarnji Other 2021-04-30 U sutrašnjem izdanju Jutarnjeg lista čitajte : 👉 Doći ću ti: | | 3| .732 |
15 Index Vijesti Other 2021-05-02 Svećenik u Salima na župu objesio zastavu s ustaškim slogano 153 22 0.720
16 24sata Other 2021-05-02 Pater je zazivao novi Bljesak; Branitelj: Ne dao Bog, ratove 264 33 0.695
17 Dnevno.hr Other 2021-05-04 SKANDAL KOJI JE 'ZAPALIO' SUSJEDE! Poznati sportaš urinirao 83 13 0.686
18 Bitno.net Independent Media 2021-04-16 Nadbiskup Uzinić u pismu svećenicima: Dozvoljena su sva misn 165 22 0.677
19 Jutarnji Other 2021-04-16 Svećenik s Dugog otoka u novom skandalu, sukobio se s vjerni 1,169 99 0.598
20 PolitikaPlus Institutional Official 2021-10-05 Na kraju, samo jedna opaska. Člankom 10. ovog Ugovora zajamč 5 2 0.597

4.3 Controversy by Actor Type

Which actor types produce the most controversial content on average?

Show code
if (nrow(fb_data) > 0) {
  controversy_actor <- fb_data[, .(
    Mean_Controversy = mean(Controversy_Index, na.rm = TRUE),
    Median_Controversy = median(Controversy_Index, na.rm = TRUE),
    Max_Controversy = max(Controversy_Index, na.rm = TRUE),
    Controversial_Posts = sum(Is_Controversial, na.rm = TRUE),
    Total_Posts = .N,
    Controversy_Rate = sum(Is_Controversial, na.rm = TRUE) / .N * 100
  ), by = ACTOR_TYPE][order(-Mean_Controversy)]
  
  ggplot(controversy_actor, aes(x = reorder(ACTOR_TYPE, Mean_Controversy), 
                                 y = Mean_Controversy, fill = ACTOR_TYPE)) +
    geom_col(width = 0.7) +
    geom_text(aes(label = sprintf("%.4f", Mean_Controversy)), hjust = -0.1, size = 3.5) +
    coord_flip() +
    scale_fill_manual(values = actor_colors) +
    scale_y_continuous(expand = expansion(mult = c(0, 0.2))) +
    labs(
      title = "Mean Controversy Index by Actor Type",
      subtitle = "Higher values indicate more controversial content on average",
      x = NULL,
      y = "Mean Controversy Index"
    ) +
    theme(legend.position = "none")
}

4.4 Controversy Rate by Actor Type

The controversy rate shows the percentage of posts from each actor type that exceed the 95th percentile controversy threshold.

Show code
if (exists("controversy_actor") && nrow(controversy_actor) > 0) {
  ggplot(controversy_actor, aes(x = reorder(ACTOR_TYPE, Controversy_Rate), 
                                 y = Controversy_Rate, fill = ACTOR_TYPE)) +
    geom_col(width = 0.7) +
    geom_text(aes(label = sprintf("%.1f%%", Controversy_Rate)), hjust = -0.1, size = 3.5) +
    coord_flip() +
    scale_fill_manual(values = actor_colors) +
    scale_y_continuous(expand = expansion(mult = c(0, 0.2))) +
    labs(
      title = "Controversy Rate by Actor Type",
      subtitle = "Percentage of posts exceeding 95th percentile controversy threshold",
      x = NULL,
      y = "Controversy Rate (%)"
    ) +
    theme(legend.position = "none")
}

4.5 Controversy Summary Table

Show code
if (exists("controversy_actor") && nrow(controversy_actor) > 0) {
  controversy_actor %>%
    mutate(
      Total_Posts = format(Total_Posts, big.mark = ","),
      Controversial_Posts = format(Controversial_Posts, big.mark = ","),
      Mean_Controversy = sprintf("%.4f", Mean_Controversy),
      Median_Controversy = sprintf("%.4f", Median_Controversy),
      Max_Controversy = sprintf("%.3f", Max_Controversy),
      Controversy_Rate = sprintf("%.2f%%", Controversy_Rate)
    ) %>%
    select(ACTOR_TYPE, Total_Posts, Controversial_Posts, Controversy_Rate, 
           Mean_Controversy, Median_Controversy, Max_Controversy) %>%
    kable(col.names = c("Actor Type", "Total Posts", "Controversial Posts", 
                        "Controversy Rate", "Mean Index", "Median Index", "Max Index")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
}
Actor Type Total Posts Controversial Posts Controversy Rate Mean Index Median Index Max Index
Other 42,121 63 0.15% 0.0006 0.0000 1.687
Independent Media 5,336 7 0.13% 0.0003 0.0000 0.677
Institutional Official 3,525 1 0.03% 0.0002 0.0000 0.597
Diocesan 2,820 0 0.00% 0.0000 0.0000 0.000
Lay Influencers 9,800 0 0.00% 0.0000 0.0000 0.000
Academic 190 0 0.00% 0.0000 0.0000 0.000
Youth Organizations 51 0 0.00% 0.0000 0.0000 0.000

4.6 Controversy Over Time

Tracking controversy over time can reveal periods of heightened conflict or tension in the Catholic digital space.

Show code
if (nrow(fb_data) > 0) {
  controversy_time <- fb_data[, .(
    Mean_Controversy = mean(Controversy_Index, na.rm = TRUE),
    Controversial_Posts = sum(Is_Controversial, na.rm = TRUE),
    Total_Posts = .N
  ), by = .(Year = year(DATE), Month = month(DATE))]
  
  controversy_time[, Date := as.Date(paste(Year, Month, "01", sep = "-"))]
  controversy_time[, Controversy_Rate := Controversial_Posts / Total_Posts * 100]
  
  ggplot(controversy_time[!is.na(Date)], aes(x = Date, y = Mean_Controversy)) +
    geom_line(color = "#ef4444", linewidth = 1) +
    geom_smooth(method = "loess", se = TRUE, color = "darkred", fill = "pink", alpha = 0.3) +
    scale_x_date(date_labels = "%Y-%m", date_breaks = "6 months") +
    labs(
      title = "Controversy Index Over Time",
      subtitle = "Monthly mean controversy index with LOESS trend",
      x = NULL,
      y = "Mean Controversy Index"
    ) +
    theme(axis.text.x = element_text(angle = 45, hjust = 1))
}

4.7 Top Controversial Sources

Click to expand (Sources with highest mean controversy, minimum 20 posts)
Show code
if (nrow(fb_data) > 0) {
  source_controversy <- fb_data[, .(
    Mean_Controversy = mean(Controversy_Index, na.rm = TRUE),
    Max_Controversy = max(Controversy_Index, na.rm = TRUE),
    Controversial_Posts = sum(Is_Controversial, na.rm = TRUE),
    Total_Posts = .N,
    Total_ANGRY = sum(ANGRY_COUNT, na.rm = TRUE),
    ACTOR_TYPE = first(ACTOR_TYPE)
  ), by = FROM][Total_Posts >= 20][order(-Mean_Controversy)]
  
  if (nrow(source_controversy) > 0) {
    source_controversy[1:min(20, nrow(source_controversy))] %>%
      mutate(
        Total_Posts = format(Total_Posts, big.mark = ","),
        Total_ANGRY = format(Total_ANGRY, big.mark = ","),
        Mean_Controversy = sprintf("%.4f", Mean_Controversy),
        Max_Controversy = sprintf("%.3f", Max_Controversy)
      ) %>%
      select(FROM, ACTOR_TYPE, Total_Posts, Controversial_Posts, 
             Mean_Controversy, Max_Controversy, Total_ANGRY) %>%
      kable(col.names = c("Source", "Actor Type", "Posts", "Controversial", 
                          "Mean Index", "Max Index", "Total ANGRY")) %>%
      kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
  }
}
Source Actor Type Posts Controversial Mean Index Max Index Total ANGRY
Nisam Vjernik Other 22 1 0.0445 0.979 17
Makarska Danas Other 31 4 0.0187 0.409 10
Index.hr Other 293 9 0.0136 1.011 301
Jutarnji Other 262 5 0.0104 0.948 207
Index Vijesti Other 240 2 0.0077 1.119 48
Zoran Šprajc Other 22 1 0.0035 0.077 6
Lupiga.com Other 47 1 0.0034 0.160 6
Dalmatinski portal Other 239 1 0.0031 0.744 43
24sata Other 475 3 0.0026 0.695 74
Narod.hr Other 465 2 0.0024 1.035 9
Slobodna Dalmacija Other 764 5 0.0023 0.837 145
Tportal.hr Other 408 3 0.0022 0.545 8
N1 Hrvatska Other 326 1 0.0018 0.575 11
Net.hr Other 655 2 0.0017 0.997 59
DNEVNIK.hr Other 643 1 0.0016 1.052 47
Dubrovački vjesnik Other 207 1 0.0016 0.322 1
Dnevno.hr Other 620 4 0.0015 0.686 66
SNAGA Šibenski portal Other 54 1 0.0013 0.072 1
ŠibenikIN Other 459 1 0.0010 0.456 8
Croative.net Other 350 1 0.0010 0.347 1

5 Analysis 3.4: Emotion by Sentiment

This section cross-tabulates automated sentiment classification with actual emotional reactions to validate whether sentiment labels align with audience responses.

Expected patterns: - Positive sentiment posts should receive more LOVE reactions - Negative sentiment posts should receive more ANGRY and SAD reactions - Neutral sentiment posts should show balanced emotional profiles

Deviations from these patterns may indicate limitations in the sentiment classifier or interesting disconnects between content tone and audience reception.

Show code
if (nrow(fb_data) > 0 && sum(fb_data$Total_Reactions > 0, na.rm = TRUE) > 0) {
  emotion_sentiment <- fb_data[!is.na(AUTO_SENTIMENT) & Total_Reactions > 0, .(
    LOVE = mean(LOVE_SHARE, na.rm = TRUE) * 100,
    WOW = mean(WOW_SHARE, na.rm = TRUE) * 100,
    HAHA = mean(HAHA_SHARE, na.rm = TRUE) * 100,
    SAD = mean(SAD_SHARE, na.rm = TRUE) * 100,
    ANGRY = mean(ANGRY_SHARE, na.rm = TRUE) * 100,
    Posts = .N
  ), by = AUTO_SENTIMENT]
  
  emotion_sentiment_long <- emotion_sentiment %>%
    select(-Posts) %>%
    pivot_longer(cols = -AUTO_SENTIMENT, names_to = "Emotion", values_to = "Share")
  
  ggplot(emotion_sentiment_long, aes(x = Emotion, y = Share, fill = Emotion)) +
    geom_col(width = 0.7) +
    facet_wrap(~AUTO_SENTIMENT, ncol = 3) +
    scale_fill_manual(values = emotion_colors) +
    labs(
      title = "Emotional Reactions by Automated Sentiment",
      subtitle = "Mean reaction share within each sentiment category",
      x = NULL,
      y = "Share of Emotional Reactions (%)"
    ) +
    theme(
      legend.position = "none",
      axis.text.x = element_text(angle = 45, hjust = 1)
    )
}

5.1 Sentiment Validation Through Reactions

The Positive Ratio divides LOVE by (ANGRY + SAD) to measure whether sentiment classifications align with actual emotional responses. Higher ratios for positive sentiment validate the classifier.

Show code
if (nrow(fb_data) > 0 && sum(fb_data$Total_Reactions > 0, na.rm = TRUE) > 0) {
  validation_metrics <- fb_data[!is.na(AUTO_SENTIMENT) & Total_Reactions > 0, .(
    Mean_Love = mean(LOVE_SHARE, na.rm = TRUE),
    Mean_Angry = mean(ANGRY_SHARE, na.rm = TRUE),
    Mean_Sad = mean(SAD_SHARE, na.rm = TRUE),
    Positive_Ratio = mean(LOVE_SHARE, na.rm = TRUE) / (mean(ANGRY_SHARE, na.rm = TRUE) + mean(SAD_SHARE, na.rm = TRUE) + 0.001)
  ), by = AUTO_SENTIMENT]
  
  validation_metrics %>%
    mutate(
      Mean_Love = sprintf("%.1f%%", Mean_Love * 100),
      Mean_Angry = sprintf("%.1f%%", Mean_Angry * 100),
      Mean_Sad = sprintf("%.1f%%", Mean_Sad * 100),
      Positive_Ratio = sprintf("%.2f", Positive_Ratio)
    ) %>%
    kable(col.names = c("Auto Sentiment", "Mean LOVE", "Mean ANGRY", "Mean SAD", 
                        "Positive Ratio (LOVE / (ANGRY+SAD))")) %>%
    kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
}
Auto Sentiment Mean LOVE Mean ANGRY Mean SAD Positive Ratio (LOVE / (ANGRY+SAD))
negative 35.2% 17.9% 17.7% 0.99
positive 90.7% 1.4% 1.3% 32.34
neutral 50.0% 10.9% 16.1% 1.85

6 Analysis 3.5: Emotional Temporal Patterns

Emotional patterns may vary over time due to seasonal factors (liturgical calendar), events (elections, scandals), or longer term trends in the Catholic digital space.

Show code
if (nrow(fb_data) > 0 && sum(fb_data$Total_Reactions > 0, na.rm = TRUE) > 0) {
  emotion_monthly <- fb_data[Total_Reactions > 0, .(
    LOVE = mean(LOVE_SHARE, na.rm = TRUE) * 100,
    ANGRY = mean(ANGRY_SHARE, na.rm = TRUE) * 100,
    SAD = mean(SAD_SHARE, na.rm = TRUE) * 100,
    Posts = .N
  ), by = .(Year = year(DATE), Month = month(DATE))]
  
  emotion_monthly[, Date := as.Date(paste(Year, Month, "01", sep = "-"))]
  
  emotion_monthly_long <- emotion_monthly %>%
    select(Date, LOVE, ANGRY, SAD) %>%
    pivot_longer(cols = -Date, names_to = "Emotion", values_to = "Share")
  
  ggplot(emotion_monthly_long[!is.na(Date)], aes(x = Date, y = Share, color = Emotion)) +
    geom_line(linewidth = 1) +
    geom_smooth(method = "loess", se = FALSE, linetype = "dashed", linewidth = 0.7) +
    scale_color_manual(values = c("LOVE" = "#ec4899", "ANGRY" = "#ef4444", "SAD" = "#3b82f6")) +
    scale_x_date(date_labels = "%Y-%m", date_breaks = "6 months") +
    labs(
      title = "Key Emotional Reactions Over Time",
      subtitle = "Monthly mean share of LOVE, ANGRY, and SAD reactions",
      x = NULL,
      y = "Share of Emotional Reactions (%)",
      color = "Reaction"
    ) +
    theme(
      axis.text.x = element_text(angle = 45, hjust = 1),
      legend.position = "top"
    )
}

6.1 Day of Week Emotional Patterns

Do posting patterns by day of week affect emotional responses? Sunday posts (liturgical content) may receive different reactions than weekday posts.

Show code
if (nrow(fb_data) > 0 && sum(fb_data$Total_Reactions > 0, na.rm = TRUE) > 0) {
  fb_data[, DOW := lubridate::wday(DATE, label = TRUE, abbr = FALSE)]
  
  emotion_dow <- fb_data[Total_Reactions > 0 & !is.na(DOW), .(
    LOVE = mean(LOVE_SHARE, na.rm = TRUE) * 100,
    ANGRY = mean(ANGRY_SHARE, na.rm = TRUE) * 100,
    Posts = .N
  ), by = DOW]
  
  emotion_dow_long <- emotion_dow %>%
    select(DOW, LOVE, ANGRY) %>%
    pivot_longer(cols = -DOW, names_to = "Emotion", values_to = "Share")
  
  ggplot(emotion_dow_long, aes(x = DOW, y = Share, fill = Emotion)) +
    geom_col(position = "dodge", width = 0.7) +
    scale_fill_manual(values = c("LOVE" = "#ec4899", "ANGRY" = "#ef4444")) +
    labs(
      title = "LOVE and ANGRY Reactions by Day of Week",
      subtitle = "Mean share of reactions by posting day",
      x = NULL,
      y = "Share of Emotional Reactions (%)",
      fill = "Reaction"
    ) +
    theme(legend.position = "top")
}

7 Summary and Key Findings

Show code
total_with_sentiment <- sum(!is.na(dta$AUTO_SENTIMENT))
positive_pct <- sum(dta$AUTO_SENTIMENT == "positive", na.rm = TRUE) / total_with_sentiment * 100
negative_pct <- sum(dta$AUTO_SENTIMENT == "negative", na.rm = TRUE) / total_with_sentiment * 100

fb_total <- nrow(fb_data)
mean_love_share <- if(fb_total > 0) mean(fb_data$LOVE_SHARE, na.rm = TRUE) * 100 else 0
mean_angry_share <- if(fb_total > 0) mean(fb_data$ANGRY_SHARE, na.rm = TRUE) * 100 else 0
controversial_count <- if(fb_total > 0) sum(fb_data$Is_Controversial, na.rm = TRUE) else 0

7.1 Summary Table

Show code
tibble(
  Finding = c(
    "Total posts with sentiment data",
    "Positive sentiment share",
    "Negative sentiment share",
    "Facebook posts with reaction data",
    "Mean LOVE share of reactions",
    "Mean ANGRY share of reactions",
    "Posts exceeding controversy threshold"
  ),
  Value = c(
    format(total_with_sentiment, big.mark = ","),
    sprintf("%.1f%%", positive_pct),
    sprintf("%.1f%%", negative_pct),
    format(fb_total, big.mark = ","),
    sprintf("%.1f%%", mean_love_share),
    sprintf("%.2f%%", mean_angry_share),
    format(controversial_count, big.mark = ",")
  )
) %>%
  kable(col.names = c("Finding", "Value")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Finding Value
Total posts with sentiment data 608,879
Positive sentiment share 47.3%
Negative sentiment share 28.7%
Facebook posts with reaction data 63,843
Mean LOVE share of reactions 53.1%
Mean ANGRY share of reactions 11.95%
Posts exceeding controversy threshold 71

7.2 Hypotheses Testing Summary

Hypothesis Metric to Examine Finding
H10: Catholic digital content is predominantly positive Sentiment distribution, positivity ratio See overall sentiment distribution
H11: Lay influencers have highest LOVE ratios LOVE share by actor type Compare actor emotional fingerprints
H12: Independent media generates most controversy Controversy rate by actor type Compare controversy rates
H13: Negative sentiment correlates with higher engagement Sentiment-engagement relationship See sentiment-engagement analysis

8 Appendix: Classification Quality Diagnostics

Click to expand diagnostics (Quality assurance for actor classification)
Show code
# Actor type distribution check
actor_dist <- dta[, .(
  Posts = .N,
  Sources = uniqueN(FROM),
  Mean_Interactions = mean(INTERACTIONS, na.rm = TRUE)
), by = ACTOR_TYPE][order(-Posts)]

actor_dist %>%
  mutate(
    Posts = format(Posts, big.mark = ","),
    Sources = format(Sources, big.mark = ","),
    Mean_Interactions = round(Mean_Interactions, 1)
  ) %>%
  kable(col.names = c("Actor Type", "Posts", "Sources", "Mean Interactions"),
        caption = "Actor Type Distribution in Emotional Analysis Sample") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Actor Type Distribution in Emotional Analysis Sample
Actor Type Posts Sources Mean Interactions
Other 468,760 16,076 99.1
Institutional Official 62,744 202 56.6
Independent Media 31,401 18 210.2
Lay Influencers 28,451 62 224.7
Diocesan 13,841 50 46.0
Religious Orders 1,677 11 22.8
Charismatic Communities 1,282 12 153.7
Academic 671 5 17.3
Youth Organizations 52 4 128.8
Show code
# Check high-engagement sources in Other category
other_high_engagement <- dta[ACTOR_TYPE == "Other" & SOURCE_TYPE == "facebook", .(
  Posts = .N,
  Total_Interactions = sum(INTERACTIONS, na.rm = TRUE)
), by = FROM][order(-Total_Interactions)][1:15]

other_high_engagement %>%
  mutate(
    Posts = format(Posts, big.mark = ","),
    Total_Interactions = format(Total_Interactions, big.mark = ",")
  ) %>%
  kable(col.names = c("Source", "Posts", "Total Interactions"),
        caption = "Top Facebook Sources in 'Other' Category (review for misclassification)") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Top Facebook Sources in 'Other' Category (review for misclassification)
Source Posts Total Interactions
Index.hr 294 197,104
Gospa 624 179,812
24sata 482 174,369
DNEVNIK.hr 648 164,778
RTL Danas 663 158,676
Kraljica Mira 1,224 155,621
Velimir Bujanec 114 144,769
Net.hr 664 135,958
RTL Direkt 157 135,282
Marko Perković Thompson 28 124,277
Slobodna Dalmacija 774 118,415
Večernji list 560 115,060
Telegram.hr 260 111,267
Ivan Pokupec 120 108,174
HRT Vijesti 180 106,376

9 Appendix: Model Diagnostics

Show code
cat("Emotional Analysis Summary:\n")
Emotional Analysis Summary:
Show code
cat("Total posts:", nrow(dta), "\n")
Total posts: 608879 
Show code
cat("Posts with sentiment:", sum(!is.na(dta$AUTO_SENTIMENT)), "\n")
Posts with sentiment: 608879 
Show code
cat("Facebook posts with reactions:", nrow(fb_data), "\n")
Facebook posts with reactions: 63843 
Show code
cat("Date range:", as.character(min(dta$DATE)), "to", as.character(max(dta$DATE)), "\n")
Date range: 2021-01-01 to 2025-12-31 
Show code
sessionInfo()
R version 4.5.2 (2025-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)

Matrix products: default
  LAPACK version 3.12.1

locale:
[1] LC_COLLATE=Croatian_Croatia.utf8  LC_CTYPE=Croatian_Croatia.utf8   
[3] LC_MONETARY=Croatian_Croatia.utf8 LC_NUMERIC=C                     
[5] LC_TIME=Croatian_Croatia.utf8    

time zone: Europe/Zagreb
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] patchwork_1.3.2   ggridges_0.5.7    viridis_0.6.5     viridisLite_0.4.2
 [5] fmsb_0.7.6        kableExtra_1.4.0  knitr_1.50        scales_1.4.0     
 [9] data.table_1.17.8 lubridate_1.9.4   forcats_1.0.1     stringr_1.6.0    
[13] dplyr_1.1.4       purrr_1.2.0       readr_2.1.6       tidyr_1.3.1      
[17] tibble_3.3.0      ggplot2_4.0.1     tidyverse_2.0.0  

loaded via a namespace (and not attached):
 [1] generics_0.1.4     xml2_1.5.1         lattice_0.22-7     stringi_1.8.7     
 [5] hms_1.1.4          digest_0.6.39      magrittr_2.0.4     evaluate_1.0.5    
 [9] grid_4.5.2         timechange_0.3.0   RColorBrewer_1.1-3 fastmap_1.2.0     
[13] Matrix_1.7-4       jsonlite_2.0.0     gridExtra_2.3      mgcv_1.9-3        
[17] textshaping_1.0.4  cli_3.6.5          rlang_1.1.6        splines_4.5.2     
[21] withr_3.0.2        yaml_2.3.11        tools_4.5.2        tzdb_0.5.0        
[25] vctrs_0.6.5        R6_2.6.1           lifecycle_1.0.4    htmlwidgets_1.6.4 
[29] pkgconfig_2.0.3    pillar_1.11.1      gtable_0.3.6       glue_1.8.0        
[33] systemfonts_1.3.1  xfun_0.54          tidyselect_1.2.1   rstudioapi_0.17.1 
[37] dichromat_2.0-0.1  farver_2.1.2       nlme_3.1-168       htmltools_0.5.8.1 
[41] labeling_0.4.3     rmarkdown_2.30     svglite_2.2.2      compiler_4.5.2    
[45] S7_0.2.1