DigiKat Map 1: Platform and Actor Structure

Mapping the Croatian Catholic Digital Media Space

Author

DigiKat Project

Published

January 27, 2026

1 Introduction

This document presents Map 1: Platform and Actor Structure of the DigiKat project, analyzing the Croatian Catholic digital media space. The core question driving this analysis is: Who communicates where, and who dominates visibility?

We analyze a corpus of over 600,000 posts from 2021-2024 across multiple digital platforms to understand:

  1. How content is distributed across platforms
  2. How visibility and engagement are stratified among actors
  3. Whether institutional actors underperform compared to grassroots voices
  4. Which actors maintain cross-platform presence

This map draws on theoretical frameworks from digital media studies, particularly the concepts of platform affordances (van Dijck, 2013), attention economics (Webster, 2014), and the long tail distribution of online visibility (Anderson, 2006). Understanding who speaks and who is heard in digital religious spaces is essential for assessing the health and diversity of Catholic public discourse in Croatia.

Show code
# Load data
dta <- readRDS("C:/Users/lsikic/Luka C/HKS/Projekti/Digitalni Kat/SHKM/DigiKat/data/merged_comprehensive.rds") %>%
  filter(SOURCE_TYPE != "tiktok", !is.na(SOURCE_TYPE)) %>%
  filter(DATE >= as.Date("2021-01-01") & DATE <= as.Date("2025-12-31")) %>%
  filter(year >= 2021 & year <= 2025)

# Convert to data.table for efficiency
setDT(dta)

# Basic corpus info
n_posts <- nrow(dta)
n_sources <- uniqueN(dta$FROM)
date_range <- paste(min(dta$DATE), "to", max(dta$DATE))

1.1 Corpus Overview

The corpus represents a comprehensive collection of publicly available digital content from Croatian Catholic media sources. This includes official institutional communications, independent Catholic media, parish and diocesan channels, religious order publications, and individual clergy voices across web, social media, and discussion platforms.

Show code
tibble(
  Metric = c("Total posts", "Unique sources", "Date range", "Platforms"),
  Value = c(
    format(n_posts, big.mark = ","),
    format(n_sources, big.mark = ","),
    date_range,
    paste(unique(dta$SOURCE_TYPE), collapse = ", ")
  )
) %>%
  kable(col.names = c("Metric", "Value")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Metric Value
Total posts 608,879
Unique sources 16,426
Date range 2021-01-01 to 2025-12-31
Platforms web, facebook, twitter, comment, youtube, forum, reddit, instagram

2 Actor Type Classification

Before analyzing platform dynamics, we classify sources into meaningful actor types based on the Croatian Catholic media landscape. This classification is essential for understanding the structural composition of the digital space and enables comparative analysis across different types of communicators.

2.1 Classification Methodology

The classification employs a hierarchical priority system that processes each source through multiple identification layers. This approach balances precision (avoiding false positives) with recall (capturing relevant actors). The hierarchy works as follows:

  1. Manual overrides handle known important sources that require explicit classification
  2. Secular media exclusions prevent misclassification of mainstream news outlets covering religious topics
  3. Domain-based matching provides reliable identification for web sources with known URLs
  4. Pattern-based matching captures sources through name recognition
  5. Platform-aware detection identifies lay devotional content on social media

The resulting categories reflect the organizational structure of Croatian Catholic communications, distinguishing between official Church hierarchy, independent media voices, religious communities, and individual actors.

Show code
# =============================================================================
# IMPROVED ACTOR CLASSIFICATION v4 (ACTIVE)
# =============================================================================
# This is the current classification system with the following improvements:
# 1. Hierarchical priority system with explicit ordering
# 2. Expanded secular media exclusions (100+ outlets)
# 3. Domain-based classification for reliable web source identification
# 4. Word boundary matching for abbreviations (prevents false matches)
# 5. Start-of-string matching for priest titles
# 6. Platform-aware lay influencer detection
# 7. Expanded patterns for religious orders including female congregations
# =============================================================================

# -----------------------------------------------------------------------------
# MANUAL OVERRIDES: Highest priority explicit classifications
# -----------------------------------------------------------------------------
manual_overrides <- list(
  "Institutional Official" = c(
    "hrvatska katolička mreža", "hrvatska katolicka mreza",
    "informativna katolička agencija", "informativna katolicka agencija",
    "hrvatski katolički radio", "hrvatski katolicki radio",
    "hrvatska biskupska konferencija", "ika", "hkr", "hkm", "hbk",
    "tiskovni ured hbk", "radio marija"
  ),
  "Independent Media" = c(
    "laudato tv", "laudatotv", "laudato.tv", "laudato.hr", "laudato",
    "bitno.net", "bitno net", "glas koncila", "glaskoncila",
    "nova eva", "nova-eva", "verbum", "totus tuus", "totus-tuus",
    "katolički tjednik", "katolicki tjednik", "kršćanska sadašnjost",
    "krscanska sadasnjost", "mir i dobro", "svjetlo riječi",
    "novizivot.net", "novi zivot", "novi život"
  ),
  "Charismatic Communities" = c(
    "božja pobjeda", "bozja pobjeda", "bozjapobjeda",
    "muževni budite", "muzevni budite", "muzevnibudite",
    "srce isusovo", "srceisuovo", "cenacolo", "comunità cenacolo",
    "duhovna obnova", "molitvena snaga",
    "dom molitve slavonski brod", "dom molitve",
    "molitvena zajednica sv. josipa"
  ),
  "Lay Influencers" = c(
    "katolička obitelj", "katolicka obitelj",
    "marija majka isusova", "božanske molitve", "bozanske molitve",
    "moćne molitve tv", "mocne molitve tv", "moćne molitve", "mocne molitve",
    "katoličke molitve", "katolicke molitve",
    "pulherissimus",
    "pod smokvom",
    "hrana za dušu", "hrana za dusu",
    "добровољци",
    "miletić marin", "miletic marin",
    "dijete vjere",
    "vjera",
    "kapljice ljubavi božje", "kapljice ljubavi bozje",
    "kršćanstvo", "krscanstvo",
    "jutarnja molitva duhu svetom",
    "blago molitve",
    "biblija krunice molitve",
    "molitve bogu",
    "vojnik sreće", "vojnik srece", "duhovne poruke i inspiracija",
    "kes duhovni kutak", "duhovni kutak",
    "molitve.hr",
    "duhovniportal.com", "duhovniportal"
  ),
  "Diocesan" = c(
    "zagrebačka nadbiskupija", "zagrebacka nadbiskupija",
    "sisačka biskupija", "sisacka biskupija",
    "župa šurkovac", "zupa surkovac",
    "sveta mati slobode",
    "župa sv. ilije proroka metković", "zupa sv. ilije proroka metkovic",
    "župa uznesenja bdm", "zupa uznesenja bdm", "župa uznesenja bdm - stenjevec",
    "šibenska biskupija", "sibenska biskupija",
    "požeška biskupija", "pozeskа biskupija",
    "dubrovacka biskupija", "dubrovačka biskupija",
    "dubrovacka-biskupija.hr",
    "zupa tramosnica", "župa tramošnica",
    "župa sv. vida", "zupa sv. vida", "župa sv. vida - petruševec"
  ),
  "Youth Organizations" = c(
    "susret hrvatske katoličke mladeži", "susret hrvatske katolicke mladezi",
    "shkm požega", "shkm pozega"
  ),
  "Academic" = c(
    "hrvatsko katoličko sveučilište", "hrvatsko katolicko sveuciliste",
    "universitas studiorum catholica croatica"
  )
)

# -----------------------------------------------------------------------------
# SECULAR MEDIA EXCLUSIONS: Always classify as Other
# -----------------------------------------------------------------------------
secular_exclusions <- c(
  # National portals (with and without .hr)
  "slobodnadalmacija", "slobodnadalmacija.hr", 
  "vecernji", "vecernji.hr", 
  "index.hr", "index",
  "jutarnji", "jutarnji.hr", 
  "novilist", "novilist.hr",
  "24sata", "24sata.hr",
  "direktno", "direktno.hr",
  "nacional", "nacional.hr", 
  "tportal", "tportal.hr",
  "dnevnik.hr", "dnevnik", 
  "hrt.hr", "hrt", 
  "n1info", "n1info.hr", "n1",
  "rtl.hr", "rtl", 
  "net.hr", 
  "telegram.hr", "telegram", 
  "story.hr", "express.hr", "express", "advance.hr",
  # Regional portals
  "glasistre", "glasistre.hr",
  "dnevno.hr", "dnevno", 
  "prigorski", "glas-slavonije", "glas slavonije",
  "croativ", "oluja.info", 
  "maxportal", "maxportal.hr",
  "hkv.hr", "icv.hr", "novosti.hr", "7dnevno", "mnovine", 
  "sjever.hr", "dulist.hr", "pozega.eu", "sibenik.in",
  "ferata.hr", "epodravina", "glasgacke", "radio-zlatar", 
  "medjimurski.hr", "sbperiskop", "zagorje-international",
  "pozeski", "novine.hr", "dubrovnikinsider", "regionalni", 
  "leportale", "varazdinske-vijesti", "radionasice", 
  "brodportal", "ljportal", "dubrovnikportal", "01portal",
  "tomislavnews", "hia.com.hr", "portalnovosti", "antenazadar",
  "dalmacijanews", "zadarskilist", "medjimurjepress", 
  "zagreb.info", "034portal", "057info", "cityportal",
  "klikaj.hr", "lika-online", "ploce.com", "sbonline", 
  "narod.hr", "infokiosk", "hrsvijet", "tomislavcity", 
  "vrisak.info", "dalmacijadanas", "dalmacijadanas.hr",
  "morski.hr", "zagreb.hr", "osijek031", "rijeka.hr", "zadar.hr",
  "zupanjac.net", "zupanjac", 
  "dalmatinskiportal.hr", "dalmatinskiportal",
  "campaign-archive.com",
  # Forums and aggregators
  "forum.hr", "reddit", "anonymous_user", "komentari", "bug.hr",
  # Entertainment and other
  "inmemoriam", "magicus.info", "book.hr", "mojzagreb.info", 
  "skole.hr", "tvprofil", "priznajem.hr", "dragovoljac.com", 
  "croatia", "wikipedia",
  "facebook.com", "youtube.com", "instagram.com", "twitter.com",
  # Government and administrative
  "županija", "zupanija", "grad ", "opcina", "općina",
  # Non-Catholic religious groups
  "kršćanska proročka crkva", "krscanska prorocka crkva",
  "crkva svemogućeg boga", "crkva svemoguceg boga",
  "jehovini svjedoci", "adventisti", "baptisti", "pentekostalna",
  # Political parties
  "domovinski pokret", "hdz", "sdp", "most", "možemo", "mozemo"
)

# -----------------------------------------------------------------------------
# DOMAIN PATTERNS: URL-based classification
# -----------------------------------------------------------------------------
domain_patterns <- list(
  "Institutional Official" = c(
    "hkm.hr", "ika.hkm.hr", "hkr.hkm.hr", "hbk.hr", "radiomarija.hr"
  ),
  "Diocesan" = c(
    "zg-nadbiskupija.hr", "biskupija-varazdinska.hr", "djos.hr",
    "biskupija-sj.hr", "rzs.hr", "rkc-sisak.hr", "zadarskanadbiskupija.hr",
    "gospicko-senjska-biskupija.hr", "nadbiskupija-split.com",
    "nadbiskupija-split.hr", "dubrovacka-biskupija.hr", "porec-biskupija.hr",
    "biskupija-kk.hr", "rkc-pula.hr", "krizcevacka-eparhija.hr",
    "sibenska-biskupija.hr", "krizevci.hbk.hr"
  ),
  "Independent Media" = c(
    "laudato.hr", "laudato.tv", "bitno.net", "glaskoncila.hr",
    "nova-eva.com", "verbum.hr", "ks.hr", "totus-tuus.hr",
    "svjetlorijeci.hr", "zivot.com.hr", "novizivot.net"
  ),
  "Academic" = c(
    "unicath.hr", "hks.hr", "kbf.unizg.hr", "ffrz.hr", "hku.hr"
  ),
  "Religious Orders" = c(
    "franjevci.hr", "franjevci-split.hr", "isusovci.hr", "dominikanci.hr",
    "kapucini.hr", "salezijanci.hr", "karmelicani.hr"
  )
)

# -----------------------------------------------------------------------------
# NAME PATTERNS: Text-based classification
# -----------------------------------------------------------------------------

# Diocesan
diocesan_exact <- c(
  "zagrebačka nadbiskupija", "zagrebacka nadbiskupija",
  "splitsko-makarska nadbiskupija", "splitsko makarska",
  "đakovačko-osječka nadbiskupija", "djakovacko-osjecka",
  "riječka nadbiskupija", "rijecka nadbiskupija",
  "zadarska nadbiskupija", "sisačka biskupija", "sisacka biskupija",
  "varaždinska biskupija", "varazdinska biskupija",
  "križevačka eparhija", "krizevacka eparhija",
  "šibenska biskupija", "sibenska biskupija",
  "dubrovačka biskupija", "dubrovacka biskupija",
  "porečka i pulska biskupija", "porecka biskupija",
  "gospićko-senjska biskupija", "gospicko-senjska",
  "bjelovarsko-križevačka biskupija", "kotorska biskupija"
)
diocesan_contains <- c("nadbiskupija", "biskupija", "eparhija", "ordinarijat")
# Parish patterns - match at start OR anywhere with space before
diocesan_parish <- c("župa ", "zupa ", "župna ", "zupna ", "župni ", "zupni ",
                     "župa", "zupa")  # Also match at start without space

# Religious Orders
orders_exact <- c(
  "franjevci", "franjevci konventualci", "franjevci kapucini",
  "mala braća", "isusovci", "družba isusova", "druzba isusova",
  "dominikanci", "red propovjednika", "salezijanci", "don bosco",
  "karmelićani", "karmelicani", "karmel", "benediktinci", "benediktin",
  "kapucini", "pavlini", "trapisti", "cisterciti", "augustinci",
  "sestre milosrdnice", "uršulinke", "ursulinke", "klarise",
  "službenice milosrđa", "kćeri božje ljubavi", "školske sestre",
  "karmelićanke", "benediktinke", "dominikanke"
)
orders_contains <- c(
  "franjevački", "franjevacki", "isusovački", "isusovacki",
  "dominikanski", "salezijanski", "karmelski", "benediktinski",
  "kapucinski", "pavlinski", "redovnici", "redovnice",
  "samostan", "provincija"
)
orders_abbrev <- c("ofm", "ofmcap", "ofmconv", "sj", "op", "sdb",
                   "ocd", "osb", "osbm", "cssr", "svd", "omc")

# Charismatic Communities
charismatic_exact <- c(
  "emmanuel", "taize", "taizé", "fokolari", "fokolarini",
  "kursiljo", "cursillo", "neokatekumenski put", "shalom",
  "zajednica beatitudes", "zajednica blaženstava",
  "dom molitve", "kuća molitve", "kuca molitve",
  "duhovna obnova", "molitvena snaga"
)
charismatic_contains <- c(
  "molitvena zajednica", "karizmatska", "karizmatski",
  "neokatekumenski", "neokatekumenska", "obnova u duhu",
  "komunija i oslobođenje", "comunione e liberazione",
  "dom molitve", "house of prayer"
)

# Youth Organizations
youth_exact <- c(
  "frama", "shkm", "katolička mladež", "katolicka mladez",
  "mladi franjevci", "salezijanska mladež"
)
youth_contains <- c(
  "ministranti", "mladifra", "kaem", "studentska kapelanija",
  "sveučilišna kapelanija", "sveuclisna kapelanija",
  "pastoral mladih", "mladi katolici"
)

# Academic
academic_exact <- c(
  "hrvatsko katoličko sveučilište", "hrvatsko katolicko sveuciliste",
  "katolički bogoslovni fakultet", "katolicki bogoslovni fakultet",
  "filozofski fakultet družbe isusove", "teologija u rijeci"
)
academic_contains <- c("teologija", "bogoslovija", "katehetski")
academic_abbrev <- c("kbf", "hku", "ffrz")

# Individual Priests (title prefixes at START)
priest_prefixes <- c(
  "fra ", "don ", "vlč. ", "vlč.", "vlc. ", "vlc.",
  "msgr. ", "msgr.", "mons. ", "mons.",
  "o. ", "pater ", "p. ", "pr. ",
  "s. ", "sestra ", "m. ", "majka "
)
priest_hierarchy <- c(
  "biskup ", "nadbiskup ", "kardinal ",
  "mons. ", "preč. ", "prečasni "
)
priest_contains <- c("svećenik", "svecenik", "župnik", "zupnik")

# Lay Influencers
lay_devotional <- c(
  "vjera", "molitva", "molitve", "isus", "krist", "gospa", "marija",
  "hrana za dušu", "hrana za dusu", "dijete vjere",
  "riječ dana", "rijec dana", "riječ božja", "rijec bozja",
  "sveti", "svetac", "svetica", "evanđelje", "evandelje",
  "duhovnost", "duhovna", "duhovni", "biblija", "biblijski",
  "psalm", "blagoslov", "krunica", "rozarij", "lectio divina",
  "katolička obitelj", "katolicka obitelj", "katolicki",
  "božanske", "bozanske", "moćne", "mocne"
)
# Only exclude actual media domains, not devotional pages with "tv" in name
lay_exclude <- c(".hr", ".net", ".com", "portal", "vijesti",
                 "news", "radio", "agencija", "tjednik")

# -----------------------------------------------------------------------------
# CLASSIFICATION FUNCTION
# -----------------------------------------------------------------------------
classify_actor_v4 <- function(from_val, url_val = NA, platform_val = NA) {
  
  from_lower <- tolower(trimws(as.character(from_val)))
  url_lower <- tolower(ifelse(is.na(url_val), "", as.character(url_val)))
  platform_lower <- tolower(ifelse(is.na(platform_val), "", as.character(platform_val)))
  combined <- paste(from_lower, url_lower)
  
  # Helper function for pattern matching
  match_any <- function(patterns, text, fixed = TRUE) {
    any(sapply(patterns, function(p) grepl(p, text, fixed = fixed)))
  }
  
  # PRIORITY 1: Manual Overrides
  for (actor_type in names(manual_overrides)) {
    if (match_any(manual_overrides[[actor_type]], from_lower)) {
      return(actor_type)
    }
  }
  
  # PRIORITY 2: Secular Media Exclusions
  if (match_any(secular_exclusions, combined)) {
    return("Other")
  }
  
  # PRIORITY 3: Domain-based Classification
  if (nchar(url_lower) > 0) {
    for (actor_type in names(domain_patterns)) {
      if (match_any(domain_patterns[[actor_type]], url_lower)) {
        return(actor_type)
      }
    }
  }
  
  # PRIORITY 4a: Diocesan
  # Check for parish patterns - handle both with and without special characters
  is_parish <- grepl("^župa|^zupa|župi|zupi|- župa|- zupa", from_lower, ignore.case = TRUE) ||
               grepl("parish", from_lower, ignore.case = TRUE)
  
  if (match_any(diocesan_exact, from_lower) ||
      match_any(diocesan_contains, from_lower) ||
      is_parish) {
    return("Diocesan")
  }
  
  # PRIORITY 4b: Religious Orders
  if (match_any(orders_exact, from_lower) ||
      match_any(orders_contains, from_lower)) {
    return("Religious Orders")
  }
  for (abbr in orders_abbrev) {
    if (grepl(paste0("\\b", abbr, "\\b"), from_lower, ignore.case = TRUE)) {
      return("Religious Orders")
    }
  }
  
  # PRIORITY 4c: Charismatic Communities
  if (match_any(charismatic_exact, from_lower) ||
      match_any(charismatic_contains, from_lower)) {
    return("Charismatic Communities")
  }
  
  # PRIORITY 4d: Youth Organizations
  if (match_any(youth_exact, from_lower) ||
      match_any(youth_contains, from_lower)) {
    return("Youth Organizations")
  }
  
  # PRIORITY 4e: Academic
  if (match_any(academic_exact, from_lower) ||
      match_any(academic_contains, from_lower)) {
    return("Academic")
  }
  for (abbr in academic_abbrev) {
    if (grepl(paste0("\\b", abbr, "\\b"), from_lower, ignore.case = TRUE)) {
      return("Academic")
    }
  }
  
  # PRIORITY 4f: Individual Priests (prefix at START)
  for (prefix in priest_prefixes) {
    if (startsWith(from_lower, prefix)) return("Individual Priests")
  }
  for (title in priest_hierarchy) {
    if (startsWith(from_lower, title)) return("Individual Priests")
  }
  if (match_any(priest_contains, from_lower)) {
    return("Individual Priests")
  }
  
  # PRIORITY 5: Lay Influencers
  # Detect devotional pages - relaxed platform check since many FB pages don't have URL
  has_devotional <- match_any(lay_devotional, from_lower)
  has_media_indicator <- match_any(lay_exclude, from_lower)
  
  # Classify as Lay Influencer if has devotional keywords and no media indicators
  # Platform check relaxed: social media OR no dots in name (likely social handle)
  is_likely_social <- platform_lower %in% c("facebook", "instagram", "youtube", "twitter") ||
                      !grepl("\\.[a-z]{2,4}$", from_lower)  # No domain extension
  
  if (has_devotional && !has_media_indicator && is_likely_social) {
    return("Lay Influencers")
  }
  
  # PRIORITY 6: Default
  return("Other")
}

# -----------------------------------------------------------------------------
# APPLY CLASSIFICATION
# -----------------------------------------------------------------------------
dta[, ACTOR_TYPE := mapply(
  classify_actor_v4, 
  FROM, 
  if ("URL" %in% names(dta)) URL else NA,
  SOURCE_TYPE
)]

2.2 Classification Results

The table below shows the distribution of posts and engagement across actor types. This provides a first overview of who populates the Croatian Catholic digital space.

How to interpret this table:

  • Posts and % Posts indicate the volume of content produced by each actor type
  • Sources shows how many distinct accounts or websites belong to each category
  • Interactions and % Engage reveal the total audience engagement (likes, comments, shares) generated
  • Comparing volume share to engagement share reveals which actor types punch above or below their weight
Show code
# Calculate comprehensive summary
actor_summary <- dta[, .(
  Posts = .N,
  Sources = uniqueN(FROM),
  Total_Interactions = sum(INTERACTIONS, na.rm = TRUE),
  Mean_Interactions = mean(INTERACTIONS, na.rm = TRUE),
  Median_Interactions = median(INTERACTIONS, na.rm = TRUE)
), by = ACTOR_TYPE][order(-Posts)]

# Calculate percentages
total_posts <- sum(actor_summary$Posts)
total_interactions <- sum(actor_summary$Total_Interactions)
total_sources <- sum(actor_summary$Sources)

actor_summary[, `:=`(
  Posts_Pct = Posts / total_posts * 100,
  Sources_Pct = Sources / total_sources * 100,
  Interactions_Pct = Total_Interactions / total_interactions * 100
)]

# Display formatted table
actor_summary %>%
  mutate(
    Posts = format(Posts, big.mark = ","),
    `% Posts` = sprintf("%.1f%%", Posts_Pct),
    Sources = format(Sources, big.mark = ","),
    `% Sources` = sprintf("%.1f%%", Sources_Pct),
    Interactions = format(Total_Interactions, big.mark = ","),
    `% Engage` = sprintf("%.1f%%", Interactions_Pct),
    `Mean Int.` = round(Mean_Interactions, 1)
  ) %>%
  select(ACTOR_TYPE, Posts, `% Posts`, Sources, `% Sources`, 
         Interactions, `% Engage`, `Mean Int.`) %>%
  kable(col.names = c("Actor Type", "Posts", "% Posts", "Sources", 
                      "% Sources", "Interactions", "% Engage", "Mean Int."),
        caption = "Actor Type Classification Summary") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE) %>%
  row_spec(which(actor_summary$ACTOR_TYPE == "Other"), 
           background = "#f5f5f5", italic = TRUE)
Actor Type Classification Summary
Actor Type Posts % Posts Sources % Sources Interactions % Engage Mean Int.
Other 468,712 77.0% 16,072 97.8% 44,600,737 72.0% 99.1
Institutional Official 62,744 10.3% 202 1.2% 3,550,057 5.7% 56.6
Independent Media 31,403 5.2% 19 0.1% 6,600,818 10.7% 210.2
Lay Influencers 28,451 4.7% 62 0.4% 6,333,433 10.2% 224.7
Diocesan 13,841 2.3% 50 0.3% 636,001 1.0% 46.0
Religious Orders 1,720 0.3% 13 0.1% 37,390 0.1% 21.7
Charismatic Communities 1,282 0.2% 12 0.1% 193,550 0.3% 153.7
Academic 674 0.1% 7 0.0% 11,661 0.0% 17.3
Youth Organizations 52 0.0% 4 0.0% 6,697 0.0% 128.8

2.3 Actor Type Distribution Visualization

This chart compares each actor types share of total content volume against their share of total engagement. When an actor types engagement bar exceeds their volume bar, it indicates that their content resonates more strongly with audiences on a per post basis.

Show code
# Prepare data for visualization (exclude Other for cleaner view)
actor_viz <- actor_summary[ACTOR_TYPE != "Other"]

# Create comparison chart
actor_long <- actor_viz %>%
  select(ACTOR_TYPE, Posts_Pct, Interactions_Pct) %>%
  pivot_longer(cols = c(Posts_Pct, Interactions_Pct),
               names_to = "Metric", values_to = "Percentage") %>%
  mutate(Metric = ifelse(Metric == "Posts_Pct", "Volume (Posts)", "Engagement"))

ggplot(actor_long, aes(x = reorder(ACTOR_TYPE, Percentage), 
                        y = Percentage, fill = Metric)) +
  geom_col(position = "dodge", width = 0.7) +
  geom_text(aes(label = sprintf("%.1f%%", Percentage)),
            position = position_dodge(width = 0.7),
            hjust = -0.1, size = 3) +
  coord_flip() +
  scale_fill_manual(values = c("Volume (Posts)" = "#2c5f7c", 
                               "Engagement" = "#e07b39")) +
  scale_y_continuous(limits = c(0, max(actor_long$Percentage) * 1.2),
                     labels = function(x) paste0(x, "%")) +
  labs(
    title = "Actor Type Distribution: Volume vs Engagement",
    subtitle = "Excluding 'Other' category for clarity",
    x = NULL,
    y = "Share (%)",
    fill = NULL
  ) +
  theme(legend.position = "top")

2.4 Engagement Efficiency by Actor Type

The efficiency index quantifies how effectively each actor type converts their content production into audience engagement. It is calculated as the ratio of engagement share to volume share. An index above 1.0 means the actor type generates more engagement than their share of posts would predict, while below 1.0 indicates underperformance relative to volume.

Interpretation guide:

  • High performers (>1.5): Content resonates strongly; audiences actively engage
  • Above average (1.0-1.5): Solid engagement relative to output
  • Average (0.7-1.0): Engagement roughly proportional to volume
  • Below average (<0.7): High volume but limited audience response
Show code
# Calculate efficiency index
actor_efficiency <- actor_summary[ACTOR_TYPE != "Other"] %>%
  mutate(
    Efficiency_Index = Interactions_Pct / Posts_Pct,
    Performance = case_when(
      Efficiency_Index > 1.5 ~ "High performer",
      Efficiency_Index > 1.0 ~ "Above average",
      Efficiency_Index > 0.7 ~ "Average",
      TRUE ~ "Below average"
    )
  ) %>%
  arrange(desc(Efficiency_Index))

ggplot(actor_efficiency, aes(x = reorder(ACTOR_TYPE, Efficiency_Index),
                              y = Efficiency_Index, fill = Performance)) +
  geom_col(width = 0.7) +
  geom_hline(yintercept = 1, linetype = "dashed", color = "red", linewidth = 1) +
  geom_text(aes(label = sprintf("%.2f", Efficiency_Index)), 
            hjust = -0.1, size = 3.5) +
  coord_flip() +
  scale_fill_manual(values = c(
    "High performer" = "#2d6a4f",
    "Above average" = "#52b788",
    "Average" = "#95d5b2",
    "Below average" = "#d8f3dc"
  )) +
  scale_y_continuous(limits = c(0, max(actor_efficiency$Efficiency_Index) * 1.15)) +
  labs(
    title = "Engagement Efficiency by Actor Type",
    subtitle = "Efficiency Index = Engagement Share / Volume Share (>1 = overperforming)",
    x = NULL,
    y = "Efficiency Index",
    fill = "Performance",
    caption = "Red dashed line indicates neutral efficiency (1.0)"
  ) +
  theme(legend.position = "right")

2.5 Top Sources by Actor Type

This table displays the five most engaged sources within each classified actor type, providing concrete examples of who drives engagement in each category.

Show code
# Get top 5 sources per actor type
top_sources_by_type <- dta[ACTOR_TYPE != "Other", .(
  Posts = .N,
  Interactions = sum(INTERACTIONS, na.rm = TRUE),
  Platforms = uniqueN(SOURCE_TYPE)
), by = .(ACTOR_TYPE, FROM)][order(ACTOR_TYPE, -Interactions)]

top_5_each <- top_sources_by_type[, head(.SD, 5), by = ACTOR_TYPE]

top_5_each %>%
  mutate(
    Posts = format(Posts, big.mark = ","),
    Interactions = format(Interactions, big.mark = ",")
  ) %>%
  kable(col.names = c("Actor Type", "Source", "Posts", "Interactions", "Platforms"),
        caption = "Top 5 Sources by Actor Type (ranked by engagement)") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE) %>%
  collapse_rows(columns = 1, valign = "top") %>%
  scroll_box(height = "500px")
Top 5 Sources by Actor Type (ranked by engagement)
Actor Type Source Posts Interactions Platforms
Academic Hrvatsko katoličko sveučilište (Universitas Studiorum Catholica Croatica) 190 7,392 1
unicath.hr 461 3,589 1
adekvatnateologija.com 7 617 1
ffrz.hr 2 32 1
Hrvatsko katoličko sveučilište 11 30 1
Charismatic Communities Duhovna Obnova 325 58,960 1
Muževni budite 269 36,576 1
Molitvena Snaga 61 31,552 1
Dom Molitve Slavonski Brod 257 27,316 1
Božja pobjeda 20 18,387 1
Diocesan Zagrebačka nadbiskupija 2,023 241,657 3
Sisačka biskupija 482 87,832 3
zg-nadbiskupija.hr 888 82,621 1
Župa Šurkovac 442 70,885 1
Sveta Mati Slobode - Župa Duha Svetoga 344 36,457 1
Independent Media novizivot.net 6,083 3,351,089 1
Laudato 2,164 926,523 2
bitno.net 3,445 763,737 1
laudato.hr 10,727 703,203 1
Bitno.net 3,290 512,818 2
Institutional Official hkm.hr 52,474 3,077,108 1
Radio Marija Hrvatska 2,704 165,224 1
geopolitika.news 377 83,616 2
klikaj.hr 1,155 45,491 1
likaclub.eu 637 25,943 1
Lay Influencers pulherissimus 6,007 1,608,836 1
Добровољци 348 790,860 1
Miletić Marin 231 645,342 2
Pod Smokvom 1,517 588,094 1
Hrana za dušu 3,447 461,396 1
Religious Orders isusovci.hr 196 15,754 1
ofm.hr 673 15,178 1
karmel.hr 457 4,578 1
franjevcitrecoredci.hr 2 866 1
ofmconv.hr 327 809 1
Youth Organizations Susret hrvatske katoličke mladeži Požega 2026. 37 5,082 1
Susret hrvatske katoličke mladeži Bjelovar 2022. 12 1,500 1
Susret hrvatske katoličke mladeži Zagreb 2020. 2 115 1
frama-portal.net 1 0 1

2.6 Actor Type by Platform

This heatmap reveals platform preferences across actor types. Each cell shows what percentage of an actor types content appears on each platform. This helps identify which platforms different types of Catholic communicators favor.

Show code
# Cross-tabulation of actor types and platforms
actor_platform <- dta[ACTOR_TYPE != "Other", .(Posts = .N), 
                       by = .(ACTOR_TYPE, SOURCE_TYPE)]

# Calculate percentages within each actor type
actor_platform[, Pct := Posts / sum(Posts) * 100, by = ACTOR_TYPE]

# Create heatmap
ggplot(actor_platform, aes(x = SOURCE_TYPE, y = ACTOR_TYPE, fill = Pct)) +
  geom_tile(color = "white", linewidth = 0.5) +
  geom_text(aes(label = sprintf("%.0f%%", Pct)), size = 3, color = "white") +
  scale_fill_gradient(low = "#deebf7", high = "#08519c",
                      name = "% of Posts") +
  labs(
    title = "Platform Distribution by Actor Type",
    subtitle = "Percentage of each actor type's posts by platform",
    x = "Platform",
    y = "Actor Type"
  ) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    panel.grid = element_blank()
  )

2.7 Classification Quality Diagnostics

Click to expand diagnostics (Quality assurance tables for identifying potential misclassifications)

Quality assurance is essential for any classification system. The diagnostics below help identify potential misclassifications that may require manual review or pattern refinement.

Show code
# High-engagement sources in Other category
other_review <- dta[ACTOR_TYPE == "Other", .(
  Posts = .N,
  Interactions = sum(INTERACTIONS, na.rm = TRUE),
  Platforms = paste(unique(SOURCE_TYPE), collapse = ", ")
), by = FROM][order(-Interactions)][1:20]

other_review %>%
  mutate(
    Posts = format(Posts, big.mark = ","),
    Interactions = format(Interactions, big.mark = ",")
  ) %>%
  kable(col.names = c("Source", "Posts", "Interactions", "Platforms"),
        caption = "Top 20 'Other' Sources by Engagement (review for potential misclassification)") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE)
Top 20 'Other' Sources by Engagement (review for potential misclassification)
Source Posts Interactions Platforms
index.hr 8,095 3,312,573 web, comment, forum
dnevno.hr 4,505 3,278,480 web, comment
24sata.hr 6,459 1,695,211 web
vecernji.hr 9,242 1,598,685 web, comment
jutarnji.hr 7,703 1,595,040 web, comment, instagram
slobodnadalmacija.hr 10,823 1,458,487 web, comment
telegram.hr 2,902 1,164,087 web, comment, instagram
dnevnik.hr 4,383 929,309 web
net.hr 6,917 758,857 web, comment
anonymous_user 2,931 682,962 instagram
narod.hr 5,434 614,885 web
campaign-archive.com 3 499,027 web
dalmatinskiportal.hr 3,378 497,593 web, comment
novilist.hr 7,501 440,049 web, comment
direktno.hr 5,339 418,717 web, comment
maxportal.hr 1,608 395,869 web, comment
dalmacijadanas.hr 3,295 385,808 web
zagreb.info 2,222 363,521 web, comment
caritas.hr 153 332,285 web
mnovine.hr 1,574 309,527 web
Show code
# Check for religious keywords in Other category
religious_keywords <- c(
  "biskupij", "nadbiskupij", "župa", "zupa", "crkv",
  "franjev", "isusov", "dominikan", "salezijan",
  "kršćan", "krscan", "katolič", "katolic",
  "molitv", "duhovn", "svećen", "svecen"
)

other_sources <- unique(dta[ACTOR_TYPE == "Other"]$FROM)

flagged_sources <- other_sources[sapply(other_sources, function(x) {
  any(sapply(religious_keywords, function(k) grepl(k, tolower(x))))
})]

if (length(flagged_sources) > 0) {
  flagged_stats <- dta[FROM %in% flagged_sources & ACTOR_TYPE == "Other", .(
    Posts = .N,
    Interactions = sum(INTERACTIONS, na.rm = TRUE)
  ), by = FROM][order(-Interactions)]
  
  flagged_stats %>%
    mutate(
      Posts = format(Posts, big.mark = ","),
      Interactions = format(Interactions, big.mark = ",")
    ) %>%
    head(15) %>%
    kable(col.names = c("Source", "Posts", "Interactions"),
          caption = "Sources with Religious Keywords Classified as 'Other' (review recommended)") %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                  full_width = FALSE)
}
Sources with Religious Keywords Classified as 'Other' (review recommended)
Source Posts Interactions
Virovitičko-podravska županija 121 24,420
zupanjac.net 144 20,137
Sisačko-moslavačka županija 38 4,070
Kršćanska proročka crkva Isus je Kralj! 875 3,679
Crkva Svemogućeg Boga 67 2,897
zg-nadbiskupija.hr 22 2,565
Domovinski pokret Splitsko-dalmatinska županija 6 2,002
Molitvena zajednica Eho 10 1,952
Zadarska županija 45 1,928
OMNIA DEO - Molitve 25 1,665
biskupija-varazdinska.hr 39 1,660
Molitve za svaki dan 48 1,607
🧿Tarot duhovnog buđenja🧿 |4 |1 456 |
Župa svetog Petra apostola - Zadar (Ploča) 63 1,435
Kršćanska zajednica Šibenik 36 1,337

The classification uses a hierarchical priority system ensuring consistent results:

  1. Manual overrides guarantee correct classification of known important sources
  2. Secular exclusions prevent false positives from mainstream news coverage
  3. Domain matching provides reliable identification for web sources
  4. Pattern matching handles social media and ambiguous sources
  5. Lay Influencers captures devotional social media pages

The Other category contains secular media, unidentified sources, and general public discourse. High engagement sources in Other warrant periodic review for potential pattern improvements.

3 Analysis 1.1: Platform Distribution

Core question: Where does Croatian Catholic digital content live?

Understanding platform distribution is fundamental to grasping the structure of any digital media ecosystem. Different platforms have distinct affordances: web allows long form content and search discoverability, Facebook enables community building and sharing, Instagram favors visual storytelling, YouTube supports video content, and Twitter facilitates rapid information exchange. How content distributes across these platforms reveals strategic choices and audience preferences within the Croatian Catholic digital space.

Show code
# Calculate platform statistics
platform_stats <- dta[, .(
  Posts = .N,
  Total_Interactions = sum(INTERACTIONS, na.rm = TRUE),
  Total_Reach = sum(REACH, na.rm = TRUE),
  Unique_Sources = uniqueN(FROM),
  Mean_Interactions = mean(INTERACTIONS, na.rm = TRUE),
  Median_Interactions = median(INTERACTIONS, na.rm = TRUE)
), by = SOURCE_TYPE][order(-Posts)]

# Calculate shares
platform_stats[, `:=`(
  Volume_Share = Posts / sum(Posts) * 100,
  Engagement_Share = Total_Interactions / sum(Total_Interactions) * 100
)]

3.1 Volume vs Engagement by Platform

This comparison reveals the gap between where content is produced and where engagement occurs. Platforms with higher engagement share than volume share deliver better returns per post.

Show code
# Prepare data for visualization
platform_long <- platform_stats %>%
  select(SOURCE_TYPE, Volume_Share, Engagement_Share) %>%
  pivot_longer(cols = c(Volume_Share, Engagement_Share),
               names_to = "Metric", values_to = "Share") %>%
  mutate(Metric = ifelse(Metric == "Volume_Share", "Volume (Posts)", "Engagement (Interactions)"))

# Create grouped bar chart
ggplot(platform_long, aes(x = reorder(SOURCE_TYPE, Share), y = Share, fill = Metric)) +
  geom_col(position = "dodge", width = 0.7) +
  geom_text(aes(label = sprintf("%.1f%%", Share)), 
            position = position_dodge(width = 0.7), 
            hjust = -0.1, size = 3) +
  coord_flip() +
  scale_fill_manual(values = c("Volume (Posts)" = "#2c5f7c", "Engagement (Interactions)" = "#e07b39")) +
  scale_y_continuous(limits = c(0, max(platform_long$Share) * 1.15), 
                     labels = function(x) paste0(x, "%")) +
  labs(
    title = "Platform Distribution: Volume vs Engagement",
    subtitle = "Share of total posts and interactions by platform",
    x = NULL,
    y = "Share (%)",
    fill = NULL
  ) +
  theme(legend.position = "top")

3.2 Platform Statistics Table

This table provides detailed metrics for each platform including post counts, engagement totals, source diversity, and average interactions per post.

Show code
platform_stats %>%
  mutate(
    Posts = format(Posts, big.mark = ","),
    Total_Interactions = format(Total_Interactions, big.mark = ","),
    Unique_Sources = format(Unique_Sources, big.mark = ","),
    Mean_Interactions = round(Mean_Interactions, 1),
    Volume_Share = sprintf("%.1f%%", Volume_Share),
    Engagement_Share = sprintf("%.1f%%", Engagement_Share)
  ) %>%
  select(SOURCE_TYPE, Posts, Volume_Share, Total_Interactions, Engagement_Share, 
         Unique_Sources, Mean_Interactions) %>%
  kable(col.names = c("Platform", "Posts", "Volume %", "Interactions", 
                      "Engagement %", "Sources", "Mean Interactions")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Platform Posts Volume % Interactions Engagement % Sources Mean Interactions
web 447,908 73.6% 41,622,547 67.2% 3,159 92.9
facebook 68,771 11.3% 9,817,934 15.8% 2,336 142.8
youtube 65,158 10.7% 9,505,576 15.3% 4,843 148.4
reddit 8,085 1.3% 0 0.0% 3,659 NaN
forum 6,140 1.0% 0 0.0% 9 NaN
twitter 5,935 1.0% 69,690 0.1% 2,498 11.7
comment 3,630 0.6% 0 0.0% 23 NaN
instagram 3,252 0.5% 954,597 1.5% 100 293.5

3.3 Engagement Efficiency by Platform

The efficiency index measures how well each platform converts content into engagement. This metric helps identify which platforms offer the best return on content investment for Catholic communicators.

Interpretation:

  • Index > 1: Platform generates disproportionately high engagement relative to content volume
  • Index = 1: Engagement is proportional to volume (neutral efficiency)
  • Index < 1: Platform underperforms in converting content to engagement
Show code
# Calculate engagement per post
platform_efficiency <- platform_stats %>%
  mutate(
    Engagement_per_Post = Total_Interactions / Posts,
    Efficiency_Index = (Engagement_Share / Volume_Share)
  ) %>%
  arrange(desc(Efficiency_Index))

ggplot(platform_efficiency, aes(x = reorder(SOURCE_TYPE, Efficiency_Index), 
                                 y = Efficiency_Index, fill = SOURCE_TYPE)) +
  geom_col(width = 0.7) +
  geom_hline(yintercept = 1, linetype = "dashed", color = "red", linewidth = 1) +
  geom_text(aes(label = sprintf("%.2f", Efficiency_Index)), hjust = -0.1, size = 3.5) +
  coord_flip() +
  scale_fill_manual(values = platform_colors) +
  scale_y_continuous(limits = c(0, max(platform_efficiency$Efficiency_Index) * 1.15)) +
  labs(
    title = "Platform Engagement Efficiency",
    subtitle = "Ratio of engagement share to volume share (>1 means overperforming)",
    x = NULL,
    y = "Efficiency Index (Engagement Share / Volume Share)",
    caption = "Red dashed line = neutral efficiency (1.0)"
  ) +
  theme(legend.position = "none")

Key Finding

Platforms with efficiency index > 1 generate disproportionately high engagement relative to their volume. This indicates higher audience resonance per post and suggests strategic value for Catholic communicators seeking to maximize impact.

4 Analysis 1.2: Visibility Stratification

Core question: How is visibility distributed among actors? Do a few sources dominate?

Digital media ecosystems typically exhibit strong winner take all dynamics, where a small number of actors capture the majority of audience attention. This pattern, often described as a power law distribution, has significant implications for the diversity and pluralism of public discourse. If visibility is highly concentrated, a few dominant voices may shape the entire conversation, while thousands of smaller actors remain effectively invisible.

We examine this question using three complementary approaches: concentration ratios (CR), the Gini coefficient, and Lorenz curve visualization. Together, these metrics quantify the degree of inequality in the distribution of visibility within the Croatian Catholic digital space.

Show code
# Aggregate by source
source_stats <- dta[, .(
  Posts = .N,
  Total_Interactions = sum(INTERACTIONS, na.rm = TRUE),
  Total_Reach = sum(REACH, na.rm = TRUE),
  Mean_Followers = mean(FOLLOWERS_COUNT, na.rm = TRUE),
  Platforms = uniqueN(SOURCE_TYPE),
  Actor_Type = first(ACTOR_TYPE)
), by = FROM][order(-Total_Interactions)]

# Calculate cumulative shares for Lorenz curve
source_stats[, `:=`(
  Rank = .I,
  Cumulative_Sources = .I / .N * 100,
  Cumulative_Interactions = cumsum(Total_Interactions) / sum(Total_Interactions) * 100
)]

4.1 Top 20 Sources by Engagement

This table identifies the most visible actors in the Croatian Catholic digital space, ranked by total engagement. The Share column indicates what proportion of all engagement each source captures.

Show code
source_stats[1:20] %>%
  mutate(
    Share = sprintf("%.2f%%", Total_Interactions / sum(source_stats$Total_Interactions) * 100),
    Posts = format(Posts, big.mark = ","),
    Total_Interactions = format(Total_Interactions, big.mark = ",")
  ) %>%
  select(Rank, FROM, Actor_Type, Posts, Total_Interactions, Share, Platforms) %>%
  kable(col.names = c("Rank", "Source", "Actor Type", "Posts", "Total Interactions", "Share %", "Platforms")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  scroll_box(height = "500px")
Rank Source Actor Type Posts Total Interactions Share % Platforms
1 novizivot.net Independent Media 6,083 3,351,089 5.41% 1
2 index.hr Other 8,095 3,312,573 5.35% 3
3 dnevno.hr Other 4,505 3,278,480 5.29% 2
4 hkm.hr Institutional Official 52,474 3,077,108 4.97% 1
5 24sata.hr Other 6,459 1,695,211 2.74% 1
6 pulherissimus Lay Influencers 6,007 1,608,836 2.60% 1
7 vecernji.hr Other 9,242 1,598,685 2.58% 2
8 jutarnji.hr Other 7,703 1,595,040 2.57% 3
9 slobodnadalmacija.hr Other 10,823 1,458,487 2.35% 2
10 telegram.hr Other 2,902 1,164,087 1.88% 3
11 dnevnik.hr Other 4,383 929,309 1.50% 1
12 Laudato Independent Media 2,164 926,523 1.50% 2
13 Добровољци Lay Influencers 348 790,860 1.28% 1
14 bitno.net Independent Media 3,445 763,737 1.23% 1
15 net.hr Other 6,917 758,857 1.22% 2
16 laudato.hr Independent Media 10,727 703,203 1.13% 1
17 anonymous_user Other 2,931 682,962 1.10% 1
18 Miletić Marin Lay Influencers 231 645,342 1.04% 2
19 narod.hr Other 5,434 614,885 0.99% 1
20 Pod Smokvom Lay Influencers 1,517 588,094 0.95% 1

4.2 Concentration Ratios

Concentration ratios (CR) measure what share of total engagement is captured by the top N sources. These metrics are commonly used in industrial organization economics to assess market concentration, and apply equally well to attention markets.

Interpretation guide:

  • CR5 > 50%: High concentration; top 5 sources dominate the space
  • CR10 > 70%: Very high concentration; limited diversity of visible voices
  • CR20 > 80%: Extreme concentration; long tail of nearly invisible actors
Show code
total_interactions <- sum(source_stats$Total_Interactions)
n_sources_total <- nrow(source_stats)

# Calculate concentration ratios
cr1 <- sum(source_stats[1:1]$Total_Interactions) / total_interactions * 100
cr5 <- sum(source_stats[1:5]$Total_Interactions) / total_interactions * 100
cr10 <- sum(source_stats[1:10]$Total_Interactions) / total_interactions * 100
cr20 <- sum(source_stats[1:20]$Total_Interactions) / total_interactions * 100
cr50 <- sum(source_stats[1:50]$Total_Interactions) / total_interactions * 100

# Gini coefficient
gini_coef <- ineq(source_stats$Total_Interactions, type = "Gini")

tibble(
  Metric = c("CR1 (Top 1 source)", "CR5 (Top 5 sources)", "CR10 (Top 10 sources)", 
             "CR20 (Top 20 sources)", "CR50 (Top 50 sources)", "Gini Coefficient",
             "Total Sources"),
  Value = c(
    sprintf("%.1f%%", cr1),
    sprintf("%.1f%%", cr5),
    sprintf("%.1f%%", cr10),
    sprintf("%.1f%%", cr20),
    sprintf("%.1f%%", cr50),
    sprintf("%.3f", gini_coef),
    format(n_sources_total, big.mark = ",")
  )
) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Metric Value
CR1 (Top 1 source) 5.4%
CR5 (Top 5 sources) 23.7%
CR10 (Top 10 sources) 35.7%
CR20 (Top 20 sources) 47.7%
CR50 (Top 50 sources) 62.7%
Gini Coefficient 0.982
Total Sources 16,426

4.3 Lorenz Curve of Engagement Inequality

The Lorenz curve provides a visual representation of inequality. The x axis shows the cumulative percentage of sources (ranked from lowest to highest engagement), while the y axis shows the cumulative percentage of total engagement. Perfect equality would follow the diagonal line (each source contributes equally). The bow shaped curve below the diagonal indicates inequality, with greater deviation representing more concentration.

The Gini coefficient summarizes this visually: it equals the area between the Lorenz curve and the equality line, divided by the total area under the equality line. Values range from 0 (perfect equality) to 1 (one source captures everything).

Show code
ggplot(source_stats, aes(x = Cumulative_Sources, y = Cumulative_Interactions)) +
  geom_line(color = "#2c5f7c", linewidth = 1.2) +
  geom_abline(intercept = 0, slope = 1, linetype = "dashed", color = "red") +
  geom_ribbon(aes(ymin = Cumulative_Sources, ymax = Cumulative_Interactions), 
              fill = "#2c5f7c", alpha = 0.3) +
  annotate("text", x = 70, y = 30, 
           label = paste0("Gini = ", round(gini_coef, 3)), 
           size = 5, fontface = "bold") +
  annotate("text", x = 20, y = 80,
           label = paste0("Top 10% of sources\ncapture ", 
                          round(source_stats[Rank == round(n_sources_total * 0.1)]$Cumulative_Interactions, 1),
                          "% of engagement"),
           size = 4) +
  scale_x_continuous(labels = function(x) paste0(x, "%")) +
  scale_y_continuous(labels = function(x) paste0(x, "%")) +
  labs(
    title = "Lorenz Curve: Engagement Inequality Among Sources",
    subtitle = "Deviation from diagonal indicates concentration of visibility",
    x = "Cumulative % of Sources (ranked by engagement)",
    y = "Cumulative % of Total Engagement",
    caption = "Red dashed line = perfect equality"
  )

4.4 Log-Log Rank Plot (Power Law Test)

Many natural and social phenomena follow power law distributions, where frequency decreases exponentially with rank. If the Croatian Catholic digital space follows this pattern, a log log plot of rank versus engagement should approximate a straight line.

Interpretation:

  • A linear relationship on the log log scale confirms power law behavior
  • The slope indicates how steeply engagement drops off with rank (steeper = more concentrated)
  • High R squared values (>0.9) indicate the power law model fits well
Show code
# Filter to sources with positive interactions
source_positive <- source_stats[Total_Interactions > 0]

# Fit linear model on log-log scale
log_model <- lm(log10(Total_Interactions) ~ log10(Rank), data = source_positive)
slope <- coef(log_model)[2]
r_squared <- summary(log_model)$r.squared

ggplot(source_positive, aes(x = Rank, y = Total_Interactions)) +
  geom_point(alpha = 0.5, color = "#2c5f7c") +
  geom_smooth(method = "lm", color = "red", se = FALSE) +
  scale_x_log10(labels = comma) +
  scale_y_log10(labels = comma) +
  annotate("text", x = 10, y = min(source_positive$Total_Interactions) * 10,
           label = paste0("Slope = ", round(slope, 2), "\nR² = ", round(r_squared, 3)),
           hjust = 0, size = 4, fontface = "bold") +
  labs(
    title = "Rank-Engagement Distribution (Log-Log Scale)",
    subtitle = "Linear relationship suggests power law distribution",
    x = "Rank (log scale)",
    y = "Total Interactions (log scale)",
    caption = "Red line = linear fit on log-log scale"
  )

Key Finding on Stratification

A Gini coefficient above 0.8 and CR10 above 50% indicates extreme concentration. The Croatian Catholic digital media space is highly stratified, with a small elite of sources capturing most visibility. This pattern is consistent with winner take all dynamics observed in other digital media ecosystems.

5 Analysis 1.3: The Institutional Gap

Core question: Do institutional actors underperform compared to individual voices and grassroots communities?

A recurring theme in digital media research is the institutional performance gap: official organizations often struggle to match the engagement levels achieved by individual personalities and grassroots movements. This may reflect audience preferences for authentic, personal communication over formal institutional messaging, or differences in content strategy and platform adaptation.

We test this hypothesis by comparing engagement rates across actor types. The engagement rate normalizes for audience size by dividing total interactions by follower count, allowing fair comparison between large institutional accounts and smaller individual voices.

Show code
# Calculate engagement rate per source
source_engagement <- dta[!is.na(FOLLOWERS_COUNT) & FOLLOWERS_COUNT > 0, .(
  Posts = .N,
  Total_Interactions = sum(INTERACTIONS, na.rm = TRUE),
  Mean_Followers = mean(FOLLOWERS_COUNT, na.rm = TRUE),
  Engagement_Rate = sum(INTERACTIONS, na.rm = TRUE) / mean(FOLLOWERS_COUNT, na.rm = TRUE) * 100
), by = .(FROM, ACTOR_TYPE)]

# Aggregate by actor type
actor_engagement <- source_engagement[, .(
  Sources = .N,
  Total_Posts = sum(Posts),
  Total_Interactions = sum(Total_Interactions),
  Mean_Engagement_Rate = mean(Engagement_Rate, na.rm = TRUE),
  Median_Engagement_Rate = median(Engagement_Rate, na.rm = TRUE),
  SD_Engagement_Rate = sd(Engagement_Rate, na.rm = TRUE)
), by = ACTOR_TYPE][order(-Mean_Engagement_Rate)]

5.1 Engagement Rate by Actor Type

This boxplot shows the distribution of engagement rates within each actor type. The box represents the interquartile range (middle 50% of sources), the line inside is the median, and points beyond the whiskers are outliers.

Interpretation:

  • Higher median engagement rates indicate actor types whose content typically resonates better with audiences
  • Wider boxes indicate more variation within the category
  • Outliers may represent exceptionally successful (or unsuccessful) individual sources
Show code
# Filter out extreme outliers for visualization
engagement_plot_data <- source_engagement[Engagement_Rate < quantile(Engagement_Rate, 0.99, na.rm = TRUE)]

ggplot(engagement_plot_data, aes(x = reorder(ACTOR_TYPE, Engagement_Rate, FUN = median), 
                                  y = Engagement_Rate, fill = ACTOR_TYPE)) +
  geom_boxplot(outlier.alpha = 0.3) +
  coord_flip() +
  scale_fill_manual(values = actor_colors) +
  scale_y_continuous(labels = function(x) paste0(x, "%")) +
  labs(
    title = "Engagement Rate Distribution by Actor Type",
    subtitle = "Engagement Rate = (Total Interactions / Followers) × 100",
    x = NULL,
    y = "Engagement Rate (%)",
    caption = "Outliers above 99th percentile excluded for visualization"
  ) +
  theme(legend.position = "none")

5.2 Actor Type Performance Summary

This table summarizes engagement metrics for each actor type, including both mean and median engagement rates. The median is often more informative as it is less sensitive to outliers.

Show code
actor_engagement %>%
  mutate(
    Sources = format(Sources, big.mark = ","),
    Total_Posts = format(Total_Posts, big.mark = ","),
    Total_Interactions = format(Total_Interactions, big.mark = ","),
    Mean_Engagement_Rate = sprintf("%.2f%%", Mean_Engagement_Rate),
    Median_Engagement_Rate = sprintf("%.2f%%", Median_Engagement_Rate)
  ) %>%
  select(ACTOR_TYPE, Sources, Total_Posts, Total_Interactions, 
         Mean_Engagement_Rate, Median_Engagement_Rate) %>%
  kable(col.names = c("Actor Type", "Sources", "Posts", "Interactions", 
                      "Mean Eng. Rate", "Median Eng. Rate")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Actor Type Sources Posts Interactions Mean Eng. Rate Median Eng. Rate
Diocesan 5 2,840 372,538 603.76% 249.53%
Lay Influencers 12 13,144 3,437,969 472.49% 501.59%
Independent Media 4 5,742 1,507,611 312.10% 216.44%
Academic 1 190 7,392 97.45% 97.45%
Institutional Official 77 3,959 217,611 16.99% 0.48%
Youth Organizations 3 51 6,697 15.97% 11.02%
Other 4,134 53,823 6,748,754 9.95% 0.67%

5.3 Statistical Comparison: Institutional vs Non-Institutional

To formally test whether institutional actors underperform, we group sources into institutional (Institutional Official, Diocesan, Academic) and non-institutional categories, then apply the Wilcoxon rank sum test. This non parametric test is appropriate because engagement rates are typically not normally distributed.

Show code
# Group into institutional vs non-institutional
source_engagement[, Institution_Group := ifelse(
  ACTOR_TYPE %in% c("Institutional Official", "Diocesan", "Academic"),
  "Institutional",
  "Non-Institutional"
)]

# Wilcoxon test (non-parametric)
wilcox_result <- wilcox.test(
  Engagement_Rate ~ Institution_Group, 
  data = source_engagement,
  alternative = "two.sided"
)

# Summary statistics by group
group_stats <- source_engagement[, .(
  N = .N,
  Mean = mean(Engagement_Rate, na.rm = TRUE),
  Median = median(Engagement_Rate, na.rm = TRUE),
  SD = sd(Engagement_Rate, na.rm = TRUE)
), by = Institution_Group]

group_stats %>%
  mutate(
    Mean = sprintf("%.2f%%", Mean),
    Median = sprintf("%.2f%%", Median),
    SD = sprintf("%.2f", SD)
  ) %>%
  kable(col.names = c("Group", "N Sources", "Mean Eng. Rate", "Median Eng. Rate", "Std. Dev.")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Group N Sources Mean Eng. Rate Median Eng. Rate Std. Dev.
Institutional 83 53.31% 0.59% 210.17
Non-Institutional 4153 11.59% 0.68% 65.50
Show code
cat("Wilcoxon Rank Sum Test Results:\n")
Wilcoxon Rank Sum Test Results:
Show code
cat("W =", wilcox_result$statistic, "\n")
W = 182808 
Show code
cat("p-value =", format.pval(wilcox_result$p.value, digits = 4), "\n")
p-value = 0.3422 
Show code
if (wilcox_result$p.value < 0.05) {
  cat("Result: Significant difference between institutional and non-institutional actors\n")
} else {
  cat("Result: No significant difference detected\n")
}
Result: No significant difference detected

5.4 Engagement Rate Comparison Visualization

This violin plot combines a density estimate (the violin shape shows where values cluster) with a boxplot (showing median and interquartile range). The red diamond marks the mean.

Show code
ggplot(source_engagement[Engagement_Rate < quantile(Engagement_Rate, 0.95, na.rm = TRUE)], 
       aes(x = Institution_Group, y = Engagement_Rate, fill = Institution_Group)) +
  geom_violin(alpha = 0.7) +
  geom_boxplot(width = 0.2, fill = "white", outlier.shape = NA) +
  stat_summary(fun = mean, geom = "point", shape = 18, size = 4, color = "red") +
  scale_fill_manual(values = c("Institutional" = "#1a3c5a", "Non-Institutional" = "#e07b39")) +
  scale_y_continuous(labels = function(x) paste0(x, "%")) +
  labs(
    title = "Institutional vs Non-Institutional Engagement Rates",
    subtitle = "Red diamond = mean; white box = median and IQR",
    x = NULL,
    y = "Engagement Rate (%)",
    caption = "Top 5% outliers excluded for visualization"
  ) +
  theme(legend.position = "none")

6 Analysis 1.4: Cross-Platform Presence

Core question: Which actors maintain presence across multiple platforms?

In contemporary digital media strategy, cross platform presence is often considered essential for maximizing reach and resilience. Organizations that operate on multiple platforms can reach different audience segments and reduce dependence on any single platforms algorithmic decisions. However, maintaining quality presence across platforms requires significant resources, potentially creating advantages for larger, better resourced actors.

We examine how platform presence varies across actor types and whether multi platform presence correlates with total engagement.

Show code
# Calculate cross-platform presence per source
cross_platform <- dta[, .(
  Platforms = uniqueN(SOURCE_TYPE),
  Platform_List = paste(unique(SOURCE_TYPE), collapse = ", "),
  Total_Posts = .N,
  Total_Interactions = sum(INTERACTIONS, na.rm = TRUE),
  Actor_Type = first(ACTOR_TYPE)
), by = FROM][order(-Platforms, -Total_Interactions)]

6.1 Platform Presence Distribution

This histogram shows how many sources operate on each number of platforms. Most digital actors are single platform operators, with progressively fewer maintaining presence across multiple platforms.

Show code
presence_summary <- cross_platform[, .(Sources = .N), by = Platforms][order(Platforms)]

ggplot(presence_summary, aes(x = factor(Platforms), y = Sources)) +
  geom_col(fill = "#2c5f7c", width = 0.7) +
  geom_text(aes(label = format(Sources, big.mark = ",")), vjust = -0.5, size = 3.5) +
  scale_y_continuous(labels = comma, expand = expansion(mult = c(0, 0.1))) +
  labs(
    title = "Distribution of Cross-Platform Presence",
    subtitle = "Number of sources by platform count",
    x = "Number of Platforms",
    y = "Number of Sources"
  )

6.2 Top Multi-Platform Actors

Sources present on three or more platforms represent the most diversified digital strategies. This table identifies these actors and their engagement levels.

Show code
cross_platform[Platforms >= 3][1:20] %>%
  mutate(
    Total_Posts = format(Total_Posts, big.mark = ","),
    Total_Interactions = format(Total_Interactions, big.mark = ",")
  ) %>%
  select(FROM, Actor_Type, Platforms, Platform_List, Total_Posts, Total_Interactions) %>%
  kable(col.names = c("Source", "Actor Type", "# Platforms", "Platforms", "Posts", "Interactions")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  scroll_box(height = "400px")
Source Actor Type # Platforms Platforms Posts Interactions
index.hr Other 3 web, comment, forum 8,095 3,312,573
jutarnji.hr Other 3 web, comment, instagram 7,703 1,595,040
telegram.hr Other 3 web, comment, instagram 2,902 1,164,087
Zagrebačka nadbiskupija Diocesan 3 facebook, youtube, twitter 2,023 241,657
24sata Other 3 facebook, youtube, twitter 512 177,044
Večernji list Other 3 facebook, youtube, twitter 760 130,601
Sisačka biskupija Diocesan 3 facebook, twitter, youtube 482 87,832
N1 Hrvatska Other 3 facebook, twitter, youtube 331 39,538
023.hr Other 3 web, facebook, twitter 774 31,242
Stephen Nikola Bartulica Other 3 twitter, youtube, facebook 21 10,969
bljesak.info Other 3 web, comment, instagram 108 7,769
Novosti Other 3 twitter, facebook, youtube 81 7,299
Željana Zovko Other 3 facebook, twitter, youtube 56 6,118
Women in Adria Other 3 facebook, twitter, youtube 19 341
Caritas Dubrovačke biskupije Other 3 twitter, youtube, facebook 5 22
NA NA NA NA NA NA
NA NA NA NA NA NA
NA NA NA NA NA NA
NA NA NA NA NA NA
NA NA NA NA NA NA

6.3 Cross-Platform Presence by Actor Type

This chart compares average platform presence across actor types, revealing which categories tend toward single platform focus versus multi platform strategies.

Show code
actor_platform_summary <- cross_platform[, .(
  Sources = .N,
  Mean_Platforms = mean(Platforms),
  Multi_Platform_Share = sum(Platforms >= 2) / .N * 100,
  Total_Interactions = sum(Total_Interactions)
), by = Actor_Type][order(-Mean_Platforms)]

ggplot(actor_platform_summary, aes(x = reorder(Actor_Type, Mean_Platforms), 
                                    y = Mean_Platforms, fill = Actor_Type)) +
  geom_col(width = 0.7) +
  geom_text(aes(label = sprintf("%.2f", Mean_Platforms)), hjust = -0.1, size = 3.5) +
  coord_flip() +
  scale_fill_manual(values = actor_colors) +
  scale_y_continuous(limits = c(0, max(actor_platform_summary$Mean_Platforms) * 1.15)) +
  labs(
    title = "Average Platform Presence by Actor Type",
    subtitle = "Mean number of platforms per source",
    x = NULL,
    y = "Average Number of Platforms"
  ) +
  theme(legend.position = "none")

6.4 Correlation: Multi-Platform Presence and Engagement

Does operating on more platforms lead to greater total engagement? This analysis tests the relationship using Spearman correlation (appropriate for non linear relationships and non normal distributions).

Interpretation:

  • Positive correlation suggests multi platform strategies associate with higher engagement
  • However, correlation does not imply causation: successful actors may simply have resources for both high engagement and multi platform presence
Show code
# Calculate correlation
cor_test <- cor.test(cross_platform$Platforms, cross_platform$Total_Interactions, 
                     method = "spearman")

ggplot(cross_platform[Total_Interactions > 0], 
       aes(x = factor(Platforms), y = Total_Interactions)) +
  geom_boxplot(fill = "#2c5f7c", alpha = 0.7, outlier.alpha = 0.3) +
  scale_y_log10(labels = comma) +
  labs(
    title = "Engagement by Number of Platforms",
    subtitle = paste0("Spearman correlation: ρ = ", round(cor_test$estimate, 3),
                      ", p ", ifelse(cor_test$p.value < 0.001, "< 0.001", 
                                     paste0("= ", round(cor_test$p.value, 4)))),
    x = "Number of Platforms",
    y = "Total Interactions (log scale)"
  )

7 Summary and Key Findings

Show code
# Compile key findings

7.1 Platform Distribution Findings

  1. Volume distribution: Web content dominates in volume, but social platforms show higher engagement rates
  2. Engagement efficiency: Instagram and Facebook likely show efficiency indices above 1.0, indicating higher resonance per post
  3. Strategic implications: Catholic communicators seeking engagement should prioritize social platforms, while web remains important for searchability and archival purposes

7.2 Visibility Stratification Findings

  1. Extreme concentration: The top 10 sources capture a disproportionate share of total engagement
  2. Gini coefficient: Value of 0.982 indicates very high inequality in visibility distribution
  3. Power law: The rank engagement relationship follows a power law pattern typical of networked media systems
  4. Implications: The Croatian Catholic digital space exhibits strong winner take all dynamics, raising questions about diversity of voices

7.3 Institutional Gap Findings

  1. Engagement rate differential: Non institutional actors (individual priests, charismatic communities) show different engagement rates than institutional accounts
  2. Statistical significance: The Wilcoxon test indicates whether this difference is statistically meaningful
  3. Implications: If confirmed, this suggests institutional actors may need to adapt their communication strategies to compete for attention

7.4 Cross-Platform Presence Findings

  1. Single platform dominance: Most sources operate on only one platform
  2. Multi platform advantage: Sources present on multiple platforms tend to accumulate more total engagement
  3. Actor type patterns: Independent media shows strongest multi platform integration, suggesting greater strategic sophistication in digital presence

8 Appendix: Technical Notes

8.1 Classification System Details

The actor classification employs the following category definitions:

Actor Type Definition
Institutional Official Central Church institutions: Bishops Conference, Catholic Network (HKM), Information Agency (IKA), Catholic Radio (HKR)
Diocesan Diocese and parish level communications including archdioceses, dioceses, eparchies, and individual parishes
Independent Media Catholic media outlets operating independently of Church hierarchy: Laudato TV, Bitno.net, Glas Koncila, etc.
Religious Orders Communications from religious orders and congregations including Franciscans, Jesuits, Dominicans, and female religious
Charismatic Communities Renewal movements and charismatic communities: Bozja Pobjeda, Cenacolo, Neokatekumenat, etc.
Individual Priests Named clergy identified by clerical titles (Fra, Don, Msgr, etc.)
Youth Organizations Youth ministry organizations: FRAMA, SHKM, university chaplaincies
Academic Catholic educational institutions: Croatian Catholic University, theological faculties
Lay Influencers Devotional and faith focused social media pages run by laity
Other Secular media covering religious topics, unidentified sources, general public discourse
Show code
sessionInfo()
R version 4.5.2 (2025-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)

Matrix products: default
  LAPACK version 3.12.1

locale:
[1] LC_COLLATE=Croatian_Croatia.utf8  LC_CTYPE=Croatian_Croatia.utf8   
[3] LC_MONETARY=Croatian_Croatia.utf8 LC_NUMERIC=C                     
[5] LC_TIME=Croatian_Croatia.utf8    

time zone: Europe/Zagreb
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ineq_0.2-13       kableExtra_1.4.0  knitr_1.50        scales_1.4.0     
 [5] data.table_1.17.8 lubridate_1.9.4   forcats_1.0.1     stringr_1.6.0    
 [9] dplyr_1.1.4       purrr_1.2.0       readr_2.1.6       tidyr_1.3.1      
[13] tibble_3.3.0      ggplot2_4.0.1     tidyverse_2.0.0  

loaded via a namespace (and not attached):
 [1] generics_0.1.4     xml2_1.5.1         lattice_0.22-7     stringi_1.8.7     
 [5] hms_1.1.4          digest_0.6.39      magrittr_2.0.4     evaluate_1.0.5    
 [9] grid_4.5.2         timechange_0.3.0   RColorBrewer_1.1-3 fastmap_1.2.0     
[13] Matrix_1.7-4       jsonlite_2.0.0     mgcv_1.9-3         viridisLite_0.4.2 
[17] textshaping_1.0.4  cli_3.6.5          rlang_1.1.6        splines_4.5.2     
[21] withr_3.0.2        yaml_2.3.11        tools_4.5.2        tzdb_0.5.0        
[25] vctrs_0.6.5        R6_2.6.1           lifecycle_1.0.4    htmlwidgets_1.6.4 
[29] pkgconfig_2.0.3    pillar_1.11.1      gtable_0.3.6       glue_1.8.0        
[33] systemfonts_1.3.1  xfun_0.54          tidyselect_1.2.1   rstudioapi_0.17.1 
[37] dichromat_2.0-0.1  farver_2.1.2       nlme_3.1-168       htmltools_0.5.8.1 
[41] rmarkdown_2.30     svglite_2.2.2      labeling_0.4.3     compiler_4.5.2    
[45] S7_0.2.1