DigiKat Map 1: Platform and Actor Structure

Mapping the Croatian Catholic Digital Media Space

Author

DigiKat Project

Published

January 27, 2026

1 Introduction

This document presents Map 1: Platform and Actor Structure of the DigiKat project, analyzing the Croatian Catholic digital media space. The core question driving this analysis is: Who communicates where, and who dominates visibility?

We analyze a corpus of over 600,000 posts from 2021-2024 across multiple digital platforms to understand:

How content is distributed across platforms
How visibility and engagement are stratified among actors
Whether institutional actors underperform compared to grassroots voices
Which actors maintain cross-platform presence

This map draws on theoretical frameworks from digital media studies, particularly the concepts of platform affordances (van Dijck, 2013), attention economics (Webster, 2014), and the long tail distribution of online visibility (Anderson, 2006). Understanding who speaks and who is heard in digital religious spaces is essential for assessing the health and diversity of Catholic public discourse in Croatia.

Show code

# Load data
dta <- readRDS("C:/Users/lsikic/Luka C/HKS/Projekti/Digitalni Kat/SHKM/DigiKat/data/merged_comprehensive.rds") %>%
  filter(SOURCE_TYPE != "tiktok", !is.na(SOURCE_TYPE)) %>%
  filter(DATE >= as.Date("2021-01-01") & DATE <= as.Date("2025-12-31")) %>%
  filter(year >= 2021 & year <= 2025)

# Convert to data.table for efficiency
setDT(dta)

# Basic corpus info
n_posts <- nrow(dta)
n_sources <- uniqueN(dta$FROM)
date_range <- paste(min(dta$DATE), "to", max(dta$DATE))

1.1 Corpus Overview

The corpus represents a comprehensive collection of publicly available digital content from Croatian Catholic media sources. This includes official institutional communications, independent Catholic media, parish and diocesan channels, religious order publications, and individual clergy voices across web, social media, and discussion platforms.

Show code

tibble(
  Metric = c("Total posts", "Unique sources", "Date range", "Platforms"),
  Value = c(
    format(n_posts, big.mark = ","),
    format(n_sources, big.mark = ","),
    date_range,
    paste(unique(dta$SOURCE_TYPE), collapse = ", ")
  )
) %>%
  kable(col.names = c("Metric", "Value")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Metric	Value
Total posts	608,879
Unique sources	16,426
Date range	2021-01-01 to 2025-12-31
Platforms	web, facebook, twitter, comment, youtube, forum, reddit, instagram

2 Actor Type Classification

Before analyzing platform dynamics, we classify sources into meaningful actor types based on the Croatian Catholic media landscape. This classification is essential for understanding the structural composition of the digital space and enables comparative analysis across different types of communicators.

2.1 Classification Methodology

The classification employs a hierarchical priority system that processes each source through multiple identification layers. This approach balances precision (avoiding false positives) with recall (capturing relevant actors). The hierarchy works as follows:

Manual overrides handle known important sources that require explicit classification
Secular media exclusions prevent misclassification of mainstream news outlets covering religious topics
Domain-based matching provides reliable identification for web sources with known URLs
Pattern-based matching captures sources through name recognition
Platform-aware detection identifies lay devotional content on social media

The resulting categories reflect the organizational structure of Croatian Catholic communications, distinguishing between official Church hierarchy, independent media voices, religious communities, and individual actors.

Show code

# =============================================================================
# IMPROVED ACTOR CLASSIFICATION v4 (ACTIVE)
# =============================================================================
# This is the current classification system with the following improvements:
# 1. Hierarchical priority system with explicit ordering
# 2. Expanded secular media exclusions (100+ outlets)
# 3. Domain-based classification for reliable web source identification
# 4. Word boundary matching for abbreviations (prevents false matches)
# 5. Start-of-string matching for priest titles
# 6. Platform-aware lay influencer detection
# 7. Expanded patterns for religious orders including female congregations
# =============================================================================

# -----------------------------------------------------------------------------
# MANUAL OVERRIDES: Highest priority explicit classifications
# -----------------------------------------------------------------------------
manual_overrides <- list(
  "Institutional Official" = c(
    "hrvatska katolička mreža", "hrvatska katolicka mreza",
    "informativna katolička agencija", "informativna katolicka agencija",
    "hrvatski katolički radio", "hrvatski katolicki radio",
    "hrvatska biskupska konferencija", "ika", "hkr", "hkm", "hbk",
    "tiskovni ured hbk", "radio marija"
  ),
  "Independent Media" = c(
    "laudato tv", "laudatotv", "laudato.tv", "laudato.hr", "laudato",
    "bitno.net", "bitno net", "glas koncila", "glaskoncila",
    "nova eva", "nova-eva", "verbum", "totus tuus", "totus-tuus",
    "katolički tjednik", "katolicki tjednik", "kršćanska sadašnjost",
    "krscanska sadasnjost", "mir i dobro", "svjetlo riječi",
    "novizivot.net", "novi zivot", "novi život"
  ),
  "Charismatic Communities" = c(
    "božja pobjeda", "bozja pobjeda", "bozjapobjeda",
    "muževni budite", "muzevni budite", "muzevnibudite",
    "srce isusovo", "srceisuovo", "cenacolo", "comunità cenacolo",
    "duhovna obnova", "molitvena snaga",
    "dom molitve slavonski brod", "dom molitve",
    "molitvena zajednica sv. josipa"
  ),
  "Lay Influencers" = c(
    "katolička obitelj", "katolicka obitelj",
    "marija majka isusova", "božanske molitve", "bozanske molitve",
    "moćne molitve tv", "mocne molitve tv", "moćne molitve", "mocne molitve",
    "katoličke molitve", "katolicke molitve",
    "pulherissimus",
    "pod smokvom",
    "hrana za dušu", "hrana za dusu",
    "добровољци",
    "miletić marin", "miletic marin",
    "dijete vjere",
    "vjera",
    "kapljice ljubavi božje", "kapljice ljubavi bozje",
    "kršćanstvo", "krscanstvo",
    "jutarnja molitva duhu svetom",
    "blago molitve",
    "biblija krunice molitve",
    "molitve bogu",
    "vojnik sreće", "vojnik srece", "duhovne poruke i inspiracija",
    "kes duhovni kutak", "duhovni kutak",
    "molitve.hr",
    "duhovniportal.com", "duhovniportal"
  ),
  "Diocesan" = c(
    "zagrebačka nadbiskupija", "zagrebacka nadbiskupija",
    "sisačka biskupija", "sisacka biskupija",
    "župa šurkovac", "zupa surkovac",
    "sveta mati slobode",
    "župa sv. ilije proroka metković", "zupa sv. ilije proroka metkovic",
    "župa uznesenja bdm", "zupa uznesenja bdm", "župa uznesenja bdm - stenjevec",
    "šibenska biskupija", "sibenska biskupija",
    "požeška biskupija", "pozeskа biskupija",
    "dubrovacka biskupija", "dubrovačka biskupija",
    "dubrovacka-biskupija.hr",
    "zupa tramosnica", "župa tramošnica",
    "župa sv. vida", "zupa sv. vida", "župa sv. vida - petruševec"
  ),
  "Youth Organizations" = c(
    "susret hrvatske katoličke mladeži", "susret hrvatske katolicke mladezi",
    "shkm požega", "shkm pozega"
  ),
  "Academic" = c(
    "hrvatsko katoličko sveučilište", "hrvatsko katolicko sveuciliste",
    "universitas studiorum catholica croatica"
  )
)

# -----------------------------------------------------------------------------
# SECULAR MEDIA EXCLUSIONS: Always classify as Other
# -----------------------------------------------------------------------------
secular_exclusions <- c(
  # National portals (with and without .hr)
  "slobodnadalmacija", "slobodnadalmacija.hr", 
  "vecernji", "vecernji.hr", 
  "index.hr", "index",
  "jutarnji", "jutarnji.hr", 
  "novilist", "novilist.hr",
  "24sata", "24sata.hr",
  "direktno", "direktno.hr",
  "nacional", "nacional.hr", 
  "tportal", "tportal.hr",
  "dnevnik.hr", "dnevnik", 
  "hrt.hr", "hrt", 
  "n1info", "n1info.hr", "n1",
  "rtl.hr", "rtl", 
  "net.hr", 
  "telegram.hr", "telegram", 
  "story.hr", "express.hr", "express", "advance.hr",
  # Regional portals
  "glasistre", "glasistre.hr",
  "dnevno.hr", "dnevno", 
  "prigorski", "glas-slavonije", "glas slavonije",
  "croativ", "oluja.info", 
  "maxportal", "maxportal.hr",
  "hkv.hr", "icv.hr", "novosti.hr", "7dnevno", "mnovine", 
  "sjever.hr", "dulist.hr", "pozega.eu", "sibenik.in",
  "ferata.hr", "epodravina", "glasgacke", "radio-zlatar", 
  "medjimurski.hr", "sbperiskop", "zagorje-international",
  "pozeski", "novine.hr", "dubrovnikinsider", "regionalni", 
  "leportale", "varazdinske-vijesti", "radionasice", 
  "brodportal", "ljportal", "dubrovnikportal", "01portal",
  "tomislavnews", "hia.com.hr", "portalnovosti", "antenazadar",
  "dalmacijanews", "zadarskilist", "medjimurjepress", 
  "zagreb.info", "034portal", "057info", "cityportal",
  "klikaj.hr", "lika-online", "ploce.com", "sbonline", 
  "narod.hr", "infokiosk", "hrsvijet", "tomislavcity", 
  "vrisak.info", "dalmacijadanas", "dalmacijadanas.hr",
  "morski.hr", "zagreb.hr", "osijek031", "rijeka.hr", "zadar.hr",
  "zupanjac.net", "zupanjac", 
  "dalmatinskiportal.hr", "dalmatinskiportal",
  "campaign-archive.com",
  # Forums and aggregators
  "forum.hr", "reddit", "anonymous_user", "komentari", "bug.hr",
  # Entertainment and other
  "inmemoriam", "magicus.info", "book.hr", "mojzagreb.info", 
  "skole.hr", "tvprofil", "priznajem.hr", "dragovoljac.com", 
  "croatia", "wikipedia",
  "facebook.com", "youtube.com", "instagram.com", "twitter.com",
  # Government and administrative
  "županija", "zupanija", "grad ", "opcina", "općina",
  # Non-Catholic religious groups
  "kršćanska proročka crkva", "krscanska prorocka crkva",
  "crkva svemogućeg boga", "crkva svemoguceg boga",
  "jehovini svjedoci", "adventisti", "baptisti", "pentekostalna",
  # Political parties
  "domovinski pokret", "hdz", "sdp", "most", "možemo", "mozemo"
)

# -----------------------------------------------------------------------------
# DOMAIN PATTERNS: URL-based classification
# -----------------------------------------------------------------------------
domain_patterns <- list(
  "Institutional Official" = c(
    "hkm.hr", "ika.hkm.hr", "hkr.hkm.hr", "hbk.hr", "radiomarija.hr"
  ),
  "Diocesan" = c(
    "zg-nadbiskupija.hr", "biskupija-varazdinska.hr", "djos.hr",
    "biskupija-sj.hr", "rzs.hr", "rkc-sisak.hr", "zadarskanadbiskupija.hr",
    "gospicko-senjska-biskupija.hr", "nadbiskupija-split.com",
    "nadbiskupija-split.hr", "dubrovacka-biskupija.hr", "porec-biskupija.hr",
    "biskupija-kk.hr", "rkc-pula.hr", "krizcevacka-eparhija.hr",
    "sibenska-biskupija.hr", "krizevci.hbk.hr"
  ),
  "Independent Media" = c(
    "laudato.hr", "laudato.tv", "bitno.net", "glaskoncila.hr",
    "nova-eva.com", "verbum.hr", "ks.hr", "totus-tuus.hr",
    "svjetlorijeci.hr", "zivot.com.hr", "novizivot.net"
  ),
  "Academic" = c(
    "unicath.hr", "hks.hr", "kbf.unizg.hr", "ffrz.hr", "hku.hr"
  ),
  "Religious Orders" = c(
    "franjevci.hr", "franjevci-split.hr", "isusovci.hr", "dominikanci.hr",
    "kapucini.hr", "salezijanci.hr", "karmelicani.hr"
  )
)

# -----------------------------------------------------------------------------
# NAME PATTERNS: Text-based classification
# -----------------------------------------------------------------------------

# Diocesan
diocesan_exact <- c(
  "zagrebačka nadbiskupija", "zagrebacka nadbiskupija",
  "splitsko-makarska nadbiskupija", "splitsko makarska",
  "đakovačko-osječka nadbiskupija", "djakovacko-osjecka",
  "riječka nadbiskupija", "rijecka nadbiskupija",
  "zadarska nadbiskupija", "sisačka biskupija", "sisacka biskupija",
  "varaždinska biskupija", "varazdinska biskupija",
  "križevačka eparhija", "krizevacka eparhija",
  "šibenska biskupija", "sibenska biskupija",
  "dubrovačka biskupija", "dubrovacka biskupija",
  "porečka i pulska biskupija", "porecka biskupija",
  "gospićko-senjska biskupija", "gospicko-senjska",
  "bjelovarsko-križevačka biskupija", "kotorska biskupija"
)
diocesan_contains <- c("nadbiskupija", "biskupija", "eparhija", "ordinarijat")
# Parish patterns - match at start OR anywhere with space before
diocesan_parish <- c("župa ", "zupa ", "župna ", "zupna ", "župni ", "zupni ",
                     "župa", "zupa")  # Also match at start without space

# Religious Orders
orders_exact <- c(
  "franjevci", "franjevci konventualci", "franjevci kapucini",
  "mala braća", "isusovci", "družba isusova", "druzba isusova",
  "dominikanci", "red propovjednika", "salezijanci", "don bosco",
  "karmelićani", "karmelicani", "karmel", "benediktinci", "benediktin",
  "kapucini", "pavlini", "trapisti", "cisterciti", "augustinci",
  "sestre milosrdnice", "uršulinke", "ursulinke", "klarise",
  "službenice milosrđa", "kćeri božje ljubavi", "školske sestre",
  "karmelićanke", "benediktinke", "dominikanke"
)
orders_contains <- c(
  "franjevački", "franjevacki", "isusovački", "isusovacki",
  "dominikanski", "salezijanski", "karmelski", "benediktinski",
  "kapucinski", "pavlinski", "redovnici", "redovnice",
  "samostan", "provincija"
)
orders_abbrev <- c("ofm", "ofmcap", "ofmconv", "sj", "op", "sdb",
                   "ocd", "osb", "osbm", "cssr", "svd", "omc")

# Charismatic Communities
charismatic_exact <- c(
  "emmanuel", "taize", "taizé", "fokolari", "fokolarini",
  "kursiljo", "cursillo", "neokatekumenski put", "shalom",
  "zajednica beatitudes", "zajednica blaženstava",
  "dom molitve", "kuća molitve", "kuca molitve",
  "duhovna obnova", "molitvena snaga"
)
charismatic_contains <- c(
  "molitvena zajednica", "karizmatska", "karizmatski",
  "neokatekumenski", "neokatekumenska", "obnova u duhu",
  "komunija i oslobođenje", "comunione e liberazione",
  "dom molitve", "house of prayer"
)

# Youth Organizations
youth_exact <- c(
  "frama", "shkm", "katolička mladež", "katolicka mladez",
  "mladi franjevci", "salezijanska mladež"
)
youth_contains <- c(
  "ministranti", "mladifra", "kaem", "studentska kapelanija",
  "sveučilišna kapelanija", "sveuclisna kapelanija",
  "pastoral mladih", "mladi katolici"
)

# Academic
academic_exact <- c(
  "hrvatsko katoličko sveučilište", "hrvatsko katolicko sveuciliste",
  "katolički bogoslovni fakultet", "katolicki bogoslovni fakultet",
  "filozofski fakultet družbe isusove", "teologija u rijeci"
)
academic_contains <- c("teologija", "bogoslovija", "katehetski")
academic_abbrev <- c("kbf", "hku", "ffrz")

# Individual Priests (title prefixes at START)
priest_prefixes <- c(
  "fra ", "don ", "vlč. ", "vlč.", "vlc. ", "vlc.",
  "msgr. ", "msgr.", "mons. ", "mons.",
  "o. ", "pater ", "p. ", "pr. ",
  "s. ", "sestra ", "m. ", "majka "
)
priest_hierarchy <- c(
  "biskup ", "nadbiskup ", "kardinal ",
  "mons. ", "preč. ", "prečasni "
)
priest_contains <- c("svećenik", "svecenik", "župnik", "zupnik")

# Lay Influencers
lay_devotional <- c(
  "vjera", "molitva", "molitve", "isus", "krist", "gospa", "marija",
  "hrana za dušu", "hrana za dusu", "dijete vjere",
  "riječ dana", "rijec dana", "riječ božja", "rijec bozja",
  "sveti", "svetac", "svetica", "evanđelje", "evandelje",
  "duhovnost", "duhovna", "duhovni", "biblija", "biblijski",
  "psalm", "blagoslov", "krunica", "rozarij", "lectio divina",
  "katolička obitelj", "katolicka obitelj", "katolicki",
  "božanske", "bozanske", "moćne", "mocne"
)
# Only exclude actual media domains, not devotional pages with "tv" in name
lay_exclude <- c(".hr", ".net", ".com", "portal", "vijesti",
                 "news", "radio", "agencija", "tjednik")

# -----------------------------------------------------------------------------
# CLASSIFICATION FUNCTION
# -----------------------------------------------------------------------------
classify_actor_v4 <- function(from_val, url_val = NA, platform_val = NA) {
  
  from_lower <- tolower(trimws(as.character(from_val)))
  url_lower <- tolower(ifelse(is.na(url_val), "", as.character(url_val)))
  platform_lower <- tolower(ifelse(is.na(platform_val), "", as.character(platform_val)))
  combined <- paste(from_lower, url_lower)
  
  # Helper function for pattern matching
  match_any <- function(patterns, text, fixed = TRUE) {
    any(sapply(patterns, function(p) grepl(p, text, fixed = fixed)))
  }
  
  # PRIORITY 1: Manual Overrides
  for (actor_type in names(manual_overrides)) {
    if (match_any(manual_overrides[[actor_type]], from_lower)) {
      return(actor_type)
    }
  }
  
  # PRIORITY 2: Secular Media Exclusions
  if (match_any(secular_exclusions, combined)) {
    return("Other")
  }
  
  # PRIORITY 3: Domain-based Classification
  if (nchar(url_lower) > 0) {
    for (actor_type in names(domain_patterns)) {
      if (match_any(domain_patterns[[actor_type]], url_lower)) {
        return(actor_type)
      }
    }
  }
  
  # PRIORITY 4a: Diocesan
  # Check for parish patterns - handle both with and without special characters
  is_parish <- grepl("^župa|^zupa|župi|zupi|- župa|- zupa", from_lower, ignore.case = TRUE) ||
               grepl("parish", from_lower, ignore.case = TRUE)
  
  if (match_any(diocesan_exact, from_lower) ||
      match_any(diocesan_contains, from_lower) ||
      is_parish) {
    return("Diocesan")
  }
  
  # PRIORITY 4b: Religious Orders
  if (match_any(orders_exact, from_lower) ||
      match_any(orders_contains, from_lower)) {
    return("Religious Orders")
  }
  for (abbr in orders_abbrev) {
    if (grepl(paste0("\\b", abbr, "\\b"), from_lower, ignore.case = TRUE)) {
      return("Religious Orders")
    }
  }
  
  # PRIORITY 4c: Charismatic Communities
  if (match_any(charismatic_exact, from_lower) ||
      match_any(charismatic_contains, from_lower)) {
    return("Charismatic Communities")
  }
  
  # PRIORITY 4d: Youth Organizations
  if (match_any(youth_exact, from_lower) ||
      match_any(youth_contains, from_lower)) {
    return("Youth Organizations")
  }
  
  # PRIORITY 4e: Academic
  if (match_any(academic_exact, from_lower) ||
      match_any(academic_contains, from_lower)) {
    return("Academic")
  }
  for (abbr in academic_abbrev) {
    if (grepl(paste0("\\b", abbr, "\\b"), from_lower, ignore.case = TRUE)) {
      return("Academic")
    }
  }
  
  # PRIORITY 4f: Individual Priests (prefix at START)
  for (prefix in priest_prefixes) {
    if (startsWith(from_lower, prefix)) return("Individual Priests")
  }
  for (title in priest_hierarchy) {
    if (startsWith(from_lower, title)) return("Individual Priests")
  }
  if (match_any(priest_contains, from_lower)) {
    return("Individual Priests")
  }
  
  # PRIORITY 5: Lay Influencers
  # Detect devotional pages - relaxed platform check since many FB pages don't have URL
  has_devotional <- match_any(lay_devotional, from_lower)
  has_media_indicator <- match_any(lay_exclude, from_lower)
  
  # Classify as Lay Influencer if has devotional keywords and no media indicators
  # Platform check relaxed: social media OR no dots in name (likely social handle)
  is_likely_social <- platform_lower %in% c("facebook", "instagram", "youtube", "twitter") ||
                      !grepl("\\.[a-z]{2,4}$", from_lower)  # No domain extension
  
  if (has_devotional && !has_media_indicator && is_likely_social) {
    return("Lay Influencers")
  }
  
  # PRIORITY 6: Default
  return("Other")
}

# -----------------------------------------------------------------------------
# APPLY CLASSIFICATION
# -----------------------------------------------------------------------------
dta[, ACTOR_TYPE := mapply(
  classify_actor_v4, 
  FROM, 
  if ("URL" %in% names(dta)) URL else NA,
  SOURCE_TYPE
)]

2.2 Classification Results

The table below shows the distribution of posts and engagement across actor types. This provides a first overview of who populates the Croatian Catholic digital space.

How to interpret this table:

Posts and % Posts indicate the volume of content produced by each actor type
Sources shows how many distinct accounts or websites belong to each category
Interactions and % Engage reveal the total audience engagement (likes, comments, shares) generated
Comparing volume share to engagement share reveals which actor types punch above or below their weight

Show code

# Calculate comprehensive summary
actor_summary <- dta[, .(
  Posts = .N,
  Sources = uniqueN(FROM),
  Total_Interactions = sum(INTERACTIONS, na.rm = TRUE),
  Mean_Interactions = mean(INTERACTIONS, na.rm = TRUE),
  Median_Interactions = median(INTERACTIONS, na.rm = TRUE)
), by = ACTOR_TYPE][order(-Posts)]

# Calculate percentages
total_posts <- sum(actor_summary$Posts)
total_interactions <- sum(actor_summary$Total_Interactions)
total_sources <- sum(actor_summary$Sources)

actor_summary[, `:=`(
  Posts_Pct = Posts / total_posts * 100,
  Sources_Pct = Sources / total_sources * 100,
  Interactions_Pct = Total_Interactions / total_interactions * 100
)]

# Display formatted table
actor_summary %>%
  mutate(
    Posts = format(Posts, big.mark = ","),
    `% Posts` = sprintf("%.1f%%", Posts_Pct),
    Sources = format(Sources, big.mark = ","),
    `% Sources` = sprintf("%.1f%%", Sources_Pct),
    Interactions = format(Total_Interactions, big.mark = ","),
    `% Engage` = sprintf("%.1f%%", Interactions_Pct),
    `Mean Int.` = round(Mean_Interactions, 1)
  ) %>%
  select(ACTOR_TYPE, Posts, `% Posts`, Sources, `% Sources`, 
         Interactions, `% Engage`, `Mean Int.`) %>%
  kable(col.names = c("Actor Type", "Posts", "% Posts", "Sources", 
                      "% Sources", "Interactions", "% Engage", "Mean Int."),
        caption = "Actor Type Classification Summary") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = FALSE) %>%
  row_spec(which(actor_summary$ACTOR_TYPE == "Other"), 
           background = "#f5f5f5", italic = TRUE)

Actor Type Classification Summary
Actor Type	Posts	% Posts	Sources	% Sources	Interactions	% Engage	Mean Int.
Other	468,712	77.0%	16,072	97.8%	44,600,737	72.0%	99.1
Institutional Official	62,744	10.3%	202	1.2%	3,550,057	5.7%	56.6
Independent Media	31,403	5.2%	19	0.1%	6,600,818	10.7%	210.2
Lay Influencers	28,451	4.7%	62	0.4%	6,333,433	10.2%	224.7
Diocesan	13,841	2.3%	50	0.3%	636,001	1.0%	46.0
Religious Orders	1,720	0.3%	13	0.1%	37,390	0.1%	21.7
Charismatic Communities	1,282	0.2%	12	0.1%	193,550	0.3%	153.7
Academic	674	0.1%	7	0.0%	11,661	0.0%	17.3
Youth Organizations	52	0.0%	4	0.0%	6,697	0.0%	128.8

2.3 Actor Type Distribution Visualization

This chart compares each actor types share of total content volume against their share of total engagement. When an actor types engagement bar exceeds their volume bar, it indicates that their content resonates more strongly with audiences on a per post basis.

Show code

# Prepare data for visualization (exclude Other for cleaner view)
actor_viz <- actor_summary[ACTOR_TYPE != "Other"]

# Create comparison chart
actor_long <- actor_viz %>%
  select(ACTOR_TYPE, Posts_Pct, Interactions_Pct) %>%
  pivot_longer(cols = c(Posts_Pct, Interactions_Pct),
               names_to = "Metric", values_to = "Percentage") %>%
  mutate(Metric = ifelse(Metric == "Posts_Pct", "Volume (Posts)", "Engagement"))

ggplot(actor_long, aes(x = reorder(ACTOR_TYPE, Percentage), 
                        y = Percentage, fill = Metric)) +
  geom_col(position = "dodge", width = 0.7) +
  geom_text(aes(label = sprintf("%.1f%%", Percentage)),
            position = position_dodge(width = 0.7),
            hjust = -0.1, size = 3) +
  coord_flip() +
  scale_fill_manual(values = c("Volume (Posts)" = "#2c5f7c", 
                               "Engagement" = "#e07b39")) +
  scale_y_continuous(limits = c(0, max(actor_long$Percentage) * 1.2),
                     labels = function(x) paste0(x, "%")) +
  labs(
    title = "Actor Type Distribution: Volume vs Engagement",
    subtitle = "Excluding 'Other' category for clarity",
    x = NULL,
    y = "Share (%)",
    fill = NULL
  ) +
  theme(legend.position = "top")

2.4 Engagement Efficiency by Actor Type

The efficiency index quantifies how effectively each actor type converts their content production into audience engagement. It is calculated as the ratio of engagement share to volume share. An index above 1.0 means the actor type generates more engagement than their share of posts would predict, while below 1.0 indicates underperformance relative to volume.

Interpretation guide:

High performers (>1.5): Content resonates strongly; audiences actively engage
Above average (1.0-1.5): Solid engagement relative to output
Average (0.7-1.0): Engagement roughly proportional to volume
Below average (<0.7): High volume but limited audience response

Show code

# Calculate efficiency index
actor_efficiency <- actor_summary[ACTOR_TYPE != "Other"] %>%
  mutate(
    Efficiency_Index = Interactions_Pct / Posts_Pct,
    Performance = case_when(
      Efficiency_Index > 1.5 ~ "High performer",
      Efficiency_Index > 1.0 ~ "Above average",
      Efficiency_Index > 0.7 ~ "Average",
      TRUE ~ "Below average"
    )
  ) %>%
  arrange(desc(Efficiency_Index))

ggplot(actor_efficiency, aes(x = reorder(ACTOR_TYPE, Efficiency_Index),
                              y = Efficiency_Index, fill = Performance)) +
  geom_col(width = 0.7) +
  geom_hline(yintercept = 1, linetype = "dashed", color = "red", linewidth = 1) +
  geom_text(aes(label = sprintf("%.2f", Efficiency_Index)), 
            hjust = -0.1, size = 3.5) +
  coord_flip() +
  scale_fill_manual(values = c(
    "High performer" = "#2d6a4f",
    "Above average" = "#52b788",
    "Average" = "#95d5b2",
    "Below average" = "#d8f3dc"
  )) +
  scale_y_continuous(limits = c(0, max(actor_efficiency$Efficiency_Index) * 1.15)) +
  labs(
    title = "Engagement Efficiency by Actor Type",
    subtitle = "Efficiency Index = Engagement Share / Volume Share (>1 = overperforming)",
    x = NULL,
    y = "Efficiency Index",
    fill = "Performance",
    caption = "Red dashed line indicates neutral efficiency (1.0)"
  ) +
  theme(legend.position = "right")

2.5 Top Sources by Actor Type

This table displays the five most engaged sources within each classified actor type, providing concrete examples of who drives engagement in each category.

Show code

# Get top 5 sources per actor type
top_sources_by_type <- dta[ACTOR_TYPE != "Other", .(
  Posts = .N,
  Interactions = sum(INTERACTIONS, na.rm = TRUE),
  Platforms = uniqueN(SOURCE_TYPE)
), by = .(ACTOR_TYPE, FROM)][order(ACTOR_TYPE, -Interactions)]

top_5_each <- top_sources_by_type[, head(.SD, 5), by = ACTOR_TYPE]

top_5_each %>%
  mutate(
    Posts = format(Posts, big.mark = ","),
    Interactions = format(Interactions, big.mark = ",")
  ) %>%
  kable(col.names = c("Actor Type", "Source", "Posts", "Interactions", "Platforms"),
        caption = "Top 5 Sources by Actor Type (ranked by engagement)") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE) %>%
  collapse_rows(columns = 1, valign = "top") %>%
  scroll_box(height = "500px")

Top 5 Sources by Actor Type (ranked by engagement)
Actor Type	Source	Posts	Interactions	Platforms
Academic	Hrvatsko katoličko sveučilište (Universitas Studiorum Catholica Croatica)	190	7,392	1
	unicath.hr	461	3,589	1
	adekvatnateologija.com	7	617	1
	ffrz.hr	2	32	1
	Hrvatsko katoličko sveučilište	11	30	1
Charismatic Communities	Duhovna Obnova	325	58,960	1
	Muževni budite	269	36,576	1
	Molitvena Snaga	61	31,552	1
	Dom Molitve Slavonski Brod	257	27,316	1
	Božja pobjeda	20	18,387	1
Diocesan	Zagrebačka nadbiskupija	2,023	241,657	3
	Sisačka biskupija	482	87,832	3
	zg-nadbiskupija.hr	888	82,621	1
	Župa Šurkovac	442	70,885	1
	Sveta Mati Slobode - Župa Duha Svetoga	344	36,457	1
Independent Media	novizivot.net	6,083	3,351,089	1
	Laudato	2,164	926,523	2
	bitno.net	3,445	763,737	1
	laudato.hr	10,727	703,203	1
	Bitno.net	3,290	512,818	2
Institutional Official	hkm.hr	52,474	3,077,108	1
	Radio Marija Hrvatska	2,704	165,224	1
	geopolitika.news	377	83,616	2
	klikaj.hr	1,155	45,491	1
	likaclub.eu	637	25,943	1
Lay Influencers	pulherissimus	6,007	1,608,836	1
	Добровољци	348	790,860	1
	Miletić Marin	231	645,342	2
	Pod Smokvom	1,517	588,094	1
	Hrana za dušu	3,447	461,396	1
Religious Orders	isusovci.hr	196	15,754	1
	ofm.hr	673	15,178	1
	karmel.hr	457	4,578	1
	franjevcitrecoredci.hr	2	866	1
	ofmconv.hr	327	809	1
Youth Organizations	Susret hrvatske katoličke mladeži Požega 2026.	37	5,082	1
	Susret hrvatske katoličke mladeži Bjelovar 2022.	12	1,500	1
	Susret hrvatske katoličke mladeži Zagreb 2020.	2	115	1
	frama-portal.net	1	0	1

2.6 Actor Type by Platform

This heatmap reveals platform preferences across actor types. Each cell shows what percentage of an actor types content appears on each platform. This helps identify which platforms different types of Catholic communicators favor.

Show code

# Cross-tabulation of actor types and platforms
actor_platform <- dta[ACTOR_TYPE != "Other", .(Posts = .N), 
                       by = .(ACTOR_TYPE, SOURCE_TYPE)]

# Calculate percentages within each actor type
actor_platform[, Pct := Posts / sum(Posts) * 100, by = ACTOR_TYPE]

# Create heatmap
ggplot(actor_platform, aes(x = SOURCE_TYPE, y = ACTOR_TYPE, fill = Pct)) +
  geom_tile(color = "white", linewidth = 0.5) +
  geom_text(aes(label = sprintf("%.0f%%", Pct)), size = 3, color = "white") +
  scale_fill_gradient(low = "#deebf7", high = "#08519c",
                      name = "% of Posts") +
  labs(
    title = "Platform Distribution by Actor Type",
    subtitle = "Percentage of each actor type's posts by platform",
    x = "Platform",
    y = "Actor Type"
  ) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    panel.grid = element_blank()
  )

2.7 Classification Quality Diagnostics

Click to expand diagnostics (Quality assurance tables for identifying potential misclassifications)

Quality assurance is essential for any classification system. The diagnostics below help identify potential misclassifications that may require manual review or pattern refinement.

Show code

# High-engagement sources in Other category
other_review <- dta[ACTOR_TYPE == "Other", .(
  Posts = .N,
  Interactions = sum(INTERACTIONS, na.rm = TRUE),
  Platforms = paste(unique(SOURCE_TYPE), collapse = ", ")
), by = FROM][order(-Interactions)][1:20]

other_review %>%
  mutate(
    Posts = format(Posts, big.mark = ","),
    Interactions = format(Interactions, big.mark = ",")
  ) %>%
  kable(col.names = c("Source", "Posts", "Interactions", "Platforms"),
        caption = "Top 20 'Other' Sources by Engagement (review for potential misclassification)") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE)

Top 20 'Other' Sources by Engagement (review for potential misclassification)
Source	Posts	Interactions	Platforms
index.hr	8,095	3,312,573	web, comment, forum
dnevno.hr	4,505	3,278,480	web, comment
24sata.hr	6,459	1,695,211	web
vecernji.hr	9,242	1,598,685	web, comment
jutarnji.hr	7,703	1,595,040	web, comment, instagram
slobodnadalmacija.hr	10,823	1,458,487	web, comment
telegram.hr	2,902	1,164,087	web, comment, instagram
dnevnik.hr	4,383	929,309	web
net.hr	6,917	758,857	web, comment
anonymous_user	2,931	682,962	instagram
narod.hr	5,434	614,885	web
campaign-archive.com	3	499,027	web
dalmatinskiportal.hr	3,378	497,593	web, comment
novilist.hr	7,501	440,049	web, comment
direktno.hr	5,339	418,717	web, comment
maxportal.hr	1,608	395,869	web, comment
dalmacijadanas.hr	3,295	385,808	web
zagreb.info	2,222	363,521	web, comment
caritas.hr	153	332,285	web
mnovine.hr	1,574	309,527	web

Show code

# Check for religious keywords in Other category
religious_keywords <- c(
  "biskupij", "nadbiskupij", "župa", "zupa", "crkv",
  "franjev", "isusov", "dominikan", "salezijan",
  "kršćan", "krscan", "katolič", "katolic",
  "molitv", "duhovn", "svećen", "svecen"
)

other_sources <- unique(dta[ACTOR_TYPE == "Other"]$FROM)

flagged_sources <- other_sources[sapply(other_sources, function(x) {
  any(sapply(religious_keywords, function(k) grepl(k, tolower(x))))
})]

if (length(flagged_sources) > 0) {
  flagged_stats <- dta[FROM %in% flagged_sources & ACTOR_TYPE == "Other", .(
    Posts = .N,
    Interactions = sum(INTERACTIONS, na.rm = TRUE)
  ), by = FROM][order(-Interactions)]
  
  flagged_stats %>%
    mutate(
      Posts = format(Posts, big.mark = ","),
      Interactions = format(Interactions, big.mark = ",")
    ) %>%
    head(15) %>%
    kable(col.names = c("Source", "Posts", "Interactions"),
          caption = "Sources with Religious Keywords Classified as 'Other' (review recommended)") %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                  full_width = FALSE)
}

Sources with Religious Keywords Classified as 'Other' (review recommended)
Source	Posts	Interactions
Virovitičko-podravska županija	121	24,420
zupanjac.net	144	20,137
Sisačko-moslavačka županija	38	4,070
Kršćanska proročka crkva Isus je Kralj!	875	3,679
Crkva Svemogućeg Boga	67	2,897
zg-nadbiskupija.hr	22	2,565
Domovinski pokret Splitsko-dalmatinska županija	6	2,002
Molitvena zajednica Eho	10	1,952
Zadarska županija	45	1,928
OMNIA DEO - Molitve	25	1,665
biskupija-varazdinska.hr	39	1,660
Molitve za svaki dan	48	1,607
🧿Tarot duhovnog buđenja🧿 \|4	\|1	456 \|
Župa svetog Petra apostola - Zadar (Ploča)	63	1,435
Kršćanska zajednica Šibenik	36	1,337

Classification Notes (click to expand)

The classification uses a hierarchical priority system ensuring consistent results:

Manual overrides guarantee correct classification of known important sources
Secular exclusions prevent false positives from mainstream news coverage
Domain matching provides reliable identification for web sources
Pattern matching handles social media and ambiguous sources
Lay Influencers captures devotional social media pages

The Other category contains secular media, unidentified sources, and general public discourse. High engagement sources in Other warrant periodic review for potential pattern improvements.

3 Analysis 1.1: Platform Distribution

Core question: Where does Croatian Catholic digital content live?

Understanding platform distribution is fundamental to grasping the structure of any digital media ecosystem. Different platforms have distinct affordances: web allows long form content and search discoverability, Facebook enables community building and sharing, Instagram favors visual storytelling, YouTube supports video content, and Twitter facilitates rapid information exchange. How content distributes across these platforms reveals strategic choices and audience preferences within the Croatian Catholic digital space.

Show code

# Calculate platform statistics
platform_stats <- dta[, .(
  Posts = .N,
  Total_Interactions = sum(INTERACTIONS, na.rm = TRUE),
  Total_Reach = sum(REACH, na.rm = TRUE),
  Unique_Sources = uniqueN(FROM),
  Mean_Interactions = mean(INTERACTIONS, na.rm = TRUE),
  Median_Interactions = median(INTERACTIONS, na.rm = TRUE)
), by = SOURCE_TYPE][order(-Posts)]

# Calculate shares
platform_stats[, `:=`(
  Volume_Share = Posts / sum(Posts) * 100,
  Engagement_Share = Total_Interactions / sum(Total_Interactions) * 100
)]

3.1 Volume vs Engagement by Platform

This comparison reveals the gap between where content is produced and where engagement occurs. Platforms with higher engagement share than volume share deliver better returns per post.

Show code

# Prepare data for visualization
platform_long <- platform_stats %>%
  select(SOURCE_TYPE, Volume_Share, Engagement_Share) %>%
  pivot_longer(cols = c(Volume_Share, Engagement_Share),
               names_to = "Metric", values_to = "Share") %>%
  mutate(Metric = ifelse(Metric == "Volume_Share", "Volume (Posts)", "Engagement (Interactions)"))

# Create grouped bar chart
ggplot(platform_long, aes(x = reorder(SOURCE_TYPE, Share), y = Share, fill = Metric)) +
  geom_col(position = "dodge", width = 0.7) +
  geom_text(aes(label = sprintf("%.1f%%", Share)), 
            position = position_dodge(width = 0.7), 
            hjust = -0.1, size = 3) +
  coord_flip() +
  scale_fill_manual(values = c("Volume (Posts)" = "#2c5f7c", "Engagement (Interactions)" = "#e07b39")) +
  scale_y_continuous(limits = c(0, max(platform_long$Share) * 1.15), 
                     labels = function(x) paste0(x, "%")) +
  labs(
    title = "Platform Distribution: Volume vs Engagement",
    subtitle = "Share of total posts and interactions by platform",
    x = NULL,
    y = "Share (%)",
    fill = NULL
  ) +
  theme(legend.position = "top")

3.2 Platform Statistics Table

This table provides detailed metrics for each platform including post counts, engagement totals, source diversity, and average interactions per post.

Show code

platform_stats %>%
  mutate(
    Posts = format(Posts, big.mark = ","),
    Total_Interactions = format(Total_Interactions, big.mark = ","),
    Unique_Sources = format(Unique_Sources, big.mark = ","),
    Mean_Interactions = round(Mean_Interactions, 1),
    Volume_Share = sprintf("%.1f%%", Volume_Share),
    Engagement_Share = sprintf("%.1f%%", Engagement_Share)
  ) %>%
  select(SOURCE_TYPE, Posts, Volume_Share, Total_Interactions, Engagement_Share, 
         Unique_Sources, Mean_Interactions) %>%
  kable(col.names = c("Platform", "Posts", "Volume %", "Interactions", 
                      "Engagement %", "Sources", "Mean Interactions")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Platform	Posts	Volume %	Interactions	Engagement %	Sources	Mean Interactions
web	447,908	73.6%	41,622,547	67.2%	3,159	92.9
facebook	68,771	11.3%	9,817,934	15.8%	2,336	142.8
youtube	65,158	10.7%	9,505,576	15.3%	4,843	148.4
reddit	8,085	1.3%	0	0.0%	3,659	NaN
forum	6,140	1.0%	0	0.0%	9	NaN
twitter	5,935	1.0%	69,690	0.1%	2,498	11.7
comment	3,630	0.6%	0	0.0%	23	NaN
instagram	3,252	0.5%	954,597	1.5%	100	293.5

3.3 Engagement Efficiency by Platform

The efficiency index measures how well each platform converts content into engagement. This metric helps identify which platforms offer the best return on content investment for Catholic communicators.

Interpretation:

Index > 1: Platform generates disproportionately high engagement relative to content volume
Index = 1: Engagement is proportional to volume (neutral efficiency)
Index < 1: Platform underperforms in converting content to engagement

Show code

# Calculate engagement per post
platform_efficiency <- platform_stats %>%
  mutate(
    Engagement_per_Post = Total_Interactions / Posts,
    Efficiency_Index = (Engagement_Share / Volume_Share)
  ) %>%
  arrange(desc(Efficiency_Index))

ggplot(platform_efficiency, aes(x = reorder(SOURCE_TYPE, Efficiency_Index), 
                                 y = Efficiency_Index, fill = SOURCE_TYPE)) +
  geom_col(width = 0.7) +
  geom_hline(yintercept = 1, linetype = "dashed", color = "red", linewidth = 1) +
  geom_text(aes(label = sprintf("%.2f", Efficiency_Index)), hjust = -0.1, size = 3.5) +
  coord_flip() +
  scale_fill_manual(values = platform_colors) +
  scale_y_continuous(limits = c(0, max(platform_efficiency$Efficiency_Index) * 1.15)) +
  labs(
    title = "Platform Engagement Efficiency",
    subtitle = "Ratio of engagement share to volume share (>1 means overperforming)",
    x = NULL,
    y = "Efficiency Index (Engagement Share / Volume Share)",
    caption = "Red dashed line = neutral efficiency (1.0)"
  ) +
  theme(legend.position = "none")

Key Finding

Platforms with efficiency index > 1 generate disproportionately high engagement relative to their volume. This indicates higher audience resonance per post and suggests strategic value for Catholic communicators seeking to maximize impact.

4 Analysis 1.2: Visibility Stratification

Core question: How is visibility distributed among actors? Do a few sources dominate?

Digital media ecosystems typically exhibit strong winner take all dynamics, where a small number of actors capture the majority of audience attention. This pattern, often described as a power law distribution, has significant implications for the diversity and pluralism of public discourse. If visibility is highly concentrated, a few dominant voices may shape the entire conversation, while thousands of smaller actors remain effectively invisible.

We examine this question using three complementary approaches: concentration ratios (CR), the Gini coefficient, and Lorenz curve visualization. Together, these metrics quantify the degree of inequality in the distribution of visibility within the Croatian Catholic digital space.

Show code

# Aggregate by source
source_stats <- dta[, .(
  Posts = .N,
  Total_Interactions = sum(INTERACTIONS, na.rm = TRUE),
  Total_Reach = sum(REACH, na.rm = TRUE),
  Mean_Followers = mean(FOLLOWERS_COUNT, na.rm = TRUE),
  Platforms = uniqueN(SOURCE_TYPE),
  Actor_Type = first(ACTOR_TYPE)
), by = FROM][order(-Total_Interactions)]

# Calculate cumulative shares for Lorenz curve
source_stats[, `:=`(
  Rank = .I,
  Cumulative_Sources = .I / .N * 100,
  Cumulative_Interactions = cumsum(Total_Interactions) / sum(Total_Interactions) * 100
)]

4.1 Top 20 Sources by Engagement

This table identifies the most visible actors in the Croatian Catholic digital space, ranked by total engagement. The Share column indicates what proportion of all engagement each source captures.

Show code

source_stats[1:20] %>%
  mutate(
    Share = sprintf("%.2f%%", Total_Interactions / sum(source_stats$Total_Interactions) * 100),
    Posts = format(Posts, big.mark = ","),
    Total_Interactions = format(Total_Interactions, big.mark = ",")
  ) %>%
  select(Rank, FROM, Actor_Type, Posts, Total_Interactions, Share, Platforms) %>%
  kable(col.names = c("Rank", "Source", "Actor Type", "Posts", "Total Interactions", "Share %", "Platforms")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  scroll_box(height = "500px")

Rank	Source	Actor Type	Posts	Total Interactions	Share %	Platforms
1	novizivot.net	Independent Media	6,083	3,351,089	5.41%	1
2	index.hr	Other	8,095	3,312,573	5.35%	3
3	dnevno.hr	Other	4,505	3,278,480	5.29%	2
4	hkm.hr	Institutional Official	52,474	3,077,108	4.97%	1
5	24sata.hr	Other	6,459	1,695,211	2.74%	1
6	pulherissimus	Lay Influencers	6,007	1,608,836	2.60%	1
7	vecernji.hr	Other	9,242	1,598,685	2.58%	2
8	jutarnji.hr	Other	7,703	1,595,040	2.57%	3
9	slobodnadalmacija.hr	Other	10,823	1,458,487	2.35%	2
10	telegram.hr	Other	2,902	1,164,087	1.88%	3
11	dnevnik.hr	Other	4,383	929,309	1.50%	1
12	Laudato	Independent Media	2,164	926,523	1.50%	2
13	Добровољци	Lay Influencers	348	790,860	1.28%	1
14	bitno.net	Independent Media	3,445	763,737	1.23%	1
15	net.hr	Other	6,917	758,857	1.22%	2
16	laudato.hr	Independent Media	10,727	703,203	1.13%	1
17	anonymous_user	Other	2,931	682,962	1.10%	1
18	Miletić Marin	Lay Influencers	231	645,342	1.04%	2
19	narod.hr	Other	5,434	614,885	0.99%	1
20	Pod Smokvom	Lay Influencers	1,517	588,094	0.95%	1

4.2 Concentration Ratios

Concentration ratios (CR) measure what share of total engagement is captured by the top N sources. These metrics are commonly used in industrial organization economics to assess market concentration, and apply equally well to attention markets.

Interpretation guide:

CR5 > 50%: High concentration; top 5 sources dominate the space
CR10 > 70%: Very high concentration; limited diversity of visible voices
CR20 > 80%: Extreme concentration; long tail of nearly invisible actors

Show code

total_interactions <- sum(source_stats$Total_Interactions)
n_sources_total <- nrow(source_stats)

# Calculate concentration ratios
cr1 <- sum(source_stats[1:1]$Total_Interactions) / total_interactions * 100
cr5 <- sum(source_stats[1:5]$Total_Interactions) / total_interactions * 100
cr10 <- sum(source_stats[1:10]$Total_Interactions) / total_interactions * 100
cr20 <- sum(source_stats[1:20]$Total_Interactions) / total_interactions * 100
cr50 <- sum(source_stats[1:50]$Total_Interactions) / total_interactions * 100

# Gini coefficient
gini_coef <- ineq(source_stats$Total_Interactions, type = "Gini")

tibble(
  Metric = c("CR1 (Top 1 source)", "CR5 (Top 5 sources)", "CR10 (Top 10 sources)", 
             "CR20 (Top 20 sources)", "CR50 (Top 50 sources)", "Gini Coefficient",
             "Total Sources"),
  Value = c(
    sprintf("%.1f%%", cr1),
    sprintf("%.1f%%", cr5),
    sprintf("%.1f%%", cr10),
    sprintf("%.1f%%", cr20),
    sprintf("%.1f%%", cr50),
    sprintf("%.3f", gini_coef),
    format(n_sources_total, big.mark = ",")
  )
) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Metric	Value
CR1 (Top 1 source)	5.4%
CR5 (Top 5 sources)	23.7%
CR10 (Top 10 sources)	35.7%
CR20 (Top 20 sources)	47.7%
CR50 (Top 50 sources)	62.7%
Gini Coefficient	0.982
Total Sources	16,426

4.3 Lorenz Curve of Engagement Inequality

The Lorenz curve provides a visual representation of inequality. The x axis shows the cumulative percentage of sources (ranked from lowest to highest engagement), while the y axis shows the cumulative percentage of total engagement. Perfect equality would follow the diagonal line (each source contributes equally). The bow shaped curve below the diagonal indicates inequality, with greater deviation representing more concentration.

The Gini coefficient summarizes this visually: it equals the area between the Lorenz curve and the equality line, divided by the total area under the equality line. Values range from 0 (perfect equality) to 1 (one source captures everything).

Show code

ggplot(source_stats, aes(x = Cumulative_Sources, y = Cumulative_Interactions)) +
  geom_line(color = "#2c5f7c", linewidth = 1.2) +
  geom_abline(intercept = 0, slope = 1, linetype = "dashed", color = "red") +
  geom_ribbon(aes(ymin = Cumulative_Sources, ymax = Cumulative_Interactions), 
              fill = "#2c5f7c", alpha = 0.3) +
  annotate("text", x = 70, y = 30, 
           label = paste0("Gini = ", round(gini_coef, 3)), 
           size = 5, fontface = "bold") +
  annotate("text", x = 20, y = 80,
           label = paste0("Top 10% of sources\ncapture ", 
                          round(source_stats[Rank == round(n_sources_total * 0.1)]$Cumulative_Interactions, 1),
                          "% of engagement"),
           size = 4) +
  scale_x_continuous(labels = function(x) paste0(x, "%")) +
  scale_y_continuous(labels = function(x) paste0(x, "%")) +
  labs(
    title = "Lorenz Curve: Engagement Inequality Among Sources",
    subtitle = "Deviation from diagonal indicates concentration of visibility",
    x = "Cumulative % of Sources (ranked by engagement)",
    y = "Cumulative % of Total Engagement",
    caption = "Red dashed line = perfect equality"
  )

4.4 Log-Log Rank Plot (Power Law Test)

Many natural and social phenomena follow power law distributions, where frequency decreases exponentially with rank. If the Croatian Catholic digital space follows this pattern, a log log plot of rank versus engagement should approximate a straight line.

Interpretation:

A linear relationship on the log log scale confirms power law behavior
The slope indicates how steeply engagement drops off with rank (steeper = more concentrated)
High R squared values (>0.9) indicate the power law model fits well

Show code

# Filter to sources with positive interactions
source_positive <- source_stats[Total_Interactions > 0]

# Fit linear model on log-log scale
log_model <- lm(log10(Total_Interactions) ~ log10(Rank), data = source_positive)
slope <- coef(log_model)[2]
r_squared <- summary(log_model)$r.squared

ggplot(source_positive, aes(x = Rank, y = Total_Interactions)) +
  geom_point(alpha = 0.5, color = "#2c5f7c") +
  geom_smooth(method = "lm", color = "red", se = FALSE) +
  scale_x_log10(labels = comma) +
  scale_y_log10(labels = comma) +
  annotate("text", x = 10, y = min(source_positive$Total_Interactions) * 10,
           label = paste0("Slope = ", round(slope, 2), "\nR² = ", round(r_squared, 3)),
           hjust = 0, size = 4, fontface = "bold") +
  labs(
    title = "Rank-Engagement Distribution (Log-Log Scale)",
    subtitle = "Linear relationship suggests power law distribution",
    x = "Rank (log scale)",
    y = "Total Interactions (log scale)",
    caption = "Red line = linear fit on log-log scale"
  )

Key Finding on Stratification

A Gini coefficient above 0.8 and CR10 above 50% indicates extreme concentration. The Croatian Catholic digital media space is highly stratified, with a small elite of sources capturing most visibility. This pattern is consistent with winner take all dynamics observed in other digital media ecosystems.

5 Analysis 1.3: The Institutional Gap

Core question: Do institutional actors underperform compared to individual voices and grassroots communities?

A recurring theme in digital media research is the institutional performance gap: official organizations often struggle to match the engagement levels achieved by individual personalities and grassroots movements. This may reflect audience preferences for authentic, personal communication over formal institutional messaging, or differences in content strategy and platform adaptation.

We test this hypothesis by comparing engagement rates across actor types. The engagement rate normalizes for audience size by dividing total interactions by follower count, allowing fair comparison between large institutional accounts and smaller individual voices.

Show code

# Calculate engagement rate per source
source_engagement <- dta[!is.na(FOLLOWERS_COUNT) & FOLLOWERS_COUNT > 0, .(
  Posts = .N,
  Total_Interactions = sum(INTERACTIONS, na.rm = TRUE),
  Mean_Followers = mean(FOLLOWERS_COUNT, na.rm = TRUE),
  Engagement_Rate = sum(INTERACTIONS, na.rm = TRUE) / mean(FOLLOWERS_COUNT, na.rm = TRUE) * 100
), by = .(FROM, ACTOR_TYPE)]

# Aggregate by actor type
actor_engagement <- source_engagement[, .(
  Sources = .N,
  Total_Posts = sum(Posts),
  Total_Interactions = sum(Total_Interactions),
  Mean_Engagement_Rate = mean(Engagement_Rate, na.rm = TRUE),
  Median_Engagement_Rate = median(Engagement_Rate, na.rm = TRUE),
  SD_Engagement_Rate = sd(Engagement_Rate, na.rm = TRUE)
), by = ACTOR_TYPE][order(-Mean_Engagement_Rate)]

5.1 Engagement Rate by Actor Type

This boxplot shows the distribution of engagement rates within each actor type. The box represents the interquartile range (middle 50% of sources), the line inside is the median, and points beyond the whiskers are outliers.

Interpretation:

Higher median engagement rates indicate actor types whose content typically resonates better with audiences
Wider boxes indicate more variation within the category
Outliers may represent exceptionally successful (or unsuccessful) individual sources

Show code

# Filter out extreme outliers for visualization
engagement_plot_data <- source_engagement[Engagement_Rate < quantile(Engagement_Rate, 0.99, na.rm = TRUE)]

ggplot(engagement_plot_data, aes(x = reorder(ACTOR_TYPE, Engagement_Rate, FUN = median), 
                                  y = Engagement_Rate, fill = ACTOR_TYPE)) +
  geom_boxplot(outlier.alpha = 0.3) +
  coord_flip() +
  scale_fill_manual(values = actor_colors) +
  scale_y_continuous(labels = function(x) paste0(x, "%")) +
  labs(
    title = "Engagement Rate Distribution by Actor Type",
    subtitle = "Engagement Rate = (Total Interactions / Followers) × 100",
    x = NULL,
    y = "Engagement Rate (%)",
    caption = "Outliers above 99th percentile excluded for visualization"
  ) +
  theme(legend.position = "none")

5.2 Actor Type Performance Summary

This table summarizes engagement metrics for each actor type, including both mean and median engagement rates. The median is often more informative as it is less sensitive to outliers.

Show code

actor_engagement %>%
  mutate(
    Sources = format(Sources, big.mark = ","),
    Total_Posts = format(Total_Posts, big.mark = ","),
    Total_Interactions = format(Total_Interactions, big.mark = ","),
    Mean_Engagement_Rate = sprintf("%.2f%%", Mean_Engagement_Rate),
    Median_Engagement_Rate = sprintf("%.2f%%", Median_Engagement_Rate)
  ) %>%
  select(ACTOR_TYPE, Sources, Total_Posts, Total_Interactions, 
         Mean_Engagement_Rate, Median_Engagement_Rate) %>%
  kable(col.names = c("Actor Type", "Sources", "Posts", "Interactions", 
                      "Mean Eng. Rate", "Median Eng. Rate")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Actor Type	Sources	Posts	Interactions	Mean Eng. Rate	Median Eng. Rate
Diocesan	5	2,840	372,538	603.76%	249.53%
Lay Influencers	12	13,144	3,437,969	472.49%	501.59%
Independent Media	4	5,742	1,507,611	312.10%	216.44%
Academic	1	190	7,392	97.45%	97.45%
Institutional Official	77	3,959	217,611	16.99%	0.48%
Youth Organizations	3	51	6,697	15.97%	11.02%
Other	4,134	53,823	6,748,754	9.95%	0.67%

5.3 Statistical Comparison: Institutional vs Non-Institutional

To formally test whether institutional actors underperform, we group sources into institutional (Institutional Official, Diocesan, Academic) and non-institutional categories, then apply the Wilcoxon rank sum test. This non parametric test is appropriate because engagement rates are typically not normally distributed.

Show code

# Group into institutional vs non-institutional
source_engagement[, Institution_Group := ifelse(
  ACTOR_TYPE %in% c("Institutional Official", "Diocesan", "Academic"),
  "Institutional",
  "Non-Institutional"
)]

# Wilcoxon test (non-parametric)
wilcox_result <- wilcox.test(
  Engagement_Rate ~ Institution_Group, 
  data = source_engagement,
  alternative = "two.sided"
)

# Summary statistics by group
group_stats <- source_engagement[, .(
  N = .N,
  Mean = mean(Engagement_Rate, na.rm = TRUE),
  Median = median(Engagement_Rate, na.rm = TRUE),
  SD = sd(Engagement_Rate, na.rm = TRUE)
), by = Institution_Group]

group_stats %>%
  mutate(
    Mean = sprintf("%.2f%%", Mean),
    Median = sprintf("%.2f%%", Median),
    SD = sprintf("%.2f", SD)
  ) %>%
  kable(col.names = c("Group", "N Sources", "Mean Eng. Rate", "Median Eng. Rate", "Std. Dev.")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Group	N Sources	Mean Eng. Rate	Median Eng. Rate	Std. Dev.
Institutional	83	53.31%	0.59%	210.17
Non-Institutional	4153	11.59%	0.68%	65.50

Show code

cat("Wilcoxon Rank Sum Test Results:\n")

Wilcoxon Rank Sum Test Results:

Show code

cat("W =", wilcox_result$statistic, "\n")

W = 182808

Show code

cat("p-value =", format.pval(wilcox_result$p.value, digits = 4), "\n")

p-value = 0.3422

Show code

if (wilcox_result$p.value < 0.05) {
  cat("Result: Significant difference between institutional and non-institutional actors\n")
} else {
  cat("Result: No significant difference detected\n")
}

Result: No significant difference detected

5.4 Engagement Rate Comparison Visualization

This violin plot combines a density estimate (the violin shape shows where values cluster) with a boxplot (showing median and interquartile range). The red diamond marks the mean.

Show code

ggplot(source_engagement[Engagement_Rate < quantile(Engagement_Rate, 0.95, na.rm = TRUE)], 
       aes(x = Institution_Group, y = Engagement_Rate, fill = Institution_Group)) +
  geom_violin(alpha = 0.7) +
  geom_boxplot(width = 0.2, fill = "white", outlier.shape = NA) +
  stat_summary(fun = mean, geom = "point", shape = 18, size = 4, color = "red") +
  scale_fill_manual(values = c("Institutional" = "#1a3c5a", "Non-Institutional" = "#e07b39")) +
  scale_y_continuous(labels = function(x) paste0(x, "%")) +
  labs(
    title = "Institutional vs Non-Institutional Engagement Rates",
    subtitle = "Red diamond = mean; white box = median and IQR",
    x = NULL,
    y = "Engagement Rate (%)",
    caption = "Top 5% outliers excluded for visualization"
  ) +
  theme(legend.position = "none")

6 Analysis 1.4: Cross-Platform Presence

Core question: Which actors maintain presence across multiple platforms?

In contemporary digital media strategy, cross platform presence is often considered essential for maximizing reach and resilience. Organizations that operate on multiple platforms can reach different audience segments and reduce dependence on any single platforms algorithmic decisions. However, maintaining quality presence across platforms requires significant resources, potentially creating advantages for larger, better resourced actors.

We examine how platform presence varies across actor types and whether multi platform presence correlates with total engagement.

Show code

# Calculate cross-platform presence per source
cross_platform <- dta[, .(
  Platforms = uniqueN(SOURCE_TYPE),
  Platform_List = paste(unique(SOURCE_TYPE), collapse = ", "),
  Total_Posts = .N,
  Total_Interactions = sum(INTERACTIONS, na.rm = TRUE),
  Actor_Type = first(ACTOR_TYPE)
), by = FROM][order(-Platforms, -Total_Interactions)]

6.1 Platform Presence Distribution

This histogram shows how many sources operate on each number of platforms. Most digital actors are single platform operators, with progressively fewer maintaining presence across multiple platforms.

Show code

presence_summary <- cross_platform[, .(Sources = .N), by = Platforms][order(Platforms)]

ggplot(presence_summary, aes(x = factor(Platforms), y = Sources)) +
  geom_col(fill = "#2c5f7c", width = 0.7) +
  geom_text(aes(label = format(Sources, big.mark = ",")), vjust = -0.5, size = 3.5) +
  scale_y_continuous(labels = comma, expand = expansion(mult = c(0, 0.1))) +
  labs(
    title = "Distribution of Cross-Platform Presence",
    subtitle = "Number of sources by platform count",
    x = "Number of Platforms",
    y = "Number of Sources"
  )

6.2 Top Multi-Platform Actors

Sources present on three or more platforms represent the most diversified digital strategies. This table identifies these actors and their engagement levels.

Show code

cross_platform[Platforms >= 3][1:20] %>%
  mutate(
    Total_Posts = format(Total_Posts, big.mark = ","),
    Total_Interactions = format(Total_Interactions, big.mark = ",")
  ) %>%
  select(FROM, Actor_Type, Platforms, Platform_List, Total_Posts, Total_Interactions) %>%
  kable(col.names = c("Source", "Actor Type", "# Platforms", "Platforms", "Posts", "Interactions")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  scroll_box(height = "400px")

Source	Actor Type	# Platforms	Platforms	Posts	Interactions
index.hr	Other	3	web, comment, forum	8,095	3,312,573
jutarnji.hr	Other	3	web, comment, instagram	7,703	1,595,040
telegram.hr	Other	3	web, comment, instagram	2,902	1,164,087
Zagrebačka nadbiskupija	Diocesan	3	facebook, youtube, twitter	2,023	241,657
24sata	Other	3	facebook, youtube, twitter	512	177,044
Večernji list	Other	3	facebook, youtube, twitter	760	130,601
Sisačka biskupija	Diocesan	3	facebook, twitter, youtube	482	87,832
N1 Hrvatska	Other	3	facebook, twitter, youtube	331	39,538
023.hr	Other	3	web, facebook, twitter	774	31,242
Stephen Nikola Bartulica	Other	3	twitter, youtube, facebook	21	10,969
bljesak.info	Other	3	web, comment, instagram	108	7,769
Novosti	Other	3	twitter, facebook, youtube	81	7,299
Željana Zovko	Other	3	facebook, twitter, youtube	56	6,118
Women in Adria	Other	3	facebook, twitter, youtube	19	341
Caritas Dubrovačke biskupije	Other	3	twitter, youtube, facebook	5	22
NA	NA	NA	NA	NA	NA
NA	NA	NA	NA	NA	NA
NA	NA	NA	NA	NA	NA
NA	NA	NA	NA	NA	NA
NA	NA	NA	NA	NA	NA

6.3 Cross-Platform Presence by Actor Type

This chart compares average platform presence across actor types, revealing which categories tend toward single platform focus versus multi platform strategies.

Show code

actor_platform_summary <- cross_platform[, .(
  Sources = .N,
  Mean_Platforms = mean(Platforms),
  Multi_Platform_Share = sum(Platforms >= 2) / .N * 100,
  Total_Interactions = sum(Total_Interactions)
), by = Actor_Type][order(-Mean_Platforms)]

ggplot(actor_platform_summary, aes(x = reorder(Actor_Type, Mean_Platforms), 
                                    y = Mean_Platforms, fill = Actor_Type)) +
  geom_col(width = 0.7) +
  geom_text(aes(label = sprintf("%.2f", Mean_Platforms)), hjust = -0.1, size = 3.5) +
  coord_flip() +
  scale_fill_manual(values = actor_colors) +
  scale_y_continuous(limits = c(0, max(actor_platform_summary$Mean_Platforms) * 1.15)) +
  labs(
    title = "Average Platform Presence by Actor Type",
    subtitle = "Mean number of platforms per source",
    x = NULL,
    y = "Average Number of Platforms"
  ) +
  theme(legend.position = "none")

6.4 Correlation: Multi-Platform Presence and Engagement

Does operating on more platforms lead to greater total engagement? This analysis tests the relationship using Spearman correlation (appropriate for non linear relationships and non normal distributions).

Interpretation:

Positive correlation suggests multi platform strategies associate with higher engagement
However, correlation does not imply causation: successful actors may simply have resources for both high engagement and multi platform presence

Show code

# Calculate correlation
cor_test <- cor.test(cross_platform$Platforms, cross_platform$Total_Interactions, 
                     method = "spearman")

ggplot(cross_platform[Total_Interactions > 0], 
       aes(x = factor(Platforms), y = Total_Interactions)) +
  geom_boxplot(fill = "#2c5f7c", alpha = 0.7, outlier.alpha = 0.3) +
  scale_y_log10(labels = comma) +
  labs(
    title = "Engagement by Number of Platforms",
    subtitle = paste0("Spearman correlation: ρ = ", round(cor_test$estimate, 3),
                      ", p ", ifelse(cor_test$p.value < 0.001, "< 0.001", 
                                     paste0("= ", round(cor_test$p.value, 4)))),
    x = "Number of Platforms",
    y = "Total Interactions (log scale)"
  )

7 Summary and Key Findings

Show code

# Compile key findings

7.1 Platform Distribution Findings

Volume distribution: Web content dominates in volume, but social platforms show higher engagement rates
Engagement efficiency: Instagram and Facebook likely show efficiency indices above 1.0, indicating higher resonance per post
Strategic implications: Catholic communicators seeking engagement should prioritize social platforms, while web remains important for searchability and archival purposes

7.2 Visibility Stratification Findings

Extreme concentration: The top 10 sources capture a disproportionate share of total engagement
Gini coefficient: Value of 0.982 indicates very high inequality in visibility distribution
Power law: The rank engagement relationship follows a power law pattern typical of networked media systems
Implications: The Croatian Catholic digital space exhibits strong winner take all dynamics, raising questions about diversity of voices

7.3 Institutional Gap Findings

Engagement rate differential: Non institutional actors (individual priests, charismatic communities) show different engagement rates than institutional accounts
Statistical significance: The Wilcoxon test indicates whether this difference is statistically meaningful
Implications: If confirmed, this suggests institutional actors may need to adapt their communication strategies to compete for attention

7.4 Cross-Platform Presence Findings

Single platform dominance: Most sources operate on only one platform
Multi platform advantage: Sources present on multiple platforms tend to accumulate more total engagement
Actor type patterns: Independent media shows strongest multi platform integration, suggesting greater strategic sophistication in digital presence

8 Appendix: Technical Notes

8.1 Classification System Details

The actor classification employs the following category definitions:

Actor Type	Definition
Institutional Official	Central Church institutions: Bishops Conference, Catholic Network (HKM), Information Agency (IKA), Catholic Radio (HKR)
Diocesan	Diocese and parish level communications including archdioceses, dioceses, eparchies, and individual parishes
Independent Media	Catholic media outlets operating independently of Church hierarchy: Laudato TV, Bitno.net, Glas Koncila, etc.
Religious Orders	Communications from religious orders and congregations including Franciscans, Jesuits, Dominicans, and female religious
Charismatic Communities	Renewal movements and charismatic communities: Bozja Pobjeda, Cenacolo, Neokatekumenat, etc.
Individual Priests	Named clergy identified by clerical titles (Fra, Don, Msgr, etc.)
Youth Organizations	Youth ministry organizations: FRAMA, SHKM, university chaplaincies
Academic	Catholic educational institutions: Croatian Catholic University, theological faculties
Lay Influencers	Devotional and faith focused social media pages run by laity
Other	Secular media covering religious topics, unidentified sources, general public discourse

Show code

sessionInfo()

R version 4.5.2 (2025-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)

Matrix products: default
  LAPACK version 3.12.1

locale:
[1] LC_COLLATE=Croatian_Croatia.utf8  LC_CTYPE=Croatian_Croatia.utf8   
[3] LC_MONETARY=Croatian_Croatia.utf8 LC_NUMERIC=C                     
[5] LC_TIME=Croatian_Croatia.utf8    

time zone: Europe/Zagreb
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ineq_0.2-13       kableExtra_1.4.0  knitr_1.50        scales_1.4.0     
 [5] data.table_1.17.8 lubridate_1.9.4   forcats_1.0.1     stringr_1.6.0    
 [9] dplyr_1.1.4       purrr_1.2.0       readr_2.1.6       tidyr_1.3.1      
[13] tibble_3.3.0      ggplot2_4.0.1     tidyverse_2.0.0  

loaded via a namespace (and not attached):
 [1] generics_0.1.4     xml2_1.5.1         lattice_0.22-7     stringi_1.8.7     
 [5] hms_1.1.4          digest_0.6.39      magrittr_2.0.4     evaluate_1.0.5    
 [9] grid_4.5.2         timechange_0.3.0   RColorBrewer_1.1-3 fastmap_1.2.0     
[13] Matrix_1.7-4       jsonlite_2.0.0     mgcv_1.9-3         viridisLite_0.4.2 
[17] textshaping_1.0.4  cli_3.6.5          rlang_1.1.6        splines_4.5.2     
[21] withr_3.0.2        yaml_2.3.11        tools_4.5.2        tzdb_0.5.0        
[25] vctrs_0.6.5        R6_2.6.1           lifecycle_1.0.4    htmlwidgets_1.6.4 
[29] pkgconfig_2.0.3    pillar_1.11.1      gtable_0.3.6       glue_1.8.0        
[33] systemfonts_1.3.1  xfun_0.54          tidyselect_1.2.1   rstudioapi_0.17.1 
[37] dichromat_2.0-0.1  farver_2.1.2       nlme_3.1-168       htmltools_0.5.8.1 
[41] rmarkdown_2.30     svglite_2.2.2      labeling_0.4.3     compiler_4.5.2    
[45] S7_0.2.1