MEData is a dataset of underwater fish census surveys conducted along the Mediterranean sea. The dataset includes environmental variables, as well as conservation information regarding MPAs.
medata <- read_rds("data/medata.Rds")
| variable | type | comments |
|---|---|---|
data.origin |
fct | azz_asi = Azzuro; azz_malta = Azzuro; Belmaker = Belmaker; Claudet = Claudet; Sala - PEW = Enric Sala; Periklis_MER = Periklis Kleitou, MER Lab |
country |
fct | Country where site is located |
season |
fct | The season in which the survey was conducted; levels:
Autumn, Spring, Summer |
lon |
num | longitude (decimal degrees); approximate for Linosa, Kornati |
lat |
num | latitude (decimal degrees); approximate for Linosa, Kornati |
site |
fct | site name |
trans |
int | transect number |
unique_trans_id |
int | unique identifier for each transect (unique site + transect + coordinates + depth) |
protection |
lgl | inside MPA/unfished (TRUE) or outside
MPA/fished (FALSE); linked to enforcement
level |
enforcement |
int | type of enforcement. 3 levels: 1 = Minimal
enforcement; 2 = Medium enforcement; 3 = Fully
protected (strongest enforcement). Complements
protection |
total.mpa.ha |
num | total MPA area in hactare; Source for Ustica; Source for Marettimo |
size.notake |
num | area size of no-take zone (in ha); Source for Ustica |
yr.creation |
int | MPA establishment year |
age.reserve.yr |
int | age of the MPA in years (corresponds to
yr.creation) at the time of survey |
depth |
num | depth of survey in meters |
tmean |
num | annual mean sea surface temperature temperature (Source: Bio-ORACLE) |
sal_mean |
num | annual mean salinity (Source: Bio-ORACLE) |
pp_mean |
num | annual mean primary productivity (Source: Bio-ORACLE) |
species |
fct | species scientific name (format: Genus.species); some species are identified to Genus level (includes ‘spp’ suffix) or family level (includes -dae suffix) |
sp.n |
int | fish count - how many individuals of this species were observed? |
sp.length |
num | length of fish in cm |
a |
num | species length-weight relationship constant (Source: FishBase; Type = TL; using method ‘Type I linear regression’) |
b |
num | species length-weight relationship constant (Source: FishBase; Type = TL; using method ‘Type I linear regression’) |
family |
fct | taxonomic family |
exotic |
lgl | whether this species is local (indigenous,
FALSE) or introduced (lessepsian migrant,
TRUE) |
FoodTroph |
num | trophic level of the species. Extracted from FishBase |
FoodSeTroph |
num | standard error for trophic level calculation
(FoodTroph. Extracted from FishBase |
summary(medata)
## data.origin country season lon
## azz_asi : 3875 Italy :13380 autumn:18275 Min. : 1.159
## azz_malt : 4179 Israel :10324 spring: 5464 1st Qu.: 8.349
## Belmaker :25679 Greece : 6424 summer:21015 Median :15.751
## Sala - PEW : 9111 France : 3617 Mean :19.375
## Periklis_MER: 1910 Croatia: 3508 3rd Qu.:34.073
## Spain : 2959 Max. :35.076
## (Other): 4542
## lat site trans unique_trans_id
## Min. :32.42 asinara_add: 3033 Min. : 1 Length:44754
## 1st Qu.:34.97 gdor : 2941 1st Qu.: 239 Class :character
## Median :36.74 achziv : 2900 Median :1432 Mode :character
## Mean :37.63 shikmona : 2373 Mean :1232
## 3rd Qu.:41.05 habonim : 2110 3rd Qu.:2000
## Max. :44.94 malta : 1582 Max. :2392
## (Other) :29815 NA's :35
## protection enforcement total.mpa.ha size.notake
## Mode :logical 0 :18648 Min. : 15.95 Min. : 0.0
## FALSE:18648 1 : 7780 1st Qu.: 191.00 1st Qu.: 167.7
## TRUE :25784 2 : 9745 Median : 785.00 Median : 519.2
## NA's :322 3 : 8259 Mean : 5487.56 Mean : 2950.9
## NA's: 322 3rd Qu.: 2375.00 3rd Qu.: 2651.0
## Max. :207000.00 Max. :15000.0
## NA's :16093 NA's :19618
## yr.creation age.reserve.yr depth tmean
## 2002 : 8879 1 : 4030 Min. : 1.000 Min. :16.78
## 1962 : 3617 55 : 3617 1st Qu.: 5.200 1st Qu.:18.66
## 1960 : 3508 57 : 3508 Median :10.000 Median :20.09
## 1986 : 3126 9 : 3193 Mean : 9.365 Mean :20.08
## 2004 : 2941 32 : 3126 3rd Qu.:11.000 3rd Qu.:22.11
## (Other):11316 (Other):15913 Max. :29.000 Max. :23.00
## NA's :11367 NA's :11367 NA's :16376 NA's :1366
## sal_mean pp_mean species sp.n
## Min. :33.99 Min. :0.0004 Length:44754 Min. : 0.00
## 1st Qu.:37.67 1st Qu.:0.0005 Class :character 1st Qu.: 1.00
## Median :37.89 Median :0.0008 Mode :character Median : 1.00
## Mean :38.08 Mean :0.0029 Mean : 16.64
## 3rd Qu.:38.79 3rd Qu.:0.0062 3rd Qu.: 4.00
## Max. :39.25 Max. :0.0096 Max. :10000.00
## NA's :1366 NA's :1366
## sp.length family exotic FoodTroph
## Min. : 0.00 Labridae :19351 Mode :logical Min. :2.000
## 1st Qu.: 8.00 Sparidae :10028 FALSE:41157 1st Qu.:3.240
## Median : 10.00 Serranidae : 4182 TRUE :3549 Median :3.340
## Mean : 11.66 Pomacentridae: 2956 NA's :48 Mean :3.271
## 3rd Qu.: 15.00 Siganidae : 2403 3rd Qu.:3.500
## Max. :150.00 Scaridae : 1341 Max. :4.500
## NA's :842 (Other) : 4493 NA's :827
## FoodSeTroph a b
## Min. :0.0000 Min. :0.001 Min. :2.429
## 1st Qu.:0.4100 1st Qu.:0.011 1st Qu.:2.892
## Median :0.4300 Median :0.015 Median :3.042
## Mean :0.4228 Mean :0.017 Mean :3.017
## 3rd Qu.:0.4700 3rd Qu.:0.020 3rd Qu.:3.122
## Max. :0.9100 Max. :0.062 Max. :3.482
## NA's :827 NA's :8444 NA's :8444
skimr::skim(medata)
| Name | medata |
| Number of rows | 44754 |
| Number of columns | 27 |
| _______________________ | |
| Column type frequency: | |
| character | 2 |
| factor | 8 |
| logical | 2 |
| numeric | 15 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| unique_trans_id | 0 | 1 | 11 | 55 | 0 | 2269 | 0 |
| species | 0 | 1 | 8 | 28 | 0 | 135 | 0 |
Variable type: factor
| skim_variable | n_missing | complete_rate | ordered | n_unique | top_counts |
|---|---|---|---|---|---|
| data.origin | 0 | 1.00 | FALSE | 5 | Bel: 25679, Sal: 9111, azz: 4179, azz: 3875 |
| country | 0 | 1.00 | FALSE | 9 | Ita: 13380, Isr: 10324, Gre: 6424, Fra: 3617 |
| season | 0 | 1.00 | FALSE | 3 | sum: 21015, aut: 18275, spr: 5464 |
| site | 0 | 1.00 | FALSE | 84 | asi: 3033, gdo: 2941, ach: 2900, shi: 2373 |
| enforcement | 322 | 0.99 | FALSE | 4 | 0: 18648, 2: 9745, 3: 8259, 1: 7780 |
| yr.creation | 11367 | 0.75 | FALSE | 16 | 200: 8879, 196: 3617, 196: 3508, 198: 3126 |
| age.reserve.yr | 11367 | 0.75 | FALSE | 20 | 1: 4030, 55: 3617, 57: 3508, 9: 3193 |
| family | 0 | 1.00 | FALSE | 41 | Lab: 19351, Spa: 10028, Ser: 4182, Pom: 2956 |
Variable type: logical
| skim_variable | n_missing | complete_rate | mean | count |
|---|---|---|---|---|
| protection | 322 | 0.99 | 0.58 | TRU: 25784, FAL: 18648 |
| exotic | 48 | 1.00 | 0.08 | FAL: 41157, TRU: 3549 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| lon | 0 | 1.00 | 19.38 | 11.59 | 1.16 | 8.35 | 15.75 | 34.07 | 35.08 | ▅▇▂▅▇ |
| lat | 0 | 1.00 | 37.63 | 3.85 | 32.42 | 34.97 | 36.74 | 41.05 | 44.94 | ▇▇▃▅▅ |
| trans | 35 | 1.00 | 1231.51 | 866.81 | 1.00 | 239.00 | 1432.00 | 2000.00 | 2392.00 | ▇▁▅▅▇ |
| total.mpa.ha | 16093 | 0.64 | 5487.56 | 20976.18 | 15.95 | 191.00 | 785.00 | 2375.00 | 207000.00 | ▇▁▁▁▁ |
| size.notake | 19618 | 0.56 | 2950.91 | 5209.93 | 0.00 | 167.70 | 519.20 | 2651.00 | 15000.00 | ▇▁▁▁▂ |
| depth | 16376 | 0.63 | 9.37 | 5.36 | 1.00 | 5.20 | 10.00 | 11.00 | 29.00 | ▆▇▂▁▁ |
| tmean | 1366 | 0.97 | 20.08 | 2.02 | 16.78 | 18.66 | 20.09 | 22.11 | 23.00 | ▅▆▇▂▇ |
| sal_mean | 1366 | 0.97 | 38.08 | 0.86 | 33.99 | 37.67 | 37.89 | 38.79 | 39.25 | ▁▁▁▇▇ |
| pp_mean | 1366 | 0.97 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | ▇▂▁▃▁ |
| sp.n | 0 | 1.00 | 16.64 | 152.29 | 0.00 | 1.00 | 1.00 | 4.00 | 10000.00 | ▇▁▁▁▁ |
| sp.length | 842 | 0.98 | 11.66 | 7.69 | 0.00 | 8.00 | 10.00 | 15.00 | 150.00 | ▇▁▁▁▁ |
| FoodTroph | 827 | 0.98 | 3.27 | 0.40 | 2.00 | 3.24 | 3.34 | 3.50 | 4.50 | ▁▁▇▁▁ |
| FoodSeTroph | 827 | 0.98 | 0.42 | 0.13 | 0.00 | 0.41 | 0.43 | 0.47 | 0.91 | ▁▁▇▁▁ |
| a | 8444 | 0.81 | 0.02 | 0.01 | 0.00 | 0.01 | 0.01 | 0.02 | 0.06 | ▅▇▁▁▁ |
| b | 8444 | 0.81 | 3.02 | 0.14 | 2.43 | 2.89 | 3.04 | 3.12 | 3.48 | ▁▁▇▆▁ |
Data was collected from 9 countries:
medata %>% distinct(country) %>% arrange(.$country) %>% print(n = Inf)
## # A tibble: 9 × 1
## country
## <fct>
## 1 Croatia
## 2 France
## 3 Greece
## 4 Israel
## 5 Italy
## 6 Malta
## 7 Spain
## 8 Turkey
## 9 Cyprus
Data was collected from 72 sites:
Dataset withholds 124 species
medata %>% distinct(species) %>% arrange(.$species) %>% count()
## # A tibble: 1 × 1
## n
## <int>
## 1 135
# proper species
medata %>% distinct(species) %>% arrange(.$species) %>%
filter(!c(grepl("dae", species) | grepl(".spp", species))) %>%
print(n = Inf)
## # A tibble: 124 × 1
## species
## <chr>
## 1 Abudefduf.saxatilis
## 2 Anthias.anthias
## 3 Apogon.imberbis
## 4 Atherina.boyeri
## 5 Balistes.capriscus
## 6 Belone.belone
## 7 Boops.boops
## 8 Bothus.podas
## 9 Caranx.crysos
## 10 Cheilodipterus.novemstriatus
## 11 Chelon.labrosus
## 12 Chromis.chromis
## 13 Conger.conger
## 14 Coris.julis
## 15 Ctenolabrus.rupestris
## 16 Dactylopterus.volitans
## 17 Dasyatis.pastinaca
## 18 Dentex.dentex
## 19 Dicentrarchus.labrax
## 20 Dicentrarchus.punctatus
## 21 Diplodus.annularis
## 22 Diplodus.cervinus
## 23 Diplodus.puntazzo
## 24 Diplodus.sargus
## 25 Diplodus.vulgaris
## 26 Epinephelus.aeneus
## 27 Epinephelus.caninus
## 28 Epinephelus.costae
## 29 Epinephelus.marginatus
## 30 Euthynnus.alletteratus
## 31 Fistularia.commersonii
## 32 Gobius.auratus
## 33 Gobius.bucchichi
## 34 Gobius.cobitis
## 35 Gobius.cruentatus
## 36 Gobius.geniporus
## 37 Gobius.incognitus
## 38 Gobius.paganellus
## 39 Gobius.vittatus
## 40 Gobius.xanthocephalus
## 41 Herklotsichthys.punctatus
## 42 Labrus.merula
## 43 Labrus.mixtus
## 44 Labrus.viridis
## 45 Lagocephalus.sceleratus
## 46 Lichia.amia
## 47 Lithognathus.mormyrus
## 48 Liza.aurata
## 49 Mugil.cephalus
## 50 Mullus.barbatus
## 51 Mullus.surmuletus
## 52 Muraena.helena
## 53 Mycteroperca.rubra
## 54 Myliobatis.aquila
## 55 Oblada.melanura
## 56 Oedalechilus.labeo
## 57 Pagellus.acarne
## 58 Pagrus.caeruleostictus
## 59 Pagrus.pagrus
## 60 Parablennius.gattorugine
## 61 Parablennius.incognitus
## 62 Parablennius.pilicornis
## 63 Parablennius.rouxi
## 64 Parablennius.sanguinolentus
## 65 Parablennius.tentacularis
## 66 Parablennius.zvonimiri
## 67 Parupeneus.forsskali
## 68 Pempheris.rhomboidea
## 69 Pempheris.vanicolensis
## 70 Phycis.phycis
## 71 Plotosus.lineatus
## 72 Pomadasys.incisus
## 73 Pomatomus.saltatrix
## 74 Pomatoschistus.quagga
## 75 Pseudocaranx.dentex
## 76 Pteragogus.pelycus
## 77 Pteragogus.trispilus
## 78 Pterois.miles
## 79 Sardina.pilchardus
## 80 Sargocentron.rubrum
## 81 Sarpa.salpa
## 82 Scartella.cristata
## 83 Scarus.ghobban
## 84 Sciaena.umbra
## 85 Scorpaena.maderensis
## 86 Scorpaena.notata
## 87 Scorpaena.porcus
## 88 Scorpaena.scrofa
## 89 Seriola.dumerili
## 90 Serranus.cabrilla
## 91 Serranus.hepatus
## 92 Serranus.scriba
## 93 Siganus.luridus
## 94 Siganus.rivulatus
## 95 Sparisoma.cretense
## 96 Sparus.aurata
## 97 Sphyraena.chrysotaenia
## 98 Sphyraena.sphyraena
## 99 Sphyraena.viridensis
## 100 Spicara.maena
## 101 Spicara.smaris
## 102 Spondyliosoma.cantharus
## 103 Stephanolepis.diaspros
## 104 Symphodus.cinereus
## 105 Symphodus.doderleini
## 106 Symphodus.mediterraneus
## 107 Symphodus.melanocercus
## 108 Symphodus.ocellatus
## 109 Symphodus.roissali
## 110 Symphodus.rostratus
## 111 Symphodus.tinca
## 112 Synodus.saurus
## 113 Taeniura.grabata
## 114 Thalassoma.pavo
## 115 Torquigener.flavimaculosus
## 116 Trachinotus.ovatus
## 117 Trachinus.draco
## 118 Trachurus.mediterraneus
## 119 Tripterygion.delaisi
## 120 Tripterygion.melanurus
## 121 Tripterygion.tripteronotus
## 122 Trisopterus.minutus
## 123 Upeneus.pori
## 124 Xyrichtys.novacula
NOTE: There are 135 taxa in the dataset but some are not species but genus (Atherina.spp, Symphodus.spp etc.) or family (Labridae, Clupeidae, Belonidae etc.).
Visual fish census surveys took place between the years 2009-2019 in multiple locations along the Mediterranean Sea (figure 1) by teams of skilled SCUBA divers. Locations were comprised of sites within MPAs with varying size, age and enforcement level, and unprotected sites which are adjacent to these MPAs. To date, the database consists of 44,754 observations, in 2,270 transects, of 135 species of fish and includes mostly abundance data.
Figure 1. Survey sites and temperatures (in degrees celsius).
The sampling protocol is described in Frid Ori, et al., 2022.
Environmental data (temperature, salinity and
primary production) were acquired from Bio-ORACLE using
sdmpredictors package.
Layers:
(See “R/bio-oracle extraction code.R” for full extraction code).
Linosa and kornati did not have specific coordinates, therefore, an approximate location (lat-lon) was attached to it. If your analysis requires fine-detail for the location, you might want to omit these locations.
Data from source asinara_add site are
presence-absence only.
Many analyses require species matrix where each row is a site and each column is a species.
Here’s the code to create such matrix with this dataset:
# Create a species matrix with where rows are transects and columns are species
# (id_cols can be minimised but I keep it for explicitness)
med_mat <- medata %>%
pivot_wider(id_cols = c(data.origin, country, season, lon, lat, site, trans, unique_trans_id, protection, enforcement, total.mpa.ha, size.notake, yr.creation, age.reserve.yr, depth, tmean, sal_mean, pp_mean),
names_from = species, values_from = sp.n,
values_fn = function(x) sum(x, na.rm = T), values_fill = 0)
# Convert the abundance data to pseudo presence-absence
pres_abs_mat <- med_mat
first_species <- which(colnames(med_mat) == "Atherina.boyeri")
pres_abs_mat[first_species:ncol(pres_abs_mat)] <- ifelse(pres_abs_mat[first_species:ncol(pres_abs_mat)] > 0, 1, 0)
Just pay attention that here there is also metadata in the left columns (until the first species). Some ecological anaylses require the data to be a real matrix (only made of one type of data - the abundance of each species), and in this case just make sure you separate the meta-data from the species matrix.
col_scale <- fishualize::fish(n = 9, option = "Thalassoma_pavo")
medata %>%
group_by(country, lon, lat) %>%
distinct(species) %>% summarise(richness = n()) %>%
ggplot() +
geom_sf(data = med_map$geometry) +
geom_point(aes(x = lon, y = lat, size = richness, col = country), pch = 21, alpha = 0.4) +
scale_colour_manual(values = col_scale, name = "Country") +
labs(x = "", y = "", size = "Richness")
## `summarise()` has grouped output by 'country', 'lon'. You can override using
## the `.groups` argument.
medata %>% filter(!is.na(exotic)) %>%
ggplot() + aes(x = exotic) + geom_bar(fill = c("#62a1c7", "#d53748")) +
labs(x = "Exotic (Lessepsian migrant)", y = "Total observations count")
medata %>% filter(!is.na(exotic)) %>% distinct(species, exotic) %>%
ggplot() + aes(x = exotic) + geom_bar(fill = c("#62a1c7", "#d53748")) +
labs(x = "Exotic (Lessepsian migrant)", y = "Total species count")
Froese, R. and D. Pauly. Editors. 2021.FishBase. World Wide Web electronic publication. www.fishbase.org, (06/2021)
Boettiger C, Temple Lang D, Wainwright P (2012). “rfishbase: exploring, manipulating and visualizing FishBase data from R.” Journal of Fish Biology
Assis J, Tyberghein L, Bosch S, Heroen V, Serrão E, De Clerck O, Tittensor D (2018). Bio-ORACLE v2.0: Extending marine data layers for bioclimatic modelling.” Global Ecology and Biogeography, 27(3),277-284. doi: 10.1111/geb.12693 (https://doi.org/10.1111/geb.12693).
Tyberghein L, Heroen V, Pauly K, Troupin C, Mineur F, De Clerck O (2012). Bio-ORACLE: a global environmental dataset for marine speciesdistribution modelling." Global Ecology and Biogeography, 21(2),272-281. doi: 10.1111/j.1466-8238.2011.00656.x (https://doi.org/10.1111/j.1466-8238.2011.00656.x).