class: center, middle, inverse, title-slide .title[ # Session 6 - Spatial Data ] .subtitle[ ##
R for Data Analysis
] .author[ ### DIME Analytics ] .date[ ### The World Bank – DIME |
WB Github
April 2025 ] --- # Table of contents .large[ 1. Overview of GIS concepts 2. Load and explore polygons, polylines, and points 3. Static maps 4. Interactive maps 5. Spatial operations applied on one dataset 6. Spatial operations applied on multiple datasets ] --- # Overview of GIS conceps __Spatial data:__ The two main types of spatial data are __vector data__ and __raster data__ .pull-left[ __Vector data__ * Points, lines, or polygons * Common file formats include shapefiles (.shp) and geojsons (.geojson) * Examples: polygons on countries, polylines of roads, points of schools ] .pull-right[ __Raster data__ * Spatially referenced grid * Common file format is a geotif (.tif) * Example: Satellite imagery of nighttime lights ] .center[ <img src="img/Simple_vector_map.svg.png" width = 300> <img src="img/raster.png" width = 300> ] --- # Coordinate Reference Systems (CRS) * Coordinate reference systems use pairs of numbers to define a location on the earth * For example, the World Bank is at a latitude of 38.89 and a longitude of -77.04 .center[ <img src="img/googlemaps_worldbank.png" width = 700> ] --- # Coordinate Reference Systems (CRS) There are many different coordinate reference systems, which can be grouped into __geographic__ and __projected__ coordinate reference systems. Geographic systems live on a sphere, while projected systems are “projected” onto a flat surface. .center[ <img src="img/geo_proj_crs.png" width = 800> ] --- # Geographic Coordinate Systems .pull-left[ __Units:__ Defined by latitude and longitude, which measure angles and units are typically in decimal degrees. (Eg, angle is latitude from the equator). __Latitude & Longitude:__ * On a grid X = longitude, Y = latitude; sometimes represented as (longitude, latitude). * Also has become convention to report them in alphabetical order: (latitude, longitude) — such as in Google Maps. * Valid range of latitude: -90 to 90 * Valid range of longitude: -180 to 180 * __{Tip}__ Latitude sounds (and looks!) like latter. ] .center[ <img src="img/longlat.png" width = 500> ] --- # Geographic Coordinate Systems .pull-left[ __Distance on a sphere__ * At the equator (latitude = 0), a 1 decimal degree longitude distance is about 111km; towards the poles (latitude = -90 or 90), a 1 decimal degree longitude distance converges to 0 km. * We must be careful (ie, use algorithms that account for a spherical earth) to calculate distances! The distance along a sphere is referred to as a [great circle distance](https://en.wikipedia.org/wiki/Great-circle_distance). * Multiple options for spherical distance calculations, with trade-off between accuracy & complexity. (See distance section for details). ] .pull-right[ .center[ <img src="img/longitude_distance.png" width = 300> ] .center[ <img src="img/greatcircle.png" width = 400> ] ] --- # Geographic Coordinate Systems .pull-left[ __Datums__ * __Is the earth flat?__ No! * __Is the earth a sphere?__ No! * __Is the earth a lumpy ellipsoid?__ [Yes!](https://oceanservice.noaa.gov/facts/earth-round.html#:~:text=The%20Earth%20is%20an%20irregularly%20shaped%20ellipsoid.&text=While%20the%20Earth%20appears%20to,unique%20and%20ever%2Dchanging%20shape.) The earth is a lumpy ellipsoid, a bit flattened at the poles. * A [datum](https://www.maptoaster.com/maptoaster-topo-nz/articles/projection/datum-projection.html) is a model of the earth that is used in mapping. One of the most common datums is [WGS 84](https://en.wikipedia.org/wiki/World_Geodetic_System), which is used by the Global Positional System (GPS). * A datum is a reference ellipsoid that approximates the shape of the earth. * Other datums exist, and the latitude and longitude values for a specific location will be different depending on the datum. ] .pull-right[ .center[ <img src="img/datum1.png" width = 300> ] .center[ <img src="img/datum2.png" width = 300> ] ] --- # Projected Coordinate Systems .pull-left[ Projected coordinate systems project spatial data from a 3D to 2D surface. __Distortions:__ Projections will distort some combination of distance, area, shape or direction. Different projections can minimize distorting some aspect at the expense of others. __Units:__ When projected, points are represented as “northings” and “eastings.” Values are often represented in meters, where northings/eastings are the meter distance from some reference point. Consequently, values can be very large! __Datums still relevant:__ Projections start from some representation of the earth. Many projections (eg, [UTM](https://en.wikipedia.org/wiki/Universal_Transverse_Mercator_coordinate_system)) use the WGS84 datum as a starting point (ie, reference datum), then project it onto a flat surface. ] .pull-right[ Click [here](https://www.youtube.com/watch?v=eLqC3FNNOaI) to see why Toby & CJ are confused (hint: projections!) .center[ <img src="img/westwing.png" width = 400> ] .center[ <img src="img/mercator_galls.png" width = 400> ] ] --- # Projected Coordinate Systems .center[ <img src="img/map_xkcd.png" width = 1000> ] --- # Referencing coordinate reference systems .large[ * There are many ways to reference coordinate systems, some of which are verbose. * __PROJ__ (Library for projections) way of referencing WGS84 `+proj=longlat +datum=WGS84 +no_defs +type=crs` * __[EPSG](https://epsg.io/)__ Assigns numeric code to CRSs to make it easier to reference. Here, WGS84 is `4326`. ] --- # Coordinate Reference Systems Whenever have spatial data, need to know which coordinate reference system (CRS) the data is in. * You wouldn’t say __“I am 5 away”__ * You would say __“I am 5 [miles / kilometers / minutes / hours] away”__ (units!) * Similarly, a “complete” way to describe location would be: I am at __6.51 latitude, 3.52 longitude using the WGS 84 CRS__ --- # Introduction - This session could be a whole course on its own, but we only have an hour and half. - To narrow our subject, we will focus on only one type of spatial data, vector data. - This is the most common type of spatial data that non-GIS experts will encounter in their work. - We will use the `sf` package, which is the tidyverse-compatible package for geospatial data in R. - For visualizing, we'll rely on `ggplot2` for static maps and `leaflet` for interactive maps --- # Setup .panelset[ .panel[.panel-name[If You Attended Session 2] 1. Go to the `dime-r-training` folder that you created, and open the file `dime-r-training.Rproj` R project that you created there. ] .panel[.panel-name[If You Did Not Attend Session 2] 1. Copy/paste the following code into a new RStudio script: ```r install.packages("usethis") library(usethis) usethis::use_zip( "https://github.com/worldbank/dime-r-training/archive/main.zip", cleanup = TRUE ) ``` 2\. A new RStudio environment will open. Use this for the session today. ] ] --- # Setup Install new packages ```r install.packages(c("sf", "leaflet", "geosphere"), dependencies = TRUE) ``` And load them ```r library(here) library(tidyverse) library(sf) # Simple features library(leaflet) # Interactive map library(geosphere) # Great circle distances ``` --- # Load and explore polylines, polylines, and points The main package we'll rely on is the `sf` (simple features) package. With `sf`, spatial data is structured similarly to a __dataframe__; however, each row is associated with a __geometry__. Geometries can be one of the below types. .center[ <img src="img/geom_types.png" width = 400> ] --- # Load and explore polygon The first thing we will do in this session is to recreate this data set: ```r country_sf <- st_read(here("DataWork", "DataSets", "Final", "country.geojson")) ``` ``` ## Reading layer `country' from data source `C:\WBG\repos\dime-r-training\DataWork\DataSets\Final\country.geojson' using driver `GeoJSON' ## Simple feature collection with 300 features and 13 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: 33.90959 ymin: -4.720417 xmax: 41.92622 ymax: 5.061166 ## Geodetic CRS: WGS 84 ``` --- # Exploring the data Look at first few observations ```r head(country_sf) ``` ``` ## Simple feature collection with 6 features and 13 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: 35.52292 ymin: -0.198901 xmax: 36.29659 ymax: 0.990413 ## Geodetic CRS: WGS 84 ## GID_2 GID_0 COUNTRY GID_1 NAME_1 NL_NAME_1 NAME_2 VARNAME_2 NL_NAME_2 TYPE_2 ENGTYPE_2 ## 1 KEN.1.1_1 KEN Kenya KEN.1_1 Baringo <NA> 805 <NA> <NA> Constituency Constituency ## 2 KEN.1.2_1 KEN Kenya KEN.1_1 Baringo <NA> Baringo Central <NA> <NA> Constituency Constituency ## 3 KEN.1.3_1 KEN Kenya KEN.1_1 Baringo <NA> Baringo North <NA> <NA> Constituency Constituency ## 4 KEN.1.4_1 KEN Kenya KEN.1_1 Baringo <NA> Baringo South <NA> <NA> Constituency Constituency ## 5 KEN.1.5_1 KEN Kenya KEN.1_1 Baringo <NA> Eldama Ravine <NA> <NA> Constituency Constituency ## 6 KEN.1.6_1 KEN Kenya KEN.1_1 Baringo <NA> Mogotio <NA> <NA> Constituency Constituency ## CC_2 HASC_2 geometry ## 1 162 <NA> MULTIPOLYGON (((35.87727 -0... ## 2 159 <NA> MULTIPOLYGON (((35.7977 0.3... ## 3 158 <NA> MULTIPOLYGON (((35.81346 0.... ## 4 160 <NA> MULTIPOLYGON (((36.22934 0.... ## 5 162 <NA> MULTIPOLYGON (((35.82341 0.... ## 6 161 <NA> MULTIPOLYGON (((36.10947 0.... ``` --- # Exploring the data Number of rows ```r nrow(country_sf) ``` ``` ## [1] 300 ``` --- # Exploring the data Check coordinate reference system ```r st_crs(country_sf) ``` ``` ## Coordinate Reference System: ## User input: WGS 84 ## wkt: ## GEOGCRS["WGS 84", ## DATUM["World Geodetic System 1984", ## ELLIPSOID["WGS 84",6378137,298.257223563, ## LENGTHUNIT["metre",1]]], ## PRIMEM["Greenwich",0, ## ANGLEUNIT["degree",0.0174532925199433]], ## CS[ellipsoidal,2], ## AXIS["geodetic latitude (Lat)",north, ## ORDER[1], ## ANGLEUNIT["degree",0.0174532925199433]], ## AXIS["geodetic longitude (Lon)",east, ## ORDER[2], ## ANGLEUNIT["degree",0.0174532925199433]], ## ID["EPSG",4326]] ``` --- # Exploring the data Plot the data. To plot using `ggplot2`, we use the `geom_sf` geometry. ```r ggplot() + geom_sf(data = country_sf) ``` <img src="06-spatial-data_files/figure-html/unnamed-chunk-8-1.png" width="60%" style="display: block; margin: auto;" /> --- # Attributes of data We want the area of each location, but we don't have a variable for area ```r names(country_sf) ``` ``` ## [1] "GID_2" "GID_0" "COUNTRY" "GID_1" "NAME_1" "NL_NAME_1" "NAME_2" "VARNAME_2" "NL_NAME_2" ## [10] "TYPE_2" "ENGTYPE_2" "CC_2" "HASC_2" "geometry" ``` --- # Attributes of data Determine area. Note the CRS is spherical (WGS84), but `st_area` gives area in meters squared. R uses s2 geomety for this. ```r st_area(country_sf) ``` ``` ## Units: [m^2] ## [1] 174612548 664171032 1640235288 1893541200 909726196 1162391322 4446442673 244343146 319588761 ## [10] 546475821 791703635 487034695 347077424 234870080 320531683 178295354 302805234 267396640 ## [19] 950722974 208724111 377069605 235040170 166654802 311166355 244842707 270856300 195816268 ## [28] 234613620 250070745 300226047 544675636 890441628 819095221 562423505 473818437 774008353 ## [37] 1329273270 340468166 4248579388 7557295525 16712310204 603028994 8442761951 6483073596 270237144 ## [46] 247974418 626536513 259329880 1032562720 699140857 292011673 1147576864 124531035 15430646699 ## [55] 10122778443 4333824029 3283326674 112849413 6385975697 7975956018 210204075 144513355 145118575 ## [64] 102257259 162397218 424754069 278291541 175812288 138974871 258559234 412385494 246642948 ## [73] 447240937 317315791 755656427 354432319 466560197 244100323 214221442 184422278 336670907 ## [82] 59209638 79588059 106276706 172477093 468135485 283853034 204126728 232219564 2950195157 ## [91] 703903982 832269572 436237078 6970056180 614090663 201415848 544469250 238375984 550503547 ## [100] 230810950 241653749 114996036 102453209 127269592 131050389 98022999 136597611 160116397 ## [109] 208693101 72147288 162435042 313730881 667098008 404328975 605968535 451077270 524062466 ## [118] 4747798919 1728598186 12776321776 648117269 4706652226 4718665995 743726578 4967043891 2033955496 ## [127] 953096770 462893878 1403369302 5320660807 2842965457 2531222550 4218803604 171732132 206956593 ## [136] 521665804 1403310114 583520262 846378092 1023316734 238773253 1065648936 406103613 2322255189 ## [145] 2138129053 827106302 1532912840 984833268 2982478928 3204243797 2911286617 6067583799 4127141044 ## [154] 5213257915 1380645852 24170451659 9324325947 40563862935 2054020059 1586747787 506859062 816128223 ## [163] 1080124761 832098507 244156864 659527817 837760564 390486757 62125073 256749205 266515607 ## [172] 352119305 1229529845 221194951 207480837 295539506 385633939 21139328 42980834 107890825 ## [181] 55415180 15845364 25253613 604369524 239467145 293229726 289060069 409758026 464802431 ## [190] 236747687 26850519 28881788 8249195 86236564 5451808 17635838 9785023 8870678 ## [199] 136860370 13346547 209761656 12060332 2913663 48224035 7227424 16949384 76175203 ## [208] 372529234 1265891267 614397732 522111537 504309687 1729120049 169208563 129569919 762443067 ## [217] 1029316683 410105633 469565652 477456361 374509844 587145423 390983403 543513338 317978589 ## [226] 2564838662 1597337437 2735609105 5405516974 5474649580 294529545 254833758 171146919 180698278 ## [235] 836850073 649976193 783296703 471122840 540020247 1822166987 460043516 179153483 170044299 ## [244] 358075802 360099927 10086081038 8486932972 2480753161 609977106 1294428804 404900705 682585279 ## [253] 309704683 212174869 2831552868 4055299785 9502547335 854404452 13666458111 9735533913 15991415045 ## [262] 511815699 266679475 441351892 1216812569 629446065 656652292 370243327 310664395 356646225 ## [271] 153279778 9812620733 6635872349 8055607462 14674269045 7443786454 14171528648 3210439823 504411830 ## [280] 359649459 705495271 778864989 660748817 434790306 90172006 185242030 83401459 112023107 ## [289] 89915250 4765892026 7580651248 4276077995 10436807892 21670846083 8159865665 4131343258 1475529162 ## [298] 1481647512 2161012742 303805970 ``` --- # Operations similar to dataframes Create new dataset that captures locations for one administrative region ```r city_sf <- country_sf %>% filter(NAME_1 == "Nairobi") ``` --- # Operations similar to dataframes Plot the dataframe ```r ggplot() + geom_sf(data = city_sf) ``` <img src="06-spatial-data_files/figure-html/unnamed-chunk-12-1.png" width="65%" style="display: block; margin: auto;" /> --- # Load and explore polyline .exercise[ **Exercise:** * Load the roads data `roads.geojson` and name the object `roads_sf` * Look at the first few observations * Check the coordinate reference system * Map the polyline ] -- .solution[ **Solution**: ```r roads_sf <- st_read(here("DataWork", "DataSets", "Final", "roads.geojson")) head(roads_sf) st_crs(roads_sf) ggplot() + geom_sf(data = roads_sf) ``` ] --- # Load and explore polyline ```r roads_sf <- st_read(here("DataWork", "DataSets", "Final", "roads.geojson")) ``` ``` ## Reading layer `roads' from data source `C:\WBG\repos\dime-r-training\DataWork\DataSets\Final\roads.geojson' using driver `GeoJSON' ## Simple feature collection with 3326 features and 3 fields ## Geometry type: MULTILINESTRING ## Dimension: XY ## Bounding box: xmin: 36.68034 ymin: -1.430759 xmax: 37.07664 ymax: -1.162558 ## Geodetic CRS: WGS 84 ``` ```r ggplot() + geom_sf(data = roads_sf) ``` <img src="06-spatial-data_files/figure-html/unnamed-chunk-14-1.png" style="display: block; margin: auto;" /> --- # Load and explore polyline .exercise[ **Exercise:** Determine length of each line (hint: use `st_length`) ]
−
+
01
:
00
-- .solution[ **Solution**: ```r st_length(roads_sf) ``` ``` ## Units: [m] ## [1] 901.575669 137.859146 166.089698 24.263358 174.355741 482.550326 486.250661 64.104259 615.457421 ## [10] 16.713533 19.698716 3.848732 4.123779 555.445488 551.580925 7.064828 229.565489 588.930452 ## [19] 136.944584 579.132204 58.331756 21.793644 90.157091 41.250716 81.708553 68.324169 496.357727 ## [28] 14.151642 44.035735 45.810017 41.676861 35.126049 40.577592 43.183338 1068.724720 254.986245 ## [37] 554.188242 280.010314 573.652511 632.206308 903.432072 15.215667 84.991640 89.585364 39.362830 ## [46] 656.439901 562.994183 53.046523 55.049084 22.151381 157.276691 742.136179 255.684125 188.665895 ## [55] 196.867732 322.574154 136.879449 145.136229 1123.879549 449.762113 76.812840 425.625306 83.281411 ## [64] 67.480452 68.962106 537.993328 19.378576 103.228193 59.214009 26.159881 384.184839 349.081276 ## [73] 23.398508 34.330123 28.640061 22.090503 126.496308 638.049903 1026.044009 19.196929 259.895072 ## [82] 372.237351 42.319444 456.900234 87.612555 1531.397856 271.171558 325.272260 236.299712 10.440369 ## [91] 92.495221 379.530644 7.949999 622.409572 15.109525 142.737712 107.081783 33.467081 62.377606 ## [100] 724.000829 695.581295 65.014153 179.260849 195.899725 100.995000 316.149423 86.934045 70.416390 ## [109] 17.353440 66.185233 214.304444 36.941121 83.792852 78.306011 69.496853 33.720629 353.459417 ## [118] 82.605637 88.792621 38.454499 76.896236 393.368598 75.564943 102.835468 354.353621 474.770509 ## [127] 15.211329 103.329785 89.752114 329.270942 77.746930 40.631355 895.617257 20.211772 76.128943 ## [136] 24.106065 18.346965 16.095807 81.844707 65.840603 7.523154 15.737415 18.346647 19.344688 ## [145] 44.468440 2365.850017 446.368350 10.685069 29.717513 242.125998 26.214424 31.161768 28.882580 ## [154] 209.744166 15.913572 45.347436 42.561872 91.728443 86.592919 76.516267 1266.191271 190.996655 ## [163] 624.157384 144.697297 9.835670 196.192194 192.187170 461.451603 75.011291 79.253269 143.695820 ## [172] 145.774860 21.897846 28.102546 112.692246 152.612069 112.549324 144.933764 132.036903 209.248099 ## [181] 79.379658 64.948667 387.080317 215.517544 12.647845 168.772763 157.631311 294.623376 18.185409 ## [190] 18.862560 13.736858 151.950779 192.606507 171.537885 170.051512 69.462351 125.180979 58.067487 ## [199] 27.244712 32.574114 35.884240 64.874557 85.942428 83.772389 64.405341 62.853065 65.017111 ## [208] 64.566466 178.974297 217.986645 489.566613 13.276265 104.782846 156.247810 621.902979 43.360073 ## [217] 302.510646 41.527962 24.679771 21.734725 23.810547 21.526451 70.239762 9.437760 10.093868 ## [226] 66.024990 244.133132 28.683986 28.787315 33.110880 33.335116 24.885260 23.623087 8.859547 ## [235] 11.474758 36.846291 34.046383 9.758055 9.582397 59.695875 61.884606 23.830371 28.617250 ## [244] 91.631524 35.845332 26.721412 94.275960 23.916933 28.801656 14.712146 160.046196 24.981400 ## [253] 234.147315 233.927103 1087.403354 30.702904 107.790191 11.018641 294.018236 78.976905 326.376813 ## [262] 22.270210 13.236484 1383.521572 1898.060518 72.618651 7.580401 3.102657 430.711322 9.208662 ## [271] 3.910851 170.763099 9.095408 48.772317 1481.739722 41.179702 11.229904 1150.718529 11.746534 ## [280] 5.710076 11.444318 23.996222 257.613786 424.029954 504.219606 290.728608 819.761936 180.474023 ## [289] 299.543825 327.896390 192.190366 1405.190682 266.197044 586.347189 249.622435 386.087360 294.467105 ## [298] 262.760188 312.509582 347.522410 159.924738 156.421648 97.608590 532.253826 1333.310710 39.763901 ## [307] 38.790521 303.513811 105.457492 107.013255 154.319578 1944.903558 145.037843 120.053487 119.594793 ## [316] 385.365800 389.801206 237.639660 392.831927 584.315248 345.230486 258.529732 348.501723 49.083442 ## [325] 252.363036 46.605372 53.797116 16.503081 8.516212 2.857909 20.918478 4.434933 17.778275 ## [334] 7.239399 25.190817 122.379208 12.730844 19.064091 20.711767 22.367547 9.607125 6.157311 ## [343] 17.575439 11.357351 9.772192 5.155870 7.693682 94.748482 65.229459 17.251260 35.322175 ## [352] 82.546897 24.747596 15.850848 3.192443 16.585177 13.812033 8.047443 41.937817 19.357601 ## [361] 17.758083 22.294960 191.566174 634.507520 816.881154 655.283982 103.147967 180.438357 97.857393 ## [370] 160.981061 30.257708 39.390289 38.849579 19.997969 33.701511 79.972952 954.089221 282.034326 ## [379] 464.715270 6.478024 5.355263 31.775248 77.925524 9.669689 142.615866 36.947310 8.819145 ## [388] 6.887780 70.166668 47.008932 34.926286 258.308529 25.484432 321.238706 19.127166 118.516793 ## [397] 548.550038 486.652961 15.197515 11.514794 15.314509 34.134704 133.594610 1.705542 4.752641 ## [406] 10.305055 25.097578 10.849514 36.996831 22.528198 355.095362 170.295579 148.700340 69.464356 ## [415] 409.321609 26.015631 21.062506 46.030159 31.305178 21.777118 109.770706 151.378231 99.596592 ## [424] 13.491546 18.402267 33.317624 139.755248 48.151289 33.629483 22.186083 20.940602 57.966413 ## [433] 47.197739 46.309170 60.813993 30.871748 27.806914 783.300024 72.530022 387.536264 97.846177 ## [442] 95.251302 3457.182344 595.540962 1619.593711 5584.887280 1499.195219 2028.456514 680.713398 1302.336331 ## [451] 678.651302 36.682478 42.016227 823.744865 180.023064 1181.018404 2737.428560 94.395118 13.246191 ## [460] 783.582352 368.183564 584.449051 2722.626668 202.179400 63.212322 55.449400 226.718124 67.955995 ## [469] 132.927041 41.871533 182.282617 40.803027 45.028038 44.329832 120.649598 13.471208 1145.994309 ## [478] 5.092131 24.713436 439.270868 24.335356 115.601846 81.180742 53.453252 196.133675 599.917699 ## [487] 10.119957 11.465427 14.978696 79.639141 69.204445 100.457397 27.755842 34.587177 43.976046 ## [496] 97.929384 11.596006 138.805883 42.892942 146.102303 139.675863 415.269187 38.924270 21.709900 ## [505] 3.569198 1497.893854 167.680758 20.318642 32.950918 54.266014 11.196699 137.922197 631.275641 ## [514] 16.885141 7.723404 350.500363 530.159592 28.938137 590.205825 24.136490 139.554173 131.974399 ## [523] 109.885175 419.173004 170.641099 35.081361 68.336162 55.889700 120.620536 63.602975 215.632901 ## [532] 37.459126 40.193705 40.819656 16.959511 15.547402 22.250300 315.157339 958.832395 191.186395 ## [541] 1099.375260 1317.301239 778.026848 1181.229521 133.879778 25.875107 225.943568 554.439191 993.214750 ## [550] 308.355702 985.563770 903.317682 358.859756 207.675584 55.231076 973.610662 519.206497 103.667080 ## [559] 312.063561 159.967959 181.086436 22.945795 366.270554 151.890173 528.356015 25.546237 210.425392 ## [568] 1338.626861 146.227323 1175.918841 218.013021 429.033390 1784.378588 551.037078 916.028966 172.938742 ## [577] 675.317506 40.351019 44.951287 327.012549 5.639886 61.878296 764.548437 58.148131 17.558132 ## [586] 216.861019 285.681313 819.221351 333.490469 33.069205 235.848426 65.430070 472.353873 1652.334256 ## [595] 563.509270 1833.514056 36.302144 29.269194 28.685523 302.127816 184.128590 428.943600 110.430419 ## [604] 109.597608 2.495818 409.318394 45.720415 38.359710 50.749066 53.439894 743.414333 38.291323 ## [613] 56.799464 444.980344 465.952418 34.502790 291.574032 389.824964 375.037643 409.852087 75.833778 ## [622] 525.541741 1312.085455 84.762778 84.882143 369.517435 575.931407 388.207124 511.166297 92.764009 ## [631] 91.532854 78.830159 49.039298 33.998093 35.562234 27.770001 27.900540 29.020983 30.923509 ## [640] 33.985089 32.227818 584.253632 300.182519 884.478556 32.726031 82.066746 87.593411 35.056546 ## [649] 66.621103 37.643724 184.735993 244.402174 438.691677 12.944088 159.932314 1077.389421 146.432128 ## [658] 46.604884 830.541626 8.614502 236.770583 324.740132 941.258982 65.139394 21.099869 156.677518 ## [667] 1.135547 8.280350 20.280793 149.493497 395.497131 1246.849919 150.699559 344.462607 924.784199 ## [676] 1560.082552 144.362009 22.239390 599.458121 61.806214 477.719356 228.696950 1647.256931 171.402594 ## [685] 26.444610 81.618315 29.035738 30.120069 528.431434 180.863466 1015.842827 67.574649 14.770247 ## [694] 5279.189324 84.206008 709.780316 27.740411 23.687744 124.820654 45.102393 1096.405518 439.413044 ## [703] 285.164315 85.851777 163.581114 56.770778 494.312474 496.955338 620.513267 271.743673 395.438276 ## [712] 60.172427 823.807637 819.352514 306.166764 395.603040 529.440875 124.538942 528.053968 122.816997 ## [721] 373.649422 383.289253 85.151176 138.007283 33.353933 334.861285 293.364100 118.185924 158.145486 ## [730] 74.394625 73.737110 465.318854 1832.076721 1833.400956 218.230203 62.254404 3.495613 87.168480 ## [739] 124.820866 86.818948 5488.305775 123.950950 14.288332 1666.913973 748.090305 49.963428 190.277939 ## [748] 7.526102 45.815879 7.538655 57.716532 517.063572 408.045695 82.135850 100.096916 872.272422 ## [757] 1279.286960 221.347035 245.294790 35.349870 35.014254 172.323068 89.782275 404.760241 48.561694 ## [766] 260.619918 186.967159 147.518483 152.714064 674.290531 100.474045 179.923164 238.204810 614.079791 ## [775] 488.905394 57.861295 566.385271 1117.868634 44.498638 69.913363 55.301045 250.770776 36.456913 ## [784] 247.154607 86.452059 13.048195 5.165935 304.141038 27.441091 22.585334 37.688845 19.859954 ## [793] 272.541180 20.197728 196.685158 54.496024 28.162708 21.695582 402.428011 4.824428 9.112998 ## [802] 15.534455 63.448048 5.881296 123.086270 29.118329 125.534212 1.505540 59.415886 157.235333 ## [811] 466.343861 963.961941 386.196080 452.825354 244.765131 20.148296 78.127647 49.440750 137.263356 ## [820] 96.880307 1207.105403 743.189877 205.829173 59.823235 294.287088 101.897931 119.549014 47.341643 ## [829] 107.814399 66.580161 97.137509 43.922472 973.607593 39.292800 83.718019 146.589150 87.117430 ## [838] 974.346345 141.893675 43.286365 14.416137 49.684304 52.182268 29.893349 41.118610 68.997068 ## [847] 6.882314 4.477679 324.544579 233.761724 568.567468 749.398078 228.167830 98.656360 40.582724 ## [856] 216.382783 70.047325 217.284204 90.985390 21.298196 24.088573 1040.125521 105.202876 8.370355 ## [865] 72.515864 73.290252 1797.484440 165.327195 80.593902 1061.513293 80.580053 117.853554 188.192875 ## [874] 116.105883 81.982675 119.737351 54.032351 54.431890 35.389955 182.306593 19.624130 36.340769 ## [883] 41.292701 1277.959594 57.180221 1695.787206 67.569483 970.455976 474.382315 59.273003 379.103942 ## [892] 26.904851 59.571964 253.430917 1230.007268 531.897923 332.087215 133.299829 224.830316 29.113686 ## [901] 29.690689 978.832452 162.571715 66.465596 272.590530 10.128424 28.399718 39.791193 40.891166 ## [910] 48.992763 15.747949 19.680054 6.400654 19.514683 117.462539 185.282361 1566.316981 530.129801 ## [919] 1469.462840 394.168179 1391.048969 636.012528 614.827689 189.744710 194.259772 194.688837 412.981128 ## [928] 1229.355220 89.232068 1206.832194 95.839910 426.701563 95.976446 431.914961 93.723878 31.479096 ## [937] 1991.287151 1843.168738 303.035806 11.709436 352.798394 54.944918 54.883716 43.649199 169.890241 ## [946] 182.973397 1163.126877 68.301029 3.574213 609.430991 952.824609 1025.592484 76.070467 79.911880 ## [955] 106.257350 47.987613 200.565801 4.255454 15.430039 23.550074 19.621493 19.629685 99.558836 ## [964] 1862.118983 573.015659 491.580757 59.215729 1054.204629 107.478347 431.755644 261.863845 1046.517549 ## [973] 380.534631 366.592485 36.104694 81.331235 53.673263 27.897598 401.420993 303.208444 614.268928 ## [982] 164.112814 41.807887 43.858441 32.024435 32.316586 97.271824 102.837031 321.668421 32.564468 ## [991] 50.254060 30.676455 56.169137 17.741403 175.940658 394.349626 285.837259 43.223170 33.323193 ## [1000] 146.903727 ## [ reached getOption("max.print") -- omitted 2326 entries ] ``` ] --- # Load and explore point data We'll load a dataset of the location of schools ```r schools_df <- read_csv(here("DataWork", "DataSets", "Final", "schools.csv")) ``` ``` ## Rows: 3546 Columns: 5 ## ── Column specification ────────────────────────────────────────────────────────────────────────────────────────── ## Delimiter: "," ## chr (2): name, amenity ## dbl (3): osm_id, longitude, latitude ## ## ℹ Use `spec()` to retrieve the full column specification for this data. ## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. ``` --- # Explore data ```r head(schools_df) ``` ``` ## # A tibble: 6 × 5 ## osm_id name amenity longitude latitude ## <dbl> <chr> <chr> <dbl> <dbl> ## 1 30312225 Consolata School school 36.8 -1.27 ## 2 674552830 <NA> <NA> 36.8 -1.26 ## 3 1399125354 Galitos restaurant school 36.8 -1.29 ## 4 1764153756 Makini Schools school 36.8 -1.30 ## 5 1867185524 Bohra Primary School school 36.8 -1.26 ## 6 2061462027 <NA> <NA> 36.8 -1.26 ``` --- # Explore data ```r names(schools_df) ``` ``` ## [1] "osm_id" "name" "amenity" "longitude" "latitude" ``` --- # Convert to spatial object We define the (1) coordinates (longitude and latitude) and (2) CRS. __Note:__ We must determine the CRS from the data metadata. This dataset comes from OpenStreetMaps, which uses EPSG:4326. __Assigning the incorrect CRS is one of the most common sources of issues I see with geospatial work. If something looks weird, check the CRS!__ ```r schools_sf <- st_as_sf(schools_df, coords = c("longitude", "latitude"), crs = 4326) ``` --- # Convert to spatial object ```r head(schools_sf$geometry) ``` ``` ## Geometry set for 6 features ## Geometry type: POINT ## Dimension: XY ## Bounding box: xmin: 36.76877 ymin: -1.296051 xmax: 36.80406 ymax: -1.258515 ## Geodetic CRS: WGS 84 ## First 5 geometries: ``` ``` ## POINT (36.80406 -1.267486) ``` ``` ## POINT (36.79734 -1.25969) ``` ``` ## POINT (36.77077 -1.290325) ``` ``` ## POINT (36.76877 -1.296051) ``` ``` ## POINT (36.79066 -1.258515) ``` --- # Map points object: Using sf ```r ggplot() + geom_sf(data = schools_sf) ``` <img src="06-spatial-data_files/figure-html/unnamed-chunk-21-1.png" width="50%" style="display: block; margin: auto;" /> --- # Map points object: Using dataframe ```r ggplot() + geom_point(data = schools_df, aes(x = longitude, y = latitude)) ``` <img src="06-spatial-data_files/figure-html/unnamed-chunk-22-1.png" width="50%" style="display: block; margin: auto;" /> --- # Make better static map Lets make a better static map. ```r # Adding a variable with squared km city_sf <- city_sf %>% mutate(area_m = city_sf %>% st_area() %>% as.numeric(), area_km = area_m / 1000^2) # Plotting ggplot() + geom_sf(data = city_sf, aes(fill = area_km)) + labs(fill = "Area") + scale_fill_distiller(palette = "Blues") + theme_void() ``` <img src="06-spatial-data_files/figure-html/unnamed-chunk-23-1.png" width="50%" style="display: block; margin: auto;" /> --- # Make better static map Lets add another spatial layer ```r ggplot() + geom_sf(data = city_sf, aes(fill = area_km)) + geom_sf(data = schools_sf, aes(color = "Schools")) + labs(fill = "Area", color = NULL) + scale_fill_distiller(palette = "Blues") + scale_color_manual(values = "black") + theme_void() ``` <img src="06-spatial-data_files/figure-html/unnamed-chunk-24-1.png" width="40%" style="display: block; margin: auto;" /> --- # Another static map .exercise[ **Exercise:** Make a static map of roads, coloring each road by its type. (__Hint:__ The `highway` variable indicates the type). ]
−
+
01
:
00
-- .solution[ **Solution**: ```r ggplot() + geom_sf(data = roads_sf, aes(color = highway)) + theme_void() + labs(color = "Road Type") ``` <img src="06-spatial-data_files/figure-html/unnamed-chunk-25-1.png" style="display: block; margin: auto;" /> ] --- # Another static map ```r ggplot() + geom_sf(data = roads_sf, aes(color = highway)) + theme_void() + labs(color = "Road Type") ``` <img src="06-spatial-data_files/figure-html/unnamed-chunk-26-1.png" style="display: block; margin: auto;" /> --- # Interactive map We use the `leaflet` package to make interactive maps. Leaflet is a JavaScript library, but the `leaflet` R package allows making interactive maps using R. Use of leaflet somewhat mimics how we use ggplot. * Start with `leaflet()` (instead of `ggplot()`) * Add spatial layers, defining type of layer (similar to geometries) ```r leaflet() %>% addTiles() # Basemap ```
--- # Interactive map We use the `leaflet` package to make interactive maps. Leaflet is a JavaScript library, but the `leaflet` R package allows making interactive maps using R. Use of leaflet somewhat mimics how we use ggplot. * Start with `leaflet()` (instead of `ggplot()`) * Add spatial layers, defining type of layer (similar to geometries) ```r leaflet() %>% addTiles() %>% addPolygons(data = city_sf) ```
--- # Interactive map Add a pop-up ```r leaflet() %>% addTiles() %>% addPolygons(data = city_sf, popup = ~NAME_2) ```
--- # Interactive map Add more than one layer ```r leaflet() %>% addTiles() %>% addPolygons(data = city_sf, popup = ~NAME_2) %>% addCircles(data = schools_sf, popup = ~name, color = "black") ```
--- # Interactive map of roads .exercise[ **Exercise:** Create a leaflet map with roads, using the `roads_sf` dataset. (__Hint:__ Use `addPolylines()`) ]
−
+
02
:
00
-- .solution[ **Solution**: ```r leaflet() %>% addTiles() %>% addPolylines(data = roads_sf) ``` ] --- # Interactive map of roads ```r leaflet() %>% addTiles() %>% addPolylines(data = roads_sf) ```
--- # Interactive maps We can spent lots of time going over what we can done with leaflet - but that would take up too much time. [This resource](https://rstudio.github.io/leaflet/articles/colors.html) provides helpful tutorials for things like: * Changing the basemap * Adding colors * Adding a legend * And much more! --- # Spatial operations applied on single dataset * `st_transform`: Transform CRS * `st_buffer`: Buffer point/line/polygon * `st_combine`: Dissolve by attribute * `st_convex_hull `: Create convex hull * `st_centroid`: Create new sf object that uses the centroid * `st_drop_geometry`: Drop geometry; convert from sf to dataframe * `st_coordinates`: Get matrix of coordinates * `st_bbox`: Get bounding box --- # Transform CRS The schools dataset is currently in a geographic CRS (WGS84), where the units are in decimal degrees. We'll tranform the CRS to a projected CRS ([EPSG:32632](https://epsg.io/32632)), and where the units will be in meters. __Note that coordinate values are large!__ Values are large because units are in meters. Large coordinate values suggest projected CRS; latitude is between -90 and 90 and longitude is between -180 and 180. ```r schools_utm_sf <- st_transform(schools_sf, 32632) schools_utm_sf$geometry %>% head(2) %>% print() ``` ``` ## Geometry set for 2 features ## Geometry type: POINT ## Dimension: XY ## Bounding box: xmin: 3722217 ymin: -158522.3 xmax: 3723051 ymax: -157537.6 ## Projected CRS: WGS 84 / UTM zone 32N ``` ``` ## POINT (3723051 -158522.3) ``` ``` ## POINT (3722217 -157537.6) ``` --- # Buffer We have the points of schools. Now we create a 1km buffer around schools. ```r schools_1km_sf <- schools_sf %>% st_buffer(dist = 1000) # Units are in meters. Thanks s2! ggplot() + geom_sf(data = schools_1km_sf) ``` <img src="06-spatial-data_files/figure-html/unnamed-chunk-29-1.png" width="50%" style="display: block; margin: auto;" /> --- # Dissolve by an attribute Below we have the second administrative regions. Using this dataset, let's create a new object at the first administrative region level. ```r country_1_sf <- country_sf %>% group_by(NAME_1) %>% summarise(geometry = st_combine(geometry)) %>% ungroup() ggplot() + geom_sf(data = country_1_sf) ``` <img src="06-spatial-data_files/figure-html/unnamed-chunk-30-1.png" width="30%" style="display: block; margin: auto;" /> --- # Exercise .exercise[ **Exercise:** Create a polyline of all trunk roads (dissolve it using `st_combine`), and buffer the polyline by 10 meters. In `roads_sf`, the `highway` variable notes road types. ]
−
+
02
:
00
-- .solution[ **Solution**: ```r roads_sf %>% filter(highway == "trunk") %>% summarise(geometry = st_combine(geometry)) %>% st_buffer(dist = 10) ``` ``` ## Simple feature collection with 1 feature and 0 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: 36.68359 ymin: -1.430867 xmax: 37.07692 ymax: -1.204784 ## Geodetic CRS: WGS 84 ## geometry ## 1 MULTIPOLYGON (((36.76646 -1... ``` ] --- # Convex Hull __Simple definition:__ Get the outer-most coordinates of a shape and connect-the-dots. __Formal definition:__ A convex hull of a shape the smallest "convex set" that contains it. (A [convex set](https://en.wikipedia.org/wiki/Convex_set) is where a straight line can be drawn anywhere in the space and the space fully contains the line). .pull-left[ __Convex__  ] .pull-right[ __Not convex__  ] __Source:__ [Wikipedia](https://en.wikipedia.org/wiki/Convex_set) --- # Convex hull In the below example, we create a conex hull around schools; creating a polygon that includes all schools. __Incorrect attempt__ ```r schools_chull1_sf <- schools_sf %>% st_convex_hull() nrow(schools_chull1_sf) ``` ``` ## [1] 3546 ``` --- # Convex hull __Correct__ ```r schools_chull2_sf <- schools_sf %>% summarise(geometry = st_combine(geometry)) %>% st_convex_hull() ggplot() + geom_sf(data = schools_chull2_sf) + geom_sf(data = schools_sf, color = "red") ``` <img src="06-spatial-data_files/figure-html/unnamed-chunk-33-1.png" width="40%" style="display: block; margin: auto;" /> --- # Determine centroid Sometimes we want to represent a polygon or polyline as a single point. For this, we can compute the centroid (ie, geographic center) of a polygon/polyline. .center[  ] __Source:__ [Wikipedia](https://en.wikipedia.org/wiki/Centroid) --- # Determine centroid Determine centroid of second administrative regions ```r country_c_sf <- st_centroid(country_sf) ``` ``` ## Warning: st_centroid assumes attributes are constant over geometries ``` ```r ggplot() + geom_sf(data = country_c_sf) ``` <img src="06-spatial-data_files/figure-html/unnamed-chunk-34-1.png" width="40%" style="display: block; margin: auto;" /> --- # Remove geometry __Incorrect approach__ ```r city_sf %>% select(-geometry) %>% head() ``` ``` ## Simple feature collection with 6 features and 15 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: 36.67803 ymin: -1.370704 xmax: 36.99025 ymax: -1.234921 ## Geodetic CRS: WGS 84 ## GID_2 GID_0 COUNTRY GID_1 NAME_1 NL_NAME_1 NAME_2 VARNAME_2 NL_NAME_2 TYPE_2 ## 1 KEN.30.1_1 KEN Kenya KEN.30_1 Nairobi <NA> Dagoretti North <NA> <NA> Constituency ## 2 KEN.30.2_1 KEN Kenya KEN.30_1 Nairobi <NA> Dagoretti South <NA> <NA> Constituency ## 3 KEN.30.3_1 KEN Kenya KEN.30_1 Nairobi <NA> Embakasi Central <NA> <NA> Constituency ## 4 KEN.30.4_1 KEN Kenya KEN.30_1 Nairobi <NA> Embakasi East <NA> <NA> Constituency ## 5 KEN.30.5_1 KEN Kenya KEN.30_1 Nairobi <NA> Embakasi North <NA> <NA> Constituency ## 6 KEN.30.6_1 KEN Kenya KEN.30_1 Nairobi <NA> Embakasi South <NA> <NA> Constituency ## ENGTYPE_2 CC_2 HASC_2 area_m area_km geometry ## 1 Constituency 275 <NA> 26850519 26.850519 MULTIPOLYGON (((36.76082 -1... ## 2 Constituency 276 <NA> 28881788 28.881788 MULTIPOLYGON (((36.75381 -1... ## 3 Constituency 284 <NA> 8249195 8.249195 MULTIPOLYGON (((36.9181 -1.... ## 4 Constituency 285 <NA> 86236564 86.236564 MULTIPOLYGON (((36.97557 -1... ## 5 Constituency 283 <NA> 5451808 5.451808 MULTIPOLYGON (((36.89368 -1... ## 6 Constituency 282 <NA> 17635838 17.635838 MULTIPOLYGON (((36.9065 -1.... ``` --- # Remove geometry __Correct__ ```r city_sf %>% st_drop_geometry() %>% head() ``` ``` ## GID_2 GID_0 COUNTRY GID_1 NAME_1 NL_NAME_1 NAME_2 VARNAME_2 NL_NAME_2 TYPE_2 ## 1 KEN.30.1_1 KEN Kenya KEN.30_1 Nairobi <NA> Dagoretti North <NA> <NA> Constituency ## 2 KEN.30.2_1 KEN Kenya KEN.30_1 Nairobi <NA> Dagoretti South <NA> <NA> Constituency ## 3 KEN.30.3_1 KEN Kenya KEN.30_1 Nairobi <NA> Embakasi Central <NA> <NA> Constituency ## 4 KEN.30.4_1 KEN Kenya KEN.30_1 Nairobi <NA> Embakasi East <NA> <NA> Constituency ## 5 KEN.30.5_1 KEN Kenya KEN.30_1 Nairobi <NA> Embakasi North <NA> <NA> Constituency ## 6 KEN.30.6_1 KEN Kenya KEN.30_1 Nairobi <NA> Embakasi South <NA> <NA> Constituency ## ENGTYPE_2 CC_2 HASC_2 area_m area_km ## 1 Constituency 275 <NA> 26850519 26.850519 ## 2 Constituency 276 <NA> 28881788 28.881788 ## 3 Constituency 284 <NA> 8249195 8.249195 ## 4 Constituency 285 <NA> 86236564 86.236564 ## 5 Constituency 283 <NA> 5451808 5.451808 ## 6 Constituency 282 <NA> 17635838 17.635838 ``` --- # Grab coordinates Create a matrix of coordinates ```r schools_sf %>% st_coordinates() %>% head() ``` ``` ## X Y ## [1,] 36.80406 -1.267486 ## [2,] 36.79734 -1.259690 ## [3,] 36.77077 -1.290325 ## [4,] 36.76877 -1.296051 ## [5,] 36.79066 -1.258515 ## [6,] 36.77899 -1.264575 ``` --- # Get bounding box ```r schools_sf %>% st_bbox() ``` ``` ## xmin ymin xmax ymax ## 36.691965 -1.374473 37.065336 -1.177316 ``` --- # Spatial operations using multiple datasets * `st_distance`: Calculate distances. * `st_intersects`: Indicates whether simple features intersect. * `st_intersection`: Cut one spatial object based on another. * `st_difference`: Remove part of spatial object based on another. * `st_join`: Spatial join (ie, add attributes of one dataframe to another based on location). --- # Distances For this example, we'll compute the distance between each school to a motorway. ```r motor_sf <- roads_sf %>% filter(highway == "motorway") # Matrix: distance of each school to each motorway dist_mat <- st_distance(schools_sf, motor_sf) # Take minimun distance for each school dist_mat %>% apply(1, min) %>% head() ``` ``` ## [1] 33.78464 155.32799 4006.16459 4662.68796 176.10524 1382.28513 ``` --- # Exercise .exercise[ **Exercise:** Calculate the distance from the centroid of each second administrtaive division to the nearest trunk road. ]
−
+
02
:
00
-- .solution[ **Solution**: ```r city_cent_sf <- city_sf %>% st_centroid() ``` ``` ## Warning: st_centroid assumes attributes are constant over geometries ``` ```r trunk_sf <- roads_sf %>% filter(highway == "trunk") # Matrix: distance of each school to each motorway dist_mat <- st_distance(city_cent_sf, trunk_sf) # Take minimun distance for each school dist_mat %>% apply(1, min) %>% head() ``` ``` ## [1] 2127.34582 2215.07338 929.39785 5642.60850 2015.55906 19.97698 ``` ] # Distances There are multiple ways to calculate distances! * __Great circle:__ sf, by default, uses s2 to computer distance (in meters) when data has a geographic CRS * __Great circle:__ Other formulas beyond s2, such as Haversine, Vincenty, and Karney’s method. See the [geosphere](https://cran.r-project.org/web/packages/geosphere/geosphere.pdf) and [geodist](https://cran.r-project.org/web/packages/geodist/geodist.pdf) packages. Vincenty is more precise than Haversine, and Karney's method is more precise than Vincenty's method. Greater precision comes with heavy computation. For more information, see [here](https://rspatial.org/raster/sphere/2-distance.html). * __Projected:__ We can use a projected CRS, where units are in meters already. --- # Distances .pull-left[ ```r # s2 st_distance(schools_sf[1,], schools_sf[2,]) %>% as.numeric() ``` ``` ## [1] 1144.271 ``` ```r # Nigeria-specific CRS schools_utm_sf <- st_transform(schools_sf, 32632) st_distance(schools_utm_sf[1,], schools_utm_sf[2,]) %>% as.numeric() ``` ``` ## [1] 1290.671 ``` ```r # World mercator schools_merc_sf <- st_transform(schools_sf, 3395) st_distance(schools_merc_sf[1,], schools_merc_sf[2,]) %>% as.numeric() ``` ``` ## [1] 1141.436 ``` ] .pull-right[ ```r # Haversine distHaversine( p1 = schools_sf[1,] %>% st_coordinates, p2 = schools_sf[2,] %>% st_coordinates) ``` ``` ## [1] 1145.551 ``` ```r # Vincenty's method distVincentySphere( p1 = schools_sf[1,] %>% st_coordinates, p2 = schools_sf[2,] %>% st_coordinates) ``` ``` ## [1] 1145.551 ``` ```r # Karney’s method distGeo(p1 = schools_sf[1,] %>% st_coordinates, p2 = schools_sf[2,] %>% st_coordinates) ``` ``` ## [1] 1141.16 ``` ] --- # Intersects For this example we'll determine which second administrative divisions intersects with a motorway. ```r # Sparse matrix st_intersects(city_sf, motor_sf) %>% print() ``` ``` ## Sparse geometry binary predicate list of length 17, where the predicate was `intersects' ## first 10 elements: ## 1: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ... ## 2: (empty) ## 3: (empty) ## 4: 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, ... ## 5: (empty) ## 6: 30, 32, 33, 34, 35, 36, 37, 38, 39, 40, ... ## 7: (empty) ## 8: (empty) ## 9: (empty) ## 10: 10, 42, 43, 45, 64 ``` --- # Intersects Take `max` (`FALSE` corresponds to 0 and `TRUE` corresponds to 1). So taking max will yeild if unit intersects with _any_ motorway ```r # Matrix st_intersects(city_sf, motor_sf, sparse = F) %>% apply(1, max) %>% head() ``` ``` ## [1] 1 0 0 1 0 1 ``` --- # Exercise .exercise[ **Exercise:** Determine which motorways intersect with a trunk road ]
−
+
02
:
00
-- .solution[ **Solution**: ```r trunk_sf <- roads_sf %>% filter(highway == "trunk") motor_sf <- roads_sf %>% filter(highway == "motorway") st_intersects(motor_sf, trunk_sf, sparse = F) %>% apply(1, max) %>% head() ``` ] --- # Intersection We have roads for the full city. Here, we want to create new roads object that __only includes__ roads in one unit. ```r loc_sf <- city_sf %>% head(1) roads_loc_sf <- st_intersection(roads_sf, loc_sf) ``` ``` ## Warning: attribute variables are assumed to be spatially constant throughout all geometries ``` ```r ggplot() + geom_sf(data = roads_loc_sf) ``` <img src="06-spatial-data_files/figure-html/unnamed-chunk-46-1.png" width="40%" style="display: block; margin: auto;" /> --- # Difference We have roads for all of the city. Here, we want to create new roads object that __excludes__ roads in one unit. ```r roads_notloc_sf <- st_difference(roads_sf, loc_sf) ``` ``` ## Warning: attribute variables are assumed to be spatially constant throughout all geometries ``` ```r ggplot() + geom_sf(data = loc_sf, fill = NA, color = "red") + geom_sf(data = roads_notloc_sf) ``` <img src="06-spatial-data_files/figure-html/unnamed-chunk-47-1.png" width="40%" style="display: block; margin: auto;" /> --- # Overlay Intersections and differencing are __overlay__ functions .center[  ] --- # Exercise .exercise[ **Exercise:** Create a map of schools that are within 1km of a motorway. ]
−
+
02
:
00
-- .solution[ **Solution**: ```r motor_1km_sf <- roads_sf %>% filter(highway == "motorway") %>% st_buffer(dist = 1000) schools_nr_motor_sf <- schools_sf %>% st_intersection(motor_1km_sf) leaflet() %>% addTiles() %>% addCircles(data = schools_nr_motor_sf) ``` ] --- # Exercise Note that there are multiple approaches we could have used for creating a map of schools that are within 1km of a trunk road. 1. Buffer trunk roads by 1km and do a spatial intersection with schools 2. Calculate the distance of each school to the nearest trunk road, then filter schools that are within 1km of a trunk road --- # Spatial join We have a dataset of schools. The school dataframe contains information such as the school name, but not on the administrative region it's in. To add data on the administrative region that the school is in, we'll perform a spatial join. Check the variable names. No names of second administrative divison :( ```r names(schools_sf) ``` ``` ## [1] "osm_id" "name" "amenity" "geometry" ``` --- # Spatial join Use `st_join` to add attributes from `city_sf` to `schools_sf`. `st_join` is similar to other join methods (eg, `left_join`); instead of joining on a varible, we join based on location. ```r schools_city_sf <- st_join(schools_sf, city_sf) schools_city_sf %>% names() %>% print() %>% tail(10) ``` ``` ## [1] "osm_id" "name" "amenity" "geometry" "GID_2" "GID_0" "COUNTRY" "GID_1" "NAME_1" ## [10] "NL_NAME_1" "NAME_2" "VARNAME_2" "NL_NAME_2" "TYPE_2" "ENGTYPE_2" "CC_2" "HASC_2" "area_m" ## [19] "area_km" ``` ``` ## [1] "NL_NAME_1" "NAME_2" "VARNAME_2" "NL_NAME_2" "TYPE_2" "ENGTYPE_2" "CC_2" "HASC_2" "area_m" ## [10] "area_km" ``` --- # Spatial join .exercise[ **Exercise:** Make a static map using of administrative areas, where each administrative area polygon displays the number of schools within the administrative area. ] -- .solution[ **Solution**: ```r ## Dataframe of number of schools per NAME_2 n_school_df <- schools_city_sf %>% st_drop_geometry() %>% group_by(NAME_2) %>% summarise(n_school = n()) %>% ungroup() ## Merge info with city_sf city_sch_sf <- city_sf %>% left_join(n_school_df, by = "NAME_2") ## Map p <- ggplot() + geom_sf(data = city_sch_sf, aes(fill = n_school)) ``` ] --- # Spatial join ```r ggplot() + geom_sf(data = city_sch_sf, aes(fill = n_school)) + labs(fill = "N\nSchools") + scale_fill_distiller(palette = "YlOrRd") + theme_void() ``` <img src="06-spatial-data_files/figure-html/unnamed-chunk-52-1.png" width="65%" style="display: block; margin: auto;" /> --- # Spatial join Let's outsource to [chatGPT](https://chatgpt.com/) (or [gemini](https://gemini.google.com/app) or your other favorite AI). Try entering the below prompt into chatGPT to see how it does. Does chatGPT give a correct answer? Do you need to modify chatGPT's output to make it work? _In R, I have an sf points object of schools called schools_sf. I also have the second administrative divisions of a city as an sf polygon called city_sf and where each location is uniquely defined by the variable NAME_2. Make a static map using of administrative areas, where each administrative area polygon displays the number of schools within the administrative area. Provide R code for this._ --- # Resources .large[ * [sf package cheatsheet](https://github.com/rstudio/cheatsheets/blob/main/sf.pdf) * [Spatial Data Science with Applications in R](https://r-spatial.org/book/) * [Geocomputation with R](https://r.geocompx.org/) ] --- # Thank you! .center[ <img src="img/werner_projection.jpg" width = 500> ]