Take-Home Exercise 1: Application of Spatial Point Pattern Analysis on Osun State, Nigeria

Author

Michelle Leong Hwee-Ling

Published

January 30, 2023

Modified

March 27, 2023

Background & Objective of the Exercise

In this exercise, we investigate the distribution and availability of Water Points and access to clean water in the state of Osun, Nigeria. This will be done through the application of various Spatial Point Pattern Analyses in order to discover the geographical distribution of functional and non-functional Water Points in Osun.

In this exercise, we will be using data from the global Water Point Data Exchange (WPdx) project.

Installing Packages & Importing the Data

Install Relevant Packages

For this exercise, we will be using the following packages:

  • sf for importing, managing, and processing geospatial data

  • tidyverse for performing data science tasks such as importing, wrangling and visualising data

  • tmap which provides functions for plotting cartographic quality static point patterns maps or interactive maps

  • maptools which provides a set of tools for manipulating geographic data

  • raster which reads, writes, manipulates, analyses and model of gridded spatial data

  • spatstat, for its wide range of useful functions for point pattern analysis, and

  • sfdep for performing geospatial data wrangling and local colocation quotient analysis

pacman::p_load(sf, tidyverse, tmap, maptools, raster, spatstat, sfdep)

Import Geospatial Data

GeoBoundaries data set

geoNGA = st_read(dsn = "data/geospatial",
                  layer = "geoBoundaries-NGA-ADM2") %>%
  st_transform(crs = 26392)
Reading layer `geoBoundaries-NGA-ADM2' from data source 
  `/Users/michelle/Desktop/IS415/shelle-mim/IS415-GAA/Take-home_Exercise/TH1/data/geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 774 features and 5 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 2.668534 ymin: 4.273007 xmax: 14.67882 ymax: 13.89442
Geodetic CRS:  WGS 84
head(geoNGA, n=3)
Simple feature collection with 3 features and 5 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 538408.1 ymin: 116360.7 xmax: 1248985 ymax: 1079710
Projected CRS: Minna / Nigeria Mid Belt
  shapeName Level                    shapeID shapeGroup shapeType
1 Aba North  ADM2 NGA-ADM2-72505758B79815894        NGA      ADM2
2 Aba South  ADM2 NGA-ADM2-72505758B67905963        NGA      ADM2
3    Abadam  ADM2 NGA-ADM2-72505758B57073987        NGA      ADM2
                        geometry
1 MULTIPOLYGON (((548795.5 11...
2 MULTIPOLYGON (((541412.3 12...
3 MULTIPOLYGON (((1248985 104...

NGA data set

NGA <- st_read("data/geospatial/",
               layer = "nga_admbnda_adm2") %>%
  st_transform(crs = 26392) # transform to PCS of Nigeria
Reading layer `nga_admbnda_adm2' from data source 
  `/Users/michelle/Desktop/IS415/shelle-mim/IS415-GAA/Take-home_Exercise/TH1/data/geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 774 features and 16 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 2.668534 ymin: 4.273007 xmax: 14.67882 ymax: 13.89442
Geodetic CRS:  WGS 84
head(NGA, n=3)
Simple feature collection with 3 features and 16 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 538408.1 ymin: 116360.7 xmax: 1248985 ymax: 1079710
Projected CRS: Minna / Nigeria Mid Belt
  Shape_Leng  Shape_Area   ADM2_EN ADM2_PCODE  ADM2_REF ADM2ALT1EN ADM2ALT2EN
1  0.2370744 0.001523921 Aba North   NG001001 Aba North       <NA>       <NA>
2  0.2624772 0.003531104 Aba South   NG001002 Aba South       <NA>       <NA>
3  3.0753158 0.326867840    Abadam   NG008001    Abadam       <NA>       <NA>
  ADM1_EN ADM1_PCODE ADM0_EN ADM0_PCODE       date    validOn validTo
1    Abia      NG001 Nigeria         NG 2016-11-29 2019-04-17    <NA>
2    Abia      NG001 Nigeria         NG 2016-11-29 2019-04-17    <NA>
3   Borno      NG008 Nigeria         NG 2016-11-29 2019-04-17    <NA>
        SD_EN SD_PCODE                       geometry
1  Abia South  NG00103 MULTIPOLYGON (((548795.5 11...
2  Abia South  NG00103 MULTIPOLYGON (((547286.1 11...
3 Borno North  NG00802 MULTIPOLYGON (((1248985 104...

Import Aspatial Data

wp_nga <- read_csv("data/aspatial/WPDEx.csv") %>%
  filter(`#clean_country_name` == "Nigeria") #remove irrelavent data, keep the data small

Data Processing & Cleaning

Convert Aspatial to Geospatial

# use log and lat to make georeference col
wp_nga$Geometry = st_as_sfc(wp_nga$`New Georeferenced Column`)
head(wp_nga)
# A tibble: 6 × 71
  row_id #sour…¹ #lat_…² #lon_…³ #repo…⁴ #stat…⁵ #wate…⁶ #wate…⁷ #wate…⁸ #wate…⁹
   <dbl> <chr>     <dbl>   <dbl> <chr>   <chr>   <chr>   <chr>   <chr>   <chr>  
1 429068 GRID3      7.98    5.12 08/29/… Unknown <NA>    <NA>    Tapsta… Tapsta…
2 222071 Federa…    6.96    3.60 08/16/… Yes     Boreho… Well    Mechan… Mechan…
3 160612 WaterA…    6.49    7.93 12/04/… Yes     Boreho… Well    Hand P… Hand P…
4 160669 WaterA…    6.73    7.65 12/04/… Yes     Boreho… Well    <NA>    <NA>   
5 160642 WaterA…    6.78    7.66 12/04/… Yes     Boreho… Well    Hand P… Hand P…
6 160628 WaterA…    6.96    7.78 12/04/… Yes     Boreho… Well    Hand P… Hand P…
# … with 61 more variables: `#facility_type` <chr>,
#   `#clean_country_name` <chr>, `#clean_adm1` <chr>, `#clean_adm2` <chr>,
#   `#clean_adm3` <chr>, `#clean_adm4` <chr>, `#install_year` <dbl>,
#   `#installer` <chr>, `#rehab_year` <lgl>, `#rehabilitator` <lgl>,
#   `#management_clean` <chr>, `#status_clean` <chr>, `#pay` <chr>,
#   `#fecal_coliform_presence` <chr>, `#fecal_coliform_value` <dbl>,
#   `#subjective_quality` <chr>, `#activity_id` <chr>, `#scheme_id` <chr>, …
wp_sf <- st_sf(wp_nga, crs=4326)
wp_sf
Simple feature collection with 95008 features and 70 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 2.707441 ymin: 4.301812 xmax: 14.21828 ymax: 13.86568
Geodetic CRS:  WGS 84
# A tibble: 95,008 × 71
   row_id `#source`      #lat_…¹ #lon_…² #repo…³ #stat…⁴ #wate…⁵ #wate…⁶ #wate…⁷
 *  <dbl> <chr>            <dbl>   <dbl> <chr>   <chr>   <chr>   <chr>   <chr>  
 1 429068 GRID3             7.98    5.12 08/29/… Unknown <NA>    <NA>    Tapsta…
 2 222071 Federal Minis…    6.96    3.60 08/16/… Yes     Boreho… Well    Mechan…
 3 160612 WaterAid          6.49    7.93 12/04/… Yes     Boreho… Well    Hand P…
 4 160669 WaterAid          6.73    7.65 12/04/… Yes     Boreho… Well    <NA>   
 5 160642 WaterAid          6.78    7.66 12/04/… Yes     Boreho… Well    Hand P…
 6 160628 WaterAid          6.96    7.78 12/04/… Yes     Boreho… Well    Hand P…
 7 160632 WaterAid          7.02    7.84 12/04/… Yes     Boreho… Well    Hand P…
 8 642747 Living Water …    7.33    8.98 10/03/… Yes     Boreho… Well    Mechan…
 9 642456 Living Water …    7.17    9.11 10/03/… Yes     Boreho… Well    Hand P…
10 641347 Living Water …    7.20    9.22 03/28/… Yes     Boreho… Well    Hand P…
# … with 94,998 more rows, 62 more variables: `#water_tech_category` <chr>,
#   `#facility_type` <chr>, `#clean_country_name` <chr>, `#clean_adm1` <chr>,
#   `#clean_adm2` <chr>, `#clean_adm3` <chr>, `#clean_adm4` <chr>,
#   `#install_year` <dbl>, `#installer` <chr>, `#rehab_year` <lgl>,
#   `#rehabilitator` <lgl>, `#management_clean` <chr>, `#status_clean` <chr>,
#   `#pay` <chr>, `#fecal_coliform_presence` <chr>,
#   `#fecal_coliform_value` <dbl>, `#subjective_quality` <chr>, …

Projection Transformation

# Transform to appropriate projected coordinate system of Nigeria
wp_sf <- wp_sf %>%
  st_transform(crs = 26392)

Data Cleaning

Select relevant cols

# Select relevant adm1 and adm2 cols (cols 3-4 and 8-9)
NGA <- NGA %>%
  dplyr::select(c(3:4, 8:9))

Check for and Remove Duplicate Names

NGA$ADM2_EN[duplicated(NGA$ADM2_EN)==TRUE]
[1] "Bassa"    "Ifelodun" "Irepodun" "Nasarawa" "Obi"      "Surulere"
# Save duplicated LGA names
duplicated_LGA <- NGA$ADM2_EN[duplicated(NGA$ADM2_EN)==TRUE]

# Find the indices of the duplicated LGAs
duplicated_indices <- which(NGA$ADM2_EN %in% duplicated_LGA)

# Edit names in at indices with duplicated names
for (ind in duplicated_indices) {
  NGA$ADM2_EN[ind] <- paste(NGA$ADM2_EN[ind], NGA$ADM1_EN[ind], sep=", ")
}

Data Wrangling

Organise Water Point status tags into broader “Functional”, “Non-Functional” and “Unknown” categories

# Look at different tags
unique(wp_sf$'#status_clean')
[1] NA                                 "Functional"                      
[3] "Abandoned/Decommissioned"         "Non-Functional"                  
[5] "Functional but not in use"        "Functional but needs repair"     
[7] "Abandoned"                        "Non functional due to dry season"
[9] "Non-Functional due to dry season"
# Rename #status_clean and remove NA values
wp_sf_nga <- wp_sf %>%
  rename(status_clean = '#status_clean') %>%
  dplyr::select(status_clean) %>%
  mutate(status_clean = replace_na(
    status_clean, "unknown"
  ))
wp_functional_sf <- wp_sf_nga %>%
  filter(status_clean %in%
           c("Functional",
             "Functional but not in use",
             "Functional but needs repair"))
wp_nonfunctional_sf <- wp_sf_nga %>%
  filter(status_clean %in%
           c("Abandoned/Decommissioned",
             "Abandoned",
             "Non-Functional due to dry season",
             "Non-Functional",
             "Non functional due to dry season"))
wp_unknown_sf <- wp_sf_nga %>% filter(`status_clean` %in% 
                                     c("unknown"))
glimpse(NGA)
Rows: 774
Columns: 5
$ ADM2_EN    <chr> "Aba North", "Aba South", "Abadam", "Abaji", "Abak", "Abaka…
$ ADM2_PCODE <chr> "NG001001", "NG001002", "NG008001", "NG015001", "NG003001",…
$ ADM1_EN    <chr> "Abia", "Abia", "Borno", "Federal Capital Territory", "Akwa…
$ ADM1_PCODE <chr> "NG001", "NG001", "NG008", "NG015", "NG003", "NG011", "NG02…
$ geometry   <MULTIPOLYGON [m]> MULTIPOLYGON (((548795.5 11..., MULTIPOLYGON (…

Extract Osun data

osun <- NGA %>%
  filter(ADM1_EN %in%
           c("Osun"))
qtm(osun)

Exploratory Spatial Data Analysis

Derive Kernel Density Maps

Conversion to ppp data type

Before plotting our Kernel Density Maps, we first need to convert our data into the appropriate ppp data type.

# Convert sf to sp's spatial class
wp_functional_spatial <- as_Spatial(wp_functional_sf)
wp_nonfunctional_spatial <- as_Spatial(wp_nonfunctional_sf)
osun_spatial <- as_Spatial(osun)
# Check spatial data type
wp_functional_spatial
class       : SpatialPointsDataFrame 
features    : 52148 
extent      : 29322.63, 1218553, 33758.37, 1092629  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=4 +lon_0=8.5 +k=0.99975 +x_0=670553.98 +y_0=0 +a=6378249.145 +rf=293.465 +towgs84=-92,-93,122,0,0,0,0 +units=m +no_defs 
variables   : 1
names       :              status_clean 
min values  :                Functional 
max values  : Functional but not in use 
wp_nonfunctional_spatial
class       : SpatialPointsDataFrame 
features    : 32204 
extent      : 28907.91, 1209690, 33736.93, 1092883  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=4 +lon_0=8.5 +k=0.99975 +x_0=670553.98 +y_0=0 +a=6378249.145 +rf=293.465 +towgs84=-92,-93,122,0,0,0,0 +units=m +no_defs 
variables   : 1
names       :                     status_clean 
min values  :                        Abandoned 
max values  : Non-Functional due to dry season 
osun_spatial
class       : SpatialPolygonsDataFrame 
features    : 30 
extent      : 176503.2, 291043.8, 331434.7, 454520.1  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=4 +lon_0=8.5 +k=0.99975 +x_0=670553.98 +y_0=0 +a=6378249.145 +rf=293.465 +towgs84=-92,-93,122,0,0,0,0 +units=m +no_defs 
variables   : 4
names       :  ADM2_EN, ADM2_PCODE, ADM1_EN, ADM1_PCODE 
min values  : Aiyedade,   NG030001,    Osun,      NG030 
max values  :   Osogbo,   NG030030,    Osun,      NG030 
# Convert to sp
wp_functional_sp <- as(wp_functional_spatial, "SpatialPoints")
wp_nonfunctional_sp <- as(wp_nonfunctional_spatial, "SpatialPoints")
osun_sp <- as(osun_spatial, "SpatialPolygons")
wp_functional_sp
class       : SpatialPoints 
features    : 52148 
extent      : 29322.63, 1218553, 33758.37, 1092629  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=4 +lon_0=8.5 +k=0.99975 +x_0=670553.98 +y_0=0 +a=6378249.145 +rf=293.465 +towgs84=-92,-93,122,0,0,0,0 +units=m +no_defs 
wp_functional_ppp <- as(wp_functional_sp, "ppp")
wp_nonfunctional_ppp <- as(wp_nonfunctional_sp, "ppp")
plot(wp_functional_ppp)

plot(wp_nonfunctional_ppp)

Check for/Deal with duplicated points

any(duplicated(wp_functional_ppp))
[1] FALSE
any(duplicated(wp_nonfunctional_ppp))
[1] FALSE

Since there are no duplicated points, no further processing is needed.

Create Owin of Osun

Next, we create an owin object of Osun in order to bound our data within when representing our data on the map.

osun_owin <- as(osun_sp, "owin")
plot(osun_owin)

We can now use this owin of Osun to plot our Functional and Non-functional water point event data within.

wp_functional_osun_ppp = wp_functional_ppp[osun_owin]
wp_nonfunctional_osun_ppp = wp_nonfunctional_ppp[osun_owin]
plot(wp_functional_osun_ppp)

plot(wp_nonfunctional_osun_ppp)

Kernel Density Estimation

With our data of Osun functional and non-functional water points assembled, we can now proceed to Kernel Density Estimation.

# Convert data to km as our unit of measurement
wp_func_osun_ppp_km <- rescale(wp_functional_osun_ppp, 1000, "km")
wp_nonfunc_osun_ppp_km <- rescale(wp_nonfunctional_osun_ppp, 1000, "km")

For Kernel Density Estimation, there are multiple method in determining the bandwidth with which the calculation will be performed with. Below, we experiment with various automatic bandwith methods:

# Choosing automatic bandwith method
par(mfrow=c(2,2))
plot(density(wp_func_osun_ppp_km,
             sigma=bw.CvL,
             edge=TRUE,
             kernel="gaussian"))
plot(density(wp_func_osun_ppp_km,
             sigma=bw.scott,
             edge=TRUE,
             kernel="gaussian"))
plot(density(wp_func_osun_ppp_km,
             sigma=bw.diggle,
             edge=TRUE,
             kernel="gaussian"))
plot(density(wp_func_osun_ppp_km,
             sigma=bw.ppl,
             edge=TRUE,
             kernel="gaussian"))

From these maps, I have choosen the bw.ppl method to be used. This is as from the visualisations we can see we appear to be working with predominately tight clusters, and bw.ppl and bw.diggle give more appropriate values when such is the case. I chose bw.ppl over bw.diggle as it shows up better than bw.diggle, which is slightly difficult to see in the visualisation above.

Now, we will choose our kernel method. The different kernel methods are demonstrated below:

# Choosing kernel method
par(mfrow=c(2,2))
plot(density(wp_func_osun_ppp_km, 
             sigma=bw.ppl, 
             edge=TRUE, 
             kernel="gaussian"), 
     main="Gaussian")
plot(density(wp_func_osun_ppp_km, 
             sigma=bw.ppl, 
             edge=TRUE, 
             kernel="epanechnikov"), 
     main="Epanechnikov")
plot(density(wp_func_osun_ppp_km, 
             sigma=bw.ppl, 
             edge=TRUE, 
             kernel="quartic"), 
     main="Quartic")
plot(density(wp_func_osun_ppp_km, 
             sigma=bw.ppl, 
             edge=TRUE, 
             kernel="disc"), 
     main="Disc")

Since there are no significant differences observed across different kernels, we will use the Gaussian kernel, the default kernel method.

Lastly, to cover all bases, we shall try some adaptive bandwidth methods. Such methods are said to be better adapted to deal with skewed distributions of data.

par(mfrow=c(1,2))
# Try adaptive bandwith
plot( adaptive.density(wp_func_osun_ppp_km, method="voronoi"))
plot( adaptive.density(wp_func_osun_ppp_km, method="kernel"))

As we can see, adaptive bandwidth also does not give the best visualisations, so will stick to using the automatic bandwidth method, with bw.ppl and Gaussian kernel.

Hence, there are our final parameters of our Kernel Density Estimates will be as shown below:

kde_wp_func_bw_ppl <- density(wp_func_osun_ppp_km, 
                               sigma=bw.ppl, 
                               edge=TRUE,
                               kernel="gaussian")
kde_wp_nonfunc_bw_ppl <- density(wp_nonfunc_osun_ppp_km, 
                               sigma=bw.ppl, 
                               edge=TRUE,
                               kernel="gaussian")

Final Kernel Density Estimation Maps

And finally, we shall plot our Kernel Density Maps.

par(mfrow=c(1,2))
plot(kde_wp_func_bw_ppl)
plot(kde_wp_nonfunc_bw_ppl)

Display KDE Maps on OpenStreetMap

Convert to raster for tmap display

In order to display our KDE Maps on tmap, we would need to first convert out KDE Map to raster format.

# Convert to Gridded, then to raster
gridded_kde_wp_func_bw_ppl <- as.SpatialGridDataFrame.im(kde_wp_func_bw_ppl)
gridded_kde_wp_nonfunc_bw_ppl <- as.SpatialGridDataFrame.im(kde_wp_nonfunc_bw_ppl)

raster_kde_wp_func_bw_ppl <- raster(gridded_kde_wp_func_bw_ppl)
raster_kde_wp_nonfunc_bw_ppl <- raster(gridded_kde_wp_nonfunc_bw_ppl)

# Assign CRS info
projection(raster_kde_wp_func_bw_ppl) <- CRS("+init=EPSG:26392 +units=km")
projection(raster_kde_wp_nonfunc_bw_ppl) <- CRS("+init=EPSG:26392 +units=km")

Display on tmap OpenStreetMap

Now that we have completed the transformation, we can display our maps on OpenStreetMap using tmap.

tmap_mode('view')

kde_func_wp <- tm_basemap("OpenStreetMap")+
  tm_view(set.zoom.limits=c(9, 16)) +
  tm_shape(raster_kde_wp_func_bw_ppl) + 
  tm_raster("v", palette="YlOrRd") +
  tm_layout(legend.position = c("right", "bottom"), frame = FALSE)

kde_func_wp
kde_nonfunc_wp <- tm_basemap("OpenStreetMap")+
  tm_view(set.zoom.limits=c(9, 16)) +
  tm_shape(raster_kde_wp_nonfunc_bw_ppl) + 
  tm_raster("v", palette="YlOrRd") +
  tm_layout(legend.position = c("right", "bottom"), frame = FALSE)

kde_nonfunc_wp

Describe Spatial Patterns

Adding the LGA borders back on top of our KDE maps, we get:

tmap_mode('plot')
kde_func_wp +
  tm_shape(osun) +
  tm_borders() +
  tm_text("ADM2_EN", size = 0.6) 

kde_nonfunc_wp + 
  tm_shape(osun) +
  tm_borders() +
  tm_text("ADM2_EN", size = 0.6) 

Benefit of Kernel Density Maps as a representation tool

As we can see from the two maps, it is very easy to pick out the areas with high density of water points in comparison to to point maps or choropleth maps. This is as the Kernel Density Estimate provides quantitative values of the concentration of points at a specific point, allowing for clean, uncluttered, and specific.

Kernel Density estimates also calculates the density of each area, allowing for a smoother estimate, and hence a better representation of the distribution of events on a map.

Spatial Pattern Analysis

From the plotted Kernel Density Maps, we can see there is a relatively high density of functional water points in the following LGAs, as denoted by the bright/dark red colors observed in the map:

  • Ejigbo

  • Ede North

  • Osogbo, and

  • Iwo

On the other hand, for non-functional water points in Osun, the high density lies primarily between the LGAs:

  • Ife Central, and

  • Ife East

We will do further analysis on some of these LGAs of interest in the next section.

Second-order Spatial Point Pattern Analysis

Formulate Null Hypothesis

Before carrying out our Second-order analysis on selected LGAs, we must first formulate our hypotheses to test our data against.

As such, below are the null and alternate hypotheses for the analysis on functional / non-functional water points respectively:

  • H0: The distribution of functional / nonfunctional water points in the given study area are randomly distributed.

  • H1: The distribution of functional / nonfunctional water points in the given study area are not randomly distributed.

  • Confidence level : 95%

  • Significance level (alpha) : 0.05

  • The null hypothesis will be rejected if p-value is smaller than alpha value of 0.05.

Perform test using G Function

Extracting Study Areas

Will be using the 2 most notable LGAs for Functional and Nonfunctional water points, which are:

  • For Functional: Ejigbo and Ede North

  • For Non-Functional: Ife Central and Ife East

We must now extract these 4 LGAs as owins, then populate them with the relavent Water Point data.

# Study Areas for Functional Water Points
ejigbo_owin <- NGA[NGA$ADM2_EN == "Ejigbo",] %>%
    as('Spatial') %>%
    as('SpatialPolygons') %>%
    as('owin')

ede_north_owin <- NGA[NGA$ADM2_EN == "Ede North",] %>%
    as('Spatial') %>%
    as('SpatialPolygons') %>%
    as('owin')

# Study Areas for NonFunctional Water Points
ife_central_owin <- NGA[NGA$ADM2_EN == "Ife Central",] %>%
    as('Spatial') %>%
    as('SpatialPolygons') %>%
    as('owin')

ife_east_owin <- NGA[NGA$ADM2_EN == "Ife East",] %>%
    as('Spatial') %>%
    as('SpatialPolygons') %>%
    as('owin')
ejigbo_wp_func_ppp = rescale(wp_functional_ppp[ejigbo_owin], 1000, "km")
ede_north_wp_func_ppp = rescale(wp_functional_ppp[ede_north_owin], 1000, "km")
ife_central_wp_nonfunc_ppp = rescale(wp_nonfunctional_ppp[ife_central_owin], 1000, "km")
ife_east_wp_nonfunc_ppp = rescale(wp_nonfunctional_ppp[ife_east_owin], 1000, "km")
par(mfrow=c(2,2))
plot(ejigbo_wp_func_ppp)
plot(ede_north_wp_func_ppp)
plot(ife_central_wp_nonfunc_ppp)
plot(ife_east_wp_nonfunc_ppp)

Calculating G Function

In this analysis, we will be using G Function in order to analyse the spatial distribution of water points in these 4 LGAs. The G function was chosen for this case as we are performing analysis on segments of our study area rather than the study area as a cumulative.

G_ejigbo = Gest(ejigbo_wp_func_ppp, correction = "border")
plot(G_ejigbo)

G_ejigbo.csr <- envelope(ejigbo_wp_func_ppp, Gest, nsim = 39)
Generating 39 simulations of CSR  ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,  39.

Done.
plot(G_ejigbo.csr)

G_ede_north = Gest(ede_north_wp_func_ppp, correction = "border")
plot(G_ede_north)

G_ede_north.csr <- envelope(ede_north_wp_func_ppp, Gest, nsim = 39)
Generating 39 simulations of CSR  ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,  39.

Done.
plot(G_ede_north.csr)

G_ife_central = Gest(ife_central_wp_nonfunc_ppp, correction = "border")
plot(G_ife_central)

G_ife_central.csr <- envelope(ife_central_wp_nonfunc_ppp, Gest, nsim = 39)
Generating 39 simulations of CSR  ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,  39.

Done.
plot(G_ife_central.csr)

G_ife_east = Gest(ife_east_wp_nonfunc_ppp, correction = "border")
plot(G_ife_east)

G_ife_east.csr <- envelope(ife_east_wp_nonfunc_ppp, Gest, nsim = 39)
Generating 39 simulations of CSR  ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,  39.

Done.
plot(G_ife_east.csr)

Second-Order Analysis Conclusion

From the graphs drawn, we can draw the following conclusions:

Ejigbo: The observed G(r) lies above the envelope at all plotted values of r, indicating that the Functional Water Points in Ejigbo are clustered. We can therefore reject the null hypothesis, and conclude that the Functional Water Points in Ejigbo are clustered with 95% confidence.

Ede North: Similarly, the observed G(r) lies above the envelope at all plotted values of r, indicating that the Functional Water Points in Ede North are clustered. We can therefore reject the null hypothesis, and conclude that the Functional Water Points in Ede North are clustered with 95% confidence.

Ife Central: Likewise, the observed G(r) lies above the envelope at all plotted values of r, indicating that the Non-Functional Water Points in Ife Central are clustered. We can therefore reject the null hypothesis, and conclude that the Non-Functional Water Points in Ife Central are clustered with 95% confidence.

Ife East: Lastly, we have Ife East, where the observed G(r) lies above the envelope, except at r > 0.925km (approximately). This indicates that the Non-Functional Water Points in Ife Central are clustered at values of r below 0.925km, and therefore for these values we can reject the null hypothesis, and conclude that the Non-Functional Water Points in Ife East at r < 0.925km are clustered with 95% confidence.

Spatial Correlation Analysis

Formulate Null Hypothesis and Alternative Hypothesis

Moving on to Spatial Correlation Analysis of Functional and Non-Functional Water Points in Osun, we will be testing the hypotheses as shown below:

  • H0: The Functional Water Points in Osun are not co-located with the Non-Functional Water Points in Osun.

  • H1: The Functional Water Points in Osun are co-located with the Non-Functional Water Points in Osun.

  • Confidence level : 95%

  • Significance level (alpha) : 0.05

  • The null hypothesis will be rejected if p-value is smaller than alpha value of 0.05.

Perform Local Colocation Quotient Analysis

Proprocessing to Osun data

In order to carry out our co-location analysis, we first need to assemble the relevant Osun Functional and Non-Functional Water point data.

# Re-select wp sf data
wp_sf_osun <- wp_sf %>%
  st_intersection(osun) %>%
  rename(status_clean = 'X.status_clean') %>%
  rename(country_name = 'X.clean_country_name') %>%
  rename(adm1 = 'X.clean_adm1') %>%
  dplyr::select(status_clean, country_name, adm1) %>%
  mutate(status_clean = replace_na(
    status_clean, "unknown"
  ))

# Filter out non-osun data
wp_sf_osun <- wp_sf_osun %>% filter(adm1 %in%
           c("Osun"))

# Create col to catagorise functional and non functional water points
wp_sf_osun <- wp_sf_osun %>%
  mutate(`func_status` = case_when(
    `status_clean` %in% c("Functional", 
                          "Functional but not in use",
                          "Functional but needs repair") ~ 
      "Functional",
    `status_clean` %in% c("Abandoned/Decommissioned",
                               "Non-Functional due to dry season",
                               "Non-Functional",
                               "Abandoned",
                               "Non functional due to dry season") ~ 
      "Non-Functional",
    `status_clean` == "Unknown" ~ "Unknown"))

glimpse(wp_sf_osun)
Rows: 5,322
Columns: 5
$ status_clean <chr> "Functional but needs repair", "Functional", "Functional"…
$ country_name <chr> "Nigeria", "Nigeria", "Nigeria", "Nigeria", "Nigeria", "N…
$ adm1         <chr> "Osun", "Osun", "Osun", "Osun", "Osun", "Osun", "Osun", "…
$ Geometry     <POINT [m]> POINT (212810 386707.6), POINT (212202 349210.1), P…
$ func_status  <chr> "Functional", "Functional", "Functional", "Functional", "…

We can now plot the processed data to visualise the sf data we will be working with:

tmap_mode("view")
tm_shape(osun) +
  tm_polygons() +
tm_shape(wp_sf_osun)+ 
  tm_dots(col = "func_status",
             size = 0.01,
             border.col = "black",
             border.lwd = 0.5) +
  tm_view(set.zoom.limits = c(9, 16))

Calculate Local Colocation Quotients

In order to compute the LCLQ, we first need to prepare the nearest neighbour list, kernel weights, and vector list, as done below:

# Prepare nearest neighbours list
nb <- include_self(
  st_knn(st_geometry(wp_sf_osun), 6))

# Compute kernel weights
wt <- st_kernel_weights(nb, 
                        wp_sf_osun, 
                        "gaussian", 
                        adaptive = TRUE)
# Prepare vectors
Fuctional_vector <- wp_sf_osun %>%
  filter(func_status == "Functional")
A <- Fuctional_vector$func_status

Non_Functional_vector <- wp_sf_osun %>%
  filter(func_status == "Non-Functional")
B <- Non_Functional_vector$func_status

With the relevant components prepared, we can now calculate the LCLQ values for each Functional Water Point, complete with the Complete Spatial Randomness Test.

LCLQ <- local_colocation(A, B, nb, wt, 49)

In order to plot these values, we then join the output of the LCLQ through cbind().

LCLQ_osun <- cbind(wp_sf_osun, LCLQ)

We can now plot the LCLQ values for analysis, with statistically significant LCLQ values that have a p-value of less than 0.05 highlighted in red:

tmap_mode("view")
osun_LCLQ_map <- tm_shape(osun) +
  tm_polygons() +
tm_shape(LCLQ_osun)+ 
  tm_dots(col = "Non.Functional",
             size = 0.02,
             border.col = "black",
             border.lwd = 0.5,
             alpha=0.5,
             palette=c("red", "grey")) +
  tm_view(set.zoom.limits = c(9, 16))

osun_LCLQ_map

Conclusion

tmap_mode("plot")
osun_LCLQ_map + 
  tm_shape(osun) +
  tm_borders() +
  tm_text("ADM2_EN", size = 0.6) 

From this analysis, we can see that there indeed there is indeed some correlation between the location of Functional and Non-Functional water points in certain LGAs in Osun, most notably Boluwaduro, Boripe, Egbedore, Ede North, Ede South, Alyedire, Alyedade, Atakumosa West, and Atakumosa East, where the highlighted points are concentrated.

Upon examining the data more closely, it is observed that all the statistically significant points from the Complete Spatial Randomness Test, as highlighted in red, all have a Local Colocation Quotient of <1. Thus, for these points, we can reject the null hypothesis, and conclude that these Functional Water Points are indeed co-located with their surrounding Non-Functional Water Points with a 95% confidence. As for the nature of this correlation, from the LCLQ of <1, we can say that these Functional Water Points are less likely to have Non-Functional Water points in their neighbourhood. We can therefore infer that the LGA listed above, which have the majority of highlighted points, have a less than proportional number of Non-Functional Water Points for each Functional Water Point.