R: Estimate occurrence probabilities of a sampling plan (points)

spsample.prob {GSIF}

R Documentation

Estimate occurrence probabilities of a sampling plan (points)

Description

Estimates occurrence probabilities as an average between the kernel density estimation (spreading of points in geographical space) and MaxLike analysis (spreading of points in feature space). The output 'iprob' indicates whether the sampling plan has systematically missed some important locations / features, and can be used as an input for geostatistical modelling (e.g. as weights for regression modeling).

Usage

 
## S4 method for signature 'SpatialPoints,SpatialPixelsDataFrame'
spsample.prob(observations, covariates, 
  quant.nndist=.95, n.sigma, ...)

Arguments

`observations`	object of class `SpatialPoints`; sampling locations
`covariates`	object of class `SpatialPixelsDataFrame`; list of covariates of interest
`quant.nndist`	numeric; threshold probability to determine the search radius (sigma)
`n.sigma`	numeric; size of sigma used for kernel density estimation (optional)
`...`	other optional arguments that can be passed to function `spatstat::density`

Value

Returns a list of objects where 'iprob' ("SpatialPixelsDataFrame") is the map showing the estimated occurrence probabilities.

Note

Occurrence probabilities for geographical space are derived using kernel density estimator. The sampling intensities are converted to probabilities by deviding the sampling intensity by the maximum sampling intensity for the study area (Baddeley, 2008). The occurrence probabilities for feature space are determined using MaxLike algorithm (Royle et al., 2012). The lower the average occurrence probability for the whole study area, the lower the representation efficiency of a sampling plan.
MaxLike function might fail to produce predictions (e.g. if not at least one continuous covariate is provided and if the optim function is not able to find the global optima) in which case an error message is generated. Running Principal Component analysis i.e. standardizing the covariates prior to running spsample.prob is, thus, highly recommended.
This function can be time consuming for large grids.

Author(s)

Tomislav Hengl

References

Baddeley, A. (2008) Analysing spatial point patterns in R. Technical report, CSIRO Australia. Version 4.
Royle, J.A., Chandler, R.B., Yackulic, C. and J. D. Nichols. (2012) Likelihood analysis of species occurrence probability from presence-only data for modelling species distributions. Methods in Ecology and Evolution.

Examples

library(plotKML)
library(maxlike)
library(spatstat)
library(maptools)

data(eberg)
data(eberg_grid)
## existing sampling plan:
sel <- runif(nrow(eberg)) < .2
eberg.xy <- eberg[sel,c("X","Y")]
coordinates(eberg.xy) <- ~X+Y
proj4string(eberg.xy) <- CRS("+init=epsg:31467")
## covariates:
gridded(eberg_grid) <- ~x+y
proj4string(eberg_grid) <- CRS("+init=epsg:31467")
## convert to continuous independent covariates:
formulaString <- ~ PRMGEO6+DEMSRT6+TWISRT6+TIRAST6
eberg_spc <- spc(eberg_grid, formulaString)

## derive occurrence probability:
covs <- eberg_spc@predicted[1:8]
iprob <- spsample.prob(eberg.xy, covs)
## Note: obvious omission areas:
hist(iprob[[1]]@data[,1])
## compare with random sampling:
rnd <- spsample(eberg_grid, type="random",
     n=length(iprob[["observations"]]))
iprob2 <- spsample.prob(rnd, covs)
## compare the two:
par(mfrow=c(1,2))
plot(raster(iprob[[1]]), zlim=c(0,1), col=SAGA_pal[[1]])
points(iprob[["observations"]])
plot(raster(iprob2[[1]]), zlim=c(0,1), col=SAGA_pal[[1]])
points(iprob2[["observations"]])

## fit a weighted lm:
eberg.xy <- eberg[sel,c("SNDMHT_A","X","Y")]
coordinates(eberg.xy) <- ~X+Y
proj4string(eberg.xy) <- CRS("+init=epsg:31467")
eberg.xy$iprob <- over(eberg.xy, iprob[[1]])$iprob
eberg.xy@data <- cbind(eberg.xy@data, over(eberg.xy, covs))
fs <- as.formula(paste("SNDMHT_A ~ ", 
    paste(names(covs), collapse="+")))
## the lower the occurrence probability, the higher the weight:
w <- 1/eberg.xy$iprob
m <- lm(fs, eberg.xy, weights=w)
summary(m)
## compare to standard lm:
m0 <- lm(fs, eberg.xy)
summary(m)$adj.r.squared
summary(m0)$adj.r.squared

## all at once:
gm <- fit.gstatModel(eberg.xy, fs, covs, weights=w)
plot(gm)

[Package GSIF version 0.5-5 Index]