5 min read

SSN package tryout

Load SSN package.

library(SSN)
## Warning: package 'SSN' was built under R version 3.4.4
## Warning: package 'RSQLite' was built under R version 3.4.4

Simulate stream network

First we simulate several stream networks in SSN object form. Sampling sites depicted as black hollow circles and prediction sites as green filled circles.

# Delete any old simulations in the working directory
unlink("simulated_network", recursive = TRUE)

# Generate and save a simulated network
my.ssn <- createSSN(n = c(200, 200, 100, 100),
          obsDesign = hardCoreDesign(c(100, 100, 100, 100), c(0.1, 0.4, 0.6, 0.9)),
          predDesign = systematicDesign(c(0.3, 0.5, 0.7, 0.9)),
          importToR = TRUE,
          path = paste0(getwd(), "/simulated_network"),
          treeFunction = iterativeTreeLayout)
#str(my.ssn)
my.ssn
## Object of class Spatial Stream Network
## 
## Object includes observations on 8 variables across 277 sites within the bounding box
##         min      max
## x -4.271398 76.99664
## y  0.000000 24.57128
## 
## Object also includes 1 set of prediction points with 1323 locations
## 
## Variables recorded are (found using names(object)):
## $Obs
## [1] "locID"      "upDist"     "pid"        "netID"      "rid"       
## [6] "ratio"      "shreve"     "addfunccol"
## 
## $preds
## [1] "locID"      "upDist"     "pid"        "netID"      "rid"       
## [6] "ratio"      "shreve"     "addfunccol"
## 
## Generic functions that work with this object include names, plot, print, summary, hist, boxplot and qqnorm

The obsDesign and predDesign arguments are both functions which specify the way in which sampling and prediction sites are spread out in the network. These are:
1. binomialDesign(n) - uniformally and randomly distributes n points on the network 2. poissonDesign(lambda) - generates points on the network at a rate lambda 3. systematicDesign(spacing) - generates points that are at a constant distance away from each other 4. hardCoreDesign(n, inhibition_region) - generates points randomly and then eliminates them until they are are at least inhibition region away from each other There are 2 ways in which the stream network can be simulated: 1. igraphKamadaKawai - works faster but creates more unrealistic networks 2. iterativeTreeLayout - takes longer but makes more realistic networks Plot the strean network with sampling sites as black circles and prediction locations as green dots ### Generate stream network data First, we need to create the distance matrix calculating hydrological distance between sites

createDistMat(my.ssn, "preds", o.write = TRUE, amongpreds = TRUE)

Second, we need to extract the observation and prediction dataframes and fill them with some dummy data.

# Exract dataframes
raw.obs <- getSSNdata.frame(my.ssn, "Obs")
raw.pred <- getSSNdata.frame(my.ssn, "preds")
# Fill dataframes
raw.obs$cont1 <- rnorm(length(raw.obs[,1]))
raw.pred$cont1 <- rnorm(length(raw.pred[,1]))
raw.obs$cont2 <- rnorm(length(raw.obs[,1]))
raw.pred$cont2 <- rnorm(length(raw.pred[,1]))
raw.obs$factor1 <- as.factor(sample.int(4, length(raw.obs[,1]), replace = TRUE))
raw.pred$factor1 <- as.factor(sample.int(4, length(raw.pred[,1]), replace = TRUE))
raw.obs$random1 <- as.factor(sample(1:3, length(raw.obs[,1]), replace = TRUE))
raw.obs$random2 <- as.factor(sample(1:4, length(raw.obs[,1]), replace = TRUE))
raw.pred$random1 <- as.factor(sample(1:3, length(raw.pred[,1]), replace = TRUE))
raw.pred$random2 <- as.factor(sample(1:4, length(raw.pred[,1]), replace = TRUE))

Third, the modified dataframes with the stream network data needs to be redistributed on the network with the spatial autocorrelated errors from the Ver Hoef and Peterson tail-up/tail-down models.

sim.out <- SimulateOnSSN(my.ssn, ObsSimDF = raw.obs, PredSimDF = raw.pred, "preds", formula = ~ cont1 + cont2 + factor1, coefficients = c(10, 1, 0, -2, 0, 2), CorModels = c("LinearSill.tailup", "Mariah.taildown", "Exponential.Euclid", "random1", "random2"), use.nugget = TRUE, CorParms = c(3, 10, 2, 10, 1, 5, 1, 0.5, 0.1), addfunccol = "addfunccol")

We can check to see that the coefficients and correlation models we supplied were implemented correctly

sim.out$FixedEffects
##        Xnames Coefficient
## 1 (Intercept)          10
## 2       cont1           1
## 3       cont2           0
## 4    factor12          -2
## 5    factor13           0
## 6    factor14           2
sim.out$CorParms
##             CorModel    type Parameter
## 1  LinearSill.tailup parsill       3.0
## 2  LinearSill.tailup   range      10.0
## 3    Mariah.taildown parsill       2.0
## 4    Mariah.taildown   range      10.0
## 5 Exponential.Euclid parsill       1.0
## 6 Exponential.Euclid   range       5.0
## 7            random1 parsill       1.0
## 8            random2 parsill       0.5
## 9             nugget parsill       0.1

Finally, we can extract and visualize the stream network with the data represented by coloured circles.

sim.ssn <- sim.out$ssn.object
plot(sim.ssn, "Sim_Values", xlab = "x-coord", ylab = "y-coord", cex = 1.5)

Modelling stream data

To model the stream, we will use the prediction sites and attempt to predict the values at those locations which we generated in the previous section. We have to extract the dataframes again so that we can store the correct values from the prediction sites and compare them against the ones generated by the model we use later on.

sim.obs <- getSSNdata.frame(sim.ssn, "Obs")
sim.pred <- getSSNdata.frame(sim.ssn, "preds")
stored.preds <- sim.pred$Sim_Values
sim.pred$Sim_Values <- NA
sim.ssn <- putSSNdata.frame(sim.pred, sim.ssn, "preds")

Next we will fit a model to our data and check out the output.

glmssn.out <- glmssn(Sim_Values ~ cont1 + cont2 + factor1, sim.ssn, CorModels = c("LinearSill.tailup", "Mariah.taildown", "Exponential.Euclid", "random1", "random2"), addfunccol = "addfunccol")
summary(glmssn.out)
## 
## Call:
## glmssn(formula = Sim_Values ~ cont1 + cont2 + factor1, ssn.object = sim.ssn, 
##     CorModels = c("LinearSill.tailup", "Mariah.taildown", "Exponential.Euclid", 
##         "random1", "random2"), addfunccol = "addfunccol")
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.4320 -1.8643 -0.1954  1.4541  8.0972 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 10.429934   0.634460  16.439   <2e-16 ***
## cont1        1.074122   0.120125   8.942   <2e-16 ***
## cont2        0.008221   0.127342   0.065    0.949    
## factor11     0.000000         NA      NA       NA    
## factor12    -2.131031   0.344957  -6.178   <2e-16 ***
## factor13    -0.309970   0.330605  -0.938    0.349    
## factor14     1.683188   0.340898   4.938   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Covariance Parameters:
##    Covariance.Model Parameter Estimate
##   LinearSill.tailup   parsill    2.545
##   LinearSill.tailup     range  138.280
##     Mariah.taildown   parsill    2.578
##     Mariah.taildown     range    6.391
##  Exponential.Euclid   parsill    0.708
##  Exponential.Euclid     range   22.929
##             random1   parsill    0.308
##             random2   parsill    0.434
##              Nugget   parsill    0.324
## 
## Residual standard error: 2.626106
## Generalized R-squared: 0.487867

Finally, to assess how well our model performed we will plot the real values against the values predicted by the model.

glmssn.pred <- predict(glmssn.out, "preds")
predictions <- getSSNdata.frame(glmssn.pred, "preds")
plot(stored.preds, predictions$Sim_Values, xlab = "True", ylab = "Predicted", pch = 19)