Commit af777b5c authored by Alison Beamish's avatar Alison Beamish
Browse files

Ali edited markdown based on Romulo's updates

parent a8cb6b36
Pipeline #22509 failed with stage
in 16 seconds
......@@ -27,7 +27,12 @@ The tool is implemented in R and uses Leaflet [Cheng et al., 2019](https://rdrr.
## 1.2 Sample datasets
The examples in this documentation use an L2A Sentinel-2 timeseries stack from 2018 (6 days, 9 bands each) and reference points that are included in the HaSa package. This Sentinel-2 data are from the Kyritz-Ruppiner Heide a former military training area north east of Berlin, Germany. The open heath-lands within the former military training area are designated protected areas under the European Natura 2000 network and are subject to management activities including tree removal, controlled burning and machine mowing. The reference data include 7 habitat classes identified by with a priori expert knowledge.
The Sentinel-2 data are downloaded and processed using the German Centre for Geosciences (GFZ) Time Series System for Sentinel-2 (GTS2). Data are made available and atmospherically corrected via a simple web application programming interface (API). Detailed information on the GTS2 system can be found [here](https://www.gfz-potsdam.de/gts2/).
The Sentinel-2 data are downloaded and processed using the German Centre for Geosciences (GFZ) Time Series System for Sentinel-2 (GTS2). Data are made available and atmospherically corrected via a simple web application programming interface (API). Detailed information on the GTS2 system can be found [here](https://www.gfz-potsdam.de/gts2/). The metadata including the Senitnel-2 bands included and the band ID in the timeseries stack are provided below.
```{r S2 metadata, eval = TRUE, echo=FALSE}
metadat <- read.csv(paste(wd, "Data/S2_stack_metadata.csv", sep = ""), header = T)
colnames(metadat) <- c("Sentinel-2 bands", "Band 2 - Blue", "Band 3 - Green", "Band 4 - Red", "Band 5 - Vegetation Red Edge 1", "Band 6 - Vegetation Red Edge 2", "Band 7 - Vegetation Red Edge 3", "Band 8 - NIR", "Band 11 - Short wave Infra-Red 1", "Band 12 - Short wave Infra-Red 2")
kable(metadat, caption = "Table 1. Sentinel-2 bands included in the timeseries stack and the correspoding band ID in the HaSa tool")
```
Sample reference data are provided in two formats, as a data table and a point shapefile. The table includes spectral information from each class type desired for the classification and can be used directly to train the models (rows = class, columns = spectral wavebands). The first row must contain the spectral wavebands names and this must match the wavebands of the input satellite data.
......@@ -75,7 +80,7 @@ lapply(libraries, library, character.only = TRUE)
```
# 3 Load demo data
An important step preceding habitat classification is to load the satellite timeseries stack, reference data, and class names. `HaSa` provides a set of functions to guide the user through the data loading process.
An important step preceding habitat classification is to load the Sentinel-2 satellite timeseries stack, reference data, and class names. `HaSa` provides a set of functions to guide the user through the data loading process.
## 3.1 Data directories
Before loading the input data and using `HaSa`, the user needs to define a series of directory paths. They are from where `HaSa` will read input data, and store intermediates and final results. These directory paths are relative to the working directory path, i.e., `wd`. The following code sets all the paths assuming that the root path is the current directory, i.e., the `demo` directory.
......@@ -93,7 +98,7 @@ raster::rasterOptions(tmpdir = "./RasterTmp/")
```
## 3.2 Satellite timeseries stack
The Satellite time series is either passed as a **3.2.1** stack of images already clipped or **3.2.2** a stack of image to be clipped. In both cases, the input Satellite images needs to either have a valid projection or the projection be passed as parameter, i.e., `sat_crs_str = '+proj=utm +zone=32 +datum=WGS84 +units=m +no_defs'`, otherwise, the function will report error. Satellite time series data are available in `dataPath`.
The satellite time series is either passed as a **3.2.1** stack of images already clipped or **3.2.2** a stack of image to be clipped. In both cases, the input Satellite images needs to either have a valid projection or the projection be passed as parameter, i.e., `sat_crs_str = '+proj=utm +zone=32 +datum=WGS84 +units=m +no_defs'`, otherwise, the function will report error. Satellite time series data are available in `dataPath`.
### 3.2.1 - Clipped
The following example loads a Sentinel-2 timeseries stack clipped to the study area.
......@@ -103,6 +108,7 @@ timeseries_stack <- HaSa::load_timeseries_stack(satellite_series_path)
```
```{r raster preview clipped, eval = TRUE}
# See Table 1 for band IDs and corresponding Sentinel-2 bands
r = 19; g = 20; b = 21;
raster::plotRGB(timeseries_stack,r = r,g = g,b = b,stretch = "lin", axes = T)
```
......@@ -120,7 +126,7 @@ raster::plotRGB(timeseries_stack,r = r,g = g,b = b,stretch = "lin", axes = T)
```
## 3.3 Selecting reference samples
Reference samples include spectral information of the defined classes as chosen by the user. Reference data can be either passed as a table **3.3.1** or a shapefile **3.3.2**. Reference samples passed as a table can either be pre-extracted spectral information from an image or spectral information imported from a spectral library. Reference samples passed as a shapefile are point locations defined by the user on the satellite timeseries stack from where the spectral information should be extracted for each class. Sample reference data are available in `dataPath`.
Reference samples include spectral information of the defined classes as chosen by the user. Reference data can be either passed as a table **3.3.1** or a shapefile **3.3.2**. Reference samples passed as a table can either be pre-extracted spectral information from an image or spectral information imported from a spectral library. Reference samples passed as a shapefile are point locations defined by the user on the satellite timeseries stack from where the spectral information are automatically extracted for each class. Sample reference data are available in `dataPath` in table and shapefile format.
### 3.2.1 Reference table
The table includes spectral information from each class type (rows = classes, columns = spectral refelctance). The first row must contain the spectral wavebands, the same names as the ones used in the input satellite time series stack.
......@@ -136,7 +142,7 @@ ref[,c(1:3)]
```
### 3.2.2 Reference points
The point shapefile contains a point location per class and is used to extract the reference data. The wavelengths for each point are extracted from the Sentinel-2 timeseries stack using the R routine `raster::extract`. If the shapefile does not have the same projection as the input Sentinel-2 stack, `HaSa` will automatically reproject it to match the projection of the Sentinel-2 stack. The resulting table has the following format (rows = classes, columns = spectral wavebands).
The point shapefile contains a point location per class and is used to automatically extract the reference data. The wavelengths for each point are extracted from the Sentinel-2 timeseries stack using the R routine `raster::extract`. If the shapefile does not have the same projection as the input Sentinel-2 stack, `HaSa` will automatically reproject it to match the projection of the Sentinel-2 stack. The resulting table has the following format (rows = classes, columns = spectral wavebands).
```{r reference points, echo = TRUE, results= "hide", message = FALSE, tidy = FALSE}
shp_path <- paste(dataPath,"Example_Reference_Points.shp", sep = "")
......@@ -152,7 +158,7 @@ ref[1:5,1:3]
```
### 3.2.3 Define class names
The class names should be passed as a vector in the same order as the reference spectra (rows = habitats).
The class names should be passed as a vector in the same order as the reference spectra (rows = class).
```{r eval = TRUE}
#create vector with class names. The order of classes must follow the same order of reference spectra (row = class)
......@@ -166,7 +172,7 @@ classNames <- c("deciduous","coniferous","heather_young","heather_old",
### 4.1.1 Plot configuration
The interactive mode of `HaSa` requires the user's expertise to define a threshold for habitat extraction. The user selects a threshold with the help of an interactive map. The interactive map includes an RGB composite of one of the Sentinel-2 scenes to assist in habitat extraction.
The satellite timeseries stack (`SentinelStack_2018.tif`) loaded in **3.2.1** has 6 scenes and each scene includes the following bands:`Band 2 - Blue`, `Band 3 - Green`, `Band 4 - Red`, `Band 5 - Vegetation Red Edge`, `Band 6 - Vegetation Red Edge`, `Band 7 - Vegetation Red Edge`, `Band 8 - NIR`, `Band 11 - Short wave Infra-Red`, and `Band 12 - Short wave Infra-Red`. Using the clipped Sentinel-2 timeseries stack provided as input, the user can test which bands should be used in the plot using `HaSa::plot_configuration()`. The variable `plot_rgb` will later be used for the interactive procedure of habitat sampling.
The satellite timeseries stack (`SentinelStack_2018.tif`) loaded in **3.2.1** has 6 scenes and each scene includes the following bands:`Band 2 - Blue`, `Band 3 - Green`, `Band 4 - Red`, `Band 5 - Vegetation Red Edge 1`, `Band 6 - Vegetation Red Edge 2`, `Band 7 - Vegetation Red Edge 3`, `Band 8 - NIR`, `Band 11 - Short wave Infra-Red 1`, and `Band 12 - Short wave Infra-Red 2` (See Table 1). Using the clipped Sentinel-2 timeseries stack provided as input, the user can test which bands should be used in the plot using `HaSa::plot_configuration()`. The variable `plot_rgb` will later be used for the interactive procedure of habitat sampling.
```{r plot configuration, eval = TRUE, results='hide', message = FALSE, warning = FALSE, tidy = FALSE}
shp_path <- paste(dataPath,"Example_Reference_Points.shp", sep = "")
......@@ -198,16 +204,16 @@ HaSa::multi_Class_Sampling(
nb_models = 200, # number of models to collect (recommended value: 200)
nb_it = 10, # number of iterations for model accuracy
# (recommended value:10)
buffer = 15, # distance (in m) for new sample collection around initial
# samples (depends on pixel size)
buffer = 10, # distance (in m) for new sample collection around initial
# samples (depends on pixel size and image resolution)
reference = ref, # table of reference spectra [data.frame]
model = "rf", # which machine learning algorithm to use ("rf" random
# forest or "svm" support vector machine;
# recommended input: rf)
mtry = 10, # number of predictors used at random forest splitting nodes
# (recommended input: mtry << n predictors)
last = F, # only FALSE for one class classifier ("TRUE" or "FALSE";
# recommended input: "F") *See note 2
last = F, # only FALSE for one class classifier (TRUE or FALSE;
# recommended input: FALSE) *See note 2
seed = 3, # set seed for reproducible results (recommended value: 3)
init.seed = "sample", # "sample" for new or use Run@seeds to reproduce previous
# steps *See note 3
......@@ -220,8 +226,8 @@ HaSa::multi_Class_Sampling(
# output *See note 4
RGB = c(19,20,21), # pallette colors for the interactive plots
overwrite = TRUE, # overwrite the KML and raster files from previous runs
save_runs = FALSE, # an Habitat object is saved into disk for each run (default TRUE)
parallel_mode = TRUE, # run loops using all available cores
save_runs = TRUE, # an Habitat object is saved into disk for each run (default TRUE)
parallel_mode = TRUE, # run loops using all available cores
max_num_cores = 4, # maximum number of cores for parallelism (default 5)
plot_on_browser = FALSE # plot on the browser or inline in a notebook (default TRUE)
)
......@@ -253,10 +259,10 @@ From this interactive map, the user has two choices:
If the user chooses to extract the habitat, a user defined threshold is entered into the R console and the following files are saved:
* HabitatSampler object (Run) - R Binary
* probability map - *.kml, *.png, geocoded *.tif (is this 3 separate files??)
* HabitatSampler object (Run) - R Binary: The R object is used when the user wants to restart the computation at a specific step or reuse the seeds for sampling.
* probability map - *.kml, *.png, geocoded *.tif: Tiff contains all classes plotted, one class, one color. See example in the demo/Data/Results/HabitatMap_final.pdf
* threshold list - R Binary
* leaflet interactive web interface - *.html
* leaflet interactive web interface - *.html: LeafLet Map with the 3 RGB channels and the raster containing the probabilities. The file is re-written for each run
After habitat extraction is done the user can proceed automatically to the next habitat by entering 0 into the R console
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment