README.md 4.37 KB
Newer Older
André Hollstein's avatar
André Hollstein committed
1
2
3
4
# Database File of Manually classified Sentinel-2A Data

This repository contains a database of manually labeled [Sentinel-2A](http://www.esa.int/Our_Activities/Observing_the_Earth/Copernicus/Sentinel-2) spectra which were used in the paper: [Hollstein, A.; Segl, K.; Guanter, L.; Brell, M.; Enesco, M.    Ready-to-Use Methods for the Detection of Clouds, Cirrus, Snow, Shadow, Water and Clear Sky Pixels in Sentinel-2 MSI Images. Remote Sens. 2016, 8, 666.](http://www.mdpi.com/2072-4292/8/8/666).

André Hollstein's avatar
André Hollstein committed
5
6
7
The data itself and some associated metadata are stored in an [HDF5](https://www.hdfgroup.org/HDF5/) file which can be downloaded here: 

<https://gitext.gfz-potsdam.de/hollstei/sentinel2_manual_classification_clouds/raw/master/20160321_s2_manual_classification_data.h5>
André Hollstein's avatar
André Hollstein committed
8

André Hollstein's avatar
André Hollstein committed
9

André Hollstein's avatar
André Hollstein committed
10

André Hollstein's avatar
André Hollstein committed
11
12
13
The first dimension of **dates**, **spectra**, and **classes** are aligned such that for each spectrum the selected classes can be retrieved. The association of **class_ids** and **class_names** is given in additional attributes.

The figure below shows the layout of the file and some sample data:
André Hollstein's avatar
André Hollstein committed
14
15


André Hollstein's avatar
André Hollstein committed
16
17
![hdf5 file](fig/screenshot_hdfview.png)

Marta Enesco's avatar
Marta Enesco committed
18
19
20
21
22
23
24
A technical note on how the data was produced is currently under preparation.

## How the data was produced

### 1. Data Collection

Open-source Sentinel-2 data is available for download on the [Scientific Data Hub](https://scihub.copernicus.eu/dhus). Products consist of a 290 km image divided into 100 km granules in UTM/WGS84 projection. The product name includes sensing and creation date, as well as the relative orbit number of the image.
Marta Enesco's avatar
Marta Enesco committed
25

Marta Enesco's avatar
Marta Enesco committed
26
27
28
29
Following image corresponds to the division into granules of the product **S2A_OPER_PRD_MSIL1C_PDMC_20151211T153317_R021_V20151211T084342_20151211T084342.SAFE**:

![granules](fig/screenshot_granules.jpg)

Marta Enesco's avatar
Marta Enesco committed
30
To create a varied and representative spatial dataset, downloaded images cover a large variety of regions from all over the world. 
Marta Enesco's avatar
Marta Enesco committed
31
32
33
34

### 2. Data Classification

By means of different spectral tools, granule pixels are selected and classified into one of the following six classes: 
Marta Enesco's avatar
Marta Enesco committed
35

Marta Enesco's avatar
Marta Enesco committed
36
37
| **Class** | **Coverage** |
| :-------: | ------------ |
Marta Enesco's avatar
Marta Enesco committed
38
39
40
41
42
| cloud | opaque clouds |
| cirrus | cirrus and vapor trails |
| snow | snow and ice |
| shadow | shadows from clouds, cirrus, mountains, buildings, etc |
| water | lakes, rivers, seas |
Marta Enesco's avatar
Marta Enesco committed
43
| clear-sky | remaining: crops, mountains, urban, etc |
Marta Enesco's avatar
Marta Enesco committed
44

Marta Enesco's avatar
Marta Enesco committed
45
Spectral tools include *false-color composites*, *image enhancements* and *graphical visualization of spectra*. Our aim is to create highly heterogeneous classes with a balanced number of pixels.
Marta Enesco's avatar
Marta Enesco committed
46

Marta Enesco's avatar
Marta Enesco committed
47
The following figure exposes the use of false-color composites for snow distinction.
Marta Enesco's avatar
Marta Enesco committed
48

Marta Enesco's avatar
Marta Enesco committed
49
![marokko](fig/screenshot_marokko.png)
Marta Enesco's avatar
Marta Enesco committed
50
51
52

For this RGB display of the Atlas mountains in Marokko, bands 12/7/3 are selected. Snow pixels appear in blue, whereas cloud pixels in pink orange.

Marta Enesco's avatar
Marta Enesco committed
53

Marta Enesco's avatar
Marta Enesco committed
54
55
56
57
And next figure illustrates some classes generation.

![fiji](fig/screenshot_fiji.jpg)

Marta Enesco's avatar
Marta Enesco committed
58
59
60
This image of Fiji coastline is displayed in two different false-composites: (a) bands 4/3/2 and (b) bands 8a/3/2. Colored polygons represent four different classes: cyan, yellow, dark blue and green correspond to water, shadow, cloud and clear-sky pixels.


Marta Enesco's avatar
Marta Enesco committed
61
| ![marokko](fig/screenshot_marokko.png)  | ![fiji](fig/screenshot_fiji.jpg)) |
Marta Enesco's avatar
Marta Enesco committed
62
63
|:---: | :---: |
| Marokko | Fiji |
Marta Enesco's avatar
Marta Enesco committed
64

Marta Enesco's avatar
Marta Enesco committed
65
## Dataset
Marta Enesco's avatar
Marta Enesco committed
66
67
68
69
70
71
72
73
74
75
76
77
78
79

Our dataset consists of a total of N=5647725 pixels. Pixel information is saved into different tables in the HDF5 file.
*Relative to Sentinel-2 spatial and spectral resolutions*:
- **band** associates a band position with its label
- further band descriptions can be found in **bandwidth_nm**, **central_wavelength_nm** and **spatial_sampling_m**
*Relative to the classes:*
- **classes** (1xN table) includes the class id to which each pixel in the dataset is associated
- **class_ids** describes the id associated to each class that appears in **class_names**
*Relative to the spectra:*
- **spectra** (13xN table) collects the spectral values of each pixel. Sentinel-2 instrument samples 13 spectral bands.
*Relative to the image metadata:*
- **latitude** and **longitude** gather pixel coordinates
- each pixel is located in a **granule_id**, where several granules correspond to an image associated with a **product_id**
- the same product will share the sensing date -**date**-, four different sampling angles -**sun_azimuth_angle**, **sun_zenith_angle**, **viewing_azimuth_angle**, **viewing_zenith_angle**- and the geographical location -**continent** and **country**.