README.md 4 KB
Newer Older
Michael Rudolf's avatar
Michael Rudolf committed
1
2
# Stick Slip Learning

Michael Rudolf's avatar
Michael Rudolf committed
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Suite of scripts to analyze annular shear experiments with a machine learning
approach. From a series of experiments at different conditions, specific
segments are extracted, features generated and then used as input for a machine
learning algorithm. For the terms used and a short explanation of what to
expect from the data see
[Terminology](https://gitext.gfz-potsdam.de/analab-code/shear-madness/blob/master/Terminology.md).


## Quick Guide for ML-Workshop participants
The majority of scripts in this repository are concerned with the
data-preparation _before_ the actual machine learning part. If you want to run
your own feature generation pipeline please directly ask @mrudolf to provide
you with the raw or pre-processed data files. These are also available in other
formats if needed as long as a Python module exists for conversion.

To see a sample implementation of `feature_generation.py` have a look in `src/set-shear-madness.py`.

## Requirements
The scripts require [Python 3(external link!)](https://python.org) to run. The
required external libraries can be found in
[requirements.txt](https://gitext.gfz-potsdam.de/analab-code/shear-madness/blob/master/requirements.txt).
If not noted otherwise the most recent version of each module or Python at the
time of the commit is used. Older versions might work but remain untested.
Michael Rudolf's avatar
Michael Rudolf committed
26
27

## Overview
Michael Rudolf's avatar
Michael Rudolf committed
28
29
30
31
32
33
34
All relevant scripts are located in the
[src](https://gitext.gfz-potsdam.de/analab-code/shear-madness/blob/master/src)
directory it contains two 'master' scripts that show a full processing pipeline
either for a single set or for multiple sets within a single folder. The
scripts use several
[modules](https://gitext.gfz-potsdam.de/analab-code/shear-madness/blob/master/src/modules)
representing different stages of processing:
Michael Rudolf's avatar
Michael Rudolf committed
35
36
37

1. Data Preparation > `preparation.py`

Michael Rudolf's avatar
Michael Rudolf committed
38
39
40
41
    _Splits the raw data into smaller sets of equal loading rate. Afterwards
    even smaller subsets are generated and certain subsets are omitted
    according to certain limitations. The subsets now contain an equal amount
    of samples and a certain number of events._
Michael Rudolf's avatar
Michael Rudolf committed
42
43
44

2. Feature Generation > `feature_functions.py` and `feature_generation.py`

Michael Rudolf's avatar
Michael Rudolf committed
45
46
47
48
49
    _Generates features with the functions implemented in
    `feature_functions.py`. The output is going to be a 2D feature array `X`
    with a column for each feature and row for each sample and a
    one-dimensional array containing the data labels. You can add new feature
    functions to `feature_functions.py` following the instructions therein._
Michael Rudolf's avatar
Michael Rudolf committed
50
51
52

3. Learning > `learning.py`

Michael Rudolf's avatar
Michael Rudolf committed
53
54
    ___To be implemented!__ Uses the chosen machine learning model (from
    scikit-learn) to fit the labeled data._
Michael Rudolf's avatar
Michael Rudolf committed
55

Michael Rudolf's avatar
Michael Rudolf committed
56
57
The other modules contain helper functions for easy file handling and saving
the current stage of processing etc.
Michael Rudolf's avatar
Michael Rudolf committed
58
59

## Documentation
Michael Rudolf's avatar
Michael Rudolf committed
60
61
62
63
64
65
66
67
68
69
70
71
72
Because most of the work is outsourced into modules the two 'master' scripts
and the comments inside them should provide enough documentation to assess the
project pipeline. A more in-depth documentation of what the functions do
including more comments on the source code is given in the form of jupyter
notebooks in
[Notebooks](https://gitext.gfz-potsdam.de/analab-code/shear-madness/tree/master/notebooks).
Because GitLabs supports the display of jupyter notebooks very well they can be
viewed online like a normal documentation. If you want to follow the individual
steps and actually run the jupyter notebooks you have to make sure to install
[jupyter (external link!)](https://jupyter.org) on your machine and run a
jupyter notebook server. In some cases you need to place them into the src
folder to properly pick up the modules. They are not required to run the main
scripts.
73

Michael Rudolf's avatar
Michael Rudolf committed
74
75
76
77
78
79
80
81
## Acknowledgements
The software in this repository has benefited from contributions by:
 - J. Bedford (@jbed)

This research has been partially funded by Deutsche Forschungsgemeinschaft
(DFG) through grant [CRC 1114 "Scaling Cascades in Complex Systems", Project B01
"Fault networks and scaling properties of deformation accumulation" (external
link!)](www.sfb1114.de).