01-numpy.md 35.7 KB
 Greg Wilson committed Mar 03, 2014 1 ``````--- `````` Greg Wilson committed Jun 22, 2016 2 3 4 5 ``````title: Analyzing Patient Data teaching: 30 exercises: 0 questions: `````` Greg Wilson committed Jul 01, 2016 6 ``````- "How can I process tabular data files in Python?" `````` Greg Wilson committed Jun 22, 2016 7 ``````objectives: `````` Brian Jackson committed Feb 22, 2018 8 ``````- "Explain what a library is and what libraries are used for." `````` Eilis Hannon committed Oct 20, 2016 9 ``````- "Import a Python library and use the functions it contains." `````` Greg Wilson committed Jun 22, 2016 10 11 12 13 ``````- "Read tabular data from a file into a program." - "Assign values to variables." - "Select individual values and subsections from data." - "Perform operations on arrays of data." `````` Eilis Hannon committed Oct 20, 2016 14 ``````- "Plot simple graphs from data." `````` Greg Wilson committed Jun 22, 2016 15 ``````keypoints: `````` Greg Wilson committed Jun 25, 2016 16 17 18 19 20 21 ``````- "Import a library into a program using `import libraryname`." - "Use the `numpy` library to work with arrays in Python." - "Use `variable = value` to assign a value to a variable in order to record it in memory." - "Variables are created on demand whenever a value is assigned to them." - "Use `print(something)` to display the value of `something`." - "The expression `array.shape` gives the shape of an array." `````` Dustin Lang committed Jan 27, 2017 22 ``````- "Use `array[x, y]` to select a single element from a 2D array." `````` Greg Wilson committed Jun 25, 2016 23 ``````- "Array indices start at 0, not 1." `````` Dustin Lang committed Jan 27, 2017 24 ``````- "Use `low:high` to specify a `slice` that includes the indices from `low` to `high-1`." `````` Greg Wilson committed Jun 25, 2016 25 26 27 28 29 ``````- "All the indexing and slicing that works on arrays also works on strings." - "Use `# some kind of explanation` to add comments to programs." - "Use `numpy.mean(array)`, `numpy.max(array)`, and `numpy.min(array)` to calculate simple statistics." - "Use `numpy.mean(array, axis=0)` or `numpy.mean(array, axis=1)` to calculate statistics across the specified axis." - "Use the `pyplot` library from `matplotlib` for creating simple visualizations." `````` Greg Wilson committed Mar 03, 2014 30 31 ``````--- `````` Justin Pringle committed Feb 26, 2018 32 ``````In this lesson we will learn how to manipulate the inflammation dataset with Python. Before we discuss how to deal with many data points, we will show how to store a single value on the computer. `````` Greg Wilson committed Mar 03, 2014 33 `````` `````` Justin Pringle committed Feb 26, 2018 34 35 36 37 38 39 ``````You can get output from python by typing math into the console: ~~~ 3+5 12/7 ~~~ However to do anything useful and/or interesting we need to assign values to _variables_ (or link _objects_ to names/variables). `````` joshkyh committed Apr 17, 2018 40 ``````The line below [assigns]({{ page.root }}/reference/#assign) the value `60` to a [variable]({{ page.root }}/reference/#variable) `weight_kg`: `````` Greg Wilson committed Mar 03, 2014 41 `````` `````` Greg Wilson committed Jun 22, 2016 42 ``````~~~ `````` joshkyh committed Apr 17, 2018 43 ``````weight_kg = 60 `````` Greg Wilson committed Dec 03, 2014 44 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 45 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 46 `````` `````` Justin Pringle committed Feb 26, 2018 47 ``````A variable is a name for a value, `````` Saymore Chifamba committed Jul 11, 2017 48 ``````such as `x_val`, `current_temperature`, or `subject_id`. `````` Trevor Bekolay committed Aug 26, 2017 49 ``````Python's variables must begin with a letter and are [case sensitive]({{ page.root }}/reference/#case-sensitive). `````` Kyler Brown committed Apr 22, 2015 50 ``````We can create a new variable by assigning a value to it using `=`. `````` valiseverywhere committed Jan 07, 2017 51 52 ``````When we are finished typing and press Shift+Enter, the notebook runs our command. `````` Greg Wilson committed Mar 03, 2014 53 `````` `````` Azalee Bostroem committed May 09, 2015 54 ``````Once a variable has a value, we can print it to the screen: `````` Greg Wilson committed Mar 03, 2014 55 `````` `````` Greg Wilson committed Jun 22, 2016 56 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 57 ``````print(weight_kg) `````` Greg Wilson committed Dec 03, 2014 58 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 59 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 60 61 `````` ~~~ `````` joshkyh committed Apr 17, 2018 62 ``````60 `````` Greg Wilson committed Dec 03, 2014 63 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 64 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 65 `````` `````` Brian Jackson committed Feb 22, 2018 66 ``````and do arithmetic with it (remember, there are 2.2 pounds per kilogram): `````` Greg Wilson committed Mar 03, 2014 67 `````` `````` Greg Wilson committed Jun 22, 2016 68 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 69 ``````print('weight in pounds:', 2.2 * weight_kg) `````` Greg Wilson committed Dec 03, 2014 70 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 71 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 72 73 `````` ~~~ `````` joshkyh committed Apr 17, 2018 74 ``````weight in pounds: 132.0 `````` Greg Wilson committed Dec 03, 2014 75 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 76 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 77 `````` `````` jstapleton committed Mar 05, 2016 78 79 80 ``````As the example above shows, we can print several things at once by separating them with commas. `````` Raniere Silva committed Sep 02, 2014 81 ``````We can also change a variable's value by assigning it a new one: `````` Greg Wilson committed Mar 03, 2014 82 `````` `````` Greg Wilson committed Jun 22, 2016 83 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 84 ``````weight_kg = 57.5 `````` Raniere Silva committed Aug 20, 2015 85 ``````print('weight in kilograms is now:', weight_kg) `````` Greg Wilson committed Dec 03, 2014 86 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 87 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 88 89 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 90 91 ``````weight in kilograms is now: 57.5 ~~~ `````` Greg Wilson committed Jun 22, 2016 92 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 93 `````` `````` Raniere Silva committed Sep 02, 2014 94 95 ``````If we imagine the variable as a sticky note with a name written on it, assignment is like putting the sticky note on a particular value: `````` Greg Wilson committed Mar 03, 2014 96 `````` `````` Greg Wilson committed Jul 20, 2016 97 ``````![Variables as Sticky Notes](../fig/python-sticky-note-variables-01.svg) `````` Greg Wilson committed Mar 03, 2014 98 `````` `````` Raniere Silva committed Sep 02, 2014 99 100 101 ``````This means that assigning a value to one variable does *not* change the values of other variables. For example, let's store the subject's weight in pounds in a variable: `````` Greg Wilson committed Mar 03, 2014 102 `````` `````` Greg Wilson committed Jun 22, 2016 103 ``````~~~ `````` Brian Jackson committed Feb 22, 2018 104 ``````#There are 2.2 pounds per kilogram. `````` Greg Wilson committed Dec 03, 2014 105 ``````weight_lb = 2.2 * weight_kg `````` Raniere Silva committed Aug 20, 2015 106 ``````print('weight in kilograms:', weight_kg, 'and in pounds:', weight_lb) `````` Greg Wilson committed Dec 03, 2014 107 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 108 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 109 110 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 111 112 ``````weight in kilograms: 57.5 and in pounds: 126.5 ~~~ `````` Greg Wilson committed Jun 22, 2016 113 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 114 `````` `````` Greg Wilson committed Jul 20, 2016 115 ``````![Creating Another Variable](../fig/python-sticky-note-variables-02.svg) `````` Greg Wilson committed Mar 03, 2014 116 `````` `````` Raniere Silva committed Sep 02, 2014 117 ``````and then change `weight_kg`: `````` Greg Wilson committed Mar 03, 2014 118 `````` `````` Greg Wilson committed Jun 22, 2016 119 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 120 ``````weight_kg = 100.0 `````` Raniere Silva committed Aug 20, 2015 121 ``````print('weight in kilograms is now:', weight_kg, 'and weight in pounds is still:', weight_lb) `````` Greg Wilson committed Dec 03, 2014 122 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 123 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 124 125 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 126 127 ``````weight in kilograms is now: 100.0 and weight in pounds is still: 126.5 ~~~ `````` Greg Wilson committed Jun 22, 2016 128 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 129 `````` `````` Greg Wilson committed Jul 20, 2016 130 ``````![Updating a Variable](../fig/python-sticky-note-variables-03.svg) `````` Greg Wilson committed Mar 03, 2014 131 `````` `````` Brian Jackson committed Feb 22, 2018 132 ``````Since `weight_lb` doesn't remember where its value came from, `````` Raniere Silva committed Sep 02, 2014 133 134 135 ``````it isn't automatically updated when `weight_kg` changes. This is different from the way spreadsheets work. `````` Greg Wilson committed Jun 22, 2016 136 ``````> ## Who's Who in Memory `````` Benjamin Laken committed Nov 09, 2015 137 ``````> `````` Trevor Bekolay committed Jun 22, 2016 138 139 140 ``````> You can use the `%whos` command at any time to see what > variables you have created and what modules you have loaded into the computer's memory. > As this is an IPython command, it will only work if you are in an IPython terminal or the Jupyter Notebook. `````` Benjamin Laken committed Nov 09, 2015 141 ``````> `````` Greg Wilson committed Jun 22, 2016 142 ``````> ~~~ `````` Trevor Bekolay committed Jun 22, 2016 143 144 ``````> %whos > ~~~ `````` Anne Fouilloux committed Feb 14, 2018 145 ``````> {: .language-python} `````` Greg Wilson committed Jun 22, 2016 146 147 ``````> > ~~~ `````` Trevor Bekolay committed Jun 22, 2016 148 149 150 151 152 ``````> Variable Type Data/Info > -------------------------------- > weight_kg float 100.0 > weight_lb float 126.5 > ~~~ `````` Greg Wilson committed Jun 22, 2016 153 154 ``````> {: .output} {: .callout} `````` Benjamin Laken committed Nov 09, 2015 155 `````` `````` devendra1810 committed Jul 19, 2016 156 157 158 159 ``````Words are useful, but what's more useful are the sentences and stories we build with them. Similarly, while a lot of powerful, general tools are built into languages like Python, `````` Trevor Bekolay committed Aug 26, 2017 160 ``````specialized tools built up from these basic units live in [libraries]({{ page.root }}/reference/#library) `````` devendra1810 committed Jul 19, 2016 161 162 163 ``````that can be called upon when needed. In order to load our inflammation data, `````` Trevor Bekolay committed Aug 26, 2017 164 ``````we need to access ([import]({{ page.root }}/reference/#import) in Python terminology) `````` devendra1810 committed Jul 19, 2016 165 166 167 168 169 170 171 172 ``````a library called [NumPy](http://docs.scipy.org/doc/numpy/ "NumPy Documentation"). In general you should use this library if you want to do fancy things with numbers, especially if you have matrices or arrays. We can import NumPy using: ~~~ import numpy ~~~ `````` Anne Fouilloux committed Feb 14, 2018 173 ``````{: .language-python} `````` devendra1810 committed Jul 19, 2016 174 175 176 `````` Importing a library is like getting a piece of lab equipment out of a storage locker and setting it up on the bench. Libraries provide additional functionality to the basic Python package, `````` Valentina Staneva committed May 15, 2017 177 ``````much like a new piece of equipment adds functionality to a lab space. Just like in the lab, importing too many libraries `````` Trevor Bekolay committed Aug 26, 2017 178 ``````can sometimes complicate and slow down your programs - so we only import what we need for each program. `````` Valentina Staneva committed Jul 31, 2017 179 ``````Once we've imported the library, `````` devendra1810 committed Jul 19, 2016 180 181 182 183 184 ``````we can ask the library to read our data file for us: ~~~ numpy.loadtxt(fname='inflammation-01.csv', delimiter=',') ~~~ `````` Anne Fouilloux committed Feb 14, 2018 185 ``````{: .language-python} `````` devendra1810 committed Jul 19, 2016 186 187 188 189 190 191 192 193 194 195 196 197 `````` ~~~ array([[ 0., 0., 1., ..., 3., 0., 0.], [ 0., 1., 2., ..., 1., 0., 1.], [ 0., 1., 1., ..., 2., 1., 1.], ..., [ 0., 1., 1., ..., 1., 1., 1.], [ 0., 0., 0., ..., 0., 2., 0.], [ 0., 0., 1., ..., 1., 1., 0.]]) ~~~ {: .output} `````` Trevor Bekolay committed Aug 26, 2017 198 199 ``````The expression `numpy.loadtxt(...)` is a [function call]({{ page.root }}/reference/#function-call) that asks Python to run the [function]({{ page.root }}/reference/#function) `loadtxt` which belongs to the `numpy` library. `````` Brian Jackson committed Feb 22, 2018 200 201 ``````This [dotted notation]({{ page.root }}/reference/#dotted-notation) is used everywhere in Python: the thing that appears before the dot contains the thing that appears after. `````` Brian Jackson committed Feb 22, 2018 202 `````` `````` Brian Jackson committed Feb 22, 2018 203 204 205 ``````As an example, John Smith is the John that belongs to the Smith family, We could use the dot notation to write his name `smith.john`, just as `loadtxt` is a function that belongs to the `numpy` library. `````` devendra1810 committed Jul 19, 2016 206 `````` `````` Trevor Bekolay committed Aug 26, 2017 207 ```````numpy.loadtxt` has two [parameters]({{ page.root }}/reference/#parameter): `````` Brian Jackson committed Feb 22, 2018 208 ``````the name of the file we want to read `````` Trevor Bekolay committed Aug 26, 2017 209 210 ``````and the [delimiter]({{ page.root }}/reference/#delimiter) that separates values on a line. These both need to be character strings (or [strings]({{ page.root }}/reference/#string) for short), `````` devendra1810 committed Jul 19, 2016 211 212 213 214 215 216 217 218 219 220 221 222 223 ``````so we put them in quotes. Since we haven't told it to do anything else with the function's output, the notebook displays it. In this case, that output is the data we just loaded. By default, only a few rows and columns are shown (with `...` to omit elements when displaying big arrays). To save space, Python displays numbers as `1.` instead of `1.0` when there's nothing interesting after the decimal point. `````` Brian Jackson committed Feb 22, 2018 224 ``````Our call to `numpy.loadtxt` read our file `````` devendra1810 committed Jul 19, 2016 225 226 ``````but didn't save the data in memory. To do that, `````` 227 ``````we need to assign the array to a variable. Just as we can assign a single value to a variable, we can also assign an array of values `````` Brian Jackson committed Feb 22, 2018 228 ``````to a variable using the same syntax. Let's re-run `numpy.loadtxt` and save the returned data: `````` Greg Wilson committed Mar 03, 2014 229 `````` `````` Greg Wilson committed Jun 22, 2016 230 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 231 232 ``````data = numpy.loadtxt(fname='inflammation-01.csv', delimiter=',') ~~~ `````` Anne Fouilloux committed Feb 14, 2018 233 ``````{: .language-python} `````` Greg Wilson committed Mar 03, 2014 234 `````` `````` Brian Jackson committed Feb 22, 2018 235 236 ``````This statement doesn't produce any output because we've assigned the output to the variable `data`. If we want to check that the data have been loaded, `````` Raniere Silva committed Sep 02, 2014 237 ``````we can print the variable's value: `````` Greg Wilson committed Mar 03, 2014 238 `````` `````` Greg Wilson committed Jun 22, 2016 239 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 240 ``````print(data) `````` Greg Wilson committed Dec 03, 2014 241 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 242 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 243 244 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 245 ``````[[ 0. 0. 1. ..., 3. 0. 0.] `````` Greg Wilson committed Mar 03, 2014 246 247 `````` [ 0. 1. 2. ..., 1. 0. 1.] [ 0. 1. 1. ..., 2. 1. 1.] `````` 248 `````` ..., `````` Greg Wilson committed Mar 03, 2014 249 250 251 `````` [ 0. 1. 1. ..., 1. 1. 1.] [ 0. 0. 0. ..., 0. 2. 0.] [ 0. 0. 1. ..., 1. 1. 0.]] `````` Greg Wilson committed Dec 03, 2014 252 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 253 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 254 `````` `````` Brian Jackson committed Feb 22, 2018 255 256 ``````Now that the data are in memory, we can manipulate them. `````` Raniere Silva committed Sep 02, 2014 257 ``````First, `````` Greg Wilson committed Sep 05, 2016 258 ``````let's ask what [type]({{ page.root }}/reference/#type) of thing `data` refers to: `````` Greg Wilson committed Mar 03, 2014 259 `````` `````` Greg Wilson committed Jun 22, 2016 260 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 261 ``````print(type(data)) `````` Greg Wilson committed Dec 03, 2014 262 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 263 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 264 265 `````` ~~~ `````` Trevor Bekolay committed Aug 28, 2015 266 `````` `````` Greg Wilson committed Dec 03, 2014 267 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 268 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 269 `````` `````` 270 ``````The output tells us that `data` currently refers to `````` Brian Jackson committed Feb 22, 2018 271 ``````an N-dimensional array, the functionality for which is provided by the NumPy library. `````` 272 ``````These data correspond to arthritis patients' inflammation. `````` Brian Jackson committed Feb 22, 2018 273 ``````The rows are the individual patients, and the columns `````` 274 275 ``````are their daily inflammation measurements. `````` Greg Wilson committed Jun 22, 2016 276 ``````> ## Data Type `````` 277 278 ``````> > A Numpy array contains one or more elements `````` Brian Jackson committed Feb 22, 2018 279 280 281 282 ``````> of the same type. The `type` function will only tell you that > a variable is a NumPy array but won't tell you the type of > thing inside the array. > We can find out the type `````` 283 284 ``````> of the data contained in the NumPy array. > `````` Greg Wilson committed Jun 22, 2016 285 ``````> ~~~ `````` 286 287 ``````> print(data.dtype) > ~~~ `````` Anne Fouilloux committed Feb 14, 2018 288 ``````> {: .language-python} `````` Greg Wilson committed Jun 22, 2016 289 290 ``````> > ~~~ `````` 291 292 ``````> dtype('float64') > ~~~ `````` Greg Wilson committed Jun 22, 2016 293 ``````> {: .output} `````` 294 295 ``````> > This tells us that the NumPy array's elements are `````` Greg Wilson committed Sep 05, 2016 296 ``````> [floating-point numbers]({{ page.root }}/reference/#floating-point number). `````` Greg Wilson committed Jun 22, 2016 297 ``````{: .callout} `````` 298 `````` `````` Brian Jackson committed Feb 22, 2018 299 ``````With the following command, we can see the array's [shape]({{ page.root }}/reference/#shape): `````` Greg Wilson committed Mar 03, 2014 300 `````` `````` Greg Wilson committed Jun 22, 2016 301 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 302 ``````print(data.shape) `````` Greg Wilson committed Dec 03, 2014 303 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 304 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 305 306 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 307 308 ``````(60, 40) ~~~ `````` Greg Wilson committed Jun 22, 2016 309 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 310 `````` `````` Brian Jackson committed Feb 22, 2018 311 312 ``````The output tells us that the `data` array variable contains 60 rows and 40 columns. When we created the variable `data` to store our arthritis data, we didn't just create the array; we also `````` Greg Wilson committed Sep 05, 2016 313 ``````created information about the array, called [members]({{ page.root }}/reference/#member) or `````` Azalee Bostroem committed May 09, 2015 314 315 ``````attributes. This extra information describes `data` in the same way an adjective describes a noun. `````` Brian Jackson committed Feb 22, 2018 316 ```````data.shape` is an attribute of `data` which describes the dimensions of `data`. `````` Azalee Bostroem committed May 09, 2015 317 ``````We use the same dotted notation for the attributes of variables `````` Raniere Silva committed Sep 02, 2014 318 319 ``````that we use for the functions in libraries because they have the same part-and-whole relationship. `````` Greg Wilson committed Mar 03, 2014 320 `````` `````` Azalee Bostroem committed May 09, 2015 321 ``````If we want to get a single number from the array, `````` Brian Jackson committed Feb 22, 2018 322 323 ``````we must provide an [index]({{ page.root }}/reference/#index) in square brackets after the variable name, just as we do in math when referring to an element of a matrix. Our inflammation data has two dimensions, so we will need to use two indices to refer to one specific value: `````` Greg Wilson committed Mar 03, 2014 324 `````` `````` Greg Wilson committed Jun 22, 2016 325 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 326 ``````print('first value in data:', data[0, 0]) `````` Greg Wilson committed Dec 03, 2014 327 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 328 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 329 330 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 331 332 ``````first value in data: 0.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 333 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 334 `````` `````` Greg Wilson committed Jun 22, 2016 335 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 336 ``````print('middle value in data:', data[30, 20]) `````` Greg Wilson committed Dec 03, 2014 337 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 338 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 339 340 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 341 342 ``````middle value in data: 13.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 343 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 344 `````` `````` Noah Spies committed Jan 16, 2018 345 346 ``````The expression `data[30, 20]` accesses the element at row 30, column 20. While this expression may not surprise you, `data[0, 0]` might. `````` Brian Jackson committed Feb 22, 2018 347 ``````Programming languages like Fortran, MATLAB and R start counting at 1 `````` Raniere Silva committed Sep 02, 2014 348 349 ``````because that's what human beings have done for thousands of years. Languages in the C family (including C++, Java, Perl, and Python) count from 0 `````` Thomas Robitaille committed Oct 04, 2016 350 351 352 353 354 ``````because it represents an offset from the first value in the array (the second value is offset by one index from the first value). This is closer to the way that computers represent arrays (if you are interested in the historical reasons behind counting indices from zero, you can read [Mike Hoye's blog post](http://exple.tive.org/blarg/2013/10/22/citation-needed/)). `````` Raniere Silva committed Sep 02, 2014 355 ``````As a result, `````` Greg Wilson committed Jul 10, 2016 356 ``````if we have an M×N array in Python, `````` Raniere Silva committed Sep 02, 2014 357 358 359 360 361 362 ``````its indices go from 0 to M-1 on the first axis and 0 to N-1 on the second. It takes a bit of getting used to, but one way to remember the rule is that the index is how many steps we have to take from the start to get the item we want. `````` Eleanor Lutz committed Jun 20, 2017 363 364 ``````![Zero Index](../fig/python-zero-index.png) `````` Greg Wilson committed Jun 22, 2016 365 ``````> ## In the Corner `````` Raniere Silva committed Sep 02, 2014 366 367 368 369 ``````> > What may also surprise you is that when Python displays an array, > it shows the element with index `[0, 0]` in the upper left corner > rather than the lower left. `````` Brian Jackson committed Feb 22, 2018 370 ``````> This is consistent with the way mathematicians draw matrices `````` Raniere Silva committed Sep 02, 2014 371 ``````> but different from the Cartesian coordinates. `````` 372 ``````> The indices are (row, column) instead of (column, row) for the same reason, `````` Greg Wilson committed Sep 08, 2014 373 ``````> which can be confusing when plotting data. `````` Greg Wilson committed Jun 22, 2016 374 ``````{: .callout} `````` Raniere Silva committed Sep 02, 2014 375 376 377 378 379 `````` An index like `[30, 20]` selects a single element of an array, but we can select whole sections as well. For example, we can select the first ten days (columns) of values `````` shiffer1 committed Jun 25, 2015 380 ``````for the first four patients (rows) like this: `````` Greg Wilson committed Mar 03, 2014 381 `````` `````` Greg Wilson committed Jun 22, 2016 382 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 383 ``````print(data[0:4, 0:10]) `````` Greg Wilson committed Dec 03, 2014 384 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 385 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 386 387 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 388 ``````[[ 0. 0. 1. 3. 1. 2. 4. 7. 8. 3.] `````` Greg Wilson committed Mar 03, 2014 389 390 391 `````` [ 0. 1. 2. 1. 2. 1. 3. 2. 2. 6.] [ 0. 1. 1. 3. 3. 2. 6. 2. 5. 9.] [ 0. 0. 2. 0. 4. 2. 2. 1. 6. 7.]] `````` Greg Wilson committed Dec 03, 2014 392 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 393 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 394 `````` `````` Greg Wilson committed Sep 05, 2016 395 ``````The [slice]({{ page.root }}/reference/#slice) `0:4` means, `````` Raniere Silva committed Sep 02, 2014 396 397 398 399 400 401 ``````"Start at index 0 and go up to, but not including, index 4." Again, the up-to-but-not-including takes a bit of getting used to, but the rule is that the difference between the upper and lower bounds is the number of values in the slice. We don't have to start slices at 0: `````` Greg Wilson committed Mar 03, 2014 402 `````` `````` Greg Wilson committed Jun 22, 2016 403 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 404 ``````print(data[5:10, 0:10]) `````` Greg Wilson committed Dec 03, 2014 405 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 406 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 407 408 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 409 ``````[[ 0. 0. 1. 2. 2. 4. 2. 1. 6. 4.] `````` Greg Wilson committed Mar 03, 2014 410 411 412 413 `````` [ 0. 0. 2. 2. 4. 2. 2. 5. 5. 8.] [ 0. 0. 1. 2. 3. 1. 2. 3. 5. 3.] [ 0. 0. 0. 3. 1. 5. 6. 5. 5. 8.] [ 0. 1. 1. 2. 1. 3. 5. 3. 5. 8.]] `````` Greg Wilson committed Dec 03, 2014 414 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 415 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 416 `````` `````` Raniere Silva committed Sep 02, 2014 417 418 419 420 421 422 423 424 ``````We also don't have to include the upper and lower bound on the slice. If we don't include the lower bound, Python uses 0 by default; if we don't include the upper, the slice runs to the end of the axis, and if we don't include either (i.e., if we just use ':' on its own), the slice includes everything: `````` Greg Wilson committed Mar 03, 2014 425 `````` `````` Greg Wilson committed Jun 22, 2016 426 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 427 ``````small = data[:3, 36:] `````` Raniere Silva committed Aug 20, 2015 428 429 ``````print('small is:') print(small) `````` Greg Wilson committed Dec 03, 2014 430 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 431 ``````{: .language-python} `````` Brian Jackson committed Feb 22, 2018 432 ``````The above example selects rows 0 through 2 and columns 36 through to the end of the array. `````` Greg Wilson committed Jun 22, 2016 433 434 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 435 ``````small is: `````` Greg Wilson committed Mar 03, 2014 436 437 438 ``````[[ 2. 3. 0. 0.] [ 1. 1. 0. 1.] [ 2. 2. 1. 1.]] `````` Greg Wilson committed Dec 03, 2014 439 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 440 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 441 `````` `````` Raniere Silva committed Sep 02, 2014 442 ``````Arrays also know how to perform common mathematical operations on their values. `````` Greg Wilson committed Dec 03, 2014 443 ``````The simplest operations with data are arithmetic: `````` Brian Jackson committed Feb 22, 2018 444 ``````addition, subtraction, multiplication, and division. `````` Greg Wilson committed Dec 03, 2014 445 `````` When you do such operations on arrays, `````` Brian Jackson committed Feb 22, 2018 446 ``````the operation is done element-by-element. `````` Greg Wilson committed Dec 03, 2014 447 ``````Thus: `````` Johnny Lin committed Sep 22, 2014 448 `````` `````` Greg Wilson committed Jun 22, 2016 449 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 450 451 ``````doubledata = data * 2.0 ~~~ `````` Anne Fouilloux committed Feb 14, 2018 452 ``````{: .language-python} `````` Trevor Bekolay committed Sep 29, 2014 453 `````` `````` Greg Wilson committed Dec 03, 2014 454 ``````will create a new array `doubledata` `````` Brian Jackson committed Feb 22, 2018 455 ``````each elements of which is twice the value of the corresponding element in `data`: `````` Johnny Lin committed Sep 22, 2014 456 `````` `````` Greg Wilson committed Jun 22, 2016 457 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 458 459 460 461 ``````print('original:') print(data[:3, 36:]) print('doubledata:') print(doubledata[:3, 36:]) `````` Greg Wilson committed Dec 03, 2014 462 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 463 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 464 465 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 466 ``````original: `````` Trevor Bekolay committed Sep 29, 2014 467 468 469 470 471 472 473 ``````[[ 2. 3. 0. 0.] [ 1. 1. 0. 1.] [ 2. 2. 1. 1.]] doubledata: [[ 4. 6. 0. 0.] [ 2. 2. 0. 2.] [ 4. 4. 2. 2.]] `````` Greg Wilson committed Dec 03, 2014 474 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 475 ``````{: .output} `````` Trevor Bekolay committed Sep 29, 2014 476 `````` `````` Greg Wilson committed Dec 03, 2014 477 ``````If, `````` Brian Jackson committed Feb 22, 2018 478 ``````instead of taking an array and doing arithmetic with a single value (as above), `````` Azalee Bostroem committed May 09, 2015 479 ``````you did the arithmetic operation with another array of the same shape, `````` Greg Wilson committed Dec 03, 2014 480 481 ``````the operation will be done on corresponding elements of the two arrays. Thus: `````` Trevor Bekolay committed Sep 29, 2014 482 `````` `````` Greg Wilson committed Jun 22, 2016 483 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 484 485 ``````tripledata = doubledata + data ~~~ `````` Anne Fouilloux committed Feb 14, 2018 486 ``````{: .language-python} `````` Trevor Bekolay committed Sep 29, 2014 487 `````` `````` Johnny Lin committed Sep 22, 2014 488 489 490 ``````will give you an array where `tripledata[0,0]` will equal `doubledata[0,0]` plus `data[0,0]`, and so on for all other elements of the arrays. `````` Greg Wilson committed Jun 22, 2016 491 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 492 493 ``````print('tripledata:') print(tripledata[:3, 36:]) `````` Greg Wilson committed Dec 03, 2014 494 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 495 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 496 497 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 498 ``````tripledata: `````` Trevor Bekolay committed Sep 29, 2014 499 500 501 ``````[[ 6. 9. 0. 0.] [ 3. 3. 0. 3.] [ 6. 6. 3. 3.]] `````` Greg Wilson committed Dec 03, 2014 502 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 503 ``````{: .output} `````` Trevor Bekolay committed Sep 29, 2014 504 `````` `````` Brian Jackson committed Feb 22, 2018 505 506 ``````Often, we want to do more than add, subtract, multiply, and divide array elements. NumPy knows how to do more complex operations, too. `````` Raniere Silva committed Sep 02, 2014 507 508 ``````If we want to find the average inflammation for all patients on all days, for example, `````` Trevor Bekolay committed Jun 22, 2016 509 ``````we can ask NumPy to compute `data`'s mean value: `````` Greg Wilson committed Mar 03, 2014 510 `````` `````` Greg Wilson committed Jun 22, 2016 511 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 512 ``````print(numpy.mean(data)) `````` Greg Wilson committed Dec 03, 2014 513 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 514 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 515 516 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 517 518 ``````6.14875 ~~~ `````` Greg Wilson committed Jun 22, 2016 519 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 520 `````` `````` Greg Wilson committed Sep 05, 2016 521 522 ```````mean` is a [function]({{ page.root }}/reference/#function) that takes an array as an [argument]({{ page.root }}/reference/#argument). `````` Trevor Bekolay committed Jun 22, 2016 523 `````` `````` Greg Wilson committed Jun 22, 2016 524 ``````> ## Not All Functions Have Input `````` Trevor Bekolay committed Jun 22, 2016 525 526 527 ``````> > Generally, a function uses inputs to produce outputs. > However, some functions produce outputs without `````` Trevor Bekolay committed Jun 22, 2016 528 529 ``````> needing any input. For example, checking the current time > doesn't require any input. `````` Trevor Bekolay committed Jun 22, 2016 530 ``````> `````` Greg Wilson committed Jun 22, 2016 531 ``````> ~~~ `````` Trevor Bekolay committed Jun 22, 2016 532 533 ``````> import time > print(time.ctime()) `````` Trevor Bekolay committed Jun 22, 2016 534 ``````> ~~~ `````` Anne Fouilloux committed Feb 14, 2018 535 ``````> {: .language-python} `````` Greg Wilson committed Jun 22, 2016 536 537 ``````> > ~~~ `````` Trevor Bekolay committed Jun 22, 2016 538 ``````> 'Sat Mar 26 13:07:33 2016' `````` Trevor Bekolay committed Jun 22, 2016 539 ``````> ~~~ `````` Greg Wilson committed Jun 22, 2016 540 ``````> {: .output} `````` Trevor Bekolay committed Jun 22, 2016 541 542 543 544 ``````> > For functions that don't take in any arguments, > we still need parentheses (`()`) > to tell Python to go and do something for us. `````` Greg Wilson committed Jun 22, 2016 545 ``````{: .callout} `````` Trevor Bekolay committed Jun 22, 2016 546 547 548 `````` NumPy has lots of useful functions that take an array as input. Let's use three of those functions to get some descriptive values about the dataset. `````` Trevor Bekolay committed Jun 21, 2016 549 550 ``````We'll also use multiple assignment, a convenient Python feature that will enable us to do this all in one line. `````` Greg Wilson committed Mar 03, 2014 551 `````` `````` Greg Wilson committed Jun 22, 2016 552 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 553 ``````maxval, minval, stdval = numpy.max(data), numpy.min(data), numpy.std(data) `````` Trevor Bekolay committed Jun 21, 2016 554 `````` `````` Alistair Walsh committed Jun 21, 2016 555 556 557 ``````print('maximum inflammation:', maxval) print('minimum inflammation:', minval) print('standard deviation:', stdval) `````` Greg Wilson committed Dec 03, 2014 558 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 559 ``````{: .language-python} `````` Brian Jackson committed Feb 22, 2018 560 ``````Here we've assigned the return value from `numpy.max(data)` to the variable `maxval`, the value from `numpy.min(data)` to `minval`, and so on. `````` Greg Wilson committed Jun 22, 2016 561 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 562 ``````maximum inflammation: 20.0 `````` Greg Wilson committed Mar 03, 2014 563 564 ``````minimum inflammation: 0.0 standard deviation: 4.61383319712 `````` Greg Wilson committed Dec 03, 2014 565 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 566 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 567 `````` `````` Greg Wilson committed Jun 22, 2016 568 ``````> ## Mystery Functions in IPython `````` Benjamin Laken committed Dec 04, 2015 569 ``````> `````` Trevor Bekolay committed Jun 22, 2016 570 ``````> How did we know what functions NumPy has and how to use them? `````` Brian Jackson committed Feb 22, 2018 571 ``````> If you are working in the IPython/Jupyter Notebook, there is an easy way to find out. `````` Dustin Lang committed Jan 27, 2017 572 ``````> If you type the name of something followed by a dot, then you can use tab completion `````` Trevor Bekolay committed Jun 22, 2016 573 ``````> (e.g. type `numpy.` and then press tab) `````` Brian Jackson committed Feb 22, 2018 574 575 ``````> to see a list of all functions and attributes that you can use. After selecting one, you > can also add a question mark (e.g. `numpy.cumprod?`), and IPython will return an `````` Trevor Bekolay committed Jun 22, 2016 576 ``````> explanation of the method! This is the same as doing `help(numpy.cumprod)`. `````` Greg Wilson committed Jun 22, 2016 577 ``````{: .callout} `````` Trevor Bekolay committed Jun 22, 2016 578 579 `````` When analyzing data, though, `````` Brian Jackson committed Feb 22, 2018 580 581 582 ``````we often want to look at variations in statistical values, such as the maximum inflammation per patient or the average inflammation per day. `````` Azalee Bostroem committed May 09, 2015 583 ``````One way to do this is to create a new temporary array of the data we want, `````` Raniere Silva committed Sep 02, 2014 584 ``````then ask it to do the calculation: `````` Greg Wilson committed Mar 03, 2014 585 `````` `````` Greg Wilson committed Jun 22, 2016 586 ``````~~~ `````` Dustin Lang committed Jan 27, 2017 587 ``````patient_0 = data[0, :] # 0 on the first axis (rows), everything on the second (columns) `````` Raniere Silva committed Aug 20, 2015 588 ``````print('maximum inflammation for patient 0:', patient_0.max()) `````` Greg Wilson committed Dec 03, 2014 589 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 590 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 591 592 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 593 594 ``````maximum inflammation for patient 0: 18.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 595 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 596 `````` `````` valiseverywhere committed Jun 20, 2016 597 ``````Everything in a line of code following the '#' symbol is a `````` Dustin Lang committed Jan 27, 2017 598 ``````[comment]({{ page.root }}/reference/#comment) that is ignored by Python. `````` valiseverywhere committed Jun 20, 2016 599 ``````Comments allow programmers to leave explanatory notes for other `````` jstapleton committed Mar 05, 2016 600 601 ``````programmers or their future selves. `````` Raniere Silva committed Sep 02, 2014 602 ``````We don't actually need to store the row in a variable of its own. `````` Trevor Bekolay committed Jun 22, 2016 603 ``````Instead, we can combine the selection and the function call: `````` Greg Wilson committed Mar 03, 2014 604 `````` `````` Greg Wilson committed Jun 22, 2016 605 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 606 ``````print('maximum inflammation for patient 2:', numpy.max(data[2, :])) `````` Greg Wilson committed Dec 03, 2014 607 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 608 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 609 610 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 611 612 ``````maximum inflammation for patient 2: 19.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 613 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 614 `````` `````` Valentina Staneva committed Jul 18, 2016 615 ``````What if we need the maximum inflammation for each patient over all days (as in the `````` Brian Jackson committed Feb 22, 2018 616 ``````next diagram on the left) or the average for each day (as in the `````` 617 618 ``````diagram on the right)? As the diagram below shows, we want to perform the operation across an axis: `````` Greg Wilson committed Mar 03, 2014 619 `````` `````` Greg Wilson committed Jul 20, 2016 620 ``````![Operations Across Axes](../fig/python-operations-across-axes.png) `````` Greg Wilson committed Mar 03, 2014 621 `````` `````` Brian Jackson committed Feb 22, 2018 622 ``````To support this functionality, `````` Trevor Bekolay committed Jun 22, 2016 623 ``````most array functions allow us to specify the axis we want to work on. `````` 624 ``````If we ask for the average across axis 0 (rows in our 2D example), `````` Raniere Silva committed Sep 02, 2014 625 ``````we get: `````` Greg Wilson committed Mar 03, 2014 626 `````` `````` Greg Wilson committed Jun 22, 2016 627 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 628 ``````print(numpy.mean(data, axis=0)) `````` Greg Wilson committed Dec 03, 2014 629 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 630 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 631 632 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 633 ``````[ 0. 0.45 1.11666667 1.75 2.43333333 3.15 `````` Greg Wilson committed Mar 03, 2014 634 635 636 637 638 639 640 `````` 3.8 3.88333333 5.23333333 5.51666667 5.95 5.9 8.35 7.73333333 8.36666667 9.5 9.58333333 10.63333333 11.56666667 12.35 13.25 11.96666667 11.03333333 10.16666667 10. 8.66666667 9.15 7.25 7.33333333 6.58333333 6.06666667 5.95 5.11666667 3.6 3.3 3.56666667 2.48333333 1.5 1.13333333 0.56666667] `````` Greg Wilson committed Dec 03, 2014 641 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 642 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 643 `````` `````` Raniere Silva committed Sep 02, 2014 644 645 ``````As a quick check, we can ask this array what its shape is: `````` Greg Wilson committed Mar 03, 2014 646 `````` `````` Greg Wilson committed Jun 22, 2016 647 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 648 ``````print(numpy.mean(data, axis=0).shape) `````` Greg Wilson committed Dec 03, 2014 649 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 650 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 651 652 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 653 654 ``````(40,) ~~~ `````` Greg Wilson committed Jun 22, 2016 655 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 656 `````` `````` Greg Wilson committed Jul 10, 2016 657 ``````The expression `(40,)` tells us we have an N×1 vector, `````` Raniere Silva committed Sep 02, 2014 658 ``````so this is the average inflammation per day for all patients. `````` 659 ``````If we average across axis 1 (columns in our 2D example), we get: `````` Greg Wilson committed Mar 03, 2014 660 `````` `````` Greg Wilson committed Jun 22, 2016 661 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 662 ``````print(numpy.mean(data, axis=1)) `````` Greg Wilson committed Dec 03, 2014 663 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 664 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 665 666 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 667 ``````[ 5.45 5.425 6.1 5.9 5.55 6.225 5.975 6.65 6.625 6.525 `````` Greg Wilson committed Mar 03, 2014 668 669 670 671 672 `````` 6.775 5.8 6.225 5.75 5.225 6.3 6.55 5.7 5.85 6.55 5.775 5.825 6.175 6.1 5.8 6.425 6.05 6.025 6.175 6.55 6.175 6.35 6.725 6.125 7.075 5.725 5.925 6.15 6.075 5.75 5.975 5.725 6.3 5.9 6.75 5.925 7.225 6.15 5.95 6.275 5.7 6.1 6.825 5.975 6.725 5.7 6.25 6.4 7.05 5.9 ] `````` Greg Wilson committed Dec 03, 2014 673 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 674 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 675 `````` `````` Raniere Silva committed Sep 02, 2014 676 677 678 679 680 ``````which is the average inflammation per patient across all days. The mathematician Richard Hamming once said, "The purpose of computing is insight, not numbers," and the best way to develop insight is often to visualize data. `````` Brian Jackson committed Feb 22, 2018 681 ``````Visualization deserves an entire lecture of its own, `````` Azalee Bostroem committed May 09, 2015 682 ``````but we can explore a few features of Python's `matplotlib` library here. `````` Brian Jackson committed Feb 22, 2018 683 684 ``````While there is no official plotting library, `matplotlib` is the de facto standard. `````` Raniere Silva committed Sep 02, 2014 685 686 687 ``````First, we will import the `pyplot` module from `matplotlib` and use two of its functions to create and display a heat map of our data: `````` Greg Wilson committed Mar 03, 2014 688 `````` `````` Greg Wilson committed Jun 22, 2016 689 ``````~~~ `````` Azalee Bostroem committed May 09, 2015 690 ``````import matplotlib.pyplot `````` Konrad Förstner committed Oct 18, 2016 691 ``````image = matplotlib.pyplot.imshow(data) `````` Elliott Sales de Andrade committed Jan 22, 2016 692 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 693 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 694 ``````{: .language-python} `````` Greg Wilson committed Mar 03, 2014 695 `````` `````` Greg Wilson committed Jul 20, 2016 696 ``````![Heatmap of the Data](../fig/01-numpy_71_0.png) `````` Greg Wilson committed Mar 03, 2014 697 `````` `````` Brian Jackson committed Feb 22, 2018 698 ``````Blue pixels in this heat map represent low values, while yellow pixels represent high values. `````` Raniere Silva committed Sep 02, 2014 699 700 ``````As we can see, inflammation rises and falls over a 40-day period. `````` Damien Irving committed May 26, 2015 701 `````` `````` Greg Wilson committed Jun 22, 2016 702 ``````> ## Some IPython Magic `````` Damien Irving committed May 26, 2015 703 704 705 706 ``````> > If you're using an IPython / Jupyter notebook, > you'll need to execute the following command > in order for your matplotlib images to appear `````` Damien Irving committed May 27, 2015 707 ``````> in the notebook when `show()` is called: `````` Damien Irving committed May 26, 2015 708 ``````> `````` Greg Wilson committed Jun 22, 2016 709 ``````> ~~~ `````` Nicola Soranzo committed May 02, 2017 710 ``````> %matplotlib inline `````` Damien Irving committed May 26, 2015 711 ``````> ~~~ `````` Anne Fouilloux committed Feb 14, 2018 712 ``````> {: .language-python} `````` Trevor Bekolay committed Jan 18, 2016 713 ``````> `````` shiffer1 committed Jun 25, 2015 714 715 ``````> The `%` indicates an IPython magic function - > a function that is only valid within the notebook environment. `````` Damien Irving committed May 26, 2015 716 ``````> Note that you only have to execute this function once per notebook. `````` Greg Wilson committed Jun 22, 2016 717 ``````{: .callout} `````` Damien Irving committed May 26, 2015 718 `````` `````` Raniere Silva committed Sep 02, 2014 719 ``````Let's take a look at the average inflammation over time: `````` Greg Wilson committed Mar 03, 2014 720 `````` `````` Greg Wilson committed Jun 22, 2016 721 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 722 ``````ave_inflammation = numpy.mean(data, axis=0) `````` Damien Irving committed May 26, 2015 723 ``````ave_plot = matplotlib.pyplot.plot(ave_inflammation) `````` Elliott Sales de Andrade committed Jan 22, 2016 724 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 725 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 726 ``````{: .language-python} `````` Greg Wilson committed Mar 03, 2014 727 `````` `````` Greg Wilson committed Jul 20, 2016 728 ``````![Average Inflammation Over Time](../fig/01-numpy_73_0.png) `````` Greg Wilson committed Mar 03, 2014 729 `````` `````` Raniere Silva committed Sep 02, 2014 730 731 ``````Here, we have put the average per day across all patients in the variable `ave_inflammation`, `````` Damien Irving committed May 26, 2015 732 ``````then asked `matplotlib.pyplot` to create and display a line graph of those values. `````` Brian Jackson committed Feb 22, 2018 733 ``````The result is a roughly linear rise and fall, `````` Raniere Silva committed Sep 02, 2014 734 ``````which is suspicious: `````` Brian Jackson committed Feb 22, 2018 735 ``````we might instead expect a sharper rise and slower fall. `````` Raniere Silva committed Sep 02, 2014 736 ``````Let's have a look at two other statistics: `````` Greg Wilson committed Mar 03, 2014 737 `````` `````` Greg Wilson committed Jun 22, 2016 738 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 739 ``````max_plot = matplotlib.pyplot.plot(numpy.max(data, axis=0)) `````` Elliott Sales de Andrade committed Jan 22, 2016 740 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 741 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 742 ``````{: .language-python} `````` Greg Wilson committed Mar 03, 2014 743 `````` `````` Greg Wilson committed Jul 20, 2016 744 ``````![Maximum Value Along The First Axis](../fig/01-numpy_75_1.png) `````` Greg Wilson committed Dec 03, 2014 745 `````` `````` Greg Wilson committed Jun 22, 2016 746 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 747 ``````min_plot = matplotlib.pyplot.plot(numpy.min(data, axis=0)) `````` Elliott Sales de Andrade committed Jan 22, 2016 748 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 749 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 750 ``````{: .language-python} `````` Greg Wilson committed Apr 09, 2014 751 `````` `````` Greg Wilson committed Jul 20, 2016 752 ``````![Minimum Value Along The First Axis](../fig/01-numpy_75_3.png) `````` Greg Wilson committed Mar 03, 2014 753 `````` `````` Brian Jackson committed Feb 22, 2018 754 ``````The maximum value rises and falls smoothly, `````` Raniere Silva committed Sep 02, 2014 755 ``````while the minimum seems to be a step function. ``````