01-numpy.md 36.3 KB
 Greg Wilson committed Mar 03, 2014 1 ``````--- `````` Greg Wilson committed Jun 22, 2016 2 3 4 5 ``````title: Analyzing Patient Data teaching: 30 exercises: 0 questions: `````` Greg Wilson committed Jul 01, 2016 6 ``````- "How can I process tabular data files in Python?" `````` Greg Wilson committed Jun 22, 2016 7 ``````objectives: `````` Brian Jackson committed Feb 22, 2018 8 ``````- "Explain what a library is and what libraries are used for." `````` Eilis Hannon committed Oct 20, 2016 9 ``````- "Import a Python library and use the functions it contains." `````` Greg Wilson committed Jun 22, 2016 10 11 12 13 ``````- "Read tabular data from a file into a program." - "Assign values to variables." - "Select individual values and subsections from data." - "Perform operations on arrays of data." `````` Eilis Hannon committed Oct 20, 2016 14 ``````- "Plot simple graphs from data." `````` Greg Wilson committed Jun 22, 2016 15 ``````keypoints: `````` Greg Wilson committed Jun 25, 2016 16 17 18 19 20 21 ``````- "Import a library into a program using `import libraryname`." - "Use the `numpy` library to work with arrays in Python." - "Use `variable = value` to assign a value to a variable in order to record it in memory." - "Variables are created on demand whenever a value is assigned to them." - "Use `print(something)` to display the value of `something`." - "The expression `array.shape` gives the shape of an array." `````` Dustin Lang committed Jan 27, 2017 22 ``````- "Use `array[x, y]` to select a single element from a 2D array." `````` Greg Wilson committed Jun 25, 2016 23 ``````- "Array indices start at 0, not 1." `````` Dustin Lang committed Jan 27, 2017 24 ``````- "Use `low:high` to specify a `slice` that includes the indices from `low` to `high-1`." `````` Greg Wilson committed Jun 25, 2016 25 26 27 28 29 ``````- "All the indexing and slicing that works on arrays also works on strings." - "Use `# some kind of explanation` to add comments to programs." - "Use `numpy.mean(array)`, `numpy.max(array)`, and `numpy.min(array)` to calculate simple statistics." - "Use `numpy.mean(array, axis=0)` or `numpy.mean(array, axis=1)` to calculate statistics across the specified axis." - "Use the `pyplot` library from `matplotlib` for creating simple visualizations." `````` Greg Wilson committed Mar 03, 2014 30 31 ``````--- `````` Maxim Belkin committed May 23, 2018 32 33 ``````In this lesson we will learn how to work with arthritis inflammation datasets in Python. However, before we discuss how to deal with many data points, let's learn how to work with single data values. `````` Greg Wilson committed Mar 03, 2014 34 `````` `````` Maxim Belkin committed May 23, 2018 35 36 37 38 39 ``````## Variables Any Python interpreter can be used as a calculator: ~~~ 3 + 5 * 4 `````` Justin Pringle committed Feb 26, 2018 40 ``````~~~ `````` Maxim Belkin committed May 23, 2018 41 ``````{: .language-python} `````` Justin Pringle committed Feb 26, 2018 42 ``````~~~ `````` Maxim Belkin committed May 23, 2018 43 44 45 ``````23 ~~~ {: .output} `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 46 `````` `````` Maxim Belkin committed May 23, 2018 47 48 49 50 51 ``````This is great but not very interesting. To do anything useful with data, we need to assign its value to a _variable_. In Python, we can [assign]({{ page.root }}/reference/#assign) a value to a [variable]({{ page.root }}/reference/#variable), using the equals sign `=`. For example, to assign value `60` to a variable `weight_kg`, we would execute: `````` Greg Wilson committed Mar 03, 2014 52 `````` `````` Greg Wilson committed Jun 22, 2016 53 ``````~~~ `````` joshkyh committed Apr 17, 2018 54 ``````weight_kg = 60 `````` Greg Wilson committed Dec 03, 2014 55 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 56 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 57 `````` `````` Maxim Belkin committed May 23, 2018 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 ``````From now on, whenever we use `weight_kg`, Python will substitute the value we assigned to it. In essence, **a variable is just a name for a value**. In Python, variable names: - must begin with a letter, and - are [case sensitive]({{ page.root }}/reference/#case-sensitive). This means that, for example: - `weight0` is a valid variable name, whereas `0weight` is not - `weight` and `Weight` are different variables ## Types of data Python knows various types of data. The most common ones are: * integer numbers * floating point numbers, and * strings. In the example above, variabe `weight_kg` has an integer value of `60`. To create a variable with a floating point value, we can execute: ~~~ weight_kg = 60.0 ~~~ {: .language-python} And to create a string we simply have to add single or double quotes around some text, for example: ~~~ weight_kg_text = 'weight in kilograms:' ~~~ {: .language-python} `````` Greg Wilson committed Mar 03, 2014 91 `````` `````` Maxim Belkin committed May 23, 2018 92 93 ``````## Using Variables in Python To display the value of a variable to the screen in Python, we can use `print` function: `````` Greg Wilson committed Mar 03, 2014 94 `````` `````` Greg Wilson committed Jun 22, 2016 95 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 96 ``````print(weight_kg) `````` Greg Wilson committed Dec 03, 2014 97 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 98 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 99 100 `````` ~~~ `````` joshkyh committed Apr 17, 2018 101 ``````60 `````` Greg Wilson committed Dec 03, 2014 102 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 103 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 104 `````` `````` Maxim Belkin committed May 23, 2018 105 106 107 108 109 110 111 112 113 114 115 116 ``````We can display multiple things at once using only one `print` command: ~~~ print(weight_kg_text, weight_kg) ~~~ {: .language-python} ~~~ weight in kilograms: 60 ~~~ {: .output} Moreover, we can do arithmetics with variables right inside the `print` function: `````` Greg Wilson committed Mar 03, 2014 117 `````` `````` Greg Wilson committed Jun 22, 2016 118 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 119 ``````print('weight in pounds:', 2.2 * weight_kg) `````` Greg Wilson committed Dec 03, 2014 120 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 121 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 122 123 `````` ~~~ `````` joshkyh committed Apr 17, 2018 124 ``````weight in pounds: 132.0 `````` Greg Wilson committed Dec 03, 2014 125 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 126 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 127 `````` `````` Maxim Belkin committed May 23, 2018 128 129 130 131 132 ``````The above command, however, did not change the value of `weight_kg`: ~~~ print(weight_kg) ~~~ {: .language-python} `````` jstapleton committed Mar 05, 2016 133 `````` `````` Maxim Belkin committed May 23, 2018 134 135 136 137 138 139 ``````~~~ 60 ~~~ {: .output} To change variable's value, we have to assign it a new one: `````` Greg Wilson committed Mar 03, 2014 140 `````` `````` Greg Wilson committed Jun 22, 2016 141 ``````~~~ `````` Maxim Belkin committed Apr 17, 2018 142 ``````weight_kg = 65.0 `````` Raniere Silva committed Aug 20, 2015 143 ``````print('weight in kilograms is now:', weight_kg) `````` Greg Wilson committed Dec 03, 2014 144 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 145 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 146 147 `````` ~~~ `````` Maxim Belkin committed Apr 17, 2018 148 ``````weight in kilograms is now: 65.0 `````` Greg Wilson committed Dec 03, 2014 149 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 150 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 151 `````` `````` Maxim Belkin committed May 23, 2018 152 153 ``````A variable is analoguous to a sticky note with a name written on it: assigning value to a variable is like putting that sticky note on a particular value. `````` Greg Wilson committed Mar 03, 2014 154 `````` `````` Greg Wilson committed Jul 20, 2016 155 ``````![Variables as Sticky Notes](../fig/python-sticky-note-variables-01.svg) `````` Greg Wilson committed Mar 03, 2014 156 `````` `````` Maxim Belkin committed May 23, 2018 157 158 ``````This means that assigning a value to one variable does **not** change the values of other variables. For example, let's store the subject's weight in pounds in its own variable: `````` Greg Wilson committed Mar 03, 2014 159 `````` `````` Greg Wilson committed Jun 22, 2016 160 ``````~~~ `````` Maxim Belkin committed Apr 17, 2018 161 ``````# There are 2.2 pounds per kilogram `````` Greg Wilson committed Dec 03, 2014 162 ``````weight_lb = 2.2 * weight_kg `````` Maxim Belkin committed May 23, 2018 163 ``````print(weight_kg_text, weight_kg, 'and in pounds:', weight_lb) `````` Greg Wilson committed Dec 03, 2014 164 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 165 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 166 167 `````` ~~~ `````` Maxim Belkin committed Apr 17, 2018 168 ``````weight in kilograms: 65.0 and in pounds: 143.0 `````` Greg Wilson committed Dec 03, 2014 169 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 170 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 171 `````` `````` Greg Wilson committed Jul 20, 2016 172 ``````![Creating Another Variable](../fig/python-sticky-note-variables-02.svg) `````` Greg Wilson committed Mar 03, 2014 173 `````` `````` Maxim Belkin committed May 23, 2018 174 ``````Let's now change `weight_kg`: `````` Greg Wilson committed Mar 03, 2014 175 `````` `````` Greg Wilson committed Jun 22, 2016 176 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 177 ``````weight_kg = 100.0 `````` Raniere Silva committed Aug 20, 2015 178 ``````print('weight in kilograms is now:', weight_kg, 'and weight in pounds is still:', weight_lb) `````` Greg Wilson committed Dec 03, 2014 179 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 180 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 181 182 `````` ~~~ `````` Maxim Belkin committed Apr 17, 2018 183 ``````weight in kilograms is now: 100.0 and weight in pounds is still: 143.0 `````` Greg Wilson committed Dec 03, 2014 184 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 185 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 186 `````` `````` Greg Wilson committed Jul 20, 2016 187 ``````![Updating a Variable](../fig/python-sticky-note-variables-03.svg) `````` Greg Wilson committed Mar 03, 2014 188 `````` `````` Brian Jackson committed Feb 22, 2018 189 ``````Since `weight_lb` doesn't remember where its value came from, `````` Raniere Silva committed Sep 02, 2014 190 ``````it isn't automatically updated when `weight_kg` changes. `````` Maxim Belkin committed May 23, 2018 191 192 `````` `````` Benjamin Laken committed Nov 09, 2015 193 `````` `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 194 195 196 197 ``````Words are useful, but what's more useful are the sentences and stories we build with them. Similarly, while a lot of powerful, general tools are built into languages like Python, specialized tools built up from these basic units live in [libraries]({{ page.root }}/reference/#library) `````` devendra1810 committed Jul 19, 2016 198 199 ``````that can be called upon when needed. `````` Maxim Belkin committed May 23, 2018 200 201 202 203 204 205 ``````## Loading data into Python In order to load our inflammation data, we need to access ([import]({{ page.root }}/reference/#import) in Python terminology) a library called [NumPy](http://docs.scipy.org/doc/numpy/ "NumPy Documentation"). In general you should use this library if you want to do fancy things with numbers, especially if you have matrices or arrays. We can import NumPy using: `````` devendra1810 committed Jul 19, 2016 206 207 208 209 `````` ~~~ import numpy ~~~ `````` Anne Fouilloux committed Feb 14, 2018 210 ``````{: .language-python} `````` devendra1810 committed Jul 19, 2016 211 `````` `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 212 213 214 215 216 217 ``````Importing a library is like getting a piece of lab equipment out of a storage locker and setting it up on the bench. Libraries provide additional functionality to the basic Python package, much like a new piece of equipment adds functionality to a lab space. Just like in the lab, importing too many libraries can sometimes complicate and slow down your programs - so we only import what we need for each program. Once we've imported the library, we can ask the library to read our data file for us: `````` devendra1810 committed Jul 19, 2016 218 219 220 221 `````` ~~~ numpy.loadtxt(fname='inflammation-01.csv', delimiter=',') ~~~ `````` Anne Fouilloux committed Feb 14, 2018 222 ``````{: .language-python} `````` devendra1810 committed Jul 19, 2016 223 224 225 226 227 228 229 230 231 232 233 234 `````` ~~~ array([[ 0., 0., 1., ..., 3., 0., 0.], [ 0., 1., 2., ..., 1., 0., 1.], [ 0., 1., 1., ..., 2., 1., 1.], ..., [ 0., 1., 1., ..., 1., 1., 1.], [ 0., 0., 0., ..., 0., 2., 0.], [ 0., 0., 1., ..., 1., 1., 0.]]) ~~~ {: .output} `````` Trevor Bekolay committed Aug 26, 2017 235 ``````The expression `numpy.loadtxt(...)` is a [function call]({{ page.root }}/reference/#function-call) `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 236 237 238 239 ``````that asks Python to run the [function]({{ page.root }}/reference/#function) `loadtxt` which belongs to the `numpy` library. This [dotted notation]({{ page.root }}/reference/#dotted-notation) is used everywhere in Python: the thing that appears before the dot contains the thing that appears after. `````` Brian Jackson committed Feb 22, 2018 240 `````` `````` Brian Jackson committed Feb 22, 2018 241 ``````As an example, John Smith is the John that belongs to the Smith family, `````` Maxim Belkin committed Apr 17, 2018 242 ``````We could use the dot notation to write his name `smith.john`, `````` Brian Jackson committed Feb 22, 2018 243 ``````just as `loadtxt` is a function that belongs to the `numpy` library. `````` devendra1810 committed Jul 19, 2016 244 `````` `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 245 246 247 248 ```````numpy.loadtxt` has two [parameters]({{ page.root }}/reference/#parameter): the name of the file we want to read and the [delimiter]({{ page.root }}/reference/#delimiter) that separates values on a line. These both need to be character strings (or [strings]({{ page.root }}/reference/#string) for short), so we put them in quotes. `````` devendra1810 committed Jul 19, 2016 249 250 251 252 253 254 255 256 257 258 259 260 `````` Since we haven't told it to do anything else with the function's output, the notebook displays it. In this case, that output is the data we just loaded. By default, only a few rows and columns are shown (with `...` to omit elements when displaying big arrays). To save space, Python displays numbers as `1.` instead of `1.0` when there's nothing interesting after the decimal point. `````` Brian Jackson committed Feb 22, 2018 261 ``````Our call to `numpy.loadtxt` read our file `````` devendra1810 committed Jul 19, 2016 262 263 ``````but didn't save the data in memory. To do that, `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 264 265 266 ``````we need to assign the array to a variable. Just as we can assign a single value to a variable, we can also assign an array of values to a variable using the same syntax. Let's re-run `numpy.loadtxt` and save the returned data: `````` Greg Wilson committed Mar 03, 2014 267 `````` `````` Greg Wilson committed Jun 22, 2016 268 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 269 270 ``````data = numpy.loadtxt(fname='inflammation-01.csv', delimiter=',') ~~~ `````` Anne Fouilloux committed Feb 14, 2018 271 ``````{: .language-python} `````` Greg Wilson committed Mar 03, 2014 272 `````` `````` Maxim Belkin committed Apr 17, 2018 273 ``````This statement doesn't produce any output because we've assigned the output to the variable `data`. `````` Brian Jackson committed Feb 22, 2018 274 ``````If we want to check that the data have been loaded, `````` Raniere Silva committed Sep 02, 2014 275 ``````we can print the variable's value: `````` Greg Wilson committed Mar 03, 2014 276 `````` `````` Greg Wilson committed Jun 22, 2016 277 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 278 ``````print(data) `````` Greg Wilson committed Dec 03, 2014 279 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 280 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 281 282 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 283 ``````[[ 0. 0. 1. ..., 3. 0. 0.] `````` Greg Wilson committed Mar 03, 2014 284 285 `````` [ 0. 1. 2. ..., 1. 0. 1.] [ 0. 1. 1. ..., 2. 1. 1.] `````` 286 `````` ..., `````` Greg Wilson committed Mar 03, 2014 287 288 289 `````` [ 0. 1. 1. ..., 1. 1. 1.] [ 0. 0. 0. ..., 0. 2. 0.] [ 0. 0. 1. ..., 1. 1. 0.]] `````` Greg Wilson committed Dec 03, 2014 290 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 291 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 292 `````` `````` Brian Jackson committed Feb 22, 2018 293 294 ``````Now that the data are in memory, we can manipulate them. `````` Raniere Silva committed Sep 02, 2014 295 ``````First, `````` Greg Wilson committed Sep 05, 2016 296 ``````let's ask what [type]({{ page.root }}/reference/#type) of thing `data` refers to: `````` Greg Wilson committed Mar 03, 2014 297 `````` `````` Greg Wilson committed Jun 22, 2016 298 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 299 ``````print(type(data)) `````` Greg Wilson committed Dec 03, 2014 300 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 301 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 302 303 `````` ~~~ `````` Trevor Bekolay committed Aug 28, 2015 304 `````` `````` Greg Wilson committed Dec 03, 2014 305 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 306 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 307 `````` `````` 308 ``````The output tells us that `data` currently refers to `````` Brian Jackson committed Feb 22, 2018 309 ``````an N-dimensional array, the functionality for which is provided by the NumPy library. `````` 310 ``````These data correspond to arthritis patients' inflammation. `````` Brian Jackson committed Feb 22, 2018 311 ``````The rows are the individual patients, and the columns `````` 312 313 ``````are their daily inflammation measurements. `````` Greg Wilson committed Jun 22, 2016 314 ``````> ## Data Type `````` 315 316 ``````> > A Numpy array contains one or more elements `````` Brian Jackson committed Feb 22, 2018 317 318 319 320 ``````> of the same type. The `type` function will only tell you that > a variable is a NumPy array but won't tell you the type of > thing inside the array. > We can find out the type `````` 321 322 ``````> of the data contained in the NumPy array. > `````` Greg Wilson committed Jun 22, 2016 323 ``````> ~~~ `````` 324 325 ``````> print(data.dtype) > ~~~ `````` Anne Fouilloux committed Feb 14, 2018 326 ``````> {: .language-python} `````` Greg Wilson committed Jun 22, 2016 327 328 ``````> > ~~~ `````` 329 330 ``````> dtype('float64') > ~~~ `````` Greg Wilson committed Jun 22, 2016 331 ``````> {: .output} `````` 332 333 ``````> > This tells us that the NumPy array's elements are `````` Greg Wilson committed Sep 05, 2016 334 ``````> [floating-point numbers]({{ page.root }}/reference/#floating-point number). `````` Greg Wilson committed Jun 22, 2016 335 ``````{: .callout} `````` 336 `````` `````` Brian Jackson committed Feb 22, 2018 337 ``````With the following command, we can see the array's [shape]({{ page.root }}/reference/#shape): `````` Greg Wilson committed Mar 03, 2014 338 `````` `````` Greg Wilson committed Jun 22, 2016 339 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 340 ``````print(data.shape) `````` Greg Wilson committed Dec 03, 2014 341 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 342 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 343 344 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 345 346 ``````(60, 40) ~~~ `````` Greg Wilson committed Jun 22, 2016 347 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 348 `````` `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 349 350 ``````The output tells us that the `data` array variable contains 60 rows and 40 columns. When we created the variable `data` to store our arthritis data, we didn't just create the array; we also `````` Greg Wilson committed Sep 05, 2016 351 ``````created information about the array, called [members]({{ page.root }}/reference/#member) or `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 352 353 354 355 ``````attributes. This extra information describes `data` in the same way an adjective describes a noun. `data.shape` is an attribute of `data` which describes the dimensions of `data`. We use the same dotted notation for the attributes of variables that we use for the functions in libraries because they have the same part-and-whole relationship. `````` Greg Wilson committed Mar 03, 2014 356 `````` `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 357 358 359 360 ``````If we want to get a single number from the array, we must provide an [index]({{ page.root }}/reference/#index) in square brackets after the variable name, just as we do in math when referring to an element of a matrix. Our inflammation data has two dimensions, so we will need to use two indices to refer to one specific value: `````` Greg Wilson committed Mar 03, 2014 361 `````` `````` Greg Wilson committed Jun 22, 2016 362 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 363 ``````print('first value in data:', data[0, 0]) `````` Greg Wilson committed Dec 03, 2014 364 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 365 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 366 367 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 368 369 ``````first value in data: 0.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 370 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 371 `````` `````` Greg Wilson committed Jun 22, 2016 372 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 373 ``````print('middle value in data:', data[30, 20]) `````` Greg Wilson committed Dec 03, 2014 374 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 375 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 376 377 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 378 379 ``````middle value in data: 13.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 380 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 381 `````` `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 382 383 ``````The expression `data[30, 20]` accesses the element at row 30, column 20. While this expression may not surprise you, `````` Noah Spies committed Jan 16, 2018 384 `````` `data[0, 0]` might. `````` Brian Jackson committed Feb 22, 2018 385 ``````Programming languages like Fortran, MATLAB and R start counting at 1 `````` Raniere Silva committed Sep 02, 2014 386 387 ``````because that's what human beings have done for thousands of years. Languages in the C family (including C++, Java, Perl, and Python) count from 0 `````` Thomas Robitaille committed Oct 04, 2016 388 389 390 391 392 ``````because it represents an offset from the first value in the array (the second value is offset by one index from the first value). This is closer to the way that computers represent arrays (if you are interested in the historical reasons behind counting indices from zero, you can read [Mike Hoye's blog post](http://exple.tive.org/blarg/2013/10/22/citation-needed/)). `````` Raniere Silva committed Sep 02, 2014 393 ``````As a result, `````` Greg Wilson committed Jul 10, 2016 394 ``````if we have an M×N array in Python, `````` Raniere Silva committed Sep 02, 2014 395 396 397 398 399 400 ``````its indices go from 0 to M-1 on the first axis and 0 to N-1 on the second. It takes a bit of getting used to, but one way to remember the rule is that the index is how many steps we have to take from the start to get the item we want. `````` Eleanor Lutz committed Jun 20, 2017 401 402 ``````![Zero Index](../fig/python-zero-index.png) `````` Greg Wilson committed Jun 22, 2016 403 ``````> ## In the Corner `````` Raniere Silva committed Sep 02, 2014 404 405 406 407 ``````> > What may also surprise you is that when Python displays an array, > it shows the element with index `[0, 0]` in the upper left corner > rather than the lower left. `````` Brian Jackson committed Feb 22, 2018 408 ``````> This is consistent with the way mathematicians draw matrices `````` Raniere Silva committed Sep 02, 2014 409 ``````> but different from the Cartesian coordinates. `````` 410 ``````> The indices are (row, column) instead of (column, row) for the same reason, `````` Greg Wilson committed Sep 08, 2014 411 ``````> which can be confusing when plotting data. `````` Greg Wilson committed Jun 22, 2016 412 ``````{: .callout} `````` Raniere Silva committed Sep 02, 2014 413 `````` `````` Maxim Belkin committed May 24, 2018 414 ``````## Slicing data `````` Raniere Silva committed Sep 02, 2014 415 416 417 418 ``````An index like `[30, 20]` selects a single element of an array, but we can select whole sections as well. For example, we can select the first ten days (columns) of values `````` shiffer1 committed Jun 25, 2015 419 ``````for the first four patients (rows) like this: `````` Greg Wilson committed Mar 03, 2014 420 `````` `````` Greg Wilson committed Jun 22, 2016 421 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 422 ``````print(data[0:4, 0:10]) `````` Greg Wilson committed Dec 03, 2014 423 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 424 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 425 426 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 427 ``````[[ 0. 0. 1. 3. 1. 2. 4. 7. 8. 3.] `````` Greg Wilson committed Mar 03, 2014 428 429 430 `````` [ 0. 1. 2. 1. 2. 1. 3. 2. 2. 6.] [ 0. 1. 1. 3. 3. 2. 6. 2. 5. 9.] [ 0. 0. 2. 0. 4. 2. 2. 1. 6. 7.]] `````` Greg Wilson committed Dec 03, 2014 431 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 432 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 433 `````` `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 434 435 436 ``````The [slice]({{ page.root }}/reference/#slice) `0:4` means, "Start at index 0 and go up to, but not including, index 4."Again, the up-to-but-not-including takes a bit of getting used to, but the rule is that the difference between the upper and lower bounds is the number of values in the slice. `````` Raniere Silva committed Sep 02, 2014 437 438 `````` We don't have to start slices at 0: `````` Greg Wilson committed Mar 03, 2014 439 `````` `````` Greg Wilson committed Jun 22, 2016 440 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 441 ``````print(data[5:10, 0:10]) `````` Greg Wilson committed Dec 03, 2014 442 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 443 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 444 445 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 446 ``````[[ 0. 0. 1. 2. 2. 4. 2. 1. 6. 4.] `````` Greg Wilson committed Mar 03, 2014 447 448 449 450 `````` [ 0. 0. 2. 2. 4. 2. 2. 5. 5. 8.] [ 0. 0. 1. 2. 3. 1. 2. 3. 5. 3.] [ 0. 0. 0. 3. 1. 5. 6. 5. 5. 8.] [ 0. 1. 1. 2. 1. 3. 5. 3. 5. 8.]] `````` Greg Wilson committed Dec 03, 2014 451 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 452 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 453 `````` `````` Maxim Belkin committed May 24, 2018 454 455 456 457 ``````We also don't have to include the upper and lower bound on the slice. If we don't include the lower bound, Python uses 0 by default; if we don't include the upper, the slice runs to the end of the axis, and if we don't include either (i.e., if we just use ':' on its own), the slice includes everything: `````` Greg Wilson committed Mar 03, 2014 458 `````` `````` Greg Wilson committed Jun 22, 2016 459 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 460 ``````small = data[:3, 36:] `````` Raniere Silva committed Aug 20, 2015 461 462 ``````print('small is:') print(small) `````` Greg Wilson committed Dec 03, 2014 463 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 464 ``````{: .language-python} `````` Brian Jackson committed Feb 22, 2018 465 ``````The above example selects rows 0 through 2 and columns 36 through to the end of the array. `````` Greg Wilson committed Jun 22, 2016 466 467 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 468 ``````small is: `````` Greg Wilson committed Mar 03, 2014 469 470 471 ``````[[ 2. 3. 0. 0.] [ 1. 1. 0. 1.] [ 2. 2. 1. 1.]] `````` Greg Wilson committed Dec 03, 2014 472 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 473 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 474 `````` `````` Maxim Belkin committed May 24, 2018 475 476 477 ``````Arrays also know how to perform common mathematical operations on their values. The simplest operations with data are arithmetic: addition, subtraction, multiplication, and division. When you do such operations on arrays, the operation is done element-by-element. Thus: `````` Johnny Lin committed Sep 22, 2014 478 `````` `````` Greg Wilson committed Jun 22, 2016 479 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 480 481 ``````doubledata = data * 2.0 ~~~ `````` Anne Fouilloux committed Feb 14, 2018 482 ``````{: .language-python} `````` Trevor Bekolay committed Sep 29, 2014 483 `````` `````` Greg Wilson committed Dec 03, 2014 484 ``````will create a new array `doubledata` `````` Brian Jackson committed Feb 22, 2018 485 ``````each elements of which is twice the value of the corresponding element in `data`: `````` Johnny Lin committed Sep 22, 2014 486 `````` `````` Greg Wilson committed Jun 22, 2016 487 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 488 489 490 491 ``````print('original:') print(data[:3, 36:]) print('doubledata:') print(doubledata[:3, 36:]) `````` Greg Wilson committed Dec 03, 2014 492 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 493 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 494 495 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 496 ``````original: `````` Trevor Bekolay committed Sep 29, 2014 497 498 499 500 501 502 503 ``````[[ 2. 3. 0. 0.] [ 1. 1. 0. 1.] [ 2. 2. 1. 1.]] doubledata: [[ 4. 6. 0. 0.] [ 2. 2. 0. 2.] [ 4. 4. 2. 2.]] `````` Greg Wilson committed Dec 03, 2014 504 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 505 ``````{: .output} `````` Trevor Bekolay committed Sep 29, 2014 506 `````` `````` Maxim Belkin committed May 24, 2018 507 508 509 ``````If, instead of taking an array and doing arithmetic with a single value (as above), you did the arithmetic operation with another array of the same shape, the operation will be done on corresponding elements of the two arrays. Thus: `````` Trevor Bekolay committed Sep 29, 2014 510 `````` `````` Greg Wilson committed Jun 22, 2016 511 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 512 513 ``````tripledata = doubledata + data ~~~ `````` Anne Fouilloux committed Feb 14, 2018 514 ``````{: .language-python} `````` Trevor Bekolay committed Sep 29, 2014 515 `````` `````` Johnny Lin committed Sep 22, 2014 516 517 518 ``````will give you an array where `tripledata[0,0]` will equal `doubledata[0,0]` plus `data[0,0]`, and so on for all other elements of the arrays. `````` Greg Wilson committed Jun 22, 2016 519 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 520 521 ``````print('tripledata:') print(tripledata[:3, 36:]) `````` Greg Wilson committed Dec 03, 2014 522 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 523 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 524 525 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 526 ``````tripledata: `````` Trevor Bekolay committed Sep 29, 2014 527 528 529 ``````[[ 6. 9. 0. 0.] [ 3. 3. 0. 3.] [ 6. 6. 3. 3.]] `````` Greg Wilson committed Dec 03, 2014 530 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 531 ``````{: .output} `````` Trevor Bekolay committed Sep 29, 2014 532 `````` `````` Maxim Belkin committed May 24, 2018 533 534 535 ``````Often, we want to do more than add, subtract, multiply, and divide array elements. NumPy knows how to do more complex operations, too. If we want to find the average inflammation for all patients on all days, for example, we can ask NumPy to compute `data`'s mean value: `````` Greg Wilson committed Mar 03, 2014 536 `````` `````` Greg Wilson committed Jun 22, 2016 537 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 538 ``````print(numpy.mean(data)) `````` Greg Wilson committed Dec 03, 2014 539 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 540 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 541 542 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 543 544 ``````6.14875 ~~~ `````` Greg Wilson committed Jun 22, 2016 545 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 546 `````` `````` Greg Wilson committed Sep 05, 2016 547 548 ```````mean` is a [function]({{ page.root }}/reference/#function) that takes an array as an [argument]({{ page.root }}/reference/#argument). `````` Trevor Bekolay committed Jun 22, 2016 549 `````` `````` Greg Wilson committed Jun 22, 2016 550 ``````> ## Not All Functions Have Input `````` Trevor Bekolay committed Jun 22, 2016 551 552 553 ``````> > Generally, a function uses inputs to produce outputs. > However, some functions produce outputs without `````` Trevor Bekolay committed Jun 22, 2016 554 555 ``````> needing any input. For example, checking the current time > doesn't require any input. `````` Trevor Bekolay committed Jun 22, 2016 556 ``````> `````` Greg Wilson committed Jun 22, 2016 557 ``````> ~~~ `````` Trevor Bekolay committed Jun 22, 2016 558 559 ``````> import time > print(time.ctime()) `````` Trevor Bekolay committed Jun 22, 2016 560 ``````> ~~~ `````` Anne Fouilloux committed Feb 14, 2018 561 ``````> {: .language-python} `````` Greg Wilson committed Jun 22, 2016 562 563 ``````> > ~~~ `````` Trevor Bekolay committed Jun 22, 2016 564 ``````> 'Sat Mar 26 13:07:33 2016' `````` Trevor Bekolay committed Jun 22, 2016 565 ``````> ~~~ `````` Greg Wilson committed Jun 22, 2016 566 ``````> {: .output} `````` Trevor Bekolay committed Jun 22, 2016 567 568 569 570 ``````> > For functions that don't take in any arguments, > we still need parentheses (`()`) > to tell Python to go and do something for us. `````` Greg Wilson committed Jun 22, 2016 571 ``````{: .callout} `````` Trevor Bekolay committed Jun 22, 2016 572 573 574 `````` NumPy has lots of useful functions that take an array as input. Let's use three of those functions to get some descriptive values about the dataset. `````` Trevor Bekolay committed Jun 21, 2016 575 576 ``````We'll also use multiple assignment, a convenient Python feature that will enable us to do this all in one line. `````` Greg Wilson committed Mar 03, 2014 577 `````` `````` Greg Wilson committed Jun 22, 2016 578 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 579 ``````maxval, minval, stdval = numpy.max(data), numpy.min(data), numpy.std(data) `````` Trevor Bekolay committed Jun 21, 2016 580 `````` `````` Alistair Walsh committed Jun 21, 2016 581 582 583 ``````print('maximum inflammation:', maxval) print('minimum inflammation:', minval) print('standard deviation:', stdval) `````` Greg Wilson committed Dec 03, 2014 584 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 585 ``````{: .language-python} `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 586 587 588 589 `````` Here we've assigned the return value from `numpy.max(data)` to the variable `maxval`, the value from `numpy.min(data)` to `minval`, and so on. `````` Greg Wilson committed Jun 22, 2016 590 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 591 ``````maximum inflammation: 20.0 `````` Greg Wilson committed Mar 03, 2014 592 593 ``````minimum inflammation: 0.0 standard deviation: 4.61383319712 `````` Greg Wilson committed Dec 03, 2014 594 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 595 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 596 `````` `````` Greg Wilson committed Jun 22, 2016 597 ``````> ## Mystery Functions in IPython `````` Benjamin Laken committed Dec 04, 2015 598 ``````> `````` Trevor Bekolay committed Jun 22, 2016 599 ``````> How did we know what functions NumPy has and how to use them? `````` Brian Jackson committed Feb 22, 2018 600 ``````> If you are working in the IPython/Jupyter Notebook, there is an easy way to find out. `````` Dustin Lang committed Jan 27, 2017 601 ``````> If you type the name of something followed by a dot, then you can use tab completion `````` Trevor Bekolay committed Jun 22, 2016 602 ``````> (e.g. type `numpy.` and then press tab) `````` Brian Jackson committed Feb 22, 2018 603 604 ``````> to see a list of all functions and attributes that you can use. After selecting one, you > can also add a question mark (e.g. `numpy.cumprod?`), and IPython will return an `````` Trevor Bekolay committed Jun 22, 2016 605 ``````> explanation of the method! This is the same as doing `help(numpy.cumprod)`. `````` Greg Wilson committed Jun 22, 2016 606 ``````{: .callout} `````` Trevor Bekolay committed Jun 22, 2016 607 608 `````` When analyzing data, though, `````` Brian Jackson committed Feb 22, 2018 609 610 611 ``````we often want to look at variations in statistical values, such as the maximum inflammation per patient or the average inflammation per day. `````` Azalee Bostroem committed May 09, 2015 612 ``````One way to do this is to create a new temporary array of the data we want, `````` Raniere Silva committed Sep 02, 2014 613 ``````then ask it to do the calculation: `````` Greg Wilson committed Mar 03, 2014 614 `````` `````` Greg Wilson committed Jun 22, 2016 615 ``````~~~ `````` Dustin Lang committed Jan 27, 2017 616 ``````patient_0 = data[0, :] # 0 on the first axis (rows), everything on the second (columns) `````` Raniere Silva committed Aug 20, 2015 617 ``````print('maximum inflammation for patient 0:', patient_0.max()) `````` Greg Wilson committed Dec 03, 2014 618 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 619 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 620 621 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 622 623 ``````maximum inflammation for patient 0: 18.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 624 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 625 `````` `````` valiseverywhere committed Jun 20, 2016 626 ``````Everything in a line of code following the '#' symbol is a `````` Dustin Lang committed Jan 27, 2017 627 ``````[comment]({{ page.root }}/reference/#comment) that is ignored by Python. `````` valiseverywhere committed Jun 20, 2016 628 ``````Comments allow programmers to leave explanatory notes for other `````` jstapleton committed Mar 05, 2016 629 630 ``````programmers or their future selves. `````` Raniere Silva committed Sep 02, 2014 631 ``````We don't actually need to store the row in a variable of its own. `````` Trevor Bekolay committed Jun 22, 2016 632 ``````Instead, we can combine the selection and the function call: `````` Greg Wilson committed Mar 03, 2014 633 `````` `````` Greg Wilson committed Jun 22, 2016 634 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 635 ``````print('maximum inflammation for patient 2:', numpy.max(data[2, :])) `````` Greg Wilson committed Dec 03, 2014 636 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 637 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 638 639 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 640 641 ``````maximum inflammation for patient 2: 19.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 642 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 643 `````` `````` Valentina Staneva committed Jul 18, 2016 644 ``````What if we need the maximum inflammation for each patient over all days (as in the `````` Brian Jackson committed Feb 22, 2018 645 ``````next diagram on the left) or the average for each day (as in the `````` 646 647 ``````diagram on the right)? As the diagram below shows, we want to perform the operation across an axis: `````` Greg Wilson committed Mar 03, 2014 648 `````` `````` Greg Wilson committed Jul 20, 2016 649 ``````![Operations Across Axes](../fig/python-operations-across-axes.png) `````` Greg Wilson committed Mar 03, 2014 650 `````` `````` Brian Jackson committed Feb 22, 2018 651 ``````To support this functionality, `````` Trevor Bekolay committed Jun 22, 2016 652 ``````most array functions allow us to specify the axis we want to work on. `````` 653 ``````If we ask for the average across axis 0 (rows in our 2D example), `````` Raniere Silva committed Sep 02, 2014 654 ``````we get: `````` Greg Wilson committed Mar 03, 2014 655 `````` `````` Greg Wilson committed Jun 22, 2016 656 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 657 ``````print(numpy.mean(data, axis=0)) `````` Greg Wilson committed Dec 03, 2014 658 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 659 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 660 661 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 662 ``````[ 0. 0.45 1.11666667 1.75 2.43333333 3.15 `````` Greg Wilson committed Mar 03, 2014 663 664 665 666 667 668 669 `````` 3.8 3.88333333 5.23333333 5.51666667 5.95 5.9 8.35 7.73333333 8.36666667 9.5 9.58333333 10.63333333 11.56666667 12.35 13.25 11.96666667 11.03333333 10.16666667 10. 8.66666667 9.15 7.25 7.33333333 6.58333333 6.06666667 5.95 5.11666667 3.6 3.3 3.56666667 2.48333333 1.5 1.13333333 0.56666667] `````` Greg Wilson committed Dec 03, 2014 670 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 671 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 672 `````` `````` Raniere Silva committed Sep 02, 2014 673 674 ``````As a quick check, we can ask this array what its shape is: `````` Greg Wilson committed Mar 03, 2014 675 `````` `````` Greg Wilson committed Jun 22, 2016 676 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 677 ``````print(numpy.mean(data, axis=0).shape) `````` Greg Wilson committed Dec 03, 2014 678 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 679 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 680 681 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 682 683 ``````(40,) ~~~ `````` Greg Wilson committed Jun 22, 2016 684 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 685 `````` `````` Greg Wilson committed Jul 10, 2016 686 ``````The expression `(40,)` tells us we have an N×1 vector, `````` Raniere Silva committed Sep 02, 2014 687 ``````so this is the average inflammation per day for all patients. `````` 688 ``````If we average across axis 1 (columns in our 2D example), we get: `````` Greg Wilson committed Mar 03, 2014 689 `````` `````` Greg Wilson committed Jun 22, 2016 690 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 691 ``````print(numpy.mean(data, axis=1)) `````` Greg Wilson committed Dec 03, 2014 692 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 693 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 694 695 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 696 ``````[ 5.45 5.425 6.1 5.9 5.55 6.225 5.975 6.65 6.625 6.525 `````` Greg Wilson committed Mar 03, 2014 697 698 699 700 701 `````` 6.775 5.8 6.225 5.75 5.225 6.3 6.55 5.7 5.85 6.55 5.775 5.825 6.175 6.1 5.8 6.425 6.05 6.025 6.175 6.55 6.175 6.35 6.725 6.125 7.075 5.725 5.925 6.15 6.075 5.75 5.975 5.725 6.3 5.9 6.75 5.925 7.225 6.15 5.95 6.275 5.7 6.1 6.825 5.975 6.725 5.7 6.25 6.4 7.05 5.9 ] `````` Greg Wilson committed Dec 03, 2014 702 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 703 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 704 `````` `````` Raniere Silva committed Sep 02, 2014 705 706 ``````which is the average inflammation per patient across all days. `````` Maxim Belkin committed May 24, 2018 707 708 709 710 711 712 713 ``````## Visualizing data The mathematician Richard Hamming once said, "The purpose of computing is insight, not numbers," and the best way to develop insight is often to visualize data. Visualization deserves an entire lecture of its own, but we can explore a few features of Python's `matplotlib` library here. While there is no official plotting library, `matplotlib` is the _de facto_ the standard. First, we will import the `pyplot` module from `matplotlib` and use two of its functions to create and display a heat map of our data: `````` Greg Wilson committed Mar 03, 2014 714 `````` `````` Greg Wilson committed Jun 22, 2016 715 ``````~~~ `````` Azalee Bostroem committed May 09, 2015 716 ``````import matplotlib.pyplot `````` Konrad Förstner committed Oct 18, 2016 717 ``````image = matplotlib.pyplot.imshow(data) `````` Elliott Sales de Andrade committed Jan 22, 2016 718 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 719 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 720 ``````{: .language-python} `````` Greg Wilson committed Mar 03, 2014 721 `````` `````` Greg Wilson committed Jul 20, 2016 722 ``````![Heatmap of the Data](../fig/01-numpy_71_0.png) `````` Greg Wilson committed Mar 03, 2014 723 `````` `````` Maxim Belkin committed May 24, 2018 724 725 ``````Blue pixels in this heat map represent low values, while yellow pixels represent high values. As we can see, inflammation rises and falls over a 40-day period. `````` Damien Irving committed May 26, 2015 726 `````` `````` Greg Wilson committed Jun 22, 2016 727 ``````> ## Some IPython Magic `````` Damien Irving committed May 26, 2015 728 729 730 731 ``````> > If you're using an IPython / Jupyter notebook, > you'll need to execute the following command > in order for your matplotlib images to appear `````` Damien Irving committed May 27, 2015 732 ``````> in the notebook when `show()` is called: `````` Damien Irving committed May 26, 2015 733 ``````> `````` Greg Wilson committed Jun 22, 2016 734 ``````> ~~~ `````` Nicola Soranzo committed May 02, 2017 735 ``````> %matplotlib inline `````` Damien Irving committed May 26, 2015 736 ``````> ~~~ `````` Anne Fouilloux committed Feb 14, 2018 737 ``````> {: .language-python} `````` Trevor Bekolay committed Jan 18, 2016 738 ``````> `````` shiffer1 committed Jun 25, 2015 739 740 ``````> The `%` indicates an IPython magic function - > a function that is only valid within the notebook environment. `````` Damien Irving committed May 26, 2015 741 ``````> Note that you only have to execute this function once per notebook. `````` Greg Wilson committed Jun 22, 2016 742 ``````{: .callout} `````` Damien Irving committed May 26, 2015 743 `````` `````` Raniere Silva committed Sep 02, 2014 744 ``````Let's take a look at the average inflammation over time: `````` Greg Wilson committed Mar 03, 2014 745 `````` `````` Greg Wilson committed Jun 22, 2016 746 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 747 ``````ave_inflammation = numpy.mean(data, axis=0) `````` Damien Irving committed May 26, 2015 748 ``````ave_plot = matplotlib.pyplot.plot(ave_inflammation) `````` Elliott Sales de Andrade committed Jan 22, 2016 749 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 750 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 751 ``````{: .language-python} `````` Greg Wilson committed Mar 03, 2014 752 `````` `````` Greg Wilson committed Jul 20, 2016 753 ``````![Average Inflammation Over Time](../fig/01-numpy_73_0.png) `````` Greg Wilson committed Mar 03, 2014 754 `````` `````` Maxim Belkin committed May 24, 2018 755 756 757 758 ``````Here, we have put the average per day across all patients in the variable `ave_inflammation`, then asked `matplotlib.pyplot` to create and display a line graph of those values. The result is a roughly linear rise and fall, which is suspicious: we might instead expect a sharper rise and slower fall. Let's have a look at two other statistics: `````` Greg Wilson committed Mar 03, 2014 759 `````` `````` Greg Wilson committed Jun 22, 2016 760 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 761 ``````max_plot = matplotlib.pyplot.plot(numpy.max(data, axis=0)) `````` Elliott Sales de Andrade committed Jan 22, 2016 762 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 763 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 764 ``````{: .language-python} `````` Greg Wilson committed Mar 03, 2014 765 `````` `````` Greg Wilson committed Jul 20, 2016 766 ``````![Maximum Value Along The First Axis](../fig/01-numpy_75_1.png) `````` Greg Wilson committed Dec 03, 2014 767 `````` `````` Greg Wilson committed Jun 22, 2016 768 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 769 ``````min_plot = matplotlib.pyplot.plot(numpy.min(data, axis=0)) `````` Elliott Sales de Andrade committed Jan 22, 2016 770 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 771 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 772 ``````{: .language-python} `````` Greg Wilson committed Apr 09, 2014 773 `````` `````` Greg Wilson committed Jul 20, 2016 774 ``````![Minimum Value Along The First Axis](../fig/01-numpy_75_3.png) `````` Greg Wilson committed Mar 03, 2014 775 `````` `````` Maxim Belkin committed May 24, 2018 776 777 778 779 ``````The maximum value rises and falls smoothly, while the minimum seems to be a step function. Neither trend seems particularly likely, so either there's a mistake in our calculations or something is wrong with our data. This insight would have been difficult to reach by examining the numbers themselves without visualization tools. `````` Greg Wilson committed Mar 03, 2014 780 `````` `````` Maxim Belkin committed May 24, 2018 781 ``````### Grouping plots `````` Azalee Bostroem committed May 09, 2015 782 ``````You can group similar plots in a single figure using subplots. `````` Azalee Bostroem committed May 09, 2015 783 ``````This script below uses a number of new commands. The function `matplotlib.pyplot.figure()` `````` Azalee Bostroem committed May 09, 2015 784 ``````creates a space into which we will place all of our plots. The parameter `figsize` `````` Azalee Bostroem committed May 09, 2015 785 ``````tells Python how big to make this space. Each subplot is placed into the figure using `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 786 787 788 ``````its `add_subplot` [method]({{ page.root }}/reference/#method). The `add_subplot` method takes 3 parameters. The first denotes how many total rows of subplots there are, the second parameter refers to the total number of subplot columns, and the final parameter denotes which subplot `````` Elliott Sales de Andrade committed Jan 27, 2016 789 790 791 ``````your variable is referencing (left-to-right, top-to-bottom). Each subplot is stored in a different variable (`axes1`, `axes2`, `axes3`). Once a subplot is created, the axes can be titled using the `set_xlabel()` command (or `set_ylabel()`). `````` W. Trevor King committed Apr 09, 2015 792 ``````Here are our three plots side by side: `````` Greg Wilson committed Mar 03, 2014 793 `````` `````` Greg Wilson committed Jun 22, 2016 794 ``````~~~ `````` W. Trevor King committed Apr 09, 2015 795 796 ``````import numpy import matplotlib.pyplot `````` Greg Wilson committed Mar 03, 2014 797 `````` `````` W. Trevor King committed Apr 09, 2015 798 ``````data = numpy.loadtxt(fname='inflammation-01.csv', delimiter=',') `````` Greg Wilson committed Mar 03, 2014 799 `````` `````` W. Trevor King committed Apr 09, 2015 800 ``````fig = matplotlib.pyplot.figure(figsize=(10.0, 3.0)) `````` Greg Wilson committed Mar 03, 2014 801 `````` `````` Andrew Lonsdale committed Feb 13, 2015 802 803 804 ``````axes1 = fig.add_subplot(1, 3, 1) axes2 = fig.add_subplot(1, 3, 2) axes3 = fig.add_subplot(1, 3, 3) ``````