01-numpy.md 35.8 KB
 Greg Wilson committed Mar 03, 2014 1 ``````--- `````` Greg Wilson committed Jun 22, 2016 2 3 4 5 ``````title: Analyzing Patient Data teaching: 30 exercises: 0 questions: `````` Greg Wilson committed Jul 01, 2016 6 ``````- "How can I process tabular data files in Python?" `````` Greg Wilson committed Jun 22, 2016 7 ``````objectives: `````` Brian Jackson committed Feb 22, 2018 8 ``````- "Explain what a library is and what libraries are used for." `````` Eilis Hannon committed Oct 20, 2016 9 ``````- "Import a Python library and use the functions it contains." `````` Greg Wilson committed Jun 22, 2016 10 11 12 13 ``````- "Read tabular data from a file into a program." - "Assign values to variables." - "Select individual values and subsections from data." - "Perform operations on arrays of data." `````` Eilis Hannon committed Oct 20, 2016 14 ``````- "Plot simple graphs from data." `````` Greg Wilson committed Jun 22, 2016 15 ``````keypoints: `````` Greg Wilson committed Jun 25, 2016 16 17 18 19 20 21 ``````- "Import a library into a program using `import libraryname`." - "Use the `numpy` library to work with arrays in Python." - "Use `variable = value` to assign a value to a variable in order to record it in memory." - "Variables are created on demand whenever a value is assigned to them." - "Use `print(something)` to display the value of `something`." - "The expression `array.shape` gives the shape of an array." `````` Dustin Lang committed Jan 27, 2017 22 ``````- "Use `array[x, y]` to select a single element from a 2D array." `````` Greg Wilson committed Jun 25, 2016 23 ``````- "Array indices start at 0, not 1." `````` Dustin Lang committed Jan 27, 2017 24 ``````- "Use `low:high` to specify a `slice` that includes the indices from `low` to `high-1`." `````` Greg Wilson committed Jun 25, 2016 25 26 27 28 29 ``````- "All the indexing and slicing that works on arrays also works on strings." - "Use `# some kind of explanation` to add comments to programs." - "Use `numpy.mean(array)`, `numpy.max(array)`, and `numpy.min(array)` to calculate simple statistics." - "Use `numpy.mean(array, axis=0)` or `numpy.mean(array, axis=1)` to calculate statistics across the specified axis." - "Use the `pyplot` library from `matplotlib` for creating simple visualizations." `````` Greg Wilson committed Mar 03, 2014 30 31 ``````--- `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 32 33 ``````In this lesson we will learn how to manipulate the inflammation dataset with Python. Before we discuss how to deal with many data points, we will show how to store a single value on the computer. `````` Greg Wilson committed Mar 03, 2014 34 `````` `````` Justin Pringle committed Feb 26, 2018 35 36 37 38 39 ``````You can get output from python by typing math into the console: ~~~ 3+5 12/7 ~~~ `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 40 `````` `````` Maxim Belkin committed May 02, 2018 41 ``````However, to do anything useful and/or interesting we need to assign values to _variables_ `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 42 43 44 ``````(or link _objects_ to names/variables). The line below [assigns]({{ page.root }}/reference/#assign) the value `60` to a [variable]({{ page.root }}/reference/#variable) `weight_kg`: `````` Greg Wilson committed Mar 03, 2014 45 `````` `````` Greg Wilson committed Jun 22, 2016 46 ``````~~~ `````` joshkyh committed Apr 17, 2018 47 ``````weight_kg = 60 `````` Greg Wilson committed Dec 03, 2014 48 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 49 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 50 `````` `````` Justin Pringle committed Feb 26, 2018 51 ``````A variable is a name for a value, `````` Saymore Chifamba committed Jul 11, 2017 52 ``````such as `x_val`, `current_temperature`, or `subject_id`. `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 53 54 ``````Python's variables must begin with a letter and are [case sensitive]({{ page.root }}/reference/#case-sensitive). `````` Kyler Brown committed Apr 22, 2015 55 ``````We can create a new variable by assigning a value to it using `=`. `````` Maxim Belkin committed May 02, 2018 56 ``````When we are finished typing and press Shift+Return, `````` valiseverywhere committed Jan 07, 2017 57 ``````the notebook runs our command. `````` Greg Wilson committed Mar 03, 2014 58 `````` `````` Azalee Bostroem committed May 09, 2015 59 ``````Once a variable has a value, we can print it to the screen: `````` Greg Wilson committed Mar 03, 2014 60 `````` `````` Greg Wilson committed Jun 22, 2016 61 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 62 ``````print(weight_kg) `````` Greg Wilson committed Dec 03, 2014 63 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 64 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 65 66 `````` ~~~ `````` joshkyh committed Apr 17, 2018 67 ``````60 `````` Greg Wilson committed Dec 03, 2014 68 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 69 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 70 `````` `````` Brian Jackson committed Feb 22, 2018 71 ``````and do arithmetic with it (remember, there are 2.2 pounds per kilogram): `````` Greg Wilson committed Mar 03, 2014 72 `````` `````` Greg Wilson committed Jun 22, 2016 73 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 74 ``````print('weight in pounds:', 2.2 * weight_kg) `````` Greg Wilson committed Dec 03, 2014 75 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 76 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 77 78 `````` ~~~ `````` joshkyh committed Apr 17, 2018 79 ``````weight in pounds: 132.0 `````` Greg Wilson committed Dec 03, 2014 80 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 81 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 82 `````` `````` jstapleton committed Mar 05, 2016 83 84 85 ``````As the example above shows, we can print several things at once by separating them with commas. `````` Raniere Silva committed Sep 02, 2014 86 ``````We can also change a variable's value by assigning it a new one: `````` Greg Wilson committed Mar 03, 2014 87 `````` `````` Greg Wilson committed Jun 22, 2016 88 ``````~~~ `````` Maxim Belkin committed Apr 17, 2018 89 ``````weight_kg = 65.0 `````` Raniere Silva committed Aug 20, 2015 90 ``````print('weight in kilograms is now:', weight_kg) `````` Greg Wilson committed Dec 03, 2014 91 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 92 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 93 94 `````` ~~~ `````` Maxim Belkin committed Apr 17, 2018 95 ``````weight in kilograms is now: 65.0 `````` Greg Wilson committed Dec 03, 2014 96 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 97 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 98 `````` `````` Raniere Silva committed Sep 02, 2014 99 100 ``````If we imagine the variable as a sticky note with a name written on it, assignment is like putting the sticky note on a particular value: `````` Greg Wilson committed Mar 03, 2014 101 `````` `````` Greg Wilson committed Jul 20, 2016 102 ``````![Variables as Sticky Notes](../fig/python-sticky-note-variables-01.svg) `````` Greg Wilson committed Mar 03, 2014 103 `````` `````` Raniere Silva committed Sep 02, 2014 104 105 106 ``````This means that assigning a value to one variable does *not* change the values of other variables. For example, let's store the subject's weight in pounds in a variable: `````` Greg Wilson committed Mar 03, 2014 107 `````` `````` Greg Wilson committed Jun 22, 2016 108 ``````~~~ `````` Maxim Belkin committed Apr 17, 2018 109 ``````# There are 2.2 pounds per kilogram `````` Greg Wilson committed Dec 03, 2014 110 ``````weight_lb = 2.2 * weight_kg `````` Raniere Silva committed Aug 20, 2015 111 ``````print('weight in kilograms:', weight_kg, 'and in pounds:', weight_lb) `````` Greg Wilson committed Dec 03, 2014 112 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 113 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 114 115 `````` ~~~ `````` Maxim Belkin committed Apr 17, 2018 116 ``````weight in kilograms: 65.0 and in pounds: 143.0 `````` Greg Wilson committed Dec 03, 2014 117 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 118 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 119 `````` `````` Greg Wilson committed Jul 20, 2016 120 ``````![Creating Another Variable](../fig/python-sticky-note-variables-02.svg) `````` Greg Wilson committed Mar 03, 2014 121 `````` `````` Raniere Silva committed Sep 02, 2014 122 ``````and then change `weight_kg`: `````` Greg Wilson committed Mar 03, 2014 123 `````` `````` Greg Wilson committed Jun 22, 2016 124 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 125 ``````weight_kg = 100.0 `````` Raniere Silva committed Aug 20, 2015 126 ``````print('weight in kilograms is now:', weight_kg, 'and weight in pounds is still:', weight_lb) `````` Greg Wilson committed Dec 03, 2014 127 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 128 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 129 130 `````` ~~~ `````` Maxim Belkin committed Apr 17, 2018 131 ``````weight in kilograms is now: 100.0 and weight in pounds is still: 143.0 `````` Greg Wilson committed Dec 03, 2014 132 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 133 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 134 `````` `````` Greg Wilson committed Jul 20, 2016 135 ``````![Updating a Variable](../fig/python-sticky-note-variables-03.svg) `````` Greg Wilson committed Mar 03, 2014 136 `````` `````` Brian Jackson committed Feb 22, 2018 137 ``````Since `weight_lb` doesn't remember where its value came from, `````` Raniere Silva committed Sep 02, 2014 138 139 140 ``````it isn't automatically updated when `weight_kg` changes. This is different from the way spreadsheets work. `````` Greg Wilson committed Jun 22, 2016 141 ``````> ## Who's Who in Memory `````` Benjamin Laken committed Nov 09, 2015 142 ``````> `````` Trevor Bekolay committed Jun 22, 2016 143 144 ``````> You can use the `%whos` command at any time to see what > variables you have created and what modules you have loaded into the computer's memory. `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 145 146 ``````> As this is an IPython command, it will only work if you are in an IPython terminal or the > Jupyter Notebook. `````` Benjamin Laken committed Nov 09, 2015 147 ``````> `````` Greg Wilson committed Jun 22, 2016 148 ``````> ~~~ `````` Trevor Bekolay committed Jun 22, 2016 149 150 ``````> %whos > ~~~ `````` Anne Fouilloux committed Feb 14, 2018 151 ``````> {: .language-python} `````` Greg Wilson committed Jun 22, 2016 152 153 ``````> > ~~~ `````` Trevor Bekolay committed Jun 22, 2016 154 155 156 ``````> Variable Type Data/Info > -------------------------------- > weight_kg float 100.0 `````` Maxim Belkin committed Apr 17, 2018 157 ``````> weight_lb float 143.0 `````` Trevor Bekolay committed Jun 22, 2016 158 ``````> ~~~ `````` Greg Wilson committed Jun 22, 2016 159 160 ``````> {: .output} {: .callout} `````` Benjamin Laken committed Nov 09, 2015 161 `````` `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 162 163 164 165 ``````Words are useful, but what's more useful are the sentences and stories we build with them. Similarly, while a lot of powerful, general tools are built into languages like Python, specialized tools built up from these basic units live in [libraries]({{ page.root }}/reference/#library) `````` devendra1810 committed Jul 19, 2016 166 167 168 ``````that can be called upon when needed. In order to load our inflammation data, `````` Trevor Bekolay committed Aug 26, 2017 169 ``````we need to access ([import]({{ page.root }}/reference/#import) in Python terminology) `````` devendra1810 committed Jul 19, 2016 170 171 172 173 174 175 176 177 ``````a library called [NumPy](http://docs.scipy.org/doc/numpy/ "NumPy Documentation"). In general you should use this library if you want to do fancy things with numbers, especially if you have matrices or arrays. We can import NumPy using: ~~~ import numpy ~~~ `````` Anne Fouilloux committed Feb 14, 2018 178 ``````{: .language-python} `````` devendra1810 committed Jul 19, 2016 179 `````` `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 180 181 182 183 184 185 ``````Importing a library is like getting a piece of lab equipment out of a storage locker and setting it up on the bench. Libraries provide additional functionality to the basic Python package, much like a new piece of equipment adds functionality to a lab space. Just like in the lab, importing too many libraries can sometimes complicate and slow down your programs - so we only import what we need for each program. Once we've imported the library, we can ask the library to read our data file for us: `````` devendra1810 committed Jul 19, 2016 186 187 188 189 `````` ~~~ numpy.loadtxt(fname='inflammation-01.csv', delimiter=',') ~~~ `````` Anne Fouilloux committed Feb 14, 2018 190 ``````{: .language-python} `````` devendra1810 committed Jul 19, 2016 191 192 193 194 195 196 197 198 199 200 201 202 `````` ~~~ array([[ 0., 0., 1., ..., 3., 0., 0.], [ 0., 1., 2., ..., 1., 0., 1.], [ 0., 1., 1., ..., 2., 1., 1.], ..., [ 0., 1., 1., ..., 1., 1., 1.], [ 0., 0., 0., ..., 0., 2., 0.], [ 0., 0., 1., ..., 1., 1., 0.]]) ~~~ {: .output} `````` Trevor Bekolay committed Aug 26, 2017 203 ``````The expression `numpy.loadtxt(...)` is a [function call]({{ page.root }}/reference/#function-call) `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 204 205 206 207 ``````that asks Python to run the [function]({{ page.root }}/reference/#function) `loadtxt` which belongs to the `numpy` library. This [dotted notation]({{ page.root }}/reference/#dotted-notation) is used everywhere in Python: the thing that appears before the dot contains the thing that appears after. `````` Brian Jackson committed Feb 22, 2018 208 `````` `````` Brian Jackson committed Feb 22, 2018 209 ``````As an example, John Smith is the John that belongs to the Smith family, `````` Maxim Belkin committed Apr 17, 2018 210 ``````We could use the dot notation to write his name `smith.john`, `````` Brian Jackson committed Feb 22, 2018 211 ``````just as `loadtxt` is a function that belongs to the `numpy` library. `````` devendra1810 committed Jul 19, 2016 212 `````` `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 213 214 215 216 ```````numpy.loadtxt` has two [parameters]({{ page.root }}/reference/#parameter): the name of the file we want to read and the [delimiter]({{ page.root }}/reference/#delimiter) that separates values on a line. These both need to be character strings (or [strings]({{ page.root }}/reference/#string) for short), so we put them in quotes. `````` devendra1810 committed Jul 19, 2016 217 218 219 220 221 222 223 224 225 226 227 228 `````` Since we haven't told it to do anything else with the function's output, the notebook displays it. In this case, that output is the data we just loaded. By default, only a few rows and columns are shown (with `...` to omit elements when displaying big arrays). To save space, Python displays numbers as `1.` instead of `1.0` when there's nothing interesting after the decimal point. `````` Brian Jackson committed Feb 22, 2018 229 ``````Our call to `numpy.loadtxt` read our file `````` devendra1810 committed Jul 19, 2016 230 231 ``````but didn't save the data in memory. To do that, `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 232 233 234 ``````we need to assign the array to a variable. Just as we can assign a single value to a variable, we can also assign an array of values to a variable using the same syntax. Let's re-run `numpy.loadtxt` and save the returned data: `````` Greg Wilson committed Mar 03, 2014 235 `````` `````` Greg Wilson committed Jun 22, 2016 236 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 237 238 ``````data = numpy.loadtxt(fname='inflammation-01.csv', delimiter=',') ~~~ `````` Anne Fouilloux committed Feb 14, 2018 239 ``````{: .language-python} `````` Greg Wilson committed Mar 03, 2014 240 `````` `````` Maxim Belkin committed Apr 17, 2018 241 ``````This statement doesn't produce any output because we've assigned the output to the variable `data`. `````` Brian Jackson committed Feb 22, 2018 242 ``````If we want to check that the data have been loaded, `````` Raniere Silva committed Sep 02, 2014 243 ``````we can print the variable's value: `````` Greg Wilson committed Mar 03, 2014 244 `````` `````` Greg Wilson committed Jun 22, 2016 245 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 246 ``````print(data) `````` Greg Wilson committed Dec 03, 2014 247 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 248 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 249 250 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 251 ``````[[ 0. 0. 1. ..., 3. 0. 0.] `````` Greg Wilson committed Mar 03, 2014 252 253 `````` [ 0. 1. 2. ..., 1. 0. 1.] [ 0. 1. 1. ..., 2. 1. 1.] `````` 254 `````` ..., `````` Greg Wilson committed Mar 03, 2014 255 256 257 `````` [ 0. 1. 1. ..., 1. 1. 1.] [ 0. 0. 0. ..., 0. 2. 0.] [ 0. 0. 1. ..., 1. 1. 0.]] `````` Greg Wilson committed Dec 03, 2014 258 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 259 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 260 `````` `````` Brian Jackson committed Feb 22, 2018 261 262 ``````Now that the data are in memory, we can manipulate them. `````` Raniere Silva committed Sep 02, 2014 263 ``````First, `````` Greg Wilson committed Sep 05, 2016 264 ``````let's ask what [type]({{ page.root }}/reference/#type) of thing `data` refers to: `````` Greg Wilson committed Mar 03, 2014 265 `````` `````` Greg Wilson committed Jun 22, 2016 266 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 267 ``````print(type(data)) `````` Greg Wilson committed Dec 03, 2014 268 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 269 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 270 271 `````` ~~~ `````` Trevor Bekolay committed Aug 28, 2015 272 `````` `````` Greg Wilson committed Dec 03, 2014 273 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 274 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 275 `````` `````` 276 ``````The output tells us that `data` currently refers to `````` Brian Jackson committed Feb 22, 2018 277 ``````an N-dimensional array, the functionality for which is provided by the NumPy library. `````` 278 ``````These data correspond to arthritis patients' inflammation. `````` Brian Jackson committed Feb 22, 2018 279 ``````The rows are the individual patients, and the columns `````` 280 281 ``````are their daily inflammation measurements. `````` Greg Wilson committed Jun 22, 2016 282 ``````> ## Data Type `````` 283 284 ``````> > A Numpy array contains one or more elements `````` Brian Jackson committed Feb 22, 2018 285 286 287 288 ``````> of the same type. The `type` function will only tell you that > a variable is a NumPy array but won't tell you the type of > thing inside the array. > We can find out the type `````` 289 290 ``````> of the data contained in the NumPy array. > `````` Greg Wilson committed Jun 22, 2016 291 ``````> ~~~ `````` 292 293 ``````> print(data.dtype) > ~~~ `````` Anne Fouilloux committed Feb 14, 2018 294 ``````> {: .language-python} `````` Greg Wilson committed Jun 22, 2016 295 296 ``````> > ~~~ `````` 297 298 ``````> dtype('float64') > ~~~ `````` Greg Wilson committed Jun 22, 2016 299 ``````> {: .output} `````` 300 301 ``````> > This tells us that the NumPy array's elements are `````` Greg Wilson committed Sep 05, 2016 302 ``````> [floating-point numbers]({{ page.root }}/reference/#floating-point number). `````` Greg Wilson committed Jun 22, 2016 303 ``````{: .callout} `````` 304 `````` `````` Brian Jackson committed Feb 22, 2018 305 ``````With the following command, we can see the array's [shape]({{ page.root }}/reference/#shape): `````` Greg Wilson committed Mar 03, 2014 306 `````` `````` Greg Wilson committed Jun 22, 2016 307 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 308 ``````print(data.shape) `````` Greg Wilson committed Dec 03, 2014 309 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 310 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 311 312 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 313 314 ``````(60, 40) ~~~ `````` Greg Wilson committed Jun 22, 2016 315 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 316 `````` `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 317 318 ``````The output tells us that the `data` array variable contains 60 rows and 40 columns. When we created the variable `data` to store our arthritis data, we didn't just create the array; we also `````` Greg Wilson committed Sep 05, 2016 319 ``````created information about the array, called [members]({{ page.root }}/reference/#member) or `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 320 321 322 323 ``````attributes. This extra information describes `data` in the same way an adjective describes a noun. `data.shape` is an attribute of `data` which describes the dimensions of `data`. We use the same dotted notation for the attributes of variables that we use for the functions in libraries because they have the same part-and-whole relationship. `````` Greg Wilson committed Mar 03, 2014 324 `````` `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 325 326 327 328 ``````If we want to get a single number from the array, we must provide an [index]({{ page.root }}/reference/#index) in square brackets after the variable name, just as we do in math when referring to an element of a matrix. Our inflammation data has two dimensions, so we will need to use two indices to refer to one specific value: `````` Greg Wilson committed Mar 03, 2014 329 `````` `````` Greg Wilson committed Jun 22, 2016 330 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 331 ``````print('first value in data:', data[0, 0]) `````` Greg Wilson committed Dec 03, 2014 332 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 333 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 334 335 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 336 337 ``````first value in data: 0.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 338 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 339 `````` `````` Greg Wilson committed Jun 22, 2016 340 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 341 ``````print('middle value in data:', data[30, 20]) `````` Greg Wilson committed Dec 03, 2014 342 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 343 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 344 345 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 346 347 ``````middle value in data: 13.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 348 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 349 `````` `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 350 351 ``````The expression `data[30, 20]` accesses the element at row 30, column 20. While this expression may not surprise you, `````` Noah Spies committed Jan 16, 2018 352 `````` `data[0, 0]` might. `````` Brian Jackson committed Feb 22, 2018 353 ``````Programming languages like Fortran, MATLAB and R start counting at 1 `````` Raniere Silva committed Sep 02, 2014 354 355 ``````because that's what human beings have done for thousands of years. Languages in the C family (including C++, Java, Perl, and Python) count from 0 `````` Thomas Robitaille committed Oct 04, 2016 356 357 358 359 360 ``````because it represents an offset from the first value in the array (the second value is offset by one index from the first value). This is closer to the way that computers represent arrays (if you are interested in the historical reasons behind counting indices from zero, you can read [Mike Hoye's blog post](http://exple.tive.org/blarg/2013/10/22/citation-needed/)). `````` Raniere Silva committed Sep 02, 2014 361 ``````As a result, `````` Greg Wilson committed Jul 10, 2016 362 ``````if we have an M×N array in Python, `````` Raniere Silva committed Sep 02, 2014 363 364 365 366 367 368 ``````its indices go from 0 to M-1 on the first axis and 0 to N-1 on the second. It takes a bit of getting used to, but one way to remember the rule is that the index is how many steps we have to take from the start to get the item we want. `````` Eleanor Lutz committed Jun 20, 2017 369 370 ``````![Zero Index](../fig/python-zero-index.png) `````` Greg Wilson committed Jun 22, 2016 371 ``````> ## In the Corner `````` Raniere Silva committed Sep 02, 2014 372 373 374 375 ``````> > What may also surprise you is that when Python displays an array, > it shows the element with index `[0, 0]` in the upper left corner > rather than the lower left. `````` Brian Jackson committed Feb 22, 2018 376 ``````> This is consistent with the way mathematicians draw matrices `````` Raniere Silva committed Sep 02, 2014 377 ``````> but different from the Cartesian coordinates. `````` 378 ``````> The indices are (row, column) instead of (column, row) for the same reason, `````` Greg Wilson committed Sep 08, 2014 379 ``````> which can be confusing when plotting data. `````` Greg Wilson committed Jun 22, 2016 380 ``````{: .callout} `````` Raniere Silva committed Sep 02, 2014 381 382 383 384 385 `````` An index like `[30, 20]` selects a single element of an array, but we can select whole sections as well. For example, we can select the first ten days (columns) of values `````` shiffer1 committed Jun 25, 2015 386 ``````for the first four patients (rows) like this: `````` Greg Wilson committed Mar 03, 2014 387 `````` `````` Greg Wilson committed Jun 22, 2016 388 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 389 ``````print(data[0:4, 0:10]) `````` Greg Wilson committed Dec 03, 2014 390 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 391 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 392 393 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 394 ``````[[ 0. 0. 1. 3. 1. 2. 4. 7. 8. 3.] `````` Greg Wilson committed Mar 03, 2014 395 396 397 `````` [ 0. 1. 2. 1. 2. 1. 3. 2. 2. 6.] [ 0. 1. 1. 3. 3. 2. 6. 2. 5. 9.] [ 0. 0. 2. 0. 4. 2. 2. 1. 6. 7.]] `````` Greg Wilson committed Dec 03, 2014 398 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 399 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 400 `````` `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 401 402 403 ``````The [slice]({{ page.root }}/reference/#slice) `0:4` means, "Start at index 0 and go up to, but not including, index 4."Again, the up-to-but-not-including takes a bit of getting used to, but the rule is that the difference between the upper and lower bounds is the number of values in the slice. `````` Raniere Silva committed Sep 02, 2014 404 405 `````` We don't have to start slices at 0: `````` Greg Wilson committed Mar 03, 2014 406 `````` `````` Greg Wilson committed Jun 22, 2016 407 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 408 ``````print(data[5:10, 0:10]) `````` Greg Wilson committed Dec 03, 2014 409 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 410 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 411 412 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 413 ``````[[ 0. 0. 1. 2. 2. 4. 2. 1. 6. 4.] `````` Greg Wilson committed Mar 03, 2014 414 415 416 417 `````` [ 0. 0. 2. 2. 4. 2. 2. 5. 5. 8.] [ 0. 0. 1. 2. 3. 1. 2. 3. 5. 3.] [ 0. 0. 0. 3. 1. 5. 6. 5. 5. 8.] [ 0. 1. 1. 2. 1. 3. 5. 3. 5. 8.]] `````` Greg Wilson committed Dec 03, 2014 418 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 419 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 420 `````` `````` Raniere Silva committed Sep 02, 2014 421 422 423 424 425 426 427 428 ``````We also don't have to include the upper and lower bound on the slice. If we don't include the lower bound, Python uses 0 by default; if we don't include the upper, the slice runs to the end of the axis, and if we don't include either (i.e., if we just use ':' on its own), the slice includes everything: `````` Greg Wilson committed Mar 03, 2014 429 `````` `````` Greg Wilson committed Jun 22, 2016 430 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 431 ``````small = data[:3, 36:] `````` Raniere Silva committed Aug 20, 2015 432 433 ``````print('small is:') print(small) `````` Greg Wilson committed Dec 03, 2014 434 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 435 ``````{: .language-python} `````` Brian Jackson committed Feb 22, 2018 436 ``````The above example selects rows 0 through 2 and columns 36 through to the end of the array. `````` Greg Wilson committed Jun 22, 2016 437 438 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 439 ``````small is: `````` Greg Wilson committed Mar 03, 2014 440 441 442 ``````[[ 2. 3. 0. 0.] [ 1. 1. 0. 1.] [ 2. 2. 1. 1.]] `````` Greg Wilson committed Dec 03, 2014 443 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 444 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 445 `````` `````` Raniere Silva committed Sep 02, 2014 446 ``````Arrays also know how to perform common mathematical operations on their values. `````` Greg Wilson committed Dec 03, 2014 447 ``````The simplest operations with data are arithmetic: `````` Brian Jackson committed Feb 22, 2018 448 ``````addition, subtraction, multiplication, and division. `````` Greg Wilson committed Dec 03, 2014 449 `````` When you do such operations on arrays, `````` Brian Jackson committed Feb 22, 2018 450 ``````the operation is done element-by-element. `````` Greg Wilson committed Dec 03, 2014 451 ``````Thus: `````` Johnny Lin committed Sep 22, 2014 452 `````` `````` Greg Wilson committed Jun 22, 2016 453 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 454 455 ``````doubledata = data * 2.0 ~~~ `````` Anne Fouilloux committed Feb 14, 2018 456 ``````{: .language-python} `````` Trevor Bekolay committed Sep 29, 2014 457 `````` `````` Greg Wilson committed Dec 03, 2014 458 ``````will create a new array `doubledata` `````` Brian Jackson committed Feb 22, 2018 459 ``````each elements of which is twice the value of the corresponding element in `data`: `````` Johnny Lin committed Sep 22, 2014 460 `````` `````` Greg Wilson committed Jun 22, 2016 461 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 462 463 464 465 ``````print('original:') print(data[:3, 36:]) print('doubledata:') print(doubledata[:3, 36:]) `````` Greg Wilson committed Dec 03, 2014 466 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 467 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 468 469 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 470 ``````original: `````` Trevor Bekolay committed Sep 29, 2014 471 472 473 474 475 476 477 ``````[[ 2. 3. 0. 0.] [ 1. 1. 0. 1.] [ 2. 2. 1. 1.]] doubledata: [[ 4. 6. 0. 0.] [ 2. 2. 0. 2.] [ 4. 4. 2. 2.]] `````` Greg Wilson committed Dec 03, 2014 478 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 479 ``````{: .output} `````` Trevor Bekolay committed Sep 29, 2014 480 `````` `````` Greg Wilson committed Dec 03, 2014 481 ``````If, `````` Brian Jackson committed Feb 22, 2018 482 ``````instead of taking an array and doing arithmetic with a single value (as above), `````` Azalee Bostroem committed May 09, 2015 483 ``````you did the arithmetic operation with another array of the same shape, `````` Greg Wilson committed Dec 03, 2014 484 485 ``````the operation will be done on corresponding elements of the two arrays. Thus: `````` Trevor Bekolay committed Sep 29, 2014 486 `````` `````` Greg Wilson committed Jun 22, 2016 487 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 488 489 ``````tripledata = doubledata + data ~~~ `````` Anne Fouilloux committed Feb 14, 2018 490 ``````{: .language-python} `````` Trevor Bekolay committed Sep 29, 2014 491 `````` `````` Johnny Lin committed Sep 22, 2014 492 493 494 ``````will give you an array where `tripledata[0,0]` will equal `doubledata[0,0]` plus `data[0,0]`, and so on for all other elements of the arrays. `````` Greg Wilson committed Jun 22, 2016 495 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 496 497 ``````print('tripledata:') print(tripledata[:3, 36:]) `````` Greg Wilson committed Dec 03, 2014 498 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 499 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 500 501 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 502 ``````tripledata: `````` Trevor Bekolay committed Sep 29, 2014 503 504 505 ``````[[ 6. 9. 0. 0.] [ 3. 3. 0. 3.] [ 6. 6. 3. 3.]] `````` Greg Wilson committed Dec 03, 2014 506 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 507 ``````{: .output} `````` Trevor Bekolay committed Sep 29, 2014 508 `````` `````` Brian Jackson committed Feb 22, 2018 509 510 ``````Often, we want to do more than add, subtract, multiply, and divide array elements. NumPy knows how to do more complex operations, too. `````` Raniere Silva committed Sep 02, 2014 511 512 ``````If we want to find the average inflammation for all patients on all days, for example, `````` Trevor Bekolay committed Jun 22, 2016 513 ``````we can ask NumPy to compute `data`'s mean value: `````` Greg Wilson committed Mar 03, 2014 514 `````` `````` Greg Wilson committed Jun 22, 2016 515 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 516 ``````print(numpy.mean(data)) `````` Greg Wilson committed Dec 03, 2014 517 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 518 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 519 520 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 521 522 ``````6.14875 ~~~ `````` Greg Wilson committed Jun 22, 2016 523 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 524 `````` `````` Greg Wilson committed Sep 05, 2016 525 526 ```````mean` is a [function]({{ page.root }}/reference/#function) that takes an array as an [argument]({{ page.root }}/reference/#argument). `````` Trevor Bekolay committed Jun 22, 2016 527 `````` `````` Greg Wilson committed Jun 22, 2016 528 ``````> ## Not All Functions Have Input `````` Trevor Bekolay committed Jun 22, 2016 529 530 531 ``````> > Generally, a function uses inputs to produce outputs. > However, some functions produce outputs without `````` Trevor Bekolay committed Jun 22, 2016 532 533 ``````> needing any input. For example, checking the current time > doesn't require any input. `````` Trevor Bekolay committed Jun 22, 2016 534 ``````> `````` Greg Wilson committed Jun 22, 2016 535 ``````> ~~~ `````` Trevor Bekolay committed Jun 22, 2016 536 537 ``````> import time > print(time.ctime()) `````` Trevor Bekolay committed Jun 22, 2016 538 ``````> ~~~ `````` Anne Fouilloux committed Feb 14, 2018 539 ``````> {: .language-python} `````` Greg Wilson committed Jun 22, 2016 540 541 ``````> > ~~~ `````` Trevor Bekolay committed Jun 22, 2016 542 ``````> 'Sat Mar 26 13:07:33 2016' `````` Trevor Bekolay committed Jun 22, 2016 543 ``````> ~~~ `````` Greg Wilson committed Jun 22, 2016 544 ``````> {: .output} `````` Trevor Bekolay committed Jun 22, 2016 545 546 547 548 ``````> > For functions that don't take in any arguments, > we still need parentheses (`()`) > to tell Python to go and do something for us. `````` Greg Wilson committed Jun 22, 2016 549 ``````{: .callout} `````` Trevor Bekolay committed Jun 22, 2016 550 551 552 `````` NumPy has lots of useful functions that take an array as input. Let's use three of those functions to get some descriptive values about the dataset. `````` Trevor Bekolay committed Jun 21, 2016 553 554 ``````We'll also use multiple assignment, a convenient Python feature that will enable us to do this all in one line. `````` Greg Wilson committed Mar 03, 2014 555 `````` `````` Greg Wilson committed Jun 22, 2016 556 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 557 ``````maxval, minval, stdval = numpy.max(data), numpy.min(data), numpy.std(data) `````` Trevor Bekolay committed Jun 21, 2016 558 `````` `````` Alistair Walsh committed Jun 21, 2016 559 560 561 ``````print('maximum inflammation:', maxval) print('minimum inflammation:', minval) print('standard deviation:', stdval) `````` Greg Wilson committed Dec 03, 2014 562 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 563 ``````{: .language-python} `````` Nicholas Cifuentes-Goodbody committed May 02, 2018 564 565 566 567 `````` Here we've assigned the return value from `numpy.max(data)` to the variable `maxval`, the value from `numpy.min(data)` to `minval`, and so on. `````` Greg Wilson committed Jun 22, 2016 568 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 569 ``````maximum inflammation: 20.0 `````` Greg Wilson committed Mar 03, 2014 570 571 ``````minimum inflammation: 0.0 standard deviation: 4.61383319712 `````` Greg Wilson committed Dec 03, 2014 572 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 573 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 574 `````` `````` Greg Wilson committed Jun 22, 2016 575 ``````> ## Mystery Functions in IPython `````` Benjamin Laken committed Dec 04, 2015 576 ``````> `````` Trevor Bekolay committed Jun 22, 2016 577 ``````> How did we know what functions NumPy has and how to use them? `````` Brian Jackson committed Feb 22, 2018 578 ``````> If you are working in the IPython/Jupyter Notebook, there is an easy way to find out. `````` Dustin Lang committed Jan 27, 2017 579 ``````> If you type the name of something followed by a dot, then you can use tab completion `````` Trevor Bekolay committed Jun 22, 2016 580 ``````> (e.g. type `numpy.` and then press tab) `````` Brian Jackson committed Feb 22, 2018 581 582 ``````> to see a list of all functions and attributes that you can use. After selecting one, you > can also add a question mark (e.g. `numpy.cumprod?`), and IPython will return an `````` Trevor Bekolay committed Jun 22, 2016 583 ``````> explanation of the method! This is the same as doing `help(numpy.cumprod)`. `````` Greg Wilson committed Jun 22, 2016 584 ``````{: .callout} `````` Trevor Bekolay committed Jun 22, 2016 585 586 `````` When analyzing data, though, `````` Brian Jackson committed Feb 22, 2018 587 588 589 ``````we often want to look at variations in statistical values, such as the maximum inflammation per patient or the average inflammation per day. `````` Azalee Bostroem committed May 09, 2015 590 ``````One way to do this is to create a new temporary array of the data we want, `````` Raniere Silva committed Sep 02, 2014 591 ``````then ask it to do the calculation: `````` Greg Wilson committed Mar 03, 2014 592 `````` `````` Greg Wilson committed Jun 22, 2016 593 ``````~~~ `````` Dustin Lang committed Jan 27, 2017 594 ``````patient_0 = data[0, :] # 0 on the first axis (rows), everything on the second (columns) `````` Raniere Silva committed Aug 20, 2015 595 ``````print('maximum inflammation for patient 0:', patient_0.max()) `````` Greg Wilson committed Dec 03, 2014 596 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 597 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 598 599 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 600 601 ``````maximum inflammation for patient 0: 18.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 602 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 603 `````` `````` valiseverywhere committed Jun 20, 2016 604 ``````Everything in a line of code following the '#' symbol is a `````` Dustin Lang committed Jan 27, 2017 605 ``````[comment]({{ page.root }}/reference/#comment) that is ignored by Python. `````` valiseverywhere committed Jun 20, 2016 606 ``````Comments allow programmers to leave explanatory notes for other `````` jstapleton committed Mar 05, 2016 607 608 ``````programmers or their future selves. `````` Raniere Silva committed Sep 02, 2014 609 ``````We don't actually need to store the row in a variable of its own. `````` Trevor Bekolay committed Jun 22, 2016 610 ``````Instead, we can combine the selection and the function call: `````` Greg Wilson committed Mar 03, 2014 611 `````` `````` Greg Wilson committed Jun 22, 2016 612 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 613 ``````print('maximum inflammation for patient 2:', numpy.max(data[2, :])) `````` Greg Wilson committed Dec 03, 2014 614 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 615 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 616 617 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 618 619 ``````maximum inflammation for patient 2: 19.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 620 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 621 `````` `````` Valentina Staneva committed Jul 18, 2016 622 ``````What if we need the maximum inflammation for each patient over all days (as in the `````` Brian Jackson committed Feb 22, 2018 623 ``````next diagram on the left) or the average for each day (as in the `````` 624 625 ``````diagram on the right)? As the diagram below shows, we want to perform the operation across an axis: `````` Greg Wilson committed Mar 03, 2014 626 `````` `````` Greg Wilson committed Jul 20, 2016 627 ``````![Operations Across Axes](../fig/python-operations-across-axes.png) `````` Greg Wilson committed Mar 03, 2014 628 `````` `````` Brian Jackson committed Feb 22, 2018 629 ``````To support this functionality, `````` Trevor Bekolay committed Jun 22, 2016 630 ``````most array functions allow us to specify the axis we want to work on. `````` 631 ``````If we ask for the average across axis 0 (rows in our 2D example), `````` Raniere Silva committed Sep 02, 2014 632 ``````we get: `````` Greg Wilson committed Mar 03, 2014 633 `````` `````` Greg Wilson committed Jun 22, 2016 634 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 635 ``````print(numpy.mean(data, axis=0)) `````` Greg Wilson committed Dec 03, 2014 636 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 637 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 638 639 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 640 ``````[ 0. 0.45 1.11666667 1.75 2.43333333 3.15 `````` Greg Wilson committed Mar 03, 2014 641 642 643 644 645 646 647 `````` 3.8 3.88333333 5.23333333 5.51666667 5.95 5.9 8.35 7.73333333 8.36666667 9.5 9.58333333 10.63333333 11.56666667 12.35 13.25 11.96666667 11.03333333 10.16666667 10. 8.66666667 9.15 7.25 7.33333333 6.58333333 6.06666667 5.95 5.11666667 3.6 3.3 3.56666667 2.48333333 1.5 1.13333333 0.56666667] `````` Greg Wilson committed Dec 03, 2014 648 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 649 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 650 `````` `````` Raniere Silva committed Sep 02, 2014 651 652 ``````As a quick check, we can ask this array what its shape is: `````` Greg Wilson committed Mar 03, 2014 653 `````` `````` Greg Wilson committed Jun 22, 2016 654 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 655 ``````print(numpy.mean(data, axis=0).shape) `````` Greg Wilson committed Dec 03, 2014 656 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 657 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 658 659 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 660 661 ``````(40,) ~~~ `````` Greg Wilson committed Jun 22, 2016 662 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 663 `````` `````` Greg Wilson committed Jul 10, 2016 664 ``````The expression `(40,)` tells us we have an N×1 vector, `````` Raniere Silva committed Sep 02, 2014 665 ``````so this is the average inflammation per day for all patients. `````` 666 ``````If we average across axis 1 (columns in our 2D example), we get: `````` Greg Wilson committed Mar 03, 2014 667 `````` `````` Greg Wilson committed Jun 22, 2016 668 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 669 ``````print(numpy.mean(data, axis=1)) `````` Greg Wilson committed Dec 03, 2014 670 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 671 ``````{: .language-python} `````` Greg Wilson committed Jun 22, 2016 672 673 `````` ~~~ `````` Greg Wilson committed Dec 03, 2014 674 ``````[ 5.45 5.425 6.1 5.9 5.55 6.225 5.975 6.65 6.625 6.525 `````` Greg Wilson committed Mar 03, 2014 675 676 677 678 679 `````` 6.775 5.8 6.225 5.75 5.225 6.3 6.55 5.7 5.85 6.55 5.775 5.825 6.175 6.1 5.8 6.425 6.05 6.025 6.175 6.55 6.175 6.35 6.725 6.125 7.075 5.725 5.925 6.15 6.075 5.75 5.975 5.725 6.3 5.9 6.75 5.925 7.225 6.15 5.95 6.275 5.7 6.1 6.825 5.975 6.725 5.7 6.25 6.4 7.05 5.9 ] `````` Greg Wilson committed Dec 03, 2014 680 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 681 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 682 `````` `````` Raniere Silva committed Sep 02, 2014 683 684 685 686 687 ``````which is the average inflammation per patient across all days. The mathematician Richard Hamming once said, "The purpose of computing is insight, not numbers," and the best way to develop insight is often to visualize data. `````` Brian Jackson committed Feb 22, 2018 688 ``````Visualization deserves an entire lecture of its own, `````` Azalee Bostroem committed May 09, 2015 689 ``````but we can explore a few features of Python's `matplotlib` library here. `````` Brian Jackson committed Feb 22, 2018 690 691 ``````While there is no official plotting library, `matplotlib` is the de facto standard. `````` Raniere Silva committed Sep 02, 2014 692 693 694 ``````First, we will import the `pyplot` module from `matplotlib` and use two of its functions to create and display a heat map of our data: `````` Greg Wilson committed Mar 03, 2014 695 `````` `````` Greg Wilson committed Jun 22, 2016 696 ``````~~~ `````` Azalee Bostroem committed May 09, 2015 697 ``````import matplotlib.pyplot `````` Konrad Förstner committed Oct 18, 2016 698 ``````image = matplotlib.pyplot.imshow(data) `````` Elliott Sales de Andrade committed Jan 22, 2016 699 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 700 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 701 ``````{: .language-python} `````` Greg Wilson committed Mar 03, 2014 702 `````` `````` Greg Wilson committed Jul 20, 2016 703 ``````![Heatmap of the Data](../fig/01-numpy_71_0.png) `````` Greg Wilson committed Mar 03, 2014 704 `````` `````` Brian Jackson committed Feb 22, 2018 705 ``````Blue pixels in this heat map represent low values, while yellow pixels represent high values. `````` Raniere Silva committed Sep 02, 2014 706 707 ``````As we can see, inflammation rises and falls over a 40-day period. `````` Damien Irving committed May 26, 2015 708 `````` `````` Greg Wilson committed Jun 22, 2016 709 ``````> ## Some IPython Magic `````` Damien Irving committed May 26, 2015 710 711 712 713 ``````> > If you're using an IPython / Jupyter notebook, > you'll need to execute the following command > in order for your matplotlib images to appear `````` Damien Irving committed May 27, 2015 714 ``````> in the notebook when `show()` is called: `````` Damien Irving committed May 26, 2015 715 ``````> `````` Greg Wilson committed Jun 22, 2016 716 ``````> ~~~ `````` Nicola Soranzo committed May 02, 2017 717 ``````> %matplotlib inline `````` Damien Irving committed May 26, 2015 718 ``````> ~~~ `````` Anne Fouilloux committed Feb 14, 2018 719 ``````> {: .language-python} `````` Trevor Bekolay committed Jan 18, 2016 720 ``````> `````` shiffer1 committed Jun 25, 2015 721 722 ``````> The `%` indicates an IPython magic function - > a function that is only valid within the notebook environment. `````` Damien Irving committed May 26, 2015 723 ``````> Note that you only have to execute this function once per notebook. `````` Greg Wilson committed Jun 22, 2016 724 ``````{: .callout} `````` Damien Irving committed May 26, 2015 725 `````` `````` Raniere Silva committed Sep 02, 2014 726 ``````Let's take a look at the average inflammation over time: `````` Greg Wilson committed Mar 03, 2014 727 `````` `````` Greg Wilson committed Jun 22, 2016 728 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 729 ``````ave_inflammation = numpy.mean(data, axis=0) `````` Damien Irving committed May 26, 2015 730 ``````ave_plot = matplotlib.pyplot.plot(ave_inflammation) `````` Elliott Sales de Andrade committed Jan 22, 2016 731 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 732 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 733 ``````{: .language-python} `````` Greg Wilson committed Mar 03, 2014 734 `````` `````` Greg Wilson committed Jul 20, 2016 735 ``````![Average Inflammation Over Time](../fig/01-numpy_73_0.png) `````` Greg Wilson committed Mar 03, 2014 736 `````` `````` Raniere Silva committed Sep 02, 2014 737 738 ``````Here, we have put the average per day across all patients in the variable `ave_inflammation`, `````` Damien Irving committed May 26, 2015 739 ``````then asked `matplotlib.pyplot` to create and display a line graph of those values. `````` Brian Jackson committed Feb 22, 2018 740 ``````The result is a roughly linear rise and fall, `````` Raniere Silva committed Sep 02, 2014 741 ``````which is suspicious: `````` Brian Jackson committed Feb 22, 2018 742 ``````we might instead expect a sharper rise and slower fall. `````` Raniere Silva committed Sep 02, 2014 743 ``````Let's have a look at two other statistics: `````` Greg Wilson committed Mar 03, 2014 744 `````` `````` Greg Wilson committed Jun 22, 2016 745 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 746 ``````max_plot = matplotlib.pyplot.plot(numpy.max(data, axis=0)) `````` Elliott Sales de Andrade committed Jan 22, 2016 747 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 748 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 749 ``````{: .language-python} `````` Greg Wilson committed Mar 03, 2014 750 `````` `````` Greg Wilson committed Jul 20, 2016 751 ``````![Maximum Value Along The First Axis](../fig/01-numpy_75_1.png) `````` Greg Wilson committed Dec 03, 2014 752 `````` `````` Greg Wilson committed Jun 22, 2016 753 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 754 ``````min_plot = matplotlib.pyplot.plot(numpy.min(data, axis=0)) `````` Elliott Sales de Andrade committed Jan 22, 2016 755 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 756 ``````~~~ `````` Anne Fouilloux committed Feb 14, 2018 757 ``````{: .language-python} `````` Greg Wilson committed Apr 09, 2014 758 `````` ``````