01-numpy.md 28.6 KB
 Greg Wilson committed Mar 03, 2014 1 ``````--- `````` Greg Wilson committed Jun 22, 2016 2 3 4 5 ``````title: Analyzing Patient Data teaching: 30 exercises: 0 questions: `````` Greg Wilson committed Jul 01, 2016 6 ``````- "How can I process tabular data files in Python?" `````` Greg Wilson committed Jun 22, 2016 7 8 9 10 11 12 13 14 15 ``````objectives: - "Explain what a library is, and what libraries are used for." - "Import a Python library and use the things it contains." - "Read tabular data from a file into a program." - "Assign values to variables." - "Select individual values and subsections from data." - "Perform operations on arrays of data." - "Display simple graphs." keypoints: `````` Greg Wilson committed Jun 25, 2016 16 17 18 19 20 21 22 23 24 25 26 27 28 29 ``````- "Import a library into a program using `import libraryname`." - "Use the `numpy` library to work with arrays in Python." - "Use `variable = value` to assign a value to a variable in order to record it in memory." - "Variables are created on demand whenever a value is assigned to them." - "Use `print(something)` to display the value of `something`." - "The expression `array.shape` gives the shape of an array." - "Use `array[x, y]` to select a single element from an array." - "Array indices start at 0, not 1." - "Use `low:high` to specify a slice that includes the indices from `low` to `high-1`." - "All the indexing and slicing that works on arrays also works on strings." - "Use `# some kind of explanation` to add comments to programs." - "Use `numpy.mean(array)`, `numpy.max(array)`, and `numpy.min(array)` to calculate simple statistics." - "Use `numpy.mean(array, axis=0)` or `numpy.mean(array, axis=1)` to calculate statistics across the specified axis." - "Use the `pyplot` library from `matplotlib` for creating simple visualizations." `````` Greg Wilson committed Mar 03, 2014 30 31 ``````--- `````` Raniere Silva committed Sep 02, 2014 32 ``````Words are useful, `````` Azalee Bostroem committed Oct 09, 2015 33 ``````but what's more useful are the sentences and stories we build with them. `````` Raniere Silva committed Sep 02, 2014 34 ``````Similarly, `````` jstapleton committed Mar 05, 2016 35 ``````while a lot of powerful, general tools are built into languages like Python, `````` Greg Wilson committed Aug 25, 2016 36 ``````specialized tools built up from these basic units live in [libraries]({{ site.github.url }}/reference/#library) `````` jstapleton committed Mar 05, 2016 37 ``````that can be called upon when needed. `````` Raniere Silva committed Sep 02, 2014 38 39 `````` In order to load our inflammation data, `````` Greg Wilson committed Aug 25, 2016 40 ``````we need to access ([import]({{ site.github.url }}/reference/#import) in Python terminology) `````` Trevor Bekolay committed Jun 22, 2016 41 ``````a library called [NumPy](http://docs.scipy.org/doc/numpy/ "NumPy Documentation"). `````` Greg Wilson committed Dec 03, 2014 42 ``````In general you should use this library if you want to do fancy things with numbers, `````` Azalee Bostroem committed May 09, 2015 43 ``````especially if you have matrices or arrays. `````` Trevor Bekolay committed Jun 22, 2016 44 ``````We can import NumPy using: `````` Greg Wilson committed Mar 03, 2014 45 `````` `````` Greg Wilson committed Jun 22, 2016 46 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 47 48 ``````import numpy ~~~ `````` Greg Wilson committed Jun 22, 2016 49 ``````{: .python} `````` Greg Wilson committed Mar 03, 2014 50 `````` `````` Greg Wilson committed Jun 22, 2016 51 52 53 ``````Importing a library is like getting a piece of lab equipment out of a storage locker and setting it up on the bench. Libraries provide additional functionality to the basic Python package, much like a new piece of equipment adds functionality to a lab space. `````` Bartosz T committed Jun 13, 2016 54 ``````Once you've imported the library, `````` Raniere Silva committed Sep 02, 2014 55 ``````we can ask the library to read our data file for us: `````` Greg Wilson committed Mar 03, 2014 56 `````` `````` Greg Wilson committed Jun 22, 2016 57 ``````~~~ `````` Peter Cock committed Jan 15, 2015 58 ``````numpy.loadtxt(fname='inflammation-01.csv', delimiter=',') `````` Greg Wilson committed Dec 03, 2014 59 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 60 61 62 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 63 ``````array([[ 0., 0., 1., ..., 3., 0., 0.], `````` Greg Wilson committed Mar 03, 2014 64 65 `````` [ 0., 1., 2., ..., 1., 0., 1.], [ 0., 1., 1., ..., 2., 1., 1.], `````` 66 `````` ..., `````` Greg Wilson committed Mar 03, 2014 67 68 `````` [ 0., 1., 1., ..., 1., 1., 1.], [ 0., 0., 0., ..., 0., 2., 0.], `````` Greg Wilson committed Dec 03, 2014 69 70 `````` [ 0., 0., 1., ..., 1., 1., 0.]]) ~~~ `````` Greg Wilson committed Jun 22, 2016 71 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 72 `````` `````` Greg Wilson committed Aug 25, 2016 73 74 75 ``````The expression `numpy.loadtxt(...)` is a [function call]({{ site.github.url }}/reference/#function-call) that asks Python to run the [function]({{ site.github.url }}/reference/#function) `loadtxt` which belongs to the `numpy` library. This [dotted notation]({{ site.github.url }}/reference/#dotted-notation) is used everywhere in Python `````` Greg Wilson committed Sep 08, 2014 76 ``````to refer to the parts of things as `thing.component`. `````` Raniere Silva committed Sep 02, 2014 77 `````` `````` Greg Wilson committed Aug 25, 2016 78 ```````numpy.loadtxt` has two [parameters]({{ site.github.url }}/reference/#parameter): `````` Raniere Silva committed Sep 02, 2014 79 ``````the name of the file we want to read, `````` Greg Wilson committed Aug 25, 2016 80 81 ``````and the [delimiter]({{ site.github.url }}/reference/#delimiter) that separates values on a line. These both need to be character strings (or [strings]({{ site.github.url }}/reference/#string) for short), `````` Raniere Silva committed Sep 02, 2014 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 ``````so we put them in quotes. When we are finished typing and press Shift+Enter, the notebook runs our command. Since we haven't told it to do anything else with the function's output, the notebook displays it. In this case, that output is the data we just loaded. By default, only a few rows and columns are shown (with `...` to omit elements when displaying big arrays). To save space, Python displays numbers as `1.` instead of `1.0` when there's nothing interesting after the decimal point. Our call to `numpy.loadtxt` read our file, but didn't save the data in memory. To do that, `````` Greg Wilson committed Aug 25, 2016 100 ``````we need to [assign]({{ site.github.url }}/reference/#assignment) the array to a [variable]({{ site.github.url }}/reference/#variable). `````` Raniere Silva committed Sep 02, 2014 101 ``````A variable is just a name for a value, `````` Greg Wilson committed Dec 03, 2014 102 ``````such as `x`, `current_temperature`, or `subject_id`. `````` Greg Wilson committed Aug 25, 2016 103 ``````Python's variables must begin with a letter and are [case sensitive]({{ site.github.url }}/reference/#case-sensitive). `````` Kyler Brown committed Apr 22, 2015 104 ``````We can create a new variable by assigning a value to it using `=`. `````` Greg Wilson committed Dec 03, 2014 105 106 107 108 ``````As an illustration, let's step back and instead of considering a table of data, consider the simplest "collection" of data, a single value. `````` Azalee Bostroem committed May 09, 2015 109 ``````The line below assigns the value `55` to a variable `weight_kg`: `````` Greg Wilson committed Dec 03, 2014 110 `````` `````` Greg Wilson committed Jun 22, 2016 111 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 112 113 ``````weight_kg = 55 ~~~ `````` Greg Wilson committed Jun 22, 2016 114 ``````{: .python} `````` Greg Wilson committed Mar 03, 2014 115 `````` `````` Azalee Bostroem committed May 09, 2015 116 ``````Once a variable has a value, we can print it to the screen: `````` Greg Wilson committed Mar 03, 2014 117 `````` `````` Greg Wilson committed Jun 22, 2016 118 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 119 ``````print(weight_kg) `````` Greg Wilson committed Dec 03, 2014 120 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 121 122 123 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 124 125 ``````55 ~~~ `````` Greg Wilson committed Jun 22, 2016 126 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 127 `````` `````` Raniere Silva committed Sep 02, 2014 128 ``````and do arithmetic with it: `````` Greg Wilson committed Mar 03, 2014 129 `````` `````` Greg Wilson committed Jun 22, 2016 130 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 131 ``````print('weight in pounds:', 2.2 * weight_kg) `````` Greg Wilson committed Dec 03, 2014 132 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 133 134 135 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 136 137 ``````weight in pounds: 121.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 138 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 139 `````` `````` jstapleton committed Mar 05, 2016 140 141 142 ``````As the example above shows, we can print several things at once by separating them with commas. `````` Raniere Silva committed Sep 02, 2014 143 ``````We can also change a variable's value by assigning it a new one: `````` Greg Wilson committed Mar 03, 2014 144 `````` `````` Greg Wilson committed Jun 22, 2016 145 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 146 ``````weight_kg = 57.5 `````` Raniere Silva committed Aug 20, 2015 147 ``````print('weight in kilograms is now:', weight_kg) `````` Greg Wilson committed Dec 03, 2014 148 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 149 150 151 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 152 153 ``````weight in kilograms is now: 57.5 ~~~ `````` Greg Wilson committed Jun 22, 2016 154 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 155 `````` `````` Raniere Silva committed Sep 02, 2014 156 157 ``````If we imagine the variable as a sticky note with a name written on it, assignment is like putting the sticky note on a particular value: `````` Greg Wilson committed Mar 03, 2014 158 `````` `````` Greg Wilson committed Jul 20, 2016 159 ``````![Variables as Sticky Notes](../fig/python-sticky-note-variables-01.svg) `````` Greg Wilson committed Mar 03, 2014 160 `````` `````` Raniere Silva committed Sep 02, 2014 161 162 163 ``````This means that assigning a value to one variable does *not* change the values of other variables. For example, let's store the subject's weight in pounds in a variable: `````` Greg Wilson committed Mar 03, 2014 164 `````` `````` Greg Wilson committed Jun 22, 2016 165 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 166 ``````weight_lb = 2.2 * weight_kg `````` Raniere Silva committed Aug 20, 2015 167 ``````print('weight in kilograms:', weight_kg, 'and in pounds:', weight_lb) `````` Greg Wilson committed Dec 03, 2014 168 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 169 170 171 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 172 173 ``````weight in kilograms: 57.5 and in pounds: 126.5 ~~~ `````` Greg Wilson committed Jun 22, 2016 174 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 175 `````` `````` Greg Wilson committed Jul 20, 2016 176 ``````![Creating Another Variable](../fig/python-sticky-note-variables-02.svg) `````` Greg Wilson committed Mar 03, 2014 177 `````` `````` Raniere Silva committed Sep 02, 2014 178 ``````and then change `weight_kg`: `````` Greg Wilson committed Mar 03, 2014 179 `````` `````` Greg Wilson committed Jun 22, 2016 180 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 181 ``````weight_kg = 100.0 `````` Raniere Silva committed Aug 20, 2015 182 ``````print('weight in kilograms is now:', weight_kg, 'and weight in pounds is still:', weight_lb) `````` Greg Wilson committed Dec 03, 2014 183 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 184 185 186 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 187 188 ``````weight in kilograms is now: 100.0 and weight in pounds is still: 126.5 ~~~ `````` Greg Wilson committed Jun 22, 2016 189 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 190 `````` `````` Greg Wilson committed Jul 20, 2016 191 ``````![Updating a Variable](../fig/python-sticky-note-variables-03.svg) `````` Greg Wilson committed Mar 03, 2014 192 `````` `````` Raniere Silva committed Sep 02, 2014 193 194 195 196 ``````Since `weight_lb` doesn't "remember" where its value came from, it isn't automatically updated when `weight_kg` changes. This is different from the way spreadsheets work. `````` Greg Wilson committed Jun 22, 2016 197 ``````> ## Who's Who in Memory `````` Benjamin Laken committed Nov 09, 2015 198 ``````> `````` Trevor Bekolay committed Jun 22, 2016 199 200 201 ``````> You can use the `%whos` command at any time to see what > variables you have created and what modules you have loaded into the computer's memory. > As this is an IPython command, it will only work if you are in an IPython terminal or the Jupyter Notebook. `````` Benjamin Laken committed Nov 09, 2015 202 ``````> `````` Greg Wilson committed Jun 22, 2016 203 ``````> ~~~ `````` Trevor Bekolay committed Jun 22, 2016 204 205 ``````> %whos > ~~~ `````` Greg Wilson committed Jun 22, 2016 206 207 208 ``````> {: .python} > > ~~~ `````` Trevor Bekolay committed Jun 22, 2016 209 210 211 212 213 214 ``````> Variable Type Data/Info > -------------------------------- > numpy module kages/numpy/__init__.py'> > weight_kg float 100.0 > weight_lb float 126.5 > ~~~ `````` Greg Wilson committed Jun 22, 2016 215 216 ``````> {: .output} {: .callout} `````` Benjamin Laken committed Nov 09, 2015 217 `````` `````` Johnny Lin committed Sep 22, 2014 218 219 ``````Just as we can assign a single value to a variable, we can also assign an array of values to a variable using the same syntax. Let's re-run `numpy.loadtxt` and save its result: `````` Greg Wilson committed Mar 03, 2014 220 `````` `````` Greg Wilson committed Jun 22, 2016 221 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 222 223 ``````data = numpy.loadtxt(fname='inflammation-01.csv', delimiter=',') ~~~ `````` Greg Wilson committed Jun 22, 2016 224 ``````{: .python} `````` Greg Wilson committed Mar 03, 2014 225 `````` `````` Raniere Silva committed Sep 02, 2014 226 227 228 ``````This statement doesn't produce any output because assignment doesn't display anything. If we want to check that our data has been loaded, we can print the variable's value: `````` Greg Wilson committed Mar 03, 2014 229 `````` `````` Greg Wilson committed Jun 22, 2016 230 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 231 ``````print(data) `````` Greg Wilson committed Dec 03, 2014 232 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 233 234 235 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 236 ``````[[ 0. 0. 1. ..., 3. 0. 0.] `````` Greg Wilson committed Mar 03, 2014 237 238 `````` [ 0. 1. 2. ..., 1. 0. 1.] [ 0. 1. 1. ..., 2. 1. 1.] `````` 239 `````` ..., `````` Greg Wilson committed Mar 03, 2014 240 241 242 `````` [ 0. 1. 1. ..., 1. 1. 1.] [ 0. 0. 0. ..., 0. 2. 0.] [ 0. 0. 1. ..., 1. 1. 0.]] `````` Greg Wilson committed Dec 03, 2014 243 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 244 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 245 `````` `````` Raniere Silva committed Sep 02, 2014 246 247 248 ``````Now that our data is in memory, we can start doing things with it. First, `````` Greg Wilson committed Aug 25, 2016 249 ``````let's ask what [type]({{ site.github.url }}/reference/#type) of thing `data` refers to: `````` Greg Wilson committed Mar 03, 2014 250 `````` `````` Greg Wilson committed Jun 22, 2016 251 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 252 ``````print(type(data)) `````` Greg Wilson committed Dec 03, 2014 253 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 254 255 256 ``````{: .python} ~~~ `````` Trevor Bekolay committed Aug 28, 2015 257 `````` `````` Greg Wilson committed Dec 03, 2014 258 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 259 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 260 `````` `````` 261 262 263 264 265 266 ``````The output tells us that `data` currently refers to an N-dimensional array created by the NumPy library. These data correspond to arthritis patients' inflammation. The rows are the individual patients and the columns are their daily inflammation measurements. `````` Greg Wilson committed Jun 22, 2016 267 ``````> ## Data Type `````` 268 269 270 271 ``````> > A Numpy array contains one or more elements > of the same type. `type` will only tell you that > a variable is a NumPy array. `````` Valentina Staneva committed Jul 18, 2016 272 ``````> We can also find out the type `````` 273 274 ``````> of the data contained in the NumPy array. > `````` Greg Wilson committed Jun 22, 2016 275 ``````> ~~~ `````` 276 277 ``````> print(data.dtype) > ~~~ `````` Greg Wilson committed Jun 22, 2016 278 279 280 ``````> {: .python} > > ~~~ `````` 281 282 ``````> dtype('float64') > ~~~ `````` Greg Wilson committed Jun 22, 2016 283 ``````> {: .output} `````` 284 285 ``````> > This tells us that the NumPy array's elements are `````` Greg Wilson committed Aug 25, 2016 286 ``````> [floating-point numbers]({{ site.github.url }}/reference/#floating-point number). `````` Greg Wilson committed Jun 22, 2016 287 ``````{: .callout} `````` 288 `````` `````` Greg Wilson committed Aug 25, 2016 289 ``````We can see what the array's [shape]({{ site.github.url }}/reference/#shape) is like this: `````` Greg Wilson committed Mar 03, 2014 290 `````` `````` Greg Wilson committed Jun 22, 2016 291 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 292 ``````print(data.shape) `````` Greg Wilson committed Dec 03, 2014 293 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 294 295 296 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 297 298 ``````(60, 40) ~~~ `````` Greg Wilson committed Jun 22, 2016 299 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 300 `````` `````` Azalee Bostroem committed May 09, 2015 301 302 ``````This tells us that `data` has 60 rows and 40 columns. When we created the variable `data` to store our arthritis data, we didn't just create the array, we also `````` Greg Wilson committed Aug 25, 2016 303 ``````created information about the array, called [members]({{ site.github.url }}/reference/#member) or `````` Azalee Bostroem committed May 09, 2015 304 305 ``````attributes. This extra information describes `data` in the same way an adjective describes a noun. `````` Bartosz T committed Jun 13, 2016 306 ```````data.shape` is an attribute of `data` which describes the dimensions of `data`. `````` Azalee Bostroem committed May 09, 2015 307 ``````We use the same dotted notation for the attributes of variables `````` Raniere Silva committed Sep 02, 2014 308 309 ``````that we use for the functions in libraries because they have the same part-and-whole relationship. `````` Greg Wilson committed Mar 03, 2014 310 `````` `````` Azalee Bostroem committed May 09, 2015 311 ``````If we want to get a single number from the array, `````` Greg Wilson committed Aug 25, 2016 312 ``````we must provide an [index]({{ site.github.url }}/reference/#index) in square brackets, `````` Raniere Silva committed Sep 02, 2014 313 ``````just as we do in math: `````` Greg Wilson committed Mar 03, 2014 314 `````` `````` Greg Wilson committed Jun 22, 2016 315 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 316 ``````print('first value in data:', data[0, 0]) `````` Greg Wilson committed Dec 03, 2014 317 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 318 319 320 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 321 322 ``````first value in data: 0.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 323 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 324 `````` `````` Greg Wilson committed Jun 22, 2016 325 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 326 ``````print('middle value in data:', data[30, 20]) `````` Greg Wilson committed Dec 03, 2014 327 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 328 329 330 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 331 332 ``````middle value in data: 13.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 333 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 334 `````` `````` Raniere Silva committed Sep 02, 2014 335 336 337 338 339 ``````The expression `data[30, 20]` may not surprise you, but `data[0, 0]` might. Programming languages like Fortran and MATLAB start counting at 1, because that's what human beings have done for thousands of years. Languages in the C family (including C++, Java, Perl, and Python) count from 0 `````` Trevor Bekolay committed Jun 21, 2016 340 341 342 ``````because that's more convenient when indices are computed rather than constant (see [Mike Hoye's blog post](http://exple.tive.org/blarg/2013/10/22/citation-needed/) for historical details). `````` Raniere Silva committed Sep 02, 2014 343 ``````As a result, `````` Greg Wilson committed Jul 10, 2016 344 ``````if we have an M×N array in Python, `````` Raniere Silva committed Sep 02, 2014 345 346 347 348 349 350 ``````its indices go from 0 to M-1 on the first axis and 0 to N-1 on the second. It takes a bit of getting used to, but one way to remember the rule is that the index is how many steps we have to take from the start to get the item we want. `````` Greg Wilson committed Jun 22, 2016 351 ``````> ## In the Corner `````` Raniere Silva committed Sep 02, 2014 352 353 354 355 356 357 ``````> > What may also surprise you is that when Python displays an array, > it shows the element with index `[0, 0]` in the upper left corner > rather than the lower left. > This is consistent with the way mathematicians draw matrices, > but different from the Cartesian coordinates. `````` 358 ``````> The indices are (row, column) instead of (column, row) for the same reason, `````` Greg Wilson committed Sep 08, 2014 359 ``````> which can be confusing when plotting data. `````` Greg Wilson committed Jun 22, 2016 360 ``````{: .callout} `````` Raniere Silva committed Sep 02, 2014 361 362 363 364 365 `````` An index like `[30, 20]` selects a single element of an array, but we can select whole sections as well. For example, we can select the first ten days (columns) of values `````` shiffer1 committed Jun 25, 2015 366 ``````for the first four patients (rows) like this: `````` Greg Wilson committed Mar 03, 2014 367 `````` `````` Greg Wilson committed Jun 22, 2016 368 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 369 ``````print(data[0:4, 0:10]) `````` Greg Wilson committed Dec 03, 2014 370 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 371 372 373 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 374 ``````[[ 0. 0. 1. 3. 1. 2. 4. 7. 8. 3.] `````` Greg Wilson committed Mar 03, 2014 375 376 377 `````` [ 0. 1. 2. 1. 2. 1. 3. 2. 2. 6.] [ 0. 1. 1. 3. 3. 2. 6. 2. 5. 9.] [ 0. 0. 2. 0. 4. 2. 2. 1. 6. 7.]] `````` Greg Wilson committed Dec 03, 2014 378 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 379 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 380 `````` `````` Greg Wilson committed Aug 25, 2016 381 ``````The [slice]({{ site.github.url }}/reference/#slice) `0:4` means, `````` Raniere Silva committed Sep 02, 2014 382 383 384 385 386 387 ``````"Start at index 0 and go up to, but not including, index 4." Again, the up-to-but-not-including takes a bit of getting used to, but the rule is that the difference between the upper and lower bounds is the number of values in the slice. We don't have to start slices at 0: `````` Greg Wilson committed Mar 03, 2014 388 `````` `````` Greg Wilson committed Jun 22, 2016 389 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 390 ``````print(data[5:10, 0:10]) `````` Greg Wilson committed Dec 03, 2014 391 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 392 393 394 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 395 ``````[[ 0. 0. 1. 2. 2. 4. 2. 1. 6. 4.] `````` Greg Wilson committed Mar 03, 2014 396 397 398 399 `````` [ 0. 0. 2. 2. 4. 2. 2. 5. 5. 8.] [ 0. 0. 1. 2. 3. 1. 2. 3. 5. 3.] [ 0. 0. 0. 3. 1. 5. 6. 5. 5. 8.] [ 0. 1. 1. 2. 1. 3. 5. 3. 5. 8.]] `````` Greg Wilson committed Dec 03, 2014 400 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 401 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 402 `````` `````` Raniere Silva committed Sep 02, 2014 403 404 405 406 407 408 409 410 ``````We also don't have to include the upper and lower bound on the slice. If we don't include the lower bound, Python uses 0 by default; if we don't include the upper, the slice runs to the end of the axis, and if we don't include either (i.e., if we just use ':' on its own), the slice includes everything: `````` Greg Wilson committed Mar 03, 2014 411 `````` `````` Greg Wilson committed Jun 22, 2016 412 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 413 ``````small = data[:3, 36:] `````` Raniere Silva committed Aug 20, 2015 414 415 ``````print('small is:') print(small) `````` Greg Wilson committed Dec 03, 2014 416 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 417 418 419 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 420 ``````small is: `````` Greg Wilson committed Mar 03, 2014 421 422 423 ``````[[ 2. 3. 0. 0.] [ 1. 1. 0. 1.] [ 2. 2. 1. 1.]] `````` Greg Wilson committed Dec 03, 2014 424 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 425 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 426 `````` `````` Raniere Silva committed Sep 02, 2014 427 ``````Arrays also know how to perform common mathematical operations on their values. `````` Greg Wilson committed Dec 03, 2014 428 429 430 431 432 ``````The simplest operations with data are arithmetic: add, subtract, multiply, and divide. When you do such operations on arrays, the operation is done on each individual element of the array. Thus: `````` Johnny Lin committed Sep 22, 2014 433 `````` `````` Greg Wilson committed Jun 22, 2016 434 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 435 436 ``````doubledata = data * 2.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 437 ``````{: .python} `````` Trevor Bekolay committed Sep 29, 2014 438 `````` `````` Greg Wilson committed Dec 03, 2014 439 440 ``````will create a new array `doubledata` whose elements have the value of two times the value of the corresponding elements in `data`: `````` Johnny Lin committed Sep 22, 2014 441 `````` `````` Greg Wilson committed Jun 22, 2016 442 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 443 444 445 446 ``````print('original:') print(data[:3, 36:]) print('doubledata:') print(doubledata[:3, 36:]) `````` Greg Wilson committed Dec 03, 2014 447 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 448 449 450 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 451 ``````original: `````` Trevor Bekolay committed Sep 29, 2014 452 453 454 455 456 457 458 ``````[[ 2. 3. 0. 0.] [ 1. 1. 0. 1.] [ 2. 2. 1. 1.]] doubledata: [[ 4. 6. 0. 0.] [ 2. 2. 0. 2.] [ 4. 4. 2. 2.]] `````` Greg Wilson committed Dec 03, 2014 459 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 460 ``````{: .output} `````` Trevor Bekolay committed Sep 29, 2014 461 `````` `````` Greg Wilson committed Dec 03, 2014 462 463 ``````If, instead of taking an array and doing arithmetic with a single value (as above) `````` Azalee Bostroem committed May 09, 2015 464 ``````you did the arithmetic operation with another array of the same shape, `````` Greg Wilson committed Dec 03, 2014 465 466 ``````the operation will be done on corresponding elements of the two arrays. Thus: `````` Trevor Bekolay committed Sep 29, 2014 467 `````` `````` Greg Wilson committed Jun 22, 2016 468 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 469 470 ``````tripledata = doubledata + data ~~~ `````` Greg Wilson committed Jun 22, 2016 471 ``````{: .python} `````` Trevor Bekolay committed Sep 29, 2014 472 `````` `````` Johnny Lin committed Sep 22, 2014 473 474 475 ``````will give you an array where `tripledata[0,0]` will equal `doubledata[0,0]` plus `data[0,0]`, and so on for all other elements of the arrays. `````` Greg Wilson committed Jun 22, 2016 476 ``````~~~ `````` Raniere Silva committed Aug 20, 2015 477 478 ``````print('tripledata:') print(tripledata[:3, 36:]) `````` Greg Wilson committed Dec 03, 2014 479 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 480 481 482 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 483 ``````tripledata: `````` Trevor Bekolay committed Sep 29, 2014 484 485 486 ``````[[ 6. 9. 0. 0.] [ 3. 3. 0. 3.] [ 6. 6. 3. 3.]] `````` Greg Wilson committed Dec 03, 2014 487 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 488 ``````{: .output} `````` Trevor Bekolay committed Sep 29, 2014 489 `````` `````` Johnny Lin committed Sep 22, 2014 490 ``````Often, we want to do more than add, subtract, multiply, and divide values of data. `````` Trevor Bekolay committed Jun 22, 2016 491 ``````NumPy knows how to do more complex operations on arrays. `````` Raniere Silva committed Sep 02, 2014 492 493 ``````If we want to find the average inflammation for all patients on all days, for example, `````` Trevor Bekolay committed Jun 22, 2016 494 ``````we can ask NumPy to compute `data`'s mean value: `````` Greg Wilson committed Mar 03, 2014 495 `````` `````` Greg Wilson committed Jun 22, 2016 496 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 497 ``````print(numpy.mean(data)) `````` Greg Wilson committed Dec 03, 2014 498 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 499 500 501 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 502 503 ``````6.14875 ~~~ `````` Greg Wilson committed Jun 22, 2016 504 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 505 `````` `````` Greg Wilson committed Aug 25, 2016 506 507 ```````mean` is a [function]({{ site.github.url }}/reference/#function) that takes an array as an [argument]({{ site.github.url }}/reference/#argument). `````` Trevor Bekolay committed Jun 22, 2016 508 509 510 ``````If variables are nouns, functions are verbs: they do things with variables. `````` Greg Wilson committed Jun 22, 2016 511 ``````> ## Not All Functions Have Input `````` Trevor Bekolay committed Jun 22, 2016 512 513 514 ``````> > Generally, a function uses inputs to produce outputs. > However, some functions produce outputs without `````` Trevor Bekolay committed Jun 22, 2016 515 516 ``````> needing any input. For example, checking the current time > doesn't require any input. `````` Trevor Bekolay committed Jun 22, 2016 517 ``````> `````` Greg Wilson committed Jun 22, 2016 518 ``````> ~~~ `````` Trevor Bekolay committed Jun 22, 2016 519 520 ``````> import time > print(time.ctime()) `````` Trevor Bekolay committed Jun 22, 2016 521 ``````> ~~~ `````` Greg Wilson committed Jun 22, 2016 522 523 524 ``````> {: .python} > > ~~~ `````` Trevor Bekolay committed Jun 22, 2016 525 ``````> 'Sat Mar 26 13:07:33 2016' `````` Trevor Bekolay committed Jun 22, 2016 526 ``````> ~~~ `````` Greg Wilson committed Jun 22, 2016 527 ``````> {: .output} `````` Trevor Bekolay committed Jun 22, 2016 528 529 530 531 ``````> > For functions that don't take in any arguments, > we still need parentheses (`()`) > to tell Python to go and do something for us. `````` Greg Wilson committed Jun 22, 2016 532 ``````{: .callout} `````` Trevor Bekolay committed Jun 22, 2016 533 534 535 `````` NumPy has lots of useful functions that take an array as input. Let's use three of those functions to get some descriptive values about the dataset. `````` Trevor Bekolay committed Jun 21, 2016 536 537 ``````We'll also use multiple assignment, a convenient Python feature that will enable us to do this all in one line. `````` Greg Wilson committed Mar 03, 2014 538 `````` `````` Greg Wilson committed Jun 22, 2016 539 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 540 ``````maxval, minval, stdval = numpy.max(data), numpy.min(data), numpy.std(data) `````` Trevor Bekolay committed Jun 21, 2016 541 `````` `````` Alistair Walsh committed Jun 21, 2016 542 543 544 ``````print('maximum inflammation:', maxval) print('minimum inflammation:', minval) print('standard deviation:', stdval) `````` Greg Wilson committed Dec 03, 2014 545 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 546 547 548 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 549 ``````maximum inflammation: 20.0 `````` Greg Wilson committed Mar 03, 2014 550 551 ``````minimum inflammation: 0.0 standard deviation: 4.61383319712 `````` Greg Wilson committed Dec 03, 2014 552 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 553 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 554 `````` `````` Greg Wilson committed Jun 22, 2016 555 ``````> ## Mystery Functions in IPython `````` Benjamin Laken committed Dec 04, 2015 556 ``````> `````` Trevor Bekolay committed Jun 22, 2016 557 558 559 560 561 562 563 ``````> How did we know what functions NumPy has and how to use them? > If you are working in the IPython/Jupyter Notebook there is an easy way to find out. > If you type the name of something with a full-stop then you can use tab completion > (e.g. type `numpy.` and then press tab) > to see a list of all functions and attributes that you can use. After selecting one you > can also add a question mark (e.g. `numpy.cumprod?`) and IPython will return an > explanation of the method! This is the same as doing `help(numpy.cumprod)`. `````` Greg Wilson committed Jun 22, 2016 564 ``````{: .callout} `````` Trevor Bekolay committed Jun 22, 2016 565 566 `````` When analyzing data, though, `````` Raniere Silva committed Sep 02, 2014 567 568 569 ``````we often want to look at partial statistics, such as the maximum value per patient or the average value per day. `````` Azalee Bostroem committed May 09, 2015 570 ``````One way to do this is to create a new temporary array of the data we want, `````` Raniere Silva committed Sep 02, 2014 571 ``````then ask it to do the calculation: `````` Greg Wilson committed Mar 03, 2014 572 `````` `````` Greg Wilson committed Jun 22, 2016 573 ``````~~~ `````` Greg Wilson committed Dec 03, 2014 574 ``````patient_0 = data[0, :] # 0 on the first axis, everything on the second `````` Raniere Silva committed Aug 20, 2015 575 ``````print('maximum inflammation for patient 0:', patient_0.max()) `````` Greg Wilson committed Dec 03, 2014 576 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 577 578 579 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 580 581 ``````maximum inflammation for patient 0: 18.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 582 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 583 `````` `````` valiseverywhere committed Jun 20, 2016 584 ``````Everything in a line of code following the '#' symbol is a `````` Greg Wilson committed Aug 25, 2016 585 ``````[comment]({{ site.github.url }}/reference/#comment) that is ignored by the computer. `````` valiseverywhere committed Jun 20, 2016 586 ``````Comments allow programmers to leave explanatory notes for other `````` jstapleton committed Mar 05, 2016 587 588 ``````programmers or their future selves. `````` Raniere Silva committed Sep 02, 2014 589 ``````We don't actually need to store the row in a variable of its own. `````` Trevor Bekolay committed Jun 22, 2016 590 ``````Instead, we can combine the selection and the function call: `````` Greg Wilson committed Mar 03, 2014 591 `````` `````` Greg Wilson committed Jun 22, 2016 592 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 593 ``````print('maximum inflammation for patient 2:', numpy.max(data[2, :])) `````` Greg Wilson committed Dec 03, 2014 594 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 595 596 597 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 598 599 ``````maximum inflammation for patient 2: 19.0 ~~~ `````` Greg Wilson committed Jun 22, 2016 600 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 601 `````` `````` Valentina Staneva committed Jul 18, 2016 602 ``````What if we need the maximum inflammation for each patient over all days (as in the `````` 603 604 605 ``````next diagram on the left), or the average for each day (as in the diagram on the right)? As the diagram below shows, we want to perform the operation across an axis: `````` Greg Wilson committed Mar 03, 2014 606 `````` `````` Greg Wilson committed Jul 20, 2016 607 ``````![Operations Across Axes](../fig/python-operations-across-axes.png) `````` Greg Wilson committed Mar 03, 2014 608 `````` `````` Raniere Silva committed Sep 02, 2014 609 ``````To support this, `````` Trevor Bekolay committed Jun 22, 2016 610 ``````most array functions allow us to specify the axis we want to work on. `````` 611 ``````If we ask for the average across axis 0 (rows in our 2D example), `````` Raniere Silva committed Sep 02, 2014 612 ``````we get: `````` Greg Wilson committed Mar 03, 2014 613 `````` `````` Greg Wilson committed Jun 22, 2016 614 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 615 ``````print(numpy.mean(data, axis=0)) `````` Greg Wilson committed Dec 03, 2014 616 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 617 618 619 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 620 ``````[ 0. 0.45 1.11666667 1.75 2.43333333 3.15 `````` Greg Wilson committed Mar 03, 2014 621 622 623 624 625 626 627 `````` 3.8 3.88333333 5.23333333 5.51666667 5.95 5.9 8.35 7.73333333 8.36666667 9.5 9.58333333 10.63333333 11.56666667 12.35 13.25 11.96666667 11.03333333 10.16666667 10. 8.66666667 9.15 7.25 7.33333333 6.58333333 6.06666667 5.95 5.11666667 3.6 3.3 3.56666667 2.48333333 1.5 1.13333333 0.56666667] `````` Greg Wilson committed Dec 03, 2014 628 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 629 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 630 `````` `````` Raniere Silva committed Sep 02, 2014 631 632 ``````As a quick check, we can ask this array what its shape is: `````` Greg Wilson committed Mar 03, 2014 633 `````` `````` Greg Wilson committed Jun 22, 2016 634 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 635 ``````print(numpy.mean(data, axis=0).shape) `````` Greg Wilson committed Dec 03, 2014 636 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 637 638 639 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 640 641 ``````(40,) ~~~ `````` Greg Wilson committed Jun 22, 2016 642 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 643 `````` `````` Greg Wilson committed Jul 10, 2016 644 ``````The expression `(40,)` tells us we have an N×1 vector, `````` Raniere Silva committed Sep 02, 2014 645 ``````so this is the average inflammation per day for all patients. `````` 646 ``````If we average across axis 1 (columns in our 2D example), we get: `````` Greg Wilson committed Mar 03, 2014 647 `````` `````` Greg Wilson committed Jun 22, 2016 648 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 649 ``````print(numpy.mean(data, axis=1)) `````` Greg Wilson committed Dec 03, 2014 650 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 651 652 653 ``````{: .python} ~~~ `````` Greg Wilson committed Dec 03, 2014 654 ``````[ 5.45 5.425 6.1 5.9 5.55 6.225 5.975 6.65 6.625 6.525 `````` Greg Wilson committed Mar 03, 2014 655 656 657 658 659 `````` 6.775 5.8 6.225 5.75 5.225 6.3 6.55 5.7 5.85 6.55 5.775 5.825 6.175 6.1 5.8 6.425 6.05 6.025 6.175 6.55 6.175 6.35 6.725 6.125 7.075 5.725 5.925 6.15 6.075 5.75 5.975 5.725 6.3 5.9 6.75 5.925 7.225 6.15 5.95 6.275 5.7 6.1 6.825 5.975 6.725 5.7 6.25 6.4 7.05 5.9 ] `````` Greg Wilson committed Dec 03, 2014 660 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 661 ``````{: .output} `````` Greg Wilson committed Mar 03, 2014 662 `````` `````` Raniere Silva committed Sep 02, 2014 663 664 665 666 667 ``````which is the average inflammation per patient across all days. The mathematician Richard Hamming once said, "The purpose of computing is insight, not numbers," and the best way to develop insight is often to visualize data. `````` Valentina Staneva committed Jul 18, 2016 668 ``````Visualization deserves an entire lecture (of course) of its own, `````` Azalee Bostroem committed May 09, 2015 669 ``````but we can explore a few features of Python's `matplotlib` library here. `````` Greg Wilson committed Dec 03, 2014 670 671 ``````While there is no "official" plotting library, this package is the de facto standard. `````` Raniere Silva committed Sep 02, 2014 672 673 674 ``````First, we will import the `pyplot` module from `matplotlib` and use two of its functions to create and display a heat map of our data: `````` Greg Wilson committed Mar 03, 2014 675 `````` `````` Greg Wilson committed Jun 22, 2016 676 ``````~~~ `````` Azalee Bostroem committed May 09, 2015 677 678 ``````import matplotlib.pyplot image = matplotlib.pyplot.imshow(data) `````` Elliott Sales de Andrade committed Jan 22, 2016 679 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 680 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 681 ``````{: .python} `````` Greg Wilson committed Mar 03, 2014 682 `````` `````` Greg Wilson committed Jul 20, 2016 683 ``````![Heatmap of the Data](../fig/01-numpy_71_0.png) `````` Greg Wilson committed Mar 03, 2014 684 `````` `````` Raniere Silva committed Sep 02, 2014 685 686 687 ``````Blue regions in this heat map are low values, while red shows high values. As we can see, inflammation rises and falls over a 40-day period. `````` Damien Irving committed May 26, 2015 688 `````` `````` Greg Wilson committed Jun 22, 2016 689 ``````> ## Some IPython Magic `````` Damien Irving committed May 26, 2015 690 691 692 693 ``````> > If you're using an IPython / Jupyter notebook, > you'll need to execute the following command > in order for your matplotlib images to appear `````` Damien Irving committed May 27, 2015 694 ``````> in the notebook when `show()` is called: `````` Damien Irving committed May 26, 2015 695 ``````> `````` Greg Wilson committed Jun 22, 2016 696 ``````> ~~~ `````` Damien Irving committed May 26, 2015 697 698 ``````> % matplotlib inline > ~~~ `````` Greg Wilson committed Jun 22, 2016 699 ``````> {: .python} `````` Trevor Bekolay committed Jan 18, 2016 700 ``````> `````` shiffer1 committed Jun 25, 2015 701 702 ``````> The `%` indicates an IPython magic function - > a function that is only valid within the notebook environment. `````` Damien Irving committed May 26, 2015 703 ``````> Note that you only have to execute this function once per notebook. `````` Greg Wilson committed Jun 22, 2016 704 ``````{: .callout} `````` Damien Irving committed May 26, 2015 705 `````` `````` Raniere Silva committed Sep 02, 2014 706 ``````Let's take a look at the average inflammation over time: `````` Greg Wilson committed Mar 03, 2014 707 `````` `````` Greg Wilson committed Jun 22, 2016 708 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 709 ``````ave_inflammation = numpy.mean(data, axis=0) `````` Damien Irving committed May 26, 2015 710 ``````ave_plot = matplotlib.pyplot.plot(ave_inflammation) `````` Elliott Sales de Andrade committed Jan 22, 2016 711 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 712 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 713 ``````{: .python} `````` Greg Wilson committed Mar 03, 2014 714 `````` `````` Greg Wilson committed Jul 20, 2016 715 ``````![Average Inflammation Over Time](../fig/01-numpy_73_0.png) `````` Greg Wilson committed Mar 03, 2014 716 `````` `````` Raniere Silva committed Sep 02, 2014 717 718 ``````Here, we have put the average per day across all patients in the variable `ave_inflammation`, `````` Damien Irving committed May 26, 2015 719 ``````then asked `matplotlib.pyplot` to create and display a line graph of those values. `````` Raniere Silva committed Sep 02, 2014 720 721 722 723 724 ``````The result is roughly a linear rise and fall, which is suspicious: based on other studies, we expect a sharper rise and slower fall. Let's have a look at two other statistics: `````` Greg Wilson committed Mar 03, 2014 725 `````` `````` Greg Wilson committed Jun 22, 2016 726 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 727 ``````max_plot = matplotlib.pyplot.plot(numpy.max(data, axis=0)) `````` Elliott Sales de Andrade committed Jan 22, 2016 728 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 729 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 730 ``````{: .python} `````` Greg Wilson committed Mar 03, 2014 731 `````` `````` Greg Wilson committed Jul 20, 2016 732 ``````![Maximum Value Along The First Axis](../fig/01-numpy_75_1.png) `````` Greg Wilson committed Dec 03, 2014 733 `````` `````` Greg Wilson committed Jun 22, 2016 734 ``````~~~ `````` Trevor Bekolay committed Jun 22, 2016 735 ``````min_plot = matplotlib.pyplot.plot(numpy.min(data, axis=0)) `````` Elliott Sales de Andrade committed Jan 22, 2016 736 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 737 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 738 ``````{: .python} `````` Greg Wilson committed Apr 09, 2014 739 `````` `````` Greg Wilson committed Jul 20, 2016 740 ``````![Minimum Value Along The First Axis](../fig/01-numpy_75_3.png) `````` Greg Wilson committed Mar 03, 2014 741 `````` `````` Raniere Silva committed Sep 02, 2014 742 743 744 745 746 ``````The maximum value rises and falls perfectly smoothly, while the minimum seems to be a step function. Neither result seems particularly likely, so either there's a mistake in our calculations or something is wrong with our data. `````` valiseverywhere committed Jun 20, 2016 747 ``````This insight would have been difficult to reach by `````` jstapleton committed Mar 05, 2016 748 ``````examining the data without visualization tools. `````` Greg Wilson committed Mar 03, 2014 749 `````` `````` Azalee Bostroem committed May 09, 2015 750 ``````You can group similar plots in a single figure using subplots. `````` Azalee Bostroem committed May 09, 2015 751 ``````This script below uses a number of new commands. The function `matplotlib.pyplot.figure()` `````` Azalee Bostroem committed May 09, 2015 752 ``````creates a space into which we will place all of our plots. The parameter `figsize` `````` Azalee Bostroem committed May 09, 2015 753 ``````tells Python how big to make this space. Each subplot is placed into the figure using `````` Greg Wilson committed Aug 25, 2016 754 ``````its `add_subplot` [method]({{ site.github.url }}/reference/#method). The `add_subplot` method takes 3 parameters. The first denotes `````` Azalee Bostroem committed May 09, 2015 755 ``````how many total rows of subplots there are, the second parameter refers to the `````` valiseverywhere committed Jun 13, 2016 756 ``````total number of subplot columns, and the final parameter denotes which subplot `````` Elliott Sales de Andrade committed Jan 27, 2016 757 758 759 ``````your variable is referencing (left-to-right, top-to-bottom). Each subplot is stored in a different variable (`axes1`, `axes2`, `axes3`). Once a subplot is created, the axes can be titled using the `set_xlabel()` command (or `set_ylabel()`). `````` W. Trevor King committed Apr 09, 2015 760 ``````Here are our three plots side by side: `````` Greg Wilson committed Mar 03, 2014 761 `````` `````` Greg Wilson committed Jun 22, 2016 762 ``````~~~ `````` W. Trevor King committed Apr 09, 2015 763 764 ``````import numpy import matplotlib.pyplot `````` Greg Wilson committed Mar 03, 2014 765 `````` `````` W. Trevor King committed Apr 09, 2015 766 ``````data = numpy.loadtxt(fname='inflammation-01.csv', delimiter=',') `````` Greg Wilson committed Mar 03, 2014 767 `````` `````` W. Trevor King committed Apr 09, 2015 768 ``````fig = matplotlib.pyplot.figure(figsize=(10.0, 3.0)) `````` Greg Wilson committed Mar 03, 2014 769 `````` `````` Andrew Lonsdale committed Feb 13, 2015 770 771 772 ``````axes1 = fig.add_subplot(1, 3, 1) axes2 = fig.add_subplot(1, 3, 2) axes3 = fig.add_subplot(1, 3, 3) `````` Greg Wilson committed Mar 03, 2014 773 `````` `````` Andrew Lonsdale committed Feb 13, 2015 774 ``````axes1.set_ylabel('average') `````` Trevor Bekolay committed Jun 22, 2016 775 ``````axes1.plot(numpy.mean(data, axis=0)) `````` Greg Wilson committed Mar 03, 2014 776 `````` `````` Andrew Lonsdale committed Feb 13, 2015 777 ``````axes2.set_ylabel('max') `````` Trevor Bekolay committed Jun 22, 2016 778 ``````axes2.plot(numpy.max(data, axis=0)) `````` Greg Wilson committed Mar 03, 2014 779 `````` `````` Andrew Lonsdale committed Feb 13, 2015 780 ``````axes3.set_ylabel('min') `````` Trevor Bekolay committed Jun 22, 2016 781 ``````axes3.plot(numpy.min(data, axis=0)) `````` Andrew Lonsdale committed Feb 13, 2015 782 783 `````` fig.tight_layout() `````` Andrew Lonsdale committed Feb 13, 2015 784 `````` `````` Elliott Sales de Andrade committed Jan 22, 2016 785 ``````matplotlib.pyplot.show() `````` Greg Wilson committed Dec 03, 2014 786 ``````~~~ `````` Greg Wilson committed Jun 22, 2016 787 ``````{: .python} `````` Greg Wilson committed Mar 03, 2014 788 `````` `````` Greg Wilson committed Jul 20, 2016 789 ``````![The Previous Plots as Subplots](../fig/01-numpy_80_0.png) `````` Greg Wilson committed Mar 03, 2014 790 `````` `````` Greg Wilson committed Aug 25, 2016 791 ``````The [call]({{ site.github.url }}/reference/#function-call) to `loadtxt` reads our data, `````` Raniere Silva committed Sep 02, 2014 792 793 ``````and the rest of the program tells the plotting library how large we want the figure to be, `````` Elliott Sales de Andrade committed Jan 27, 2016 794 ``````that we're creating three subplots, `````` Raniere Silva committed Sep 02, 2014 795 796 797 ``````what to draw for each one, and that we want a tight layout. (Perversely, `````` Andrew Lonsdale committed Feb 13, 2015 798 ``````if we leave out that call to `fig.tight_layout()`, `````` Raniere Silva committed Sep 02, 2014 799 ``````the graphs will actually be squeezed together more closely.) `````` Greg Wilson committed Mar 03, 2014 800 `````` `````` Greg Wilson committed Jun 22, 2016 801 ``````> ## Scientists Dislike Typing `````` Trevor Bekolay committed May 05, 2015 802 803 804 805 806 807 808 ``````> > We will always use the syntax `import numpy` to import NumPy. > However, in order to save typing, it is > [often suggested](http://www.scipy.org/getting-started.html#an-example-script) > to make a shortcut like so: `import numpy as np`. > If you ever see Python code online using a NumPy function with `np` > (for example, `np.loadtxt(...)`), it's because they've used this shortcut. `````` Greg Wilson committed Jun 22, 2016 809 ``````{: .callout} `````` Trevor Bekolay committed May 05, 2015 810 `````` ``````