Commit 6c3687a2 by Bill Mills

### removed aliases; better to show patterns than give confusing alternatives to novices.

parent d79d0440
 ... ... @@ -47,7 +47,7 @@
array([[ 0.,  0.,  1., ...,  3.,  0.,  0.],
[ 0.,  1.,  2., ...,  1.,  0.,  1.],
[ 0.,  1.,  1., ...,  2.,  1.,  1.],
...,
...,
[ 0.,  1.,  1., ...,  1.,  1.,  1.],
[ 0.,  0.,  0., ...,  0.,  2.,  0.],
[ 0.,  0.,  1., ...,  1.,  1.,  0.]])
... ... @@ -68,18 +68,15 @@
weight in kilograms is now: 57.5

As the example above shows, we can print several things at once by separating them with commas.

If we imagine the variable as a sticky note with a name written on it, assignment is like putting the sticky note on a particular value:

This means that assigning a value to one variable does not change the values of other variables. For example, let's store the subject's weight in pounds in a variable:

This means that assigning a value to one variable does not change the values of other variables. For example, let's store the subject's weight in pounds in a variable:

weight_lb = 2.2 * weight_kg
print 'weight in kilograms:', weight_kg, 'and in pounds:', weight_lb
weight in kilograms: 57.5 and in pounds: 126.5

and then change weight_kg:

and then change weight_kg:

weight_kg = 100.0
print 'weight in kilograms is now:', weight_kg, 'and weight in pounds is still:', weight_lb
weight in kilograms is now: 100.0 and weight in pounds is still: 126.5

Since weight_lb doesn't "remember" where its value came from, it isn't automatically updated when weight_kg changes. This is different from the way spreadsheets work.

Since weight_lb doesn't "remember" where its value came from, it isn't automatically updated when weight_kg changes. This is different from the way spreadsheets work.

Just as we can assign a single value to a variable, we can also assign an array of values to a variable using the same syntax. Let's re-run numpy.loadtxt and save its result:

This statement doesn't produce any output because assignment doesn't display anything. If we want to check that our data has been loaded, we can print the variable's value:

... ... @@ -87,7 +84,7 @@
[[ 0.  0.  1. ...,  3.  0.  0.]
[ 0.  1.  2. ...,  1.  0.  1.]
[ 0.  1.  1. ...,  2.  1.  1.]
...,
...,
[ 0.  1.  1. ...,  1.  1.  1.]
[ 0.  0.  0. ...,  0.  2.  0.]
[ 0.  0.  1. ...,  1.  1.  0.]]
... ... @@ -173,8 +170,7 @@ standard deviation: 4.61383319712
print 'maximum inflammation for patient 2:', data[2, :].max()
maximum inflammation for patient 2: 19.0

What if we need the maximum inflammation for all patients, or the average for each day? As the diagram below shows, we want to perform the operation across an axis:

To support this, most array methods allow us to specify the axis we want to work on. If we ask for the average across axis 0, we get:

To support this, most array methods allow us to specify the axis we want to work on. If we ask for the average across axis 0, we get:

print data.mean(axis=0)
[  0.           0.45         1.11666667   1.75         2.43333333   3.15
3.8          3.88333333   5.23333333   5.51666667   5.95         5.9
...  ...  @@ -200,44 +196,38 @@ standard deviation: 4.61383319712
from matplotlib import pyplot
pyplot.imshow(data)
pyplot.show()

Blue regions in this heat map are low values, while red shows high values. As we can see, inflammation rises and falls over a 40-day period. Let's take a look at the average inflammation over time:

Blue regions in this heat map are low values, while red shows high values. As we can see, inflammation rises and falls over a 40-day period. Let's take a look at the average inflammation over time:

ave_inflammation = data.mean(axis=0)
pyplot.plot(ave_inflammation)
pyplot.show()

Here, we have put the average per day across all patients in the variable ave_inflammation, then asked pyplot to create and display a line graph of those values. The result is roughly a linear rise and fall, which is suspicious: based on other studies, we expect a sharper rise and slower fall. Let's have a look at two other statistics:

Here, we have put the average per day across all patients in the variable ave_inflammation, then asked pyplot to create and display a line graph of those values. The result is roughly a linear rise and fall, which is suspicious: based on other studies, we expect a sharper rise and slower fall. Let's have a look at two other statistics:

pyplot.plot(data.max(axis=0))
pyplot.show()

pyplot.plot(data.min(axis=0))
pyplot.show()

The maximum value rises and falls perfectly smoothly, while the minimum seems to be a step function. Neither result seems particularly likely, so either there's a mistake in our calculations or something is wrong with our data.

It's very common to create an alias for a library when importing it in order to reduce the amount of typing we have to do. Here are our three plots side by side using aliases for numpy and pyplot:

import numpy as np
from matplotlib import pyplot as plt

~ {.python} pyplot.plot(data.min(axis=0)) pyplot.show() ~

The maximum value rises and falls perfectly smoothly, while the minimum seems to be a step function. Neither result seems particularly likely, so either there's a mistake in our calculations or something is wrong with our data.

Here are our three plots side by side:

import numpy
from matplotlib import pyplot

plt.figure(figsize=(10.0, 3.0))
pyplot.figure(figsize=(10.0, 3.0))

plt.subplot(1, 3, 1)
plt.ylabel('average')
plt.plot(data.mean(0))
pyplot.subplot(1, 3, 1)
pyplot.ylabel('average')
pyplot.plot(data.mean(axis=0))

plt.subplot(1, 3, 2)
plt.ylabel('max')
plt.plot(data.max(0))
pyplot.subplot(1, 3, 2)
pyplot.ylabel('max')
pyplot.plot(data.max(axis=0))

plt.subplot(1, 3, 3)
plt.ylabel('min')
plt.plot(data.min(0))
pyplot.subplot(1, 3, 3)
pyplot.ylabel('min')
pyplot.plot(data.min(axis=0))

plt.tight_layout()
plt.show()

The first two lines re-load our libraries as np and plt, which are the aliases most Python programmers use. The call to loadtxt reads our data, and the rest of the program tells the plotting library how large we want the figure to be, that we're creating three sub-plots, what to draw for each one, and that we want a tight layout. (Perversely, if we leave out that call to plt.tight_layout(), the graphs will actually be squeezed together more closely.)

pyplot.tight_layout() pyplot.show()

The call to loadtxt reads our data, and the rest of the program tells the plotting library how large we want the figure to be, that we're creating three sub-plots, what to draw for each one, and that we want a tight layout. (Perversely, if we leave out that call to pyplot.tight_layout(), the graphs will actually be squeezed together more closely.)

FIXME

Modify the program to display the three plots on top of one another instead of side by side.

... ...
 ... ... @@ -42,7 +42,7 @@ numpy.loadtxt(fname='inflammation-01.csv', delimiter=',') array([[ 0., 0., 1., ..., 3., 0., 0.], [ 0., 1., 2., ..., 1., 0., 1.], [ 0., 1., 1., ..., 2., 1., 1.], ..., ..., [ 0., 1., 1., ..., 1., 1., 1.], [ 0., 0., 0., ..., 0., 2., 0.], [ 0., 0., 1., ..., 1., 1., 0.]]) ... ... @@ -124,7 +124,7 @@ we can print several things at once by separating them with commas. If we imagine the variable as a sticky note with a name written on it, assignment is like putting the sticky note on a particular value: ![Variables as Sticky Notes](fig/python-sticky-note-variables-01.svg)\ ![Variables as Sticky Notes](fig/python-sticky-note-variables-01.svg)\ This means that assigning a value to one variable does *not* change the values of other variables. For example, ... ... @@ -138,7 +138,7 @@ print 'weight in kilograms:', weight_kg, 'and in pounds:', weight_lb weight in kilograms: 57.5 and in pounds: 126.5 ~~~ ![Creating Another Variable](fig/python-sticky-note-variables-02.svg)\ ![Creating Another Variable](fig/python-sticky-note-variables-02.svg)\ and then change `weight_kg`: ... ... @@ -150,7 +150,7 @@ print 'weight in kilograms is now:', weight_kg, 'and weight in pounds is still:' weight in kilograms is now: 100.0 and weight in pounds is still: 126.5 ~~~ ![Updating a Variable](fig/python-sticky-note-variables-03.svg)\ ![Updating a Variable](fig/python-sticky-note-variables-03.svg)\ Since `weight_lb` doesn't "remember" where its value came from, it isn't automatically updated when `weight_kg` changes. ... ... @@ -174,7 +174,7 @@ print data [[ 0. 0. 1. ..., 3. 0. 0.] [ 0. 1. 2. ..., 1. 0. 1.] [ 0. 1. 1. ..., 2. 1. 1.] ..., ..., [ 0. 1. 1. ..., 1. 1. 1.] [ 0. 0. 0. ..., 0. 2. 0.] [ 0. 0. 1. ..., 1. 1. 0.]] ... ... @@ -249,7 +249,7 @@ the index is how many steps we have to take from the start to get the item we wa > rather than the lower left. > This is consistent with the way mathematicians draw matrices, > but different from the Cartesian coordinates. > The indices are (row, column) instead of (column, row) for the same reason, > The indices are (row, column) instead of (column, row) for the same reason, > which can be confusing when plotting data. An index like `[30, 20]` selects a single element of an array, ... ... @@ -434,7 +434,7 @@ or the average for each day? As the diagram below shows, we want to perform the operation across an axis: ![Operations Across Axes](fig/python-operations-across-axes.svg)\ ![Operations Across Axes](fig/python-operations-across-axes.svg)\ To support this, most array methods allow us to specify the axis we want to work on. ... ... @@ -500,7 +500,7 @@ pyplot.imshow(data) pyplot.show() ~~~ ![Heatmap of the Data](fig/01-numpy_71_0.png)\ ![Heatmap of the Data](fig/01-numpy_71_0.png)\ Blue regions in this heat map are low values, while red shows high values. As we can see, ... ... @@ -513,7 +513,7 @@ pyplot.plot(ave_inflammation) pyplot.show() ~~~ ![Average Inflammation Over Time](fig/01-numpy_73_0.png)\ ![Average Inflammation Over Time](fig/01-numpy_73_0.png)\ Here, we have put the average per day across all patients in the variable `ave_inflammation`, ... ... @@ -529,14 +529,14 @@ pyplot.plot(data.max(axis=0)) pyplot.show() ~~~ ![Maximum Value Along The First Axis](fig/01-numpy_75_1.png)\ ![Maximum Value Along The First Axis](fig/01-numpy_75_1.png)\ ~~~ {.python} pyplot.plot(data.min(axis=0)) pyplot.show() ~~~ ![Minimum Value Along The First Axis](fig/01-numpy_75_3.png)\ ![Minimum Value Along The First Axis](fig/01-numpy_75_3.png)\ The maximum value rises and falls perfectly smoothly, while the minimum seems to be a step function. ... ... @@ -544,38 +544,34 @@ Neither result seems particularly likely, so either there's a mistake in our calculations or something is wrong with our data. It's very common to create an **alias** for a library when importing it in order to reduce the amount of typing we have to do. Here are our three plots side by side using aliases for `numpy` and `pyplot`: Here are our three plots side by side: ~~~ {.python} import numpy as np from matplotlib import pyplot as plt import numpy from matplotlib import pyplot data = np.loadtxt(fname='inflammation-01.csv', delimiter=',') data = numpy.loadtxt(fname='inflammation-01.csv', delimiter=',') plt.figure(figsize=(10.0, 3.0)) pyplot.figure(figsize=(10.0, 3.0)) plt.subplot(1, 3, 1) plt.ylabel('average') plt.plot(data.mean(0)) pyplot.subplot(1, 3, 1) pyplot.ylabel('average') pyplot.plot(data.mean(axis=0)) plt.subplot(1, 3, 2) plt.ylabel('max') plt.plot(data.max(0)) pyplot.subplot(1, 3, 2) pyplot.ylabel('max') pyplot.plot(data.max(axis=0)) plt.subplot(1, 3, 3) plt.ylabel('min') plt.plot(data.min(0)) pyplot.subplot(1, 3, 3) pyplot.ylabel('min') pyplot.plot(data.min(axis=0)) plt.tight_layout() plt.show() pyplot.tight_layout() pyplot.show() ~~~ ![The Previous Plots as Subplots](fig/01-numpy_80_0.png)\ ![The Previous Plots as Subplots](fig/01-numpy_80_0.png)\ The first two lines re-load our libraries as `np` and `plt`, which are the aliases most Python programmers use. The call to `loadtxt` reads our data, and the rest of the program tells the plotting library how large we want the figure to be, ... ... @@ -583,7 +579,7 @@ that we're creating three sub-plots, what to draw for each one, and that we want a tight layout. (Perversely, if we leave out that call to `plt.tight_layout()`, if we leave out that call to `pyplot.tight_layout()`, the graphs will actually be squeezed together more closely.) > ## FIXME {.challenge} ... ... @@ -593,7 +589,7 @@ the graphs will actually be squeezed together more closely.) > ## FIXME {.challenge} > > Draw diagrams showing what variables refer to what values after each statement in the following program: > > > ~~~ {.python} > mass = 47.5 > age = 122 ... ... @@ -604,7 +600,7 @@ the graphs will actually be squeezed together more closely.) > ## FIXME {.challenge} > > What does the following program print out? > > > ~~~ {.python} > first, second = 'Grace', 'Hopper' > third, fourth = second, first ... ... @@ -615,22 +611,22 @@ the graphs will actually be squeezed together more closely.) > > A section of an array is called a **slice**. > We can take slices of character strings as well: > > > ~~~ {.python} > element = 'oxygen' > print 'first three characters:', element[0:3] > print 'last three characters:', element[3:6] > ~~~ > > > ~~~ {.output} > first three characters: oxy > last three characters: gen > ~~~ > > > What is the value of `element[:4]`? > What about `element[4:]`? > Or `element[:]`? > > > What is `element[-1]`? > What is `element[-2]`? > Given those answers, ... ...
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!