| $x$ | $y$ | State |
| 30.34 | 3.46 | AK
| | 18.20 | 2.90 | AL
| | 18.24 | 2.99 | AR
| | 25.82 | 3.52 | AZ
| | 28.60 | 4.46 | CA
| | 31.10 | 5.11 | CT
| | 40.46 | 5.60 | DC
| | 33.60 | 4.78 | DE
| | 28.27 | 4.46 | FL
| | 22.12 | 4.23 | IA
| | 20.10 | 3.08 | ID
| | 27.91 | 4.75 | IL
| | 26.18 | 4.09 | IN
| | 21.84 | 2.91 | KS
| | 23.44 | 2.86 | KY
| | 21.58 | 4.65 | LA
| | 26.92 | 4.69 | MA
| | 25.91 | 5.21 | MD
| | 28.92 | 4.79 | ME
| | 24.96 | 5.27 | MI
| | 22.06 | 3.72 | MN
| | 27.56 | 4.04 | MO
| | 16.08 | 3.06 | MS
| | 23.75 | 3.95 | MT
| | 19.96 | 2.89 | ND
| | 23.32 | 3.72 | NE
| | 28.64 | 5.98 | NJ
| | 21.16 | 2.90 | NM
| | 42.40 | 6.54 | NV
| | 29.14 | 5.30 | NY
| | 26.38 | 4.47 | OH
| | 23.44 | 2.93 | OK
| | 23.78 | 4.89 | PA
| | 29.18 | 4.99 | RI
| | 18.06 | 3.25 | SC
| | 20.94 | 3.64 | SD
| | 20.08 | 2.94 | TN
| | 22.57 | 3.21 | TX
| | 14.00 | 3.31 | UT
| | 25.89 | 4.63 | VT
| | 21.17 | 4.04 | WA
| | 21.25 | 5.14 | WI
| | 22.86 | 4.78 | WV
| | 28.04 | 3.20 | WY
|
| Question 1. The numbers in the $x$ column of this table tell us how
many hundreds of cigarettes were sold per person in each state during 1960. For example,
in Alabama (AL), the number in the $x$ column is 18.2.
This means that for each person living in Alabama in 1960, 1,820 cigarettes
were sold (that is, 18.2 hundreds).
Question 2. The numbers in the $y$ column tell us how
many people died of bladder cancer for each 100,000 people living in the state.
For example, in Alabama (AL), for each 100,000 people, 2.9 people died of bladder cancer.
Finding the "best fit" line
Just by looking at the table, it's hard to tell whether there is a relationship between the
number of cigarettes sold in each state and the number of deaths due to bladder cancer.
Another way to look at the data in the table is to graph it. You can see a graph showing the
points that represent the data to the left.
When the points on a grid are not all on a straight line, but
seem to have a somewhat linear pattern, you can find a line that
is "best fit" (closest) to the pattern.
The number 2.981, shown to the right of the graph, is called the "root-mean-square error"
for the graphed line ($y=0.05x$).
This number tells you how far away the line is from the
points (a smaller number means the line is a
better “fit” to the points).
|