Part A: Latitude vs. Daylight
In this part, you will examine data collected in the Global Sun 
Temperature project to see if there is any relationship between proximity
to the equator (latitude) and amount of daylight at that location. The
data that you will be using was collected during a specific week in
November, 2000. During that week, students recorded the sunrise and sunset
times and determined the number of minutes of daylight (or sunlight) each
day. They found the average minutes of daylight for the week and submitted
it to the project along with their latitude and longitude.
1. Review the data that was submitted to the project:
Fall 2000 Data . Note that it is sorted by
latitude. What do negative latitudes represent? Positive latitudes?
2. Examine a scatter plot of the data. Describe, in your own words, the
trend you see in the data. Does it make sense? Explain why or why not.
(click to enlarge)
3. What is the appropriate domain and range for this data? Explain your
reasoning.
4. Below are three functions that have been fit to the data; a linear
function, a 2nd degree polynomial (quadratic) and a 3rd degree polynomial.
The equations for each of the functions are given along with the
Coefficient of Determination (r^{2}). In each case, the functions
have been extrapolated to +/ 90°.
Linear Model
(click to enlarge) 
2nd Degree Polynomial Model
(click to enlarge) 
3rd Degree Polynomial Model
(click to enlarge) 
QUESTION: Which one is the most
appropriate model for this data?
Things to consider: 
Explanation for above models: 
Coefficient of Determination (r^{2}) The coefficient
of determination indicates the percent of variation in the data that
is explained by the model. In other words, it tells how good a fit the model is
to the data. The closer r^{2} is to 1, the better the fitted
model explains the data. However, it is not the only measure of a good
model.

The coefficient of determination for all three of the
models above is very strong and doesn't vary much among the
models. In this case, the coefficient of determination may not be the
best way to select the most appropriate model. 
Residuals
Residuals (or statistical errors) help to determine if the model is
a good fit to the data. A scatter plot of residuals vs. the
independent variable that shows residuals uniformly close to the
xaxis indicates a good fit to the data. However, if the residuals
form any type of increasing or decreasing pattern then it may indicate
that the model is not a good fit to the data.
Residual = observed value  predicted value 
The residuals for each of the models is shown below.
Residuals for Linear Model
(click to enlarge) 
Residuals for 2nd Degree Polynomial
(click to enlarge) 
Residuals for 3rd Degree Polynomial
(click to enlarge) 

The residuals for the linear model look slightly more
uniform about the xaxis than the other two models. 
Which Model Supports the Theory the Best?
Examine each of the models. Do any of the models seem inappropriate
for the domain and range of data? Do any seem inappropriate when the data is
extrapolated? Do any data seem to be outliers? Might there be a
reasonable explanation for the outliers? Which model provides a
reasonable indication of what you would expect to see? 
The 2nd degree polynomial model is a parabola.
Although the curve fits the data for the domain and range of data
given, it also indicates that the amount of daylight reaches a maximum
at southern latitudes and then decreases at more southern latitudes.
This model does not appear to fit what we know to be true. Both the
linear and 3rd degree polynomial model indicate that the amount of
daylight experienced increases with northern latitudes, which is what
is expected. 
All things being equal, which is the simplest model that explains the trend in data? Ockham's Razor is a
maxim that suggests choosing a simple model over a more sophisticated
model if all else appears equal. You can find out more about Ockham's
Razor in the Supplementary Resources section. 
The linear model in this example is the simplest of
the models given. 
5. Which model do YOU think is best for the given data? Justify your
decision.
6. What would you expect the model to look like in May of any given
year? Explain. Explore
archived data for this project and pick 510 data points that
either support or refute your predicted model. Pick points at a variety of
latitudes to ensure that the points represent the model in its entirety.
