Statistics project
This project will involve the gathering of samples and the use of both descriptive and inferential statistics. It will be due by December 17th. All work must be shown and the write up must be typed (mathematical formulations may be hand written). You may use any software with statistical applications (Excel, StatCrunch, Minitab, etc.), calculator or any other statistical tool for your analysis work (there are some online tools that can calculate confidence intervals and hypothesis tests). You may also use the textbook and any other written resource to aid you in your work (website, etc.). The work on this project must be your own. Seeking assistance from anyone else (Tutor, Classmate, etc.) is strictly forbidden (you are on your honor).I will be available to answer any questions that you may have for this project for clarity purposes.
Project Concept
This project will focus on analyzing weather data for two cities in upstate New York (Binghamton and Syracuse). The data is located on the NOAA website. The project will be in three parts:
Part 1: Perform a descriptive and numerical analysis of a sample of average daily temperatures for the city of Syracuse, NY.
Part 2: Create confidence intervals to estimate the average monthly precipitation for both Syracuse and Binghamton, NY.
Part 3: Using a twosample dependent (matched pairs) design, determine if there exists a difference in average winter monthly snowfall.
The NOAA website contains the data needed to perform the analysis. There will be instructions on how to get to the website and navigate to the locations of where the data can be found. For all three parts please restrict yourself within the time period from January 1965 to December 2012.
Part 1: Descriptive and Numerical Analysis for Syracuse, NY (1965 to 2012)
For this part of the project you need to gather the daily average temperature for 50 different days from January 1965 to December 2012 for Syracuse, NY. At the end of this project document I will provide some details of what can be used to gather this sample properly.
Part 2: Confidence Interval Estimates for Monthly Precipitation for Binghamton and Syracuse, NY (1965 to 2012)
For this part of the project you need to gather the total monthly precipitation for 30 different months from January 1965 to December 2012 for Binghamton and Syracuse, NY. You must gather a different simple random sample (SRS) of 30 months for each city (one set of 30 months for Binghamton and a separate set of 30 months for Syracuse). At the end of this project document I will provide some details of what can be used to gather this sample properly. I have created a chart (called Part 2 Data Sheet) that will help you organize this data.
Part 3: TwoSample Dependent (Matched Pairs) Design for Winter Month Snowfall (1965 to 2012)
For the last part of the project you need to gather a matched pair design sample of monthly total snowfall totals for 30 different “winter” months from January 1965 to December 2012 for Binghamton and Syracuse. Winter months will be defined as December, January, February and March. This sample differs from the other parts of the project in that you only need to obtain oneSRS of 30 different winter months. Once that sample is created then you need to get the monthly snowfall totals for both Binghamton and Syracuse for that month (i.e. if one of your months is February 1992 then you need to gather the snowfall totals for both Binghamton and Syracuse for February 1992). I have created a chart (called Part 3 Data Sheet) that will help you organize this data.
Statistical Analysis
Part 1: Descriptive and Numerical Analysis
Using the sample of average daily temperatures gathered for Syracuse, NY please conduct the following descriptive and numerical analysis.
 StemandLeaf Plot
 Frequency Table using between 6 and 10 classes (your choice)
 Frequency Histogram based on the frequency table you created
 Five Number Summary and Boxplot (including fences)
 Mean, Median, and Mode
 Standard Deviation
*note: for the purposes of calculating mean and standard deviation you do not need to calculate it by hand (you may use the statistical features of your calculator or computer instead)
Part 2: Confidence Interval Estimator
Using the Part 2 data gathered please create a 95% confidence interval estimator for the population average monthly precipitation for Binghamton and Syracuse, NY. In other words, you will create two different 95% confidence intervals, one for Binghamton and another for Syracuse.
Part 3: Twosample Dependent (Matched Pairs) Design Hypothesis Test
Using the Part 3 data gathered please run a dependent hypothesis test for the population difference of monthly average snowfall at the 5% level of significance. Please be sure to show all four steps of the hypothesis test (you may use the p value approach for this if you wish). Note: you do not need to show me the work for the mean and standard deviation calculations since you can use the statistical functions of your calculator for them.
Project Summary
For the final part of the project I would like you to answer the following questions below.
a) Using the Descriptive Analysis from part 1, does it appear that the population of average daily temperatures for Syracuse, NY is normally distributed? Why or why not?
b) Using the confidence intervals from part 2, is it possible that the population means of monthly precipitation for Binghamton and Syracuse, NY are the same? Why or why not?
c) Using the dependent hypothesis test from part 3, what were you able to conclude about the difference of monthly snowfall between Binghamton and Syracuse (i.e. did one of the cities appear to have a higher average monthly snowfall total?)
d) The sample size for parts 2 and 3 of the analysis were set at 30 (hint: it had to be at least 30). Why is this important?
e) In the past students have tried to use the random date generator from random.org to gather their part 2 data by ignoring the day and using just the month and year (i.e. if they got a date of 3/20/1977, they would throw out the 20 and use just 3/1977 as their random month). Does this process actually result in a true simple random sample? Why or why not?
Write Up Details
When putting the project together for submission please make sure the materials are in the following order:
 Cover Page (with your name and date)
 The raw data you gathered (for parts 1,2, and 3 of the project)
 Part 1 Analysis
 Part 2 Analysis
 Part 3 Analysis
 Project Summary Questions
Please be sure to type all that you can for this project. If hand calculations need to be shown please be sure they are clear and legible.
Random Sample Resources
 For part 1 of the project I highly recommend using the calendar date generator found on the random.org website. The address is
http://www.random.org/calendardates/
Be sure to include all days of the week when using this feature (it defaults to just weekdays).
 For parts 2 and 3 you can use random.org to generate lists of integers. The address is
http://www.random.org/integers/
For the part 2 months, generate two separate lists of at least 30 values. One that uses digits 112 (for the months) and the other that uses digits 19652011 (for the years). You can then merge them together for your month/year combo.
For the part 3 months, use the same process as the part 2 process except use 03 for your winter months..
0 = December, 1 = January, 2 = February, 3 = March
I highly recommend gathering more than 30 random months for parts 2 and 3. So if you get a repeat month/year you can ignore it and move to the next month/year for your sample.
Part 1 Data Sheet
Number 
Date 
Average Temp 

Number 
Date 
Average Temp 
1 
31 

2 
32 

3 
33 

4 
34 

5 
35 

6 
36 

7 
37 

8 
38 

9 
39 

10 
40 

11 
41 

12 
42 

13 
43 

14 
44 

15 
45 

16 
46 

17 
47 

18 
48 

19 
49 

20 
50 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 
Part 2 Data Sheet
Binghamton Syracuse
Number 
Month

Total Precipitation 

Number 
Month

Total Precipitation 
1 
1 

2 
2 

3 
3 

4 
4 

5 
5 

6 
6 

7 
7 

8 
8 

9 
9 

10 
10 

11 
11 

12 
12 

13 
13 

14 
14 

15 
15 

16 
16 

17 
17 

18 
18 

19 
19 

20 
20 

21 
21 

22 
22 

23 
23 

24 
24 

25 
25 

26 
26 

27 
27 

28 
28 

29 
29 

30 
30 
Part 3 Data Sheet
Number 
Month

Total Monthly Snowfall for Syracuse 
Total Monthly Snowfall for Binghamton 
Difference 
1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 
Finding Data on the NOAA Website (Part 1)
To find the data required for the project :
 Go to website location: www.erh.noaa.gov/bgm/
 In the left panel under Climate choose Local.
 Above Observed Weather Reports choose Local Data/Records.
 Under Climate Data choose Local Records and Averages.
 Above Local Climate Information choose your city (Binghamton, Scranton or Syracuse)
 Use Past Preliminary Climatology Data (CF6). There will be two drop tabs (one for month and one for year).. Select your month/year and then choose Get Data
This should get you to the web page with weather information. Please read Part 2 to learn how to find the data needed on the page.
There are appears to be two different formats for the weather data to be displayed when you look up a month:
Example of FORMAT 1.
This is the standard format that most months may be presented as. The example below is of March 1994 for Binghamton, NY (BGM).
DAY = Day of the month
For Part 1 of the project you will be using the AVG Column (for the daily Average temperature).
For Part 2 of the project (monthly total precipitation) you will look at the bottom of the PCPN column to get the monthly precipitation (cross reference with the Total row). In this example the total monthly precipitation is 5.06 inches.
For Part 3 of the project (monthly total snowfall) you will look at the bottom of the SNOW column to get the monthly snowfall (cross reference with the Total row). In this example the total monthly snowfall is 27.8 inches.
Example of FORMAT 2
Format 2 looks much like format 1. When you need to find information that is used for Parts 1 they exist in the same column just like format 1. Finding total monthly precipitation and snowfall (for Parts 2 and 3) becomes a little more involved in format 2. The best way to find it is to move down the page a little and look for the following (this is an example of March, 1999 for Binghamton):
Note that I circled where you can find the Total Monthly Precipitation and Snowfall (2.57 inches of precipitation and 33.8 inches of snow for this month).
Looking over the website it looks like much of the older information is in Format 1 and the newer information is in Format 2. If you have any difficulty finding information please do not hesitate to ask.

Rating:
5/
Solution: Statistics project