regression analysis, (week 2, 5, and 8)

The Major League Baseball Data Set (collected from 2005) on the the next tab describes Week 2 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
As preparation for the final research paper, formulate a theory about the correlation between measurable independent variables (causes) and one measurable dependent variable (the effect). Be sure to have at least two independent variables for proposed research paper. The topic proposal should include the following four items which serve as the foundation for the final research paper after instructor feedback is given. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1) Purpose Statement In one paragraph, state the correlation and identify the primary independent variables. State the correlation as in the following: “The dependent variable _______ is determined by independent variables ________, _________, ________, and ________.” Identify and defend the “primary” independent variable, or the variable believed to have the strongest impact on the dependent variable: “The most important independent variable in this relationship is ________ because _________.” |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2) Definition of Variables For each variable, write a single definition paragraph talking about the variable. Paragraphs should be in this order: dependent variable, primary independent variable, and three independent variables. In addition to defining the independent variables, defend why each determines the dependent variable. For the primary independent variable, at least two research sources that discuss the variable also must be cited. These sources need not be technical documents but should contain evidence to justify the relationship between the primary independent variable and the dependent variable. List these sources in the Works Cited (reference) page. **Note:Citations from encyclopedias, Wikipedia, blogs, abstracts, or non-governmental websites are not acceptable research sources. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3) Data Description For each of the variables, at least 30 observations of cross-sectional or time-series data must be obtained. Thus for the final research paper, a data matrix that is at least 30 rows by numbers of variables must be presented. In one paragraph, identify the data sources and describe the data (i.e., which government agencies supply the data, which methods are used to compile them, when they were collected, etc.). |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
4) Works Cited Page The final page of the proposal should be a Works Cited page listing the two research sources for the primary independent variable and the data sources, with a separate citation for each table of data, including specific table numbers for each of the sources. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Upload this Word File in dropbox labeled "Project
Topic and Feasibility Paper". Week 5 In week two you submitted the following -- As preparation for the final research paper, formulate a theory about the correlation between measurable independent variables (causes) and one measurable dependent variable (the effect). Be sure to have at least two independent variables for proposed research paper. The topic proposal should include the following four items which serve as the foundation for the final research paper after instructor feedback is given. For Week five -- [50 points] Create a draft of your report including the data. This copy of the data may be in an Excel spreadsheet. ________________________________________ **Below is a copy of the requirements for your proposal from week 2. Points are assigned for week 5 work as listed on each item. 1) Purpose Statement [10pts] In one paragraph, state the correlation and identify the primary independent variables. State the correlation as in the following: "The dependent variable _______ is determined by independent variables ________, _________, ________, and ________." Identify and defend the "primary" independent variable, or the variable believed to have the strongest impact on the dependent variable: "The most important independent variable in this relationship is ________ because _________." ________________________________________ 2) Definition of Variables [10pts] For each variable, write a single definition paragraph talking about the variable. Paragraphs should be in this order: dependent variable, primary independent variable, and three independent variables. In addition to defining the independent variables, defend why each determines the dependent variable. For the primary independent variable, at least two research sources that discuss the variable also must be cited. These sources need not be technical documents but should contain evidence to justify the relationship between the primary independent variable and the dependent variable. List these sources in the Works Cited (reference) page. **Note: Citations from encyclopedias, Wikipedia, blogs, abstracts, or non-governmental websites are not acceptable research sources. ________________________________________ 3) Data Description [10pts] For each of the variables, at least 30 observations of cross-sectional data must be obtained. Thus for the final research paper, a data matrix that is at least 30 rows by numbers of variables must be presented. In one paragraph, identify the data sources and describe the data (i.e., which government agencies supply the data, which methods are used to compile them, when they were collected, etc.). ________________________________________ 4) Works Cited Page [10pts] The final page of the proposal should be a Works Cited page listing the two research sources for the primary independent variable and the data sources, with a separate citation for each table of data, including specific table numbers for each of the sources. 5) Data [10pts] ________________________________________ Upload this Word File in dropbox labeled "Term Project Proposal" Final Research Paper: Assessment Rubric for this paper is available under Course Home, Asessment Rubric link. Week 8 ________________________________________ Purpose Statement and Model 1) In the introductory paragraph, state why the dependent variable has been chosen for analysis. Then make a general statement about the model: "The dependent variable _______ is determined by variables ________, ________, ________, and ________." 2) In the second paragraph, identify the primary independent variable and defend why it is important. "The most important variable in this analysis is ________ because _________." In this paragraph, cite and discuss the two research sources that support the thesis, i.e., the model. 3) Write the general form of the regression model (less intercept and coefficients), with the variables named appropriately so reader can identify each variable at a glance: Dep_Var = Ind_Var_1 + Ind_Var_2 + Ind_Var_3 For instance, a typical model would be written: Price_of_Home = Square_Footage + Number_Bedrooms + Lot_Size Where Price_of_Home: brief definition of dependent variable Square_Footage: brief definition of first independent variable Number_Bedrooms: brief definition of second independent variable Lot_Size: brief definition of third independent variable [Note: student of course replaces these variable names with his/her own variable names.] Definition of Variables 4) Define and defend all variables, including the dependent variable, in a single paragraph for each variable. Also, state the expectations for each independent variable. These paragraphs should be in numerical order, i.e., dependent variable, X1, then X2, etc. In each paragraph, the following should be addressed: < How is the variable defined in the data source? < Which unit of measurement is used? < For the independent variables: why does the variable determine Y? < What sign is expected for the independent variable's coefficient, positive or negative? Why? Data Description 5) In one paragraph, describe the data and identify the data sources. < From which general sources and from which specific tables are the data taken? (Citing a website is not acceptable.) < Which year or years were the data collected? < Are there any data limitations? Presentation and Interpretation of Results 6) Write the regression (prediction) equation: Dep_Var = Intercept + c1 * Ind_Var_1 + c2 * Ind_Var_2 + c3 * Ind_Var_3 7) Identify and interpret the adjusted R2 (one paragraph): < Define "adjusted R2." < What does the value of the adjusted R2 reveal about the model? < If the adjusted R2 is low, how has the choice of independent variables created this result? 8) Identify and interpret the F test (one paragraph): < Using the p-value approach, is the null hypothesis for the F test rejected or not rejected? Why or why not? < Interpret the implications of these findings for the model. 9) Identify and interpret the t tests for each of the coefficients (one separate paragraph for each variable, in numerical order): < Are the signs of the coefficients as expected? If not, why not? < For each of the coefficients, interpret the numerical value. < Using the p-value approach, is the null hypothesis for the t test rejected or not rejected for each coefficient? Why or why not? < Interpret the implications of these findings for the variable. < Identify the variable with the greatest significance. 10) Analyze multicollinearity of the independent variables (one paragraph): < Generate the correlation matrix. < Define multicollinearity. < Are any of the independent variables highly correlated with each other? If so, identify the variables and explain why they are correlated. < State the implications of multicollinearity (if found) for the model. 11) Other (not required): < If any additional techniques for improving results are employed, discuss these at the end of the paper. Works Cited Page 12) Use the proper format to list the works cited under two headings: Research: two sources Data: a separate citation for each of the variables used in the paper. ________________________________________ Upload this Word File in dropbox labeled "Term Project Report" ******************************************************* Example_TopicProposal_Week2 Women in the Workforce: A Wonderful Addition or a Woeful Mistake? EC 315 Former Student Example Fall I 2007 Women in the Workforce: A Wonderful Addition or a Woeful Mistake? Background Many human resources professionals, scholars, feminists, and economists tout the addition of women to the U.S. workforce. Wendell French (2005) speculates in Human Resources Managementthat the continuous stream of women entering the workforce will explain a 55% increase in total U.S. labor force expansion between the years of 2002 and 2012 (p. 57). In addition, the percentage of working women continues to increase (French, 2005, p. 57). As women comprise an increasingly larger share of the labor market, their contributions, education, and effect on the economy warrants discussion. The aim of this project is to determine the effects of the entrance of larger proportions of increasingly educated women over the age of 25 with 4 years of college on the productivity of non-farm business in the United States, while holding the rate of population growth for this specific class (females over age 25), average number of hours worked, and average salary constant. This study employs a time-series analysis with observations from 1960 to 2006 included. Demographic data on education was taken from the U.S. Census Bureau and productivity information from the Bureau of Labor Statistics. The model (less constants and coefficients is): OUTPUT = %COLLEGE_FEM + AVG_SAL + AVG_HOURS + POP_GROWTH The result or dependent variable, OUTPUT, includes non-farm, seasonally adjusted output per hour. This variable is calculated using the ratio of the output of goods and services to labor hours required to produce them. %COLLEGE_FEM, the first independent variable, is the percentage of the female population age 25 and over who have completed 4 years of college and is compiled by the U.S. Census Bureau. This measure is used because of the established relationship between productivity and higher learning. If other independent variables are held constant, increases in education should result in a positive change in productivity (Sweetman, 2002 & Saxton, 2000). AVG_SAL is the real hourly compensation received by employees in non-farming business sectors. It is seasonally adjusted and indexed to 1992. This figure is utilized in the formula because wages have become a progressively significant incentive for workers to remain in or become more productive in the workforce. This being said, a positive relationship between average wages and productivity should exist (Fazzari, 2007). The average weekly hours spent on the job is another possible predictor of the dependent variable, OUTPUT. The Census Bureau collects this data from American workers for the Bureau of Labor Statistics. Many employers make adjustments to the hours which employees work in order to affect changes in productivity (International Labour Office Geneva, 2007). The additional time spent on the job increases output; essentially this should mean that a positive relationship exists between average hours and productivity, all other independent variables held equal (Skoczylas & Tissot 2004). Finally, as the population grows so does potential, equilibrium, and per capita output; this should also affect hourly productivity in a positive fashion (Fazzari, 2007). References International Labour Office; Geneva, (2007). Working time around the world: Main findings and policy implications. Retrieved August 29, 2007, from International Labour Office Web site: http://www.ilo.org/wcmsp5/groups/public/---dgreports/--- dcomm/documents/publication/wcms_082838.pdf Fazzari, (2007, April 17). Retrieved September 2, 2007, from Washington State University, St. Louis Web site: artsci.wustl.edu/~ec104sf/Lec%20Notes%20104-8.doc French, W.L. (2005). Human Resources Management. New York: Houghton Mifflin Company. Saxton, Jim (January 2000). Joint Economic Committee Study. Retrieved September 1, 2007, from The United States House of Representatives Web site: http://www.house.gov/jec/educ.htm Skoczylas, L., & B, Tissot (2005). Revisiting Recent Productivity Developments Across OECD Countries. Bank for International Settlements, Retrieved September 2, 2007, from http://www.ifcommittee.org/tissot.pdf. Sweetman, A. (2002, November 27). Working smarter: Education and productivity. The Review of Economic Performance and Social Progress, Retrieved September 1, 2007, from http://www.irpp.org/miscpubs/archive/repsp1202/sweetman.pdf regression (example) Purpose Statement: The intent of this project is to measure the impact of winning percentage of the baseball teams for 1999 and its highest correlation to the pitching saves followed by other variables such as total payroll, runs batted in, batting average, home runs, runs, earned run average and pitching saves. To be price, winning percentage by professional teams will be the dependent variable and the remaining will be under independent variables. Abstract: In this paper, I am hypothesizing that the winning percentage is directly related to the total runs batted in (RBI), total payroll, batting average, home runs (HR), runs (R), earned run average (ERA) and pitching saves. I will demonstrate the relationship of the aforementioned independent variables and the dependent variable.
Definition of Variables: The dependent variable, winning percentage (Wining), is determined by independent variables,total runs batted in (RBI), total payroll, batting average, home runs (HR), runs (R), earned run average (ERA) and pitching saves. The primary independent variable is pitching saves, is defined as the number of saves, or percentage of save opportunities successfully converted. This variable is the most significant independent variable followed by earned run average. The independent variable, runs batted in (RBI), is defined as the total runs batted by the each baseball team in 1999. This variable tells us the total runs batted in but is highly insignificant. The independent variable, earned run average (ERA) , is defined as the mean ofearned runs given up by apitcher per nineinnings pitched. It is determined by dividing the number of earned runs allowed by the number of innings pitched and multiplying by nine. This variable is highly significant. The independent variable, runs (R), is defined as he total runs made by each team during the tournament. This variable is selected to see the impact of the overall runs made by different baseball teams. The independent variable, payroll, is defined as the salary given to each player of the baseball teams. This variable illustrates that which team had been hired on the highest payroll and hence it justifies in a way the performance of each team player – high performers actually. The independent variable, home runs (HR), is scored when the ball is hit in such a way that the batter is able to reach home safely in one play without any errors being committed by the defensive team in the process. Home runs are among the most popular aspects of baseball and, as a result, prolific home run hitters are usually the most popular among fans and consequently the highest paid by teams. The independent variable, batting average, is a measure of a batter's performance obtained by dividing the total of base hits by the number of times at bat, not including walks. This variable illustrates the degree of achievement or accomplishment in any activity. Relationship of Variables: The relationship between winning percentage and all the independent variables is positive except for earned batting average. Of all the independent variables, only payroll, pitching saves and earned batting average are the significant variables even payroll is little on the other side. To be precise, we will show the summary statistics of all the variables in the table below:
We will also see the regression outputs and conclude which variables are the most significant and how the model is coming along. About 95.2% of the variation in the winning percentage is accounted by all the independent variables. This number is nothing but the value of R-square or coefficient of determination. This indicates that the overall model has a good fit. Works Sited
· File 1999 Baseball Data.xls SampleFinalPaperTermProject Women in the Workforce: A Wonderful Addition or a Woeful Mistake? EC 315 Sample Report Term Project Report Fall I 2007 TABLE OF CONTENTS BACKGROUND 3 DICUSSION OF RESULTS 4 SUMMARY 8 REFERENCES 9 APPENDIX 10 Women in the Workforce: A Wonderful Addition or a Woeful Mistake? Background
Many human resources professionals, scholars, feminists, and economists tout the addition of women to the U.S. workforce. Wendell French (2005) speculates in Human Resources Managementthat the continuous stream of women entering the workforce will explain a 55% increase in total U.S. labor force expansion between the years of 2002 and 2012 (p. 57). In addition, the percentage of working women continues to increase (French, 2005, p. 57). As women comprise an increasingly larger share of the labor market, their contributions, education, and effect on the economy warrants discussion. The aim of this project is to determine the effects of the entrance of larger proportions of increasingly educated women over the age of 25 with 4 years of college on the productivity of non-farm business in the United States, while holding the rate of population growth for this specific class (females over age 25), average number of hours worked, and average salary constant. This study employs a time-series analysis with observations from 1966 to 2006 included. Demographic data on education was taken from the U.S. Census Bureau and productivity information from the Bureau of Labor Statistics. The model (less constants and coefficients is): OUTPUT = %COLLEGE_FEM + AVG_SAL + AVG_HOURS + POP_GROWTH The result or dependent variable, OUTPUT, includes non-farm, seasonally adjusted output per hour. This variable is calculated using the ratio of the output of goods and services to labor hours required to produce them. %COLLEGE_FEM, the first independent variable, is the percentage of the female population age 25 and over who have completed 4 years of college and is compiled by the U.S. Census Bureau. This measure is used because of the established relationship between productivity and higher learning. If other independent variables are held constant, increases in education should result in a positive change in productivity (Sweetman, 2002 & Saxton, 2000). AVG_SAL is the real hourly compensation received by employees in non-farming business sectors. It is seasonally adjusted and indexed to 1992. This figure is utilized in the formula because wages have become a progressively significant incentive for workers to remain in or become more productive in the workforce. This being said, a positive relationship between average wages and productivity should exist (Fazzari, 2007). The average weekly hours spent on the job is another possible predictor of the dependent variable, OUTPUT. The Census Bureau collects this data from American workers for the Bureau of Labor Statistics. Many employers make adjustments to the hours which employees work in order to affect changes in productivity (International Labour Office Geneva, 2007). The additional time spent on the job increases output; essentially this should mean that a positive relationship exists between average hours and productivity, all other independent variables held equal (Skoczylas & Tissot 2004). Finally, as the population grows so does potential, equilibrium, and per capita output; this should also affect hourly productivity in a positive fashion (Fazzari, 2007). Discussion of Results
The model was regressed and yielded the following results: Regression equation is:OUPUT = - 100 + 1.50 %COLLEGE_FEM + 1.27 AVG_SAL + 0.497 AVG_HOURS+ 4.46 POP_GROWTH
The focus of this analysis is on the impact of %COLLEGE_FEM on OUTPUT. As evident above, the Durbin Watson Statistic of .802773 fell into the rejection region, indicating positive autocorrelation. Autocorrelation occurs when a pattern exists between the error terms due to a variable missing from the analysis. In this regression, the coefficients of the independent variables are biased to an unknown extent and are not reliable or reportable in a scholarly paper, publication, or report. If the Durbin-Watson for this regression had passed, the independent variables %COLLEGE_FEM and AVG_SAL would have been significant at ? = .05, .025, .01, .005, and .001. The variable AVG_HOURS would have shown significance at ? = .05, .025, .01, and .005. However, the independent variable, POP_GROWTH shows inadequate significance with a p value of .2426; this does not meet the criteria of showing significance at the ? = .05, or even .10 level; however, it is significant when the sig value is less than or equal to ?. The R2 value of 99.2% suggests that the independent variables account for 99.2% of the variation of the outcome; this is often the case in time series regressions, in which one observation builds upon another. R2 cannot be relied upon since the Durbin-Watson indicates positive autocorrelation. It is desirable for R2 to be 50% or greater. The Adj. R2 value of 99.1% is also thought of as a "good" thing, if the Durbin-Watson had passed. In the case of adjusted R2, this regression indicates that the independent variables explain 99.1% of the variance of the dependent variable. Both statistics must be less than or equal to one, but greater than or equal to 0. Often, people rely heavily on the use of R2 and Adj. R2, disregarding the Durbin-Watson test. Since this analysis is a time series regression, the Durbin-Watson is more valuable to the analyst than the R2 and Adj. R2. Without a passing Durbin-Watson test for a time-series analysis, both of the preceding are useless, as is the case with this regression. Another important consideration when regressing an equation is the presence of multicollinearity. In this model, there was a complete absence of it. All pairs of independent variables were regressed, and the resulting R2from the bivariate regressions compared to the R2 of the entire model. The results are detailed below:
As is evident above, the bivariate regressions yield R2 values less than the value of the entire regression. Multicollinearity happens when two more of the predictors have a linear relationship. When this occurs, the statistical software package does not know which variable to give the coefficient to. Often, one coefficient will be near zero while the other coefficient of the collinear variable will be the source of all the affect on the outcome, causing the coefficients to be biased. The absence of multicollinearity indicates that the coefficients are not biased due to its presence. Another useful way to analyze the effect of independent variables on the outcomes is through coefficients. The coefficients for the model are detailed below:
The 1.5009 coefficient value for %COLLEGE_FEM indicates that for every one percent increase in the percentage of females who have obtained four years of college, output increases by 1.5009. 1.2682 is the coefficient for the predictor, AVG_SAL and indicates that for every single unit increase in salary, output increases by 1.268. The .4973 coefficient value for AVG_ HOURS suggests that for 1 hour increase, output is increased by .4973. POP_GROWTH brings with it a coefficient of 4.4615 indicating that for every 1,000 people that are added to the population, output increases by 4.4616. While these coefficients indicate a fairly significant relationship there are two things that must first be considered:
Had the Durbin-Watson passed, and the sig value of POP_GROWTH been less than or equal to alpha, the coefficients of the variables would yield the results detailed above. According to the coefficients, population growth has the most significant effect on the outcome, followed by the percentage of women with 4 years of college, average salary, and finally average hours. Summary Unfortunately, the failure of the Durbin-Watson test makes the model biased to an unknown extent. Unless the missing variable can be found, the results are essentially useless. If the Durbin-Watson had passed, all variables except POP_GROWTH would be significant predictors of the outcome. This would indicate that as the number of women who have 4 years of college increases, so does output per hour. In addition, the predictions made in the Background section would prove to be true. References International Labour Office; Geneva, (2007). Working time around the world: Main findings and policy implications. Retrieved August 29, 2007, from International Labour Office Web site: http://www.ilo.org/wcmsp5/groups/public/---dgreports/--- dcomm/documents/publication/wcms_082838.pdf Fazzari, (2007, April 17). Retrieved September 2, 2007, from Washington State University, St. Louis Web site: artsci.wustl.edu/~ec104sf/Lec%20Notes%20104-8.doc French, W.L. (2005). Human Resources Management. New York: Houghton Mifflin Company. Saxton, Jim (January 2000). Joint Economic Committee Study. Retrieved September 1, 2007, from The United States House of Representatives Web site: http://www.house.gov/jec/educ.htm Skoczylas, L., & B, Tissot (2005). Revisiting Recent Productivity Developments Across OECD Countries. Bank for International Settlements, Retrieved September 2, 2007, from http://www.ifcommittee.org/tissot.pdf. Sweetman, A. (2002, November 27). Working smarter: Education and productivity. The Review of Economic Performance and Social Progress, Retrieved September 1, 2007, from http://www.irpp.org/miscpubs/archive/repsp1202/sweetman.pdf Appendix
Minitab Regression Calculation Worksheet size: 10000 cells. Welcome to Minitab, press F1 for help.
Regression Analysis: OUPUT versus %COLLEGE_FEM, AVG_SAL, ... The regression equation is OUPUT = - 100 + 1.50 %COLLEGE_FEM + 1.27 AVG_SAL + 0.497 AVG_HOURS + 4.46 POP_GROWTH Predictor Coef SE Coef T P Constant -100.40 21.17 -4.74 0.000 %COLLEGE_FEM 1.5009 0.2579 5.82 0.000 AVG_SAL 1.2683 0.1013 12.52 0.000 AVG_HOURS 0.4974 0.1629 3.05 0.004 POP_GROWTH 4.462 3.756 1.19 0.243 S = 1.87253 R-Sq = 99.2% R-Sq(adj) = 99.1% Analysis of Variance Source DF SS MS F P Regression 4 16156.6 4039.2 1151.94 0.000 Residual Error 36 126.2 3.5 Total 40 16282.8 Source DF Seq SS %COLLEGE_FEM 1 15517.5 AVG_SAL 1 587.1 AVG_HOURS 1 47.1 POP_GROWTH 1 4.9 Unusual Observations Obs %COLLEGE_FEM OUPUT Fit SE Fit Residual St Resid 22 12.8 90.608 94.368 0.617 -3.760 -2.13R 35 16.3 115.689 120.049 0.587 -4.360 -2.45R R denotes an observation with a large standardized residual. Durbin-Watson statistic = 0.802773 Dl = 1.29, which is greater than .802773, signaling the failure of the DW test.
|

-
Rating:
5/
Solution: regression analysis, (week 2, 5, and 8)