Friday, May 1, 2020
Eviews Illustrator free essay sample
Windows, Word and Excel are trademarks of Microsoft Corporation. PostScript is a trademark of Adobe Corporation. Professional Organization of English Majors is a trademark of Garrison Keillor. All other product names mentioned in this manual may be trademarks or registered trademarks of their respective companies. Quantitative Micro Software, LLC 4521 Campus Drive, #336, Irvine CA, 92612-2699 Telephone: (949) 856-3368 Fax: (949) 856-2044 web: www. eviews. com First edition: 2007 Second edition: 2009 Editor: Meredith Startz Index: Palmer Publishing Services Chapter 3. Getting the Most from Least Squares Regression is the king of econometric tools. Regressionââ¬â¢s job is to find numerical values for theoretical parameters. In the simplest case this means telling us the slope and intercept of a line drawn through two dimensional data. But EViews tells us lots more than just slope and intercept. In this chapter youââ¬â¢ll see how easy it is to get parameter estimates plus a large variety of auxiliary statistics. We begin our exploration of EViewsââ¬â¢ regression tool with a quick look back at the NYSE volume data that we first saw in the opening chapter. Then weââ¬â¢ll talk about how to instruct EViews to estimate a regression and how to read the information about each estimated coefficient from the EViews output. In addition to regression coefficients, EViews provides a great deal of summary information about each estimated equation. Weââ¬â¢ll walk through these items as well. We take a look at EViewsââ¬â¢ features for testing hypotheses about regression coefficients and conclude with a quick look at some of EViewsââ¬â¢ most important views of regression results. Regression is a big subject. This chapter focuses on EViewsââ¬â¢ most important regression features. We postpone until later chapters various issues, including forecasting (Chapter 8, ââ¬Å"Forecastingâ⬠), serial correlation (Chapter 13, ââ¬Å"Serial Correlationââ¬âFriend or Foe? â⬠), and heteroskedasticity and nonlinear regression (Chapter 14, ââ¬Å"A Taste of Advanced Estimationâ⬠). A First Regression Returning to our earlier examination of trend growth in the volume of stock trades, we start with a scatter diagram of the logarithm of volume plotted against time. EViews has drawn a straight lineââ¬âa regression lineââ¬âthrough the cloud of points plotted with log ( volume ) on the vertical axis and time on the horizontal. The regression line can be written as an algebraic expression: log ( volume t ) = a + bt Using EViews to estimate a regression lets us replace a and b with numbers 62ââ¬âChapter 3. Getting the Most from Least Squares based on the data in the workfile. In a bit weââ¬â¢ll see that EViews estimates the regression line to be: log ( volume t ) = ââ¬â 2. 629649 + 0. 017278t In other words, the intercept a is estimated to be -2. 6 and the slope b is estimated to be 0. 017. Most data points in the scatter plot fall either above or below the regression line. For example, for observation 231 (which happens to be the first quarter of 1938) the actual trading volume was far below the predicted regression line. In other words, the regression line contains errors which arenââ¬â¢t accounted for in the estimated equation. Itââ¬â¢s standard to write a regression model to include a term u t to account for these errors. (Econometrics texts sometimes use the Greek letter epsilon, e , rather than u for the error term. ) A complete equation can be written as: log ( volume t ) = a + bt + u t Regression is a statistical procedure. As such, regression analysis takes uncertainty into ? account. Along with an estimated value for each parameter (e. g. , b = 0. 017 ) we get: â⬠¢ Measures of the accuracy of each of the estimated parameters and related information for computing hypothesis tests. â⬠¢ Measures of how well the equation fits the data: How much is explained by the estimated values of a and b and how much remains unexplained. â⬠¢ Diagnostics to check up on whether assumptions underlying the regression model seem satisfied by the data. Weââ¬â¢re re-using the data from Chapter 1, ââ¬Å"A Quick Walk Throughâ⬠to illustrate the features of EViewsââ¬â¢ regression procedure. If you want to follow along on the computer, use the workfile ââ¬Å"NYSEVOLUMEâ⬠as shown. A First Regressionââ¬â63 EViews allows you to run a regression either by creating an equation object or by typing commands in the command pane. Weââ¬â¢ll start with the former approach. Choose the menu command Object/New Objectâ⬠¦. Pick Equation in the New Object dialog. The empty equation window pops open with space to fill in the variables you want in the regression. Regression equations are easily specified in EViews by a list in which the first variable is the dependent variableââ¬âthe variable the regression is to explain, followed by a list of explanatoryââ¬âor independentââ¬âvariables. Because EViews allows an expression pretty much anywhere a variable is allowed, we can use either variable names or expressions in our regression specification. We want log ( volume ) for our dependent variable and a time trend for our independent variable. Fill out the equation dialog by entering ââ¬Å"log(volume) c @trendâ⬠. Hint: EViews tells one item in a list from another by looking for spaces between items. For this reason, spaces generally arenââ¬â¢t allowed inside a single item. If you type: log (volume) c @trend youââ¬â¢ll get an error message. 64ââ¬âChapter 3. Getting the Most from Least Squares Exception to the previous hint: When a text string is called for in a command, spaces are allowed inside paired quotes. Reminder: The letter ââ¬Å"Câ⬠in a regression specification notifies EViews to estimate an interceptââ¬âthe parameter we called a above. Hint: Another reminder: @trend is an EViews function to generate a time trend, 0, 1, 2, â⬠¦. Our regression results appear below: The Really Important Regression Results There are 25 pieces of information displayed for this very simple regression. To sort out all the different goodies, weââ¬â¢ll start by showing a couple of ways that the main results might be presented in a scientific paper. Then weââ¬â¢ll discuss the remaining items one number at a time. A favorite scientific convention for reporting the results of a single regression is display the estimated equation inline with standard errors placed below estimated coefficients, looking something like: The Really Important Regression Resultsââ¬â65 log ( volume t ) = ââ¬â 2. 629649 + 0. 017278 ? t , ser = 0. 967362, R = 0. 852357 ( 0. 89576 ) ( 0. 000334 ) 2 Hint: The dependent variable is also called the left-hand side variable and the independent variables are called the right-hand side variables. Thatââ¬â¢s because when you write out the regression equation algebraically, as above, convention puts the dependent variable to the left of the equals sign and the independent variabl es to the right. The convention for inline reporting works well for a single equation, but becomes unwieldy when you have more than one equation to report. Results from several related regressions might be displayed in a table, looking something like Table 2. Table 2 (1) Intercept -2. 629649 (0. 089576) 0. 017278 (0. 000334) ââ¬â (2) -0. 106396 (0. 045666) -0. 000736 (0. 000417) 6. 63E-06 (1. 37E-06) 0. 868273 (0. 022910) 0. 289391 0. 986826 t t 2 log(volume(-1)) ser ââ¬â 0. 967362 0. 852357 R 2 Column (2)? Donââ¬â¢t worry, weââ¬â¢ll come back to it later. Hint: Good scientific practice is to report only digits that are meaningful when displaying a number. Weââ¬â¢ve printed far too many digits in both the inline display and in Table 2 so as to make it easy for you to match up the displayed numbers with the EViews output. From now on weââ¬â¢ll be better behaved. EViews regression output is divided into three panels. The top panel summarizes the input to the regression, the middle panel gives information about each regression coefficient, and the bottom panel provides summary statistics about the whole regression equation. 66ââ¬âChapter 3. Getting the Most from Least Squares The most important elements of EViews regression output are the estimated regression coefficients and the statistics associated with each coefficient. We begin by linking up the numbers in the inline displayââ¬âor equivalently column (1) of Table 2ââ¬âwith the EViews output shown earlier. The names of the independent variables in the regression appear in the first column (labeled ââ¬Å"Variableâ⬠) in the EViews output, with the estimated regression coefficients appearing one column over to the right (labeled ââ¬Å"Coefficientâ⬠). In econometrics texts, regression coefficients are commonly denoted with a Greek letter such as a or b or, occasionally, with a Roman b . In contrast, EViews presents you with the variable names; for example, ââ¬Å"@TRENDâ⬠rather than ââ¬Å" b â⬠. The third EViews column, labeled ââ¬Å"Std. Error,â⬠gives the standard error associated with each regression coefficient. In the scientific reporting displays above, weââ¬â¢ve reported the standard error in parentheses directly below the associated coefficient. The standard error is a measure of uncertainty about the true value of the regression coefficient. The standard error of the regression, abbreviated ââ¬Å"ser,â⬠is the estimated standard deviation of the error terms, u t . In the inline display, ââ¬Å"ser=0. 967362â⬠appears to the right of the regression equation proper. EViews labels the ser as ââ¬Å"S. E. of regression,â⬠reporting its value in the left column in the lower summary block. Note that the third column of EViews regression output reports the standard error of the estimated coefficients while the summary block below reports the standard error of the regression. Donââ¬â¢t confuse the two. The final statistic in our scientific display is R . R measures the overall fit of the regression line, in the sense of measuring how close the points are to the estimated regression line 2 in the scatter plot. EViews computes R as the fraction of the variance of the dependent variable explained by the regression. (See the Userââ¬â¢s Guide for the precise definition. 2 2 Loosely, R = 1 means the regression fit the data perfectly and R = 0 means the regression is no better than guessing the sample mean. Hint: EViews will report a negative R for a model which fits worse than a model consisting only of the sample mean. 2 2 2 The Pretty Important (But Not So Important As the Last Sectionââ¬â¢s) Regression Results Weââ¬â¢re usually most interested in the regression coefficients and the statistical information provided for each one, so letââ¬â¢s continue along with the middle panel. The Pretty Important (But Not So Important As the Last Sectionââ¬â¢s) Regression Resultsââ¬â67 -Tests and Stuff All the stuff about individual coefficients is reported in the middle panel, a copy of which weââ¬â¢ve yanked out to examine on its own. The column headed ââ¬Å"t-Statisticâ⬠reports, not surprisingly, the t-statistic. Specifically, this is the t-statistic for the hypothesis that the coefficient in the same row equals zero. (Itââ¬â¢s computed as the ratio of the estimated coefficient to its standard error: e. g. , 51. 7 = 0. 017 à § 0. 00033 . ) Given that there are many potentially interesting hypotheses, why does EViews devote an entire column to testing that specific coefficients equal zero? The hypothesis that a coefficient equals zero is special, because if the coefficient does equal zero then the attached coefficient drops out of the equation. In other words, log ( volume t ) = a + 0 ? t + u t is really the same as log ( volume t ) = a + u t , with the time trend not mattering at all. Foreshadowing hint: EViews automatically computes the test statistic against the hypothesis that a coefficient equals zero. Weââ¬â¢ll get to testing other coefficients in a minute, but if you want to leap ahead, look at the equation window menu View/Coefficient Testsâ⬠¦. If the t-statistic reported in column four is larger than the critical value you choose for the test, the estimated coefficient is said to be ââ¬Å"statistically significant. â⬠The critical value you pick depends primarily on the risk youââ¬â¢re willing to take of mistakenly rejecting the null hypothesis (the technical term is the ââ¬Å"sizeâ⬠of the test), and secondarily on the degrees of freedom for the test. The larger the risk youââ¬â¢re willing to take, the smaller the critical value, and the more likely you are to find the coefficient ââ¬Å"significant. â⬠Hint: EViews doesnââ¬â¢t compute the degrees of freedom for you. Thatââ¬â¢s probably because the computation is so easy itââ¬â¢s not worth using scarce screen real estate. Degrees of freedom equals the number of observations (reported in the top panel on the output screen) less the number of parameters estimated (the number of rows in the middle panel). In our example, df = 465 ââ¬â 2 = 463 . The textbook approach to hypothesis testing proceeds thusly: 1. Pick a size (the probability of mistakenly rejecting), say five percent. 2. Look up the critical value in a t-table for the specified size and degrees of freedom. 68ââ¬âChapter 3. Getting the Most from Least Squares . Compare the critical value to the t-statistic reported in column four. Find the variable to be ââ¬Å"significantâ⬠if the t-statistic is greater than the critical value. EViews lets you turn the process inside out by using the ââ¬Å"p-valueâ⬠reported in the right-most column, under the heading ââ¬Å"Prob. â⬠EViews has worked the problem backwards an d figured out what size would give you a critical value that would just match the t-statistic reported in column three. So if you are interested in a five percent test, you can reject if and only if the reported p-value is less than 0. 05. Since the p-value is zero in our example, weââ¬â¢d reject the hypothesis of no trend at any size youââ¬â¢d like. Obviously, that last sentence canââ¬â¢t be literally true. EViews only reports p-values to four decimal places because no one ever cares about smaller probabilities. The p-value isnââ¬â¢t literally 0. 0000, but itââ¬â¢s close enough for all practical purposes. Hint: t-statistics and p-values are different ways of looking at the same issue. A t-statistic of 2 corresponds (approximately) to a p-value of 0. 05. In the old days youââ¬â¢d make the translation by looking at a ââ¬Å"t-tableâ⬠in the back of a statistics book. EViews just saves you some trouble by giving both t- and p-. Not-really-about-EViews-digression: Saying a coefficient is ââ¬Å"significantâ⬠means there is statistical evidence that the coefficient differs from zero. Thatââ¬â¢s not the same as saying the coefficient is ââ¬Å"largeâ⬠or that the variable is ââ¬Å"important. â⬠ââ¬Å"Largeâ⬠and ââ¬Å"importantâ⬠depend on the substantive issue youââ¬â¢re working on, not on statistics. For example, our estimate is that NYSE volume rises about one and one-half percent each quarter. Weââ¬â¢re very sure that the increase differs from zeroââ¬âa statement about statistical significance, not importance. Consider two different views about whatââ¬â¢s ââ¬Å"large. â⬠If you were planning a quarter ahead, itââ¬â¢s hard to imagine that you need to worry about a change as small as one and one-half percent. On the other hand, one and one-half percent per quarter starts to add up over time. The estimated coefficient predicts volume will double each decade, so the estimated increase is certainly large enough to be important for long-run planning. More Practical Advice On Reporting Results Now you know the principles of how to read EViewsââ¬â¢ output in order to test whether a coefficient equals zero. Letââ¬â¢s be less coy about common practice. When the p-value is under 0. 05, econometricians say the variable is ââ¬Å"significantâ⬠and when itââ¬â¢s above 0. 05 they say itââ¬â¢s ââ¬Å"insignificant. â⬠(Sometimes a variable with a p-value between 0. 10 and 0. 05 is said to be ââ¬Å"weakly significantâ⬠and one with a p-value less than 0. 01 is ââ¬Å"strongly significant. â⬠) This practice may or may not be wise, but wise or not itââ¬â¢s what most people do. The Pretty Important (But Not So Important As the Last Sectionââ¬â¢s) Regression Resultsââ¬â69 We talked above about scientific conventions for reporting results and showed how to report results both inline and in a display table. In both cases standard errors appear in parentheses below the associated coefficient estimates. ââ¬Å"Standard errors in parenthesesâ⬠is really the first of two-and-a-half reporting conventions used in the statistical literature. The second convention places the t-statistics in the parentheses instead of standard errors. For example, we could have reported the results from EViews inline as log ( volume t ) = ââ¬â 2. 629649 + 0. 017278 ? t , ser = 0. 967362, R = 0. 852357 ( ââ¬â 29. 35656 ) ( 51. 70045 ) 2 Both conventions are in wide use. Thereââ¬â¢s no way for the reader to know which one youââ¬â¢re usingââ¬âso you have to tell them. Include a comment or footnote: ââ¬Å"Standard errors in parenthesesâ⬠or ââ¬Å"t-statistics in parentheses. â⬠Fifty percent of economists report standard errors and fifty percent report t-statistics. The remainder report p-values, which is the final convention youââ¬â¢ll want to know about. Where Did This Output Come From Again? The top panel of regression output, shown on the right, summarizes the setting for the regression. The last line, ââ¬Å"Included observations,â⬠is obviously useful. It tells you how much data you have! And the next to last line identifies the sample to remind you which observations youââ¬â¢re using. Hint: EViews automatically excludes all observations in which any variable in the specification is NA (not available). The technical term for this exclusion rule is ââ¬Å"listwise deletion. â⬠70ââ¬âChapter 3. Getting the Most from Least Squares Big (Digression) Hint: Automatic exclusion of NA observations can sometimes have surprising side effects. Weââ¬â¢ll use the data abstract at the right as an example. Data are missing from observation 2 for X1 and from observation 3 for X2. A regression of Y on X1 would use observations 1, 3, 4, and 5. A regression of Y on X2 would use observations 1, 2, 4, and 5. A regression of Y on both X1 and X2 would use observations 1, 4, and 5. Notice that the fifth observation on Y is zero, which is perfectly valid, but that the fifth observation on log(Y) is NA. Since the logarithm of zero is undefined EViews inserts NA whenever itââ¬â¢s asked to take the log of zero. A regression of log(Y) on both X1 and X2 would use only observations 1 and 4. The variable, X1(-1), giving the previous periodââ¬â¢s values of X1, is missing both the first and third observation. The first value of X1(-1) is NA because the data from the observation before observation 1 doesnââ¬â¢t exist. (There is no observation before the first one, eh? The third observation is NA because itââ¬â¢s the second observation for X1, and that one is NA. So while a regression of Y on X1 would use observations 1, 3, 4, and 5, a regression of Y on X1(-1) would use observations 2, 4, and 5. Moral: When thereââ¬â¢s missing data, changing the variables specified in a regression can a lso inadvertently change the sample. Whatââ¬â¢s the use of the top three lines? Itââ¬â¢s nice to know the date and time, but EViews is rather ungainly to use as a wristwatch. More seriously, the top three lines are there so that when you look at the output you can remember what you were doing. Dependent Variableâ⬠just reminds you what the regression was explainingââ¬â LOG(VOLUME) in this case. ââ¬Å"Methodâ⬠reminds us which statistical procedure produced the output. EViews has dozens of statistical procedures built-in. The default procedure for estimating the parameters of an equation is ââ¬Å"least squares. â⬠The Pretty Important (But Not So Important As the Last Sectionââ¬â¢s) Regression Resultsââ¬â71 The third line just reports the date and time EViews estimated the regression. Itââ¬â¢s surprising how handy that information can be a couple of months into a project, when youââ¬â¢ve forgotten in what order you were doing things. Since weââ¬â¢re talking about looking at output at a later date, this is a good time to digress on ways to save output for later. You can: â⬠¢ Hit the button to save the equation in the workfile. The equation will appear in the workfile window marked with the icon. Then save the workfile. Hint: Before saving the file, switch to the equationââ¬â¢s label view and write a note to remind yourself why youââ¬â¢re using this equation. â⬠¢ Hit the button. â⬠¢ Spend output to a Rich Text Format (RTF) file, which can then be read directly by most word processors. Select Redirect: in the Print dialog and enter a file name in the Filename: field. As shown, youââ¬â¢ll end up with results stored in the file ââ¬Å"some results. rtfâ⬠. â⬠¢ Right-click and choose Select non-empty cells, or hit Ctrl-Aââ¬â itââ¬â¢s the same thing. Copy and then paste into a word processor. Freeze it If you have output that you want to make sure wonââ¬â¢t ever change, even if you change the equation specification, hit . Freezing the equation makes a copy of the current view in the form of a table which is detached from the equation object. (The original equation is unaffected. ) You can then this frozen table so that it will be saved in the workfile. See Chapter 17, ââ¬Å"Odds and Ends. â⬠72ââ¬âChapter 3. Getting the Most from Least Squares Summary Regression Statistics The bottom panel of the regression provides 12 summary statistics about the regression. Weââ¬â¢ll go over these statistics briefly, but leave technical details to your favorite econometrics text or the Userââ¬â¢s Guide. Weââ¬â¢ve already talked about the two most important numbers, ââ¬Å"R-squaredâ⬠and ââ¬Å"S. E. of regression. â⬠Our regression accounts for 85 percent of the variance in the dependent variable and the estimated standard deviation of the error term is 0. 97. Five other elements, ââ¬Å"Sum squared residuals,â⬠ââ¬Å"Log likelihood,â⬠ââ¬Å"Akaike info criterion,â⬠ââ¬Å"Schwarz criterion,â⬠and ââ¬Å"Hannan-Quinn criter. â⬠are used for making statistical comparisons between two different regressions. This means that they donââ¬â¢t really help us learn anything about the regression weââ¬â¢re working on; rather, these statistics are useful for deciding if one model is better than another. For the record, the sum of squared residuals is used in computing F-tests, the log likelihood is used for computing likelihood ratio tests, and the Akaike and Schwarz criteria are used in Bayesian model comparison. The next two numbers, ââ¬Å"Mean dependent varâ⬠and ââ¬Å"S. D. dependent var,â⬠report the sample mean and standard deviation of the left hand side variable. These are the same numbers youââ¬â¢d get by asking for descriptive statistics on the left hand side variables, so long as you were using the sample used in the regression. (Remember: EViews will drop observations from the estimation sample if any of the left-hand side or right-hand side variables are NAââ¬â i. e. , missing. ) The standard deviation of the dependent variable is much larger than the standard error of the regression, so our regression has explained most of the variance in og(volume)ââ¬âwhich is exactly the story we got from looking at the R-squared. Why use valuable screen space on numbers you could get elsewhere? Primarily as a safety check. A quick glance at the mean of the dependent variable guards against forgetting that you changed the units of measurement or that the sample used is so mehow different from what you were expecting. ââ¬Å"Adjusted R-squaredâ⬠makes an adjustment to the plain-old R to take account of the num2 ber of right hand side variables in the regression. R measures what fraction of the variation in the left hand side variable is explained by the regression. When you add another 2 right hand side variable to a regression, R always rises. (This is a numerical property of 2 2 least squares. ) The adjusted R , sometimes written R , subtracts a small penalty for each additional variable added. ââ¬Å"F-statisticâ⬠and ââ¬Å"Prob(F-statistic)â⬠come as a pair and are used to test the hypothesis that none of the explanatory variables actually explain anything. Put more formally, the ââ¬Å"F-sta2 A Multiple Regression Is Simple Tooââ¬â73 tisticâ⬠computes the standard F-test of the joint hypothesis that all the coefficients, except the intercept, equal zero. Prob(F-statistic)â⬠displays the p-value corresponding to the reported F-statistic. In this example, there is essentially no chance at all that the coefficients of the right-hand side variables all equal zero. Parallel construction notice: The fourth and fifth columns in EViews regression output report the t-statistic and corresponding p-value for the hypothesis th at the individual coefficient in the row equals zero. The F-statistic in the summary area is doing exactly the same test for all the coefficients (except the intercept) together. This example has only one such coefficient, so the t-statistic and the F-statistic test exactly the same hypothesis. Not coincidentally, the reported p-values are identical 2 and the F- is exactly the square of the t-, 2672 = 51. 7 . Our final summary statistic is the ââ¬Å"Durbin-Watson,â⬠the classic test statistic for serial correlation. A Durbin-Watson close to 2. 0 is consistent with no serial correlation, while a number closer to 0 means there probably is serial correlation. The ââ¬Å"DW,â⬠as the statistic is known, of 0. 095 in this example is a very strong indicator of serial correlation. EViews has extensive facilities both for testing for the presence of serial correlation and for correcting regressions when serial correlation exists. Weââ¬â¢ll look at the Durbin-Watson, as well as other tests for serial correlation and correction methods, later in the book. (See Chapter 13, ââ¬Å"Serial Correlationââ¬âFriend or Foe? â⬠). A Multiple Regression Is Simple Too Traditionally, when teaching about regression, the simple regression is introduced first and then ââ¬Å"multiple regressionâ⬠is presented as a more advanced and more complicated technique. A simple regression uses an intercept and one explanatory variable on the right to explain the dependent variable. A multiple regression uses one or more explanatory variables. So a simple regression is just a special case of a multiple regression. In learning about a simple regression in this chapter youââ¬â¢ve learned all there is to know about multiple regression too. Well, almost. The main addition with a multiple regression is that there are added right hand-side variables and therefore added rows of coefficients, standard errors, etc. The model weââ¬â¢ve used so far explains the log of NYSE volume as a linear function of time. Letââ¬â¢s add two more variables, time-squared and lagged log(volume), hoping that time and timesquared will improve our ability to match the long-run trend and that lagged values of the dependent variable will help out with the short run. In the last example, we entered the specification in the Equation Estimation dialog. I find it much easier to type the regression command directly into the command pane, although the 74ââ¬âChapter 3. Getting the Most from Least Squares method you use is strictly a matter of taste. The regression command is ls followed by the dependent variable, followed by a list of independent variables (using the special symbol ââ¬Å"Câ⬠to signal EViews to include an intercept. ) In this case, type: ls log(volume) c @trend @trend^2 log(volume(-1)) and EViews brings up the multiple regression output shown to the right. You already knew some of the numbers in this regression because they appeared in the second column in Table 1 on page 65. When you specify a multiple regression, EViews gives one row in the output for each independent variable. Hint: Most regression specifications include an intercept. Be sure to include ââ¬Å"Câ⬠in the list of independent variables unless youââ¬â¢re sure you donââ¬â¢t want an intercept. Hint: Did you notice that EViews reports one fewer observation in this regression than in the last, and that EViews changed the first date in the sample from the first to the second quarter of 1888? This is because the first data we can use for lagged volume, from second quarter 1888, is the (non-lagged) volume value from the first quarter. We canââ¬â¢t compute lagged volume in the first quarter because that would require data from the last quarter of 1887, which is before the beginning of our workfile range. Hypothesis Testing Weââ¬â¢ve already seen how to test that a single coefficient equals zero. Just use the reported tstatistic. For example, the t-statistic for lagged log(volume) is 37. 89 with 460 degrees of freedom (464 observations minus 4 estimated coefficients). With EViews itââ¬â¢s nearly as easy to test much more complex hypotheses. Hypothesis Testingââ¬â75 Click the button and choose Coefficient Diagnostics/Wald ââ¬â Coefficient Restrictionsâ⬠¦ to bring up the dialog shown to the right. In order to whip the Wald Test dialog into shape you need to know three things: â⬠¢ EViews names coefficients C(1), C(2), C(3), etc. numbering them in the order they appear in the regression. As an example, the coefficient on LOG(VOLUME(-1)) is C(4). â⬠¢ You specify a hypothesis as an equation restricting the values of the coefficients in the regression. To test that the coefficient on LOG(VOLUME(-1)) equals zero, specify ââ¬Å"C(4)=0â⬠. â⬠¢ If a hypothesis involves multiple restrictions, you enter multiple coefficient equations separated by commas. Letââ¬â¢s work through some examples, starting with the one we already know the answer to: Is the coefficient on LOG(VOLUME(-1)) significantly different from zero? Hint: We know the results of this test already, because EViews computed the appropriate test statistic for us in its standard regression output. 76ââ¬âChapter 3. Getting the Most from Least Squares Complete the Wald Test dialog with C(4)=0. EViews gives the test results as shown to the right. EViews always reports an F-statistic since the F- applies for both single and multiple restrictions. In cases with a single restriction, EViews will also show the t-statistic. Hint: The p-value reported by EViews is computed for a two-tailed test. If youââ¬â¢re interested in a one-tailed test, youââ¬â¢ll have to look up the critical value for yourself. Suppose we wanted to test whether the coefficient on LOG(VOLUME(-1)) equaled one rather than zero. Enter ââ¬Å"c(4)=1â⬠to find the new test statistic. So this hypothesis is also easily rejected. Hypothesis Testingââ¬â77 Econometric theory warning: If youââ¬â¢ve studied the advanced topic in econometric theory called the ââ¬Å"unit root problemâ⬠you know that standard theory doesnââ¬â¢t apply in this test (although the issue is harmless for this particular set of data). Take this as a reminder that you and EViews are a team, but youââ¬â¢re the brains of the outfit. EViews will obediently do as itââ¬â¢s told. Itââ¬â¢s up to you to choose the proper procedure. EViews is happy to test a hypothesis involving multiple coefficients and nonlinear restrictions. To test that the sum of the first two coefficients equals the product of the sines of the second two coefficients (and to emphasize that EViews is perfectly happy to test a hypothesis that is complete nonsense) enter ââ¬Å"c(1)+c(2)=sin(c(3))+sin(c(4))â⬠. Not only is the hypothesis nonsense, apparently itââ¬â¢s not true. 78ââ¬âChapter 3. Getting the Most from Least Squares A good example of a hypothesis involving multiple restrictions is the hypothesis that there is no time trend, so the coefficients on 2 both t and t equal zero. Hereââ¬â¢s the Wald Test view after entering ââ¬Å"c(2)=0, c(3)=0â⬠. The hypothesis is rejected. Note that EViews correctly reports 2 degrees of freedom for the test statistic. Representing The Representations view, shown at the right, doesnââ¬â¢t tell you anything you donââ¬â¢t already know, but it provides useful reminders of the command used to generate the regression, the interpretation of the coefficient labels C(1), C(2), etc. and the form of the equation written out with the estimated coefficients. Hint: Okay, okay. Maybe you didnââ¬â¢t really need the representations view as a reminder. The real value of this view is that you can copy the equation from this view and then paste it into your word processor, or into an EViews batch program, or even into Excel, where wi th a little judicious editing you can turn the equation into an Excel formula. Whatââ¬â¢s Left After Youââ¬â¢ve Gotten the Most Out of Least Squares Our regression equation does a pretty good job of explaining log(volume), but the explanation isnââ¬â¢t perfect. What remainsââ¬âthe difference between the left-hand side variable and the value predicted by the right-hand sideââ¬âis called the residual. EViews provides several tools to examine and use the residuals. Whatââ¬â¢s Left After Youââ¬â¢ve Gotten the Most Out of Least Squaresââ¬â79 Peeking at the Residuals The View Actual, Fitted, Residual provides several different ways to look at the residuals. Usually the best view to look at first is Actual, Fitted, Residual/Actual, Fitted, Residual Graph as illustrated by the graph shown here. Three series are displayed. The residuals are plotted against the left vertical axis and both the actual (log(volume)) and fitted (predicted log(volume)) series are plotted against the vertical axis on the right. As it happens, because our fit is quite good and because we have so many observations, the fitted values nearly cover up the actual values on the graph. But from the residuals itââ¬â¢s easy to see two facts: our model fits better in the later part of the sample than in the earlier yearsââ¬âthe residuals become smaller in absolute valueââ¬âand there are a very small number of data points for which the fit is really terrible. 80ââ¬âChapter 3. Getting the Most from Least Squares Points with really big positive or negative residuals are called outliers. In the plot to the right we see a small number of spikes which are much, much larger than the typical residual. We can get a close up on the residuals by choosing Actual, Fitted, Residual/Residual Graph. It might be interesting to look more carefully at specific numbers. Choose Actual, Fitted, Residual/Actual, Fitted, Residual Table for a look that includes numerical values. You can see enormous residuals in the second quarter for 1933. The actual value looks out of line with the surrounding values. Perhaps this was a really unusual quarter on the NYSE, or maybe someone even wrote down the wrong numbers when putting the data together! Grabbing the Residuals Since there is one residual for each observation, you might want to put the residuals in a series for later analysis. Fine. All done. Without you doing anything, EViews stuffs the residuals into the special series each estimation. You can use RESID just like any other series. after Quick Reviewââ¬â81 Resid Hint 1: That was a very slight fib. EViews wonââ¬â¢t let you include RESID as a series in an estimation command because the act of estimation changes the values stored in RESID. Resid Hint 2: EViews replaces the values in RESID with new residuals after each estimation. If you want to keep a set, copy them into a new series as in: series rememberresids = resid before estimating anything else. Resid Hint 3: You can store the residuals from an equation in a series with any name you like by using Proc/Make Residual Seriesâ⬠¦ from the equation window. Quick Review To estimate a multiple regression, use the ls command followed first by the dependent variable and then by a list of independent variables. An equation window opens with estimated coefficients, information about the uncertainty attached to each estimate, and a set of summary statistics for the regression as a whole. Various other views make it easy to work with the residuals and to test hypotheses about the estimated coefficients. In later chapters we turn to more advanced uses of least squares. Nonlinear estimation is covered, as are methods of dealing with serial correlation. And, predictably, weââ¬â¢ll spend some time talking about forecasting. 82ââ¬âChapter 3. Getting the Most from Least Squares
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.