Monday, 22 January 2018

Data Handling: Interpretation and Discussion of Results in Scientific Economic Research

Philip O. Alege, Ph.D
Professor of Economics
                                 Department of Economics and Development Studies                                
Covenant University, Ota, Ogun State

The tools available to modern economists in the discharge of functions as an analyst are very many simply because of various infiltrations of knowledge from other sciences into the discipline of economics such as physics, biology, mechanical engineering and particularly mathematics and statistics. Today, modern economies will be difficult to analyse, understand and predict without the tools of mathematics, statistics and in particular econometrics. This can be explained by virtue of the growing number of economic activities and interactions among the different agents in a given country and between/among countries. There is a school of thought that believes in more of economics and little of mathematics. There is also another school of thought that believes in substantial application of the tools of mathematics in economics as necessary to get the “useful” results from our analysis. Though I belong to the latter school, I do also contend that things must be done properly.

Basics of Econometric Modeling
Basically, econometrics is to provide empirical support for economic data. Its main purpose is to estimate the parameter(s) of a model that capture the behaviour of economic agent(s) as described by the theory and the model. Since the estimated parameters may be useful in understanding the economic theory, for policy analysis and forecasting, it becomes necessary on the econometrician to obtain parameters that are efficient. In order to achieve this, we must adhere to some principles of model building that can generate results whose interpretations and discussions will be useful for policy analysis as well as decision making. These are listed as follows:
·    Economic theory applicable to the specific area of the research
·    Design of the mathematical model and the hypotheses of the study
· The quest to obtain the right economic statistics i.e. the collection, collation and analysis of requisite data for the research, and
·  Interpretation/discussion of the findings/results

The researcher should keep in mind that econometric models are tools and therefore means to some desired ends. That is, our professional calling is to provide plausible parameter estimates that should be useful for policy analysis and decision making. Therefore, any mathematical and/or statistical model must be able to deliver these objectives of the researcher in an efficient manner.

Model Specification and Estimation Techniques
Consequently, model specification is the nucleus/DNA of any scientific economic research. I usually call it the economics of the study. It shows the depth of the researcher in the knowledge of theoretical economics as well as ability to state clearly the contribution(s) to knowledge as envisaged in the study. The latter may come as:
·      Additional variable to existing theoretical model
·  A single equation now specified as system of equations in order to capture a phenomenon hitherto not considered, or
·      Application of a technique not commonly used in our own environment.

Once the model is correctly specified, the next step is to consider the estimation technique that will produce the most efficient estimates of the parameters of the model. It is important to use a technique of estimation that will deliver the objective(s) of the study. It is apposite to mention some estimation techniques at this stage. It should, however, be noted that the list is not exhaustive. Some of these are as follows: ordinary least squares (OLS), indirect least squares (ILS), instrumental variables (IL), two stage least squares (2SLS): in the case of system of simultaneous equation, three stage least squares (3SLS), error correction model (ECM) which examines short-run dynamics, cointegration regression, generalised method of moments (GMM), vector autoregressive method (VAR) which examines the effect of shocks on a system, structural vector autoregressive method (SVAR), panel data method, panel vector autoregressive method (PVAR), panel structural vector autoregressive method (PSVAR), vector error correction (VECM), panel cointegration, panel vector error correction (PVECM) and so on.
Some learning resource materials are, but not limited to:
1.    Gujarati D. N. (2013). Basic Econometrics, Eight Edition, McGraw-Hill International Editions Economic Series, Glasgow
2.    Maddala, G. S. and Lahiri, K. (2009). Introduction to Econometrics. Fourth Edition. John Wiley
3.    Wooldridge J. M. (2009). Introductory Econometrics, Fourth Edition, South-Western Cengage Learning, Mason, U.S.A

Dynamic General Equilibrium (DGE) Models
There are lots of other techniques of estimation that should be of interest to the younger generations of economists. The basic framework is the dynamic general equilibrium (DGE) theories. Models built around this method are solved using the DYNARE codes in the MATLAB environment or directly using the Matlab codes written for such models. As part of the estimation is the need to calibrate the model. This consists of finding values for some parameters in the model though theoretical knowledge, calculating long-run averages as well as micro-econometric studies. The statistics often used are derived from the Bayesian inference as against the classical statistics referred to in the preceding paragraphs. Some of these models are: real business cycle (RBC), New Keynesian models (NKM), dynamic stochastic general equilibrium (DSGE), over-lapping generation (OLG), computable general equilibrium (CGE), dynamic computable general equilibrium (DCGE), Bayesian vector autoregression (BVAR), Bayesian structural vector autoregression (BSVAR), dynamic macro panels (DMP), augmented gravity models (AGM) and multicounty New Keynesian (MCNK) models.
Some learning resource materials are:
1.    Wichens, M. (2008). Macroeconomic Theory: A Dynamic General Equilibrium Approach. Princeton University Press, Princeton
2.    Canova, F. (undated). Methods for applied Macroeconomic Research
3.    Dejong, D. N. and Dave, C. (2007). Structural Macroeconometrics, Princeton University Press, Princeton.
4.    Cooley, T. F. (ed.) (1995). Frontiers of Business Cycle Research. Princeton University Press, Princeton.
5.    McCandless, G. (2008). The ABCs of RBCs: An Introduction to Dynamic Macroeconomic Models. Harvard University Press; and
6.    Lucas, R. E. (1991). Models of Business Cycles.

It is apposite to state that researchers must have a working understanding of the tests that must be carried out under each technique of estimation. I need to also draw the attention of interested researcher in the area of dynamic general equilibrium because it requires adequate knowledge of computational economics. Specifically, you need sound working knowledge of the following: dynamic optimization, method of Lagrange multipliers, continuous-time optimization, dynamic programming, stochastic dynamic optimization, time-consistency and time-inconsistency and linear rational-expectation models.
Some learning resource materials
1.    Dadkhah, K. (undated) Foundation of Mathematical and Computational Economics, Thomson South-Western.

Interpretation of Results
In interpreting the results of an econometric model, you have the choice of the most appropriate method for your work either the classical or Bayesian statistics as mentioned above. This aspect of the work constitutes the scientific content emanating from economic statistics and mathematical economics. In this case, we should be addressing statistics such as:
·      R-squared
·      Adjusted R-squared (“goodness of fit” test)
·      F-statistics
·      Durbin-Watson statistic
These, in addition to the test of heteroscedasticity constitute the “diagnostic tests”. Once they fail to fall within the zones of acceptance, we cannot go ahead to test for the significance of each variable. There may be the need for: model re-specification, detection and correction of autocorrelation, and/or detection and correction of multicollinearity.

We may also need to test for heteroscedasticity. The occurrence of any of this is an evidence of the violation of assumption(s) of the technique being applied. This is followed by the statistics to test the significance of the individual variables included in the model. This was the standard during the époque of almighty OLS. Later in the history of applied econometrics, it was observed that certain time-series are non-stationary, i.e. their means, variances and covariances are not constant over time. In such situation regression results are generally meaningless and are, therefore, termed spurious. In order to correct for the latter, the statistics often used to examine the stationarity of time series include the following: Dickey-Fuller test, “augmented” Dickey-Fuller test in the presence of error term that is none white noise, Panel data unit root tests, co-integration tests and error correction model (ECM), to mention a few. The use of any of these tests should be in response to the objective of the researcher and the desired contribution(s) to knowledge.

Some Pitfalls in Econometrics
·      The wrong way to go in modelling
How one interprets the coefficients in regression models will be a function of how the dependent (y) and independent (x) variables are measured. In general, there are three main types of variables used in econometrics: (1) continuous variables, (2) the natural logarithm of continuous variables, and (3) dummy variables.

·      Some Specific Rules of Thumb from Statistics

After performing a regression analysis:
1.    Look at the number of observations:
·      Is your result in line with a priori expectation?
·      If not, you should find out why.
·      Remember, any observations with missing values will be dropped from the regression.
·      Do not take the logarithm of a variables whose value equals zero. The model will not run, simple.
·      Ensure the number of observations in your model falls within the rule i.e. sample size should be greater than or equals to 30 (the law of large numbers).

2.    Observe the value of the R2:
·      The R2 tells you the percentage of the total variation in the dependent variable that the independent variables of your model “explains”.
·      This should be less than 1. The rest is the error term.
·      Suppose an estimated model of R2 = 0.46. This means that 46% of the total variation in the dependent variable is explained by the independent variables. This is not a “good fit”.
·      For a regression to have a good fit then we must have a result such that 0.5<R2<1. This is in the case of a time series regression.
·      However, it is considered good for cross-section data and very good for panel data.

Problems with R2:
·      If you have a ‘very low’ R2, have a rethink about whether you might have omitted some important variables.
·      However, be careful not to include unnecessary variables only to increase your R2
·      A ‘very high’ R2 could indicate several problems.
·      Firstly, if a high R2 is combined with many statistically significant variables, your independent variables might be highly correlated amongst themselves (multicollinearity).
·      You might consider dropping some in the interest of parsimony.
·      It might be an indication that you have mis-specified your model.

The adjusted R2:
·      Adjust the R2 to penalize the inclusion of more variables. i.e. correct for the degree of freedom.
·      Include as many variables as you need but keep your model as parsimonious as possible. Observe the rules guiding this.

3.    Look at the F-test.
·      The F-test aims at the “joint significance” of the model.
·      More formally it is a test of whether all your coefficients are jointly equal to zero under the null hypothesis.
·      If they are, effectively your model is not really explaining anything. Hint: ideally you want a high F-value, and a low corresponding p-value 

4.    Interpret the signs of the coefficients.
·      Which ones should be positive and which should be negative from the theoretical perspective? Interpret this!
·      A positive coefficient means that variable has a positive impact on your dependent variable, and a negative one has a negative impact or inverse relationship.

5.    Interpret the size of the coefficients where relevant.
·      If you obtain a statistically significant coefficient-wonderful!
·      So maybe you’ve found consumption increases with disposable income. But by how much? Is it close to 1 by which the marginal propensity to consume is high and the marginal propensity to save is low? What would be the effect of this on the economy?

6.    Look at the significance of the coefficients (most important?).
·      This should in fact become the first thing that your eyes drift towards when you get regression output.
·      You should feel a little hint of excitement as you are waiting to find out whether your model works and whether your theory has been proved correct or not.
·      The test of significance is designed to test whether a coefficient is significantly different from zero or not.
·      If it is not, then you must conclude that your explanatory variable does not, in fact, explain at all your dependent variable.
·      We use t - test (just like we learnt in first year statistics) to test this so that we compare a t - value taken from the table (at a given significance level, α, with n - k degree of freedom) with a calculated t, where n = number of observations and k = number of parameters estimated/independent variables; n – k = degree of freedom.


7.    Others
·      Other tests follow, such as testing for normality of error terms, checking for existence of heteroscedasticity, performing specification and robustness tests.
·      But these exciting topics are to be covered if your econometric work would have any useful output valuable for policy making and decision making.

Discussion of Results
The essence of a scientific economic research is to build economic models that enable us obtain plausible estimates from given set of data. We should know that the structural parameters estimated encapsulate our behaviour and, therefore, in discussing them, we need to go beyond the confine of economics to locate additional means of buttressing our results from:
·      historical context
·      socio-political condition, and
·      psychological state as well as
·      international environment

I have tried to raise some important issues in this post. There are so many things to keep in mind when preparing a research work. The most important of them all is the need to keep your model simple and avoid frivolities in modeling. It is important to remember that we are first of all economists. The tools of analysis at our disposal should not overshadow that calling.

Quite a lot has been said about how a researcher in the field of economics can handle data, interpret and discuss research findings that will be relevant for policy-making. If you still have further questions or comments in this regard, kindly post them below for the benefit of all.

Post your comments and questions….