Wednesday 7 February 2018

Time Series Analysis (Lecture 1): Sourcing Data, Theoretical Framework and Model Specification

Caution: This tutorial is only a guide and should not be adopted in its entirety. Endeavour to consult your tutor and other resource materials for proper guidance!


Introduction

The dissertation fervor is heating up with the usual twists and turns. In view of these and in response to readers’ requests, I will be starting a series of lectures on how to run time series and panel data analyses. These will be in parts and supported with short video tutorials posted to YouTube (so ensure to hook up to get the hands-on training). In order not to leave anyone out, these practical lectures will be carried out using three (3) analytical packages that is common among final-year students – Stata, EViews and Excel. Also, real country-level data will be used (but subject to my modifications to prevent unethical conduct from readers). Lastly, only quantitative research will be addressed.

For time series analysis, the lectures will only cover: data sourcing, model specification, lag selection, unit root testing, cointegration test, vector autoregressive model (VAR), autoregressive distributed lag model (ARDL), vector error correction mechanism (VECM), Granger causality tests, CUSUMSQ test and other post-estimation tests. While for panel data analysis, the lectures will only cover: setting up a panel data in Stata and EViews, data sourcing, model specification, Hausman test, fixed effects (FE) model, random effects (RE) model and generalised methods of moments (GMM).

So, in order to get prompt tutorials, the moment I click the “post” button, I will encourage you to subscribe for these blog posts. Use the “Follow by Email” menu on my blog https://cruncheconometrix.blogspot.com.ng, activate the link once you receive the notification in your email (check your spam box too) and you are good to go! Likewise, follow that up by subscribing to my YouTube videos for those short hands-on video clips. Click on this link: CrunchEconometrix YouTube videos and subscribe!

Data Sourcing
“I can’t get data!!!”, “what’s a proxy?”, “I have data but not for 30 years”,“ how do I go about modeling my theoretical framework?”, “how do I construct my empirical model?”, “in fact, I’m confused!”…so many questions and believe me the chattering seems endless. First, I always tell students to relax! Secondly, I tell them that the moment the research area has been identified, and the topic streamlined, the next thing to do is to go on data-search. Okay, think about this: of what use is an empirical research if there is no data (or you have insufficient data to test your hypothesis)? So before, you proceed to writing chapter 1 (that is, the study background), make certain that you have the data handy.

Primary Data Sources
Regardless of the field of study or research discipline, primary data gathering requires the use of questionnaires, interviews, focus group discussions etc. It may require one of these or a combination of 2 or 3 data-gathering methods. So, if you are using primary data, ensure to get out these materials and distribute to the respondents in order to harvest responses within the shortest time frame. Getting a good number of responses is a precursor to having a quality research and unbiased results. However, these structured tutorials will not be extended to analysing primary data….my sincere apologies!

Secondary Data Sources
Since, research is not limited to those in the field of economics, it is important that researchers identify those databases hosting the relevant data required for their work. As an economist, I will indicate some databases/sources where students can go source for their data. Here are some which can be accessed (for macro and micro datasets):
IEA Coal Information
IEA CO2 Emissions from Fuel Combustion
IEA Electricity information
IEA Energy Prices and Taxes
IEA Energy Technology Research and Development Database
IEA Natural Gas Information
IEA Oil Information
IEA Renewables Information
IEA World Energy Statistics and Balances
ILO Key Indicators of the Labour Market
IMF Balance of Payment Statistics
IMF Direction of Trade Statistics
IMF Government Finance Statistics
IMF International Financial Statistics
IMF World Economic Outlook
OECD Education Statistics
OECD Globalisation
OECD International Development
OECD International Direct Investment Statistics
OECD International Migration Statistics
OECD International Trade by Commodities Statistics
OECD Main Economic Indicators
OECD Main Science and Technology Indicators
OECD National Accounts
OECD Quarterly Labour Force Statistics
OECD Services Statistics
OECD Social Expenditure Database
OECD Structural Analysis
UNIDO Industrial Demand Supply
UNIDO Industrial Statistics
World Bank Global Development Finance
World Bank World Development Indicators
World Bank Africa Development Indicators

Other sources of international data include but not limited to:
International Monetary Fund - http://www.imf.org/external/data.htm#data
United Nations - http://data.un.org/
Data on aid flows complied by OECD - http://www.oecd.org/dac/stats/
NBER data sets - http://www.nber.org/data/

For information from over 256 and regions since 1960, the accessible databases are:
World Development Indicators
Global Development Finance
The African Development Indicators
Doing Business
Education Statistics
Enterprise Surveys
Gender Statistics
Health Nutrition and Population Statistics
Millennium Development Goals
Worldwide Governance Indicators
Endeavour to check out those sites that are relevant to your study.
Note: it is expected that you state your data source in your thesis/dissertation and the years of coverage say 1980 to 2016, or 1970 to 2015 etc.

Model Framework and Specification
This section focusses on the theoretical framework and model specification. I will also touch on description of variables in a model, the a priori expectations and finally, the method of analysis (or the estimation technique(s) to be used in testing the research hypothesis).

Theoretical Framework
Before you specify the empirical model, you must first state the theoretical model. That is, let your readers know where your empirical model is linked to. The theoretical model is that model supporting the theory you are using to undertake your research because no research can be done in isolation without an underlying theory. For instance, if my study is on the impact of exchange rate on output, then I must look for a suitable theory which I can adapt to my research. Hence, I may decide to use the “monetary model of exchange rate” which is one of the earliest models used to determine the exchange rate. It is used as a measure to study the other approaches that are used in determining exchange rate. The monetary model approach assumes a simple demand for money curve, the purchasing power parity or the law of one price and a vertical aggregate supply curve.

The theoretical framework can be built like this: (remember that this is just an example, and not to be copied literarily!)

From the absolute purchasing power parity (P = EP*), the exchange rate is obtained by dividing the price of the domestic currency by the foreign price for that domestic currency. That is: Eppp = P/P*. The demand for money assumption: since real money balance depends on real income, demand for money is given as Md = kPY, where k is constant and Y is the real income level. Hence, in equilibrium, money demand (Md) equals money supply (Ms) and at the point of intersection of the aggregate demand and the aggregate supply curve:
kPY=Ms
P = Ms/kY
EP* = P = Ms/kY
and E = Ms/P*kY

From the stated framework, it is theorised that if the money supply within an economy increases, it will result in appreciation of the domestic currency. Likewise, foreign price level and the output level are inversely related to the exchange rate. If fixed money supply rises in the domestic economy, since prices are held constant, excess money supply leads to higher demand for goods and services within the economy.

Model Specification
So, having stated the theoretical framework, I can now go ahead to modify it to suit my research and form there formulate my empirical model. For instance, in using a Cobb-Douglas production from the neo-classical growth mode, I will attempt to explain output growth in the context of capital accumulation, labour and productivity, usually referred to as technological progress. Focusing on a closed economy, I can implicitly express the Cobb- Douglas production model as:

Y = f(ALβKα)                                                                                    [1]
where, Y is output; K is capital stock; L is labour and A is productivity of labour which grows at an exogenous rate. As a result of constant returns to scale, if all inputs are increased by the same amount, then there would be an increase in output. The production function,

Y = KαL1-α                                                                                         [2]
where (1 - a = b) is mainly used by economists and researchers due to the following reasons: firstly, there is a constant return to scale and secondly, the two exponents α and (1 - a), sum up to one.

Next, is to tie up my empirical model to the theoretical framework. That is given the relationship between exchange rate and output, I can specify the model implicitly as:

Yt = f (Exchratet, X1t, X2t, …, Xnt)                                                  [3]
where Y = output (the dependent variable, state the measurement either gross output, or % of GDP, or growth rate etc.)
Exchrate = real exchange rate (main explanatory variable)
X1, X2, …, Xn = control variables (state their individual measurements either gross output, or % of GDP, or growth rate etc.)

On the basis of the theoretical framework and using the Cobb-Douglas production, the explicit model is stated as:
Yt = β0 + β1Exchratet + β2X1t + β3X2t + … + βnXnt + ut               [4]
where, u = error term

 A Priori Expectations
Always know that the expected a priori is directly related to what theory says. It is from that you know what signs of the coefficients are expected from the main regressor and other covariates. For instance, from the theory, it is expected that currency depreciation will have a positive impact on domestic output, hence, a negative sign of the coefficient is expected. That is:

β1 < 0

Therefore, the expected signs of the control variables must be in line with their respective theories which must be related to your study.

Estimation Technique
At this point, the researcher may not know the exact technique to adopt between the vector autoregressive model (VAR) and the autoregressive distributed lag (ARDL) model. The choice between these two is subject to the outcome of the unit root test (URT) on each of the variables used in the model. This implies that, until the URT is carried out, one cannot know whether to use VAR or ARDL model. This is because if the variables are integrated of the same order, the VAR model is applicable but the ARDL model suffices if otherwise. I will discuss these two in detail in subsequent tutorials using both Stata and EViews analytical softwares in addition to video clips on how to perform them.

Variables, Measurement and Description
Lastly, tabulate your variables detailing their names, short description, measurement and sources.

Here’s an example:
Table xxx: Variables Description and Measurement
Variables
Short Definition
Measurement
Source
Output


World Bank (2016)
Real exchange rate


World Bank (2016)
Source: Researcher’s compilation (always put this at the bottom of the Table)

Conclusion
I have taken you through the steps required for sourcing your data, formulating your theoretical framework, adapting the framework to align with your research, constructing your empirical model, stating the expected a priori, having an idea about the estimation technique and tabulating your data showing the brief description of your variables, their measurements and data sources.

From next lecture, I will begin analysing the data using both Stata and EViews analytical packages. So, endeavour to follow these tutorials by getting the most of it to ease the dissertation pressure. Make sure you follow me on the next lecture series which is: Time Series Analysis (Lecture 2): Optimal lag selection.

If you have any comments or question in relation to what have been discussed in this post, do not hesitate to post them in the comment section below….

No comments:

Post a Comment