Caution: This tutorial is only a guide and should not be adopted in its entirety. Endeavour to consult your tutor and other resource materials for proper guidance!
Introduction
The dissertation fervor is heating up with the usual twists and turns. In view of these and in response to readers’ requests, I will be starting a series of lectures on how to run time series and panel data analyses. These will be in parts and supported with short video tutorials posted to YouTube (so ensure to hook up to get the hands-on training). In order not to leave anyone out, these practical lectures will be carried out using three (3) analytical packages that is common among final-year students – Stata, EViews and Excel. Also, real country-level data will be used (but subject to my modifications to prevent unethical conduct from readers). Lastly, only quantitative research will be addressed.
For time series analysis, the lectures will only cover:
data sourcing, model specification, lag selection, unit root testing,
cointegration test, vector autoregressive model (VAR), autoregressive
distributed lag model (ARDL), vector error correction mechanism (VECM), Granger
causality tests, CUSUMSQ test and other post-estimation tests. While for panel data analysis,
the lectures will only cover: setting up a panel data in Stata and EViews, data
sourcing, model specification, Hausman test, fixed effects (FE) model, random
effects (RE) model and generalised methods of moments (GMM).
So, in order to get prompt tutorials, the moment I click the “post” button, I will encourage you to subscribe for these blog posts. Use the “Follow by Email” menu on my blog https://cruncheconometrix.blogspot.com.ng, activate the link once you receive the notification in your email (check your spam box too) and you are good to go! Likewise, follow that up by subscribing to my YouTube videos for those short hands-on video clips. Click on this link: CrunchEconometrix YouTube videos and subscribe!
Data
Sourcing
“I can’t get
data!!!”, “what’s a proxy?”, “I have data but not for 30 years”,“ how do I go
about modeling my theoretical framework?”, “how do I construct my empirical
model?”, “in fact, I’m confused!”…so many questions and believe me the
chattering seems endless. First, I always tell students to relax! Secondly, I
tell them that the moment the research area has been identified, and the topic
streamlined, the next thing to do is to go on data-search. Okay, think about this:
of what use is an empirical research if there is no data (or you have
insufficient data to test your hypothesis)? So before, you proceed to writing
chapter 1 (that is, the study background), make certain that you have the data
handy.
Primary
Data Sources
Regardless of
the field of study or research discipline, primary data gathering requires the
use of questionnaires, interviews, focus group discussions etc. It may require
one of these or a combination of 2 or 3 data-gathering methods. So, if you are using
primary data, ensure to get out these materials and distribute to the
respondents in order to harvest responses within the shortest time frame. Getting
a good number of responses is a precursor to having a quality research and unbiased
results. However, these structured tutorials will not be extended to analysing
primary data….my sincere apologies!
Secondary
Data Sources
Since, research
is not limited to those in the field of economics, it is important that
researchers identify those databases hosting the relevant data required for
their work. As an economist, I will indicate some databases/sources where
students can go source for their data. Here are some
which can be accessed (for macro and micro datasets):
IEA Coal Information
IEA CO2 Emissions from Fuel Combustion
IEA Electricity information
IEA Energy Prices and Taxes
IEA Energy Technology Research and Development
Database
IEA Natural Gas Information
IEA Oil Information
IEA Renewables Information
IEA World Energy Statistics and Balances
ILO Key Indicators of the Labour Market
IMF Balance of Payment Statistics
IMF Direction of Trade Statistics
IMF Government Finance Statistics
IMF International Financial Statistics
IMF World Economic Outlook
OECD Education Statistics
OECD Globalisation
OECD International Development
OECD International Direct Investment Statistics
OECD International Migration Statistics
OECD International Trade by Commodities Statistics
OECD Main Economic Indicators
OECD Main Science and Technology Indicators
OECD National Accounts
OECD Quarterly Labour Force Statistics
OECD Services Statistics
OECD Social Expenditure Database
OECD Structural Analysis
UNIDO Industrial Demand Supply
UNIDO Industrial Statistics
World Bank Global Development Finance
World Bank World Development Indicators
World Bank Africa Development Indicators
Other sources of international data include but not
limited to:
For information from over 256 and regions since 1960,
the accessible databases are:
World Development Indicators
Global Development Finance
The African Development Indicators
Doing Business
Education Statistics
Enterprise Surveys
Gender Statistics
Health Nutrition and Population Statistics
Millennium Development Goals
Worldwide Governance Indicators
Endeavour to
check out those sites that are relevant to your study.
Note: it is expected that you state your data source in
your thesis/dissertation and the years of coverage say 1980 to 2016, or 1970 to
2015 etc.
Model
Framework and Specification
This section
focusses on the theoretical framework and model specification. I will also
touch on description of variables in a model, the a priori expectations and finally, the method of analysis (or the
estimation technique(s) to be used in testing the research hypothesis).
Theoretical Framework
Before you
specify the empirical model, you must
first state the theoretical model. That is, let your readers know where your
empirical model is linked to. The theoretical model is that model supporting
the theory you are using to undertake your research because no research can be
done in isolation without an underlying theory. For instance, if my study is on
the impact of exchange rate on output, then I must look for a suitable theory which
I can adapt to my research. Hence,
I may decide to use the “monetary model of exchange rate” which is one of the earliest
models used to determine the exchange rate. It is used as a measure to study
the other approaches that are used in determining exchange rate. The monetary
model approach assumes a simple demand for money curve, the purchasing power
parity or the law of one price and a vertical aggregate supply curve.
The theoretical framework can be built like this: (remember that this is
just an example, and not to be copied literarily!)
From the absolute purchasing power parity (P =
EP*), the exchange rate is obtained by dividing the price of the domestic
currency by the foreign price for that domestic currency. That is: Eppp
= P/P*. The demand for money assumption: since real money balance depends
on real income, demand for money is given as Md = kPY, where k is constant
and Y is the real income level. Hence, in equilibrium, money demand (Md)
equals money supply (Ms) and at the point of intersection of the
aggregate demand and the aggregate supply curve:
kPY=Ms
P = Ms/kY
EP* = P = Ms/kY
and E = Ms/P*kY
From the stated framework, it is theorised that if the money
supply within an economy increases, it will result in appreciation of the
domestic currency. Likewise, foreign price level and the output level are
inversely related to the exchange rate. If fixed money supply rises in the
domestic economy, since prices are held constant, excess money supply leads to
higher demand for goods and services within the economy.
Model
Specification
So, having
stated the theoretical framework, I can now go ahead to modify it to suit my
research and form there formulate my empirical model. For instance, in using a
Cobb-Douglas production from the neo-classical growth mode, I will attempt to explain
output growth in the context of capital accumulation, labour and productivity,
usually referred to as technological progress. Focusing on a closed economy, I
can implicitly express the Cobb- Douglas production model as:
Y
= f(ALβKα) [1]
where, Y is output;
K is capital stock; L is labour and A is productivity of
labour which grows at an exogenous rate. As a result of constant returns to
scale, if all inputs are increased by the same amount, then there would be an
increase in output. The production function,
Y
= KαL1-α [2]
where (1 - a = b) is
mainly used by economists and researchers due to the following reasons:
firstly, there is a constant return to scale and secondly, the two exponents α
and (1 - a),
sum up to one.
Next, is to tie
up my empirical model to the theoretical framework. That is given the
relationship between exchange rate and output, I can specify the model
implicitly as:
Yt
= f (Exchratet, X1t, X2t, …, Xnt) [3]
where Y = output (the dependent variable, state the measurement either gross output, or % of GDP, or growth rate etc.)
where Y = output (the dependent variable, state the measurement either gross output, or % of GDP, or growth rate etc.)
Exchrate = real
exchange rate (main explanatory variable)
X1, X2,
…, Xn = control variables (state their individual measurements
either gross output, or % of GDP, or growth rate etc.)
On the basis of
the theoretical framework and using the Cobb-Douglas production, the explicit
model is stated as:
Yt
= β0 + β1Exchratet + β2X1t
+ β3X2t + … + βnXnt + ut [4]
where, u = error term
where, u = error term
A Priori
Expectations
Always know that
the expected a priori is directly
related to what theory says. It is from that you know what signs of the
coefficients are expected from the main regressor and other covariates. For
instance, from the theory, it is expected that currency depreciation will have
a positive impact on domestic output, hence, a negative sign of the coefficient
is expected. That is:
β1
< 0
Therefore, the
expected signs of the control variables must be in line with their respective
theories which must be related to your study.
Estimation
Technique
At this point,
the researcher may not know the exact technique to adopt between the vector
autoregressive model (VAR) and the autoregressive distributed lag (ARDL) model.
The choice between these two is subject to the outcome of the unit root test
(URT) on each of the variables used in the model. This implies that, until the
URT is carried out, one cannot know whether to use VAR or ARDL model. This is
because if the variables are integrated of the same order, the VAR model is
applicable but the ARDL model suffices if otherwise. I will discuss these two
in detail in subsequent tutorials using both Stata and EViews analytical
softwares in addition to video clips on how to perform them.
Variables, Measurement and Description
Lastly, tabulate
your variables detailing their names, short description, measurement and
sources.
Here’s an
example:
Table xxx: Variables
Description and Measurement
Variables
|
Short
Definition
|
Measurement
|
Source
|
Output
|
World Bank (2016)
|
||
Real exchange
rate
|
World Bank (2016)
|
Source:
Researcher’s compilation (always put this at the bottom
of the Table)
Conclusion
I have taken you
through the steps required for sourcing your data, formulating your theoretical
framework, adapting the framework to align with your research, constructing
your empirical model, stating the expected a
priori, having an idea about the estimation technique and tabulating your
data showing the brief description of your variables, their measurements and
data sources.
From next lecture, I will begin analysing the data using both Stata and EViews analytical packages. So, endeavour to follow these tutorials by getting the most of it to ease the dissertation pressure. Make sure you follow me on the next lecture series which is: Time Series Analysis (Lecture 2): Optimal lag selection.
No comments:
Post a Comment