ESSSSDA 2022 3K: Dynamics and Heterogeneity

Slides

`panelr`

The panelr package vignette on between-within

Starting the panel data, or the generalization to multiple time series, perhaps the most famous question in the generic literature is a question about fixed and random effects, more precisely, do we estimate specific unobserved constants or do we seek only the distribution of these constants. The implications of this basic issue are substantial.

Some Simulated Data

Random effects and pooled regressions can be terribly wrong when the pooled and random effects moment condition fails. Let’s show some data here to illustrate the point. The true model here is \[ y_{it} = \alpha_{i} + X_{it}\beta + \epsilon_{it} \] where the \(\beta=1\) and \(\alpha_{i}=\{6,0,-6\}\) and \(\epsilon \sim \mathcal{N}(0,1)\). Here is the plot.

X.FE <- c(seq(-2.5,-0.5,by=0.05),seq(-2,0,by=0.05),seq(-1.5,0.5,by=0.05))
y.FE <- -3*c(rep(-2,41),rep(0,41),rep(2,41))+X.FE + rnorm(123,0,1)
FE.data <- data.frame(y.FE,X.FE,unit=c(rep(1,41),rep(2,41),rep(3,41)), time=rep(seq(1,41,1),3))
library(foreign)
write.dta(FE.data, "FEData-2.dta")
par(mfrow=c(1,2))
with(FE.data, plot(X.FE,y.FE, bty="n", main="Pooled"))
with(FE.data, abline(lm(y.FE~X.FE), lty=2, col="brown"))
with(FE.data, plot(X.FE,y.FE, bty="n", col=unit, main="Fixed Effects"))
abline(a=-6,b=1, col="blue")
abline(a=0,b=1, col="blue")
abline(a=6,b=1, col="blue")

Three Models

library(plm)
FE.pdata <- pdata.frame(FE.data, c("unit","time"))
mod.RE <- plm(y.FE~X.FE, data=FE.pdata, model="random")
mod.RE2 <- plm(y.FE~X.FE, data=FE.pdata, model="random", random.method = "amemiya")
mod.RE3 <- plm(y.FE~X.FE, data=FE.pdata, model="random", random.method = "walhus")
mod.RE4 <- plm(y.FE~X.FE, data=FE.pdata, model="random", random.method = "nerlove")
mod.FE <- plm(y.FE~X.FE, data=FE.pdata, model="within")
mod.pool <- plm(y.FE~X.FE, data=FE.pdata, model="pooling")

Omitted Fixed Effects can be Very Bad

As we can see, the default random effects model in R [and Stata] is actually pretty horrible.

library(stargazer)
stargazer(mod.RE,mod.RE2,mod.RE3,mod.RE4,mod.pool,mod.FE, type="html", column.labels=c("RE","RE-WalHus","RE-Amemiya","RE-Nerlove","Pooled","FE"))


	Dependent variable:

	y.FE
	RE	RE-WalHus	RE-Amemiya	RE-Nerlove	Pooled	FE
	(1)	(2)	(3)	(4)	(5)	(6)

X.FE	-3.043^***	0.837^***	0.764^***	0.839^***	-3.043^***	0.842^***
	(0.524)	(0.140)	(0.164)	(0.139)	(0.524)	(0.140)

Constant	-4.084^***	-0.203	-0.277	-0.202	-4.084^***
	(0.646)	(2.866)	(0.845)	(3.538)	(0.646)


Observations	123	123	123	123	123	123
R²	0.218	0.228	0.153	0.230	0.218	0.234
Adjusted R²	0.211	0.222	0.146	0.224	0.211	0.215
F Statistic	33.666^***	35.758^***	21.821^***	36.205^***	33.666^*** (df = 1; 121)	36.446^*** (df = 1; 119)

Note:	p<0.1; p<0.05; p<0.01

Discussion

The random method matters quite a bit though; many of them are very close to the truth. Models containing much or all of the between information are wrong.

If the X and unit effects are dependent, then there are serious threats to proper inference.

`plm` things

Beck and Katz (1995) standard errors are provided with vcovBK(). The key argument is cluster which averages over groups or time. The Beck and Katz paper would involve cluster="time".

Almost all panel unit root testing goes on with purtest. The test= argument is key for IPS, Levin, et al., Maddala-Wu, Hadri, and various tests proposed by Choi (2001). A few others are specified individually below.

The test of serial correlation for panel models is given by pbgtest(model).
The Baltagi and Li test of serial correlation in panel models with random effects is given by pbltest(model). The various alternatives are specified in alternative.
The Baltagi-Wu statistic for AR(1) disturbances is given by pbnftest(model, test="lbi") while a BNF (1982) statistic is the default for this test for fixed effects models.

# replicate Baltagi (2013), p. 101, table 5.1:
re <- plm(inv ~ value + capital, data = Grunfeld, model = "random")
pbnftest(re, test = "lbi")

pbsytest(model) gives the joint test of Baltagi and Li and a variant owing to Bera, et. al (2001) and Sosa-Escudero and Bera (2008) – the latter is a paper in Stata journal with companion software to be installed.
pcdtest(formula, data) gives the Pesaran test for cross-sectional dependence.
pdwtest(model) gives a panel Durbin-Watson statistic.
pFtest gives the F-test of fixed effects.
pggls gives GLS estimators for panel data specifying the effect and a model of within, pooling, fd.
phansitest(purtest object) combines unit root tests in the method proposed by Hanck (2013).
phtest(model1, model2) is the Hausman test for panel data models. This one has robust options detailed in the last section of ?phtest.
piest(formula, data) performs Chamberlain’s tests on the within regression.
Another test of unit/time effects is given in plmtest().
Chow tests of poolability are given by pooltest() applied to a pooled or within regression.
pvar ensures variation along dimensions.
pvcm will estimate variable coefficients models ala Swamy (1970).
Joint tests of coefficients are constructed using pwaldtest.
Wooldridge’s test for serial correlation in within models is pwartest(model)
Wooldridge’s test for AR(1) errors in level or differenced panel models is given by pwfdtest(model). The underlying idea is clever; if the levels are independent then the errors in first-differences will be correlated as -0.5. The test can be implemented against either within/fe or first-difference alternatives.
pwtest(pooling model) gives a semi-parametric test for the presence of (individual or time) unobserved effects in panel models that owes to Wooldridge.
ranef and fixef extract the random and fixed effects, respectively.