| Title: | Panel Data Pre-Testing and Diagnostic Suite |
|---|---|
| Description: | Pre-testing and diagnostic tools for panel data analysis. Researchers should run these tests before any panel regression to verify modelling assumptions. The package implements: (1) the Hsiao (2014, <ISBN:978-1-107-65763-2>) homogeneity F-tests (F1/F2/F3), Swamy (1970) <doi:10.2307/1913012> parameter heterogeneity test, and Pesaran (2004) <doi:10.2139/ssrn.572504> cross-sectional dependence test via xtpretest(); (2) missing-data detection, mechanism testing, and imputation for unbalanced panels via xtmispanel(); (3) quantile-regression cross-sectional dependence tests (T_tau and T-tilde_tau statistics) of Demetrescu, Hosseinkouchack and Rodrigues (2023) <doi:10.1016/j.jeconom.2022.09.001> via xtcsdq(); and (4) the panel quantile-regression slope homogeneity S-hat and D-hat statistics of Galvao, Juhl, Montes-Rojas and Olmo (2017) <doi:10.1080/07350015.2015.1054493> via xtqsh(). Together these tests address three fundamental pre-testing questions: (i) are slopes homogeneous? (ii) is there cross-sectional dependence? and (iii) is the panel balanced and is missingness ignorable? |
| Authors: | Muhammad Abdullah Alkhalaf [aut, cre, cph] (ORCID: <https://orcid.org/0009-0002-2677-9246>) |
| Maintainer: | Muhammad Abdullah Alkhalaf <[email protected]> |
| License: | GPL-3 |
| Version: | 1.0.5 |
| Built: | 2026-06-04 07:58:38 UTC |
| Source: | https://github.com/cran/paneltests |
Print method for xtcsdq objects
## S3 method for class 'xtcsdq' print(x, ...)## S3 method for class 'xtcsdq' print(x, ...)
x |
An object of class |
... |
Additional arguments (ignored). |
Invisibly returns x.
Prints a formatted summary of an "xtqsh" test result.
## S3 method for class 'xtqsh' print(x, ...)## S3 method for class 'xtqsh' print(x, ...)
x |
An object of class |
... |
Additional arguments (ignored). |
Invisibly returns x.
A simulated balanced panel dataset for demonstrating the quantile slope
homogeneity test (xtqsh).
data(qsh_sample)data(qsh_sample)
A data frame with columns:
Cross-sectional unit identifier.
Time period identifier.
Dependent variable.
First explanatory variable.
Second explanatory variable.
Summary method for xtcsdq objects
## S3 method for class 'xtcsdq' summary(object, ...)## S3 method for class 'xtcsdq' summary(object, ...)
object |
An object of class |
... |
Additional arguments (ignored). |
Invisibly returns object.
Prints a summary of an "xtqsh" test result.
## S3 method for class 'xtqsh' summary(object, ...)## S3 method for class 'xtqsh' summary(object, ...)
object |
An object of class |
... |
Additional arguments (ignored). |
Invisibly returns object.
Tests the null hypothesis of no cross-sectional error dependence (CSD) in panel quantile regressions. Implements the T_tau and T-tilde_tau statistics of Demetrescu, Hosseinkouchack and Rodrigues (2023).
xtcsdq( formula = NULL, data = NULL, index = NULL, quantiles, mode = c("pooled", "individual", "residuals"), residuals = NULL, bandwidth = NULL, correction = TRUE )xtcsdq( formula = NULL, data = NULL, index = NULL, quantiles, mode = c("pooled", "individual", "residuals"), residuals = NULL, bandwidth = NULL, correction = TRUE )
formula |
A formula of the form |
data |
A data frame containing the panel data in long format. Required
unless |
index |
A character vector of length 2: |
quantiles |
A numeric vector of quantile levels, each strictly between 0 and 1. |
mode |
Estimation mode: |
residuals |
A list (or named list) of numeric vectors or a matrix with
one column per quantile, containing pre-computed QR residuals. Only used
when |
bandwidth |
Numeric. KDE bandwidth for sparsity estimation. If
|
correction |
Logical. If |
The T_tau statistic (Equation 3 in Demetrescu et al., 2023) tests for CSD by examining pairwise correlations of demeaned QR residuals across units. Under the null of no CSD, T_tau is asymptotically standard normal.
The bias-corrected version T-tilde_tau (Equation 5) subtracts two correction terms that account for the estimation uncertainty in the QR slope and the sparsity at the quantile. Reject H0 for large positive values.
The portmanteau statistic
aggregates across K quantile levels.
The KDE bandwidth defaults to as in the
original paper.
An object of class "xtcsdq" with components:
Numeric vector of T_tau statistics (one per quantile).
Numeric vector of bias-corrected T-tilde_tau statistics.
p-values for T_tau.
p-values for T-tilde_tau.
KDE density estimates at zero (one per quantile).
Portmanteau statistic (average of T_tau over quantiles).
Bias-corrected portmanteau statistic.
p-value for M_K.
p-value for Mtilde_K.
Quantile levels used.
Number of cross-sectional units.
Number of time periods.
KDE bandwidth used.
Demetrescu, M., Hosseinkouchack, M. and Rodrigues, P.M.M. (2023). Testing for No Cross-Sectional Error Dependence in Panel Quantile Regressions. Ruhr Economic Papers, No. 1041. doi:10.4419/96973002
set.seed(42) n <- 8; tt <- 20 dat <- data.frame( id = rep(1:n, each = tt), time = rep(1:tt, times = n), y = rnorm(n * tt), x1 = rnorm(n * tt) ) res <- xtcsdq(y ~ x1, data = dat, index = c("id", "time"), quantiles = c(0.25, 0.5, 0.75)) print(res) summary(res)set.seed(42) n <- 8; tt <- 20 dat <- data.frame( id = rep(1:n, each = tt), time = rep(1:tt, times = n), y = rnorm(n * tt), x1 = rnorm(n * tt) ) res <- xtcsdq(y ~ x1, data = dat, index = c("id", "time"), quantiles = c(0.25, 0.5, 0.75)) print(res) summary(res)
Detects, diagnoses, and imputes missing values in panel (longitudinal) data sets. The function can produce summary tables (Module 1), test the missingness mechanism (Module 2), impute a target variable (Module 3), and run a cross-method sensitivity analysis (Module 4).
xtmispanel( data, vars = NULL, index, detect = TRUE, test = FALSE, impute = NULL, target = NULL, new_var = NULL, sensitivity = FALSE, knn_k = 5L )xtmispanel( data, vars = NULL, index, detect = TRUE, test = FALSE, impute = NULL, target = NULL, new_var = NULL, sensitivity = FALSE, knn_k = 5L )
data |
A |
vars |
Character vector of variable names to analyse. If |
index |
Character vector of length 2: |
detect |
Logical. Run Module 1 (detection tables, default |
test |
Logical. Run Module 2 (MCAR/MAR mechanism tests,
default |
impute |
Character or |
target |
Character. Name of the variable to impute (required when
|
new_var |
Character. Name of the output imputed variable
(default |
sensitivity |
Logical. Run Module 4 (sensitivity analysis across
all imputation methods, default |
knn_k |
Integer. Number of neighbours for KNN imputation (default 5). |
A list (invisibly) with components:
detectSummary statistics per variable/panel/period.
testMCAR and MAR test results.
imputedThe data frame augmented with the imputed
column (when imputation is requested).
impute_statsSummary comparing original vs imputed.
sensitivitySensitivity analysis results.
Little, R. J. A. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association, 83(404), 1198-1202. doi:10.1080/01621459.1988.10478714
set.seed(1) df <- data.frame( id = rep(1:4, each = 8), time = rep(1:8, times = 4), y = c(rnorm(32)) ) # introduce some NAs df$y[c(3, 11, 20)] <- NA res <- xtmispanel(df, vars = "y", index = c("id", "time"), detect = TRUE)set.seed(1) df <- data.frame( id = rep(1:4, each = 8), time = rep(1:8, times = 4), y = c(rnorm(32)) ) # introduce some NAs df$y[c(3, 11, 20)] <- NA res <- xtmispanel(df, vars = "y", index = c("id", "time"), detect = TRUE)
Performs a full battery of panel data pre-tests: Hsiao (2014) homogeneity F-tests, robust (HC1) versions, Swamy (1970) parameter heterogeneity test, cross-sectional dependence (Pesaran 2004), and panel summary statistics.
xtpretest( data, formula, index, tests = "ALL", level = 0.05 )xtpretest( data, formula, index, tests = "ALL", level = 0.05 )
data |
A |
formula |
A two-sided formula of the form |
index |
Character vector of length 2: |
tests |
Character vector. Which modules to run. Possible values:
|
level |
Numeric. Significance level for decisions (default 0.05). |
A list (invisibly) with components:
summaryPanel summary statistics.
hsiaoHsiao homogeneity F-test results.
robustRobust HC1 F-test results.
swamySwamy heterogeneity test results.
csdCross-sectional dependence test results.
recommendationCharacter. Suggested estimator.
Hsiao, C. (2014). Analysis of Panel Data (3rd ed.). Cambridge University Press. doi:10.1017/CBO9781139839327
Swamy, P. A. V. B. (1970). Efficient inference in a random coefficient regression model. Econometrica, 38(2), 311-323. doi:10.2307/1909405
Pesaran, M. H. (2004). General diagnostic tests for cross section dependence in panels. Cambridge Working Paper in Economics, No. 0435. doi:10.2139/ssrn.572504
set.seed(10) n <- 5; t <- 10 df <- data.frame( id = rep(1:n, each = t), time = rep(1:t, times = n), y = rnorm(n * t), x1 = rnorm(n * t) ) res <- xtpretest(df, y ~ x1, index = c("id", "time"), tests = c("hsiao", "csd"))set.seed(10) n <- 5; t <- 10 df <- data.frame( id = rep(1:n, each = t), time = rep(1:t, times = n), y = rnorm(n * t), x1 = rnorm(n * t) ) res <- xtpretest(df, y ~ x1, index = c("id", "time"), tests = c("hsiao", "csd"))
Tests the null hypothesis of slope homogeneity in panel quantile regressions. Implements the S-hat and D-hat statistics of Galvao et al. (2017).
xtqsh(formula, data, index, tau, bw = "hallsheather", marginal = FALSE)xtqsh(formula, data, index, tau, bw = "hallsheather", marginal = FALSE)
formula |
A formula of the form |
data |
A data frame containing the panel data in long format. |
index |
Character vector of length 2: |
tau |
Numeric vector of quantile levels, each strictly between 0 and 1. |
bw |
Bandwidth method: |
marginal |
Logical. If |
An object of class "xtqsh" containing test statistics and p-values.
Galvao, A.F., Juhl, T., Montes-Rojas, G. and Olmo, J. (2017). Testing Slope Homogeneity in Quantile Regression Panel Data. Journal of Financial Econometrics, 16(2), 211-243.
set.seed(42) n <- 10; tt <- 20 dat <- data.frame( id = rep(1:n, each = tt), time = rep(1:tt, times = n), y = rnorm(n * tt), x1 = rnorm(n * tt) ) res <- xtqsh(y ~ x1, data = dat, index = c("id", "time"), tau = 0.5) print(res)set.seed(42) n <- 10; tt <- 20 dat <- data.frame( id = rep(1:n, each = tt), time = rep(1:tt, times = n), y = rnorm(n * tt), x1 = rnorm(n * tt) ) res <- xtqsh(y ~ x1, data = dat, index = c("id", "time"), tau = 0.5) print(res)