| Title: | High Dimensional Time Series Analysis Tools |
|---|---|
| Description: | An implementation for high-dimensional time series analysis methods, including factor model for vector time series proposed by Lam and Yao (2012) <doi:10.1214/12-AOS970> and Chang, Guo and Yao (2015) <doi:10.1016/j.jeconom.2015.03.024>, martingale difference test proposed by Chang, Jiang and Shao (2023) <doi:10.1016/j.jeconom.2022.09.001>, principal component analysis for vector time series proposed by Chang, Guo and Yao (2018) <doi:10.1214/17-AOS1613>, cointegration analysis proposed by Zhang, Robinson and Yao (2019) <doi:10.1080/01621459.2018.1458620>, unit root test proposed by Chang, Cheng and Yao (2022) <doi:10.1093/biomet/asab034>, white noise tests proposed by Chang, Yao and Zhou (2017) <doi:10.1093/biomet/asw066> and Chang et al. (2026+), CP-decomposition for matrix time series proposed by Chang et al. (2023) <doi:10.1093/jrsssb/qkac011> and Chang et al. (2026+) <doi:10.48550/arXiv.2410.05634>, and statistical inference for spectral density matrix proposed by Chang et al. (2025) <doi:10.1080/01621459.2025.2468013>. |
| Authors: | Jinyuan Chang [aut], Jing He [aut], Chen Lin [aut, cre], Qiwei Yao [aut] |
| Maintainer: | Chen Lin <[email protected]> |
| License: | GPL-3 |
| Version: | 1.0.7-1 |
| Built: | 2026-05-31 07:13:01 UTC |
| Source: | https://github.com/linc2021/hdtsa |
The dataset is a tensor time series constructed from
the Beijing multi-site air-quality data. The raw data contain six hourly
air-pollution measurements, namely ,
, , ,
, and , collected from 12 nationally
controlled air-quality monitoring stations in Beijing from March 1, 2013
to February 28, 2017.
This dataset is not the raw data. It is a processed version constructed for studying the spatio-temporal dependence structure of air pollution in Beijing. Specifically, missing values and extreme values in the hourly observations were handled by interpolation. Then, for each monitoring station and each pollutant type, the hourly series were differenced to focus on concentration changes rather than levels. Finally, all series were standardized to remove the effects of different measurement scales across pollutants and stations.
After these preprocessing steps, the data form a tensor time series
for , where records the
standardized concentration change of pollutant at station
during the -th hour of day .
data(BeijingAir)data(BeijingAir)
A 4-way array with dimension :
The day index, with 1461 daily observations from March 1, 2013 to February 28, 2017.
The 12 air-quality monitoring stations in Beijing.
The 6 pollutants: , , , , , and .
The 24 hours within each day.
The raw data can be downloaded from https://archive.ics.uci.edu/dataset/501/beijing+multi+site+air+quality+data
Coint() deals with cointegration analysis for high-dimensional
vector time series proposed in Zhang, Robinson and Yao (2019). Consider the model:
where is a unknown and invertible constant matrix,
is a latent
process, is an process,
is a process with nonstationary components, and no linear
combination of is . This function aims to estimate the
cointegration rank and the invertible constant matrix .
Coint( Y, lag.k = 5, type = c("acf", "urtest", "both"), c0 = 0.3, m = 20, alpha = 0.01 )Coint( Y, lag.k = 5, type = c("acf", "urtest", "both"), c0 = 0.3, m = 20, alpha = 0.01 )
Y |
An |
lag.k |
The time lag
where |
type |
The method used to identify the cointegration rank. Available
options include: |
c0 |
The prescribed constant |
m |
The prescribed constant |
alpha |
The significance level |
Write .
When type = "acf", Coint() estimates by
for some
constant and some large constant , where
is the sum of the sample autocorrelations of
over lags 1 to ,
which is specified in Section 2.3 of Zhang, Robinson and Yao (2019).
When type = "urtest", Coint() estimates by unit root
tests. For , consider the null hypothesis
The estimation procedure for
can be implemented as follows:
Step 1. Start with . Perform the unit root test proposed
in Chang, Cheng and Yao (2021) for .
Step 2. If the null hypothesis is not rejected at the significance
level , increment by 1 and repeat Step 1. Otherwise, stop
the procedure and denote the value of at termination as .
The cointegration rank is then estimated as .
An object of class "coint", which contains the following
components:
A |
The estimated |
coint_rank |
The estimated cointegration rank |
lag.k |
The time lag used in function. |
method |
A string indicating which method is used to identify the cointegration rank. |
Chang, J., Cheng, G., & Yao, Q. (2022). Testing for unit roots based on sample autocovariances. Biometrika, 109, 543–550. doi:10.1093/biomet/asab034.
Zhang, R., Robinson, P., & Yao, Q. (2019). Identifying cointegration by eigenanalysis. Journal of the American Statistical Association, 114, 916–927. doi:10.1080/01621459.2018.1458620.
# Example 1 (Example 1 in Zhang, Robinson and Yao (2019)) ## Generate yt p <- 10 n <- 1000 r <- 3 d <- 1 X <- mat.or.vec(p, n) X[1,] <- arima.sim(n-d, model = list(order=c(0, d, 0))) for(i in 2:3)X[i,] <- rnorm(n) for(i in 4:(r+1)) X[i, ] <- arima.sim(model = list(ar = 0.5), n) for(i in (r+2):p) X[i, ] <- arima.sim(n = (n-d), model = list(order=c(1, d, 1), ar=0.6, ma=0.8)) M1 <- matrix(c(1, 1, 0, 1/2, 0, 1, 0, 1, 0), ncol = 3, byrow = TRUE) A <- matrix(runif(p*p, -3, 3), ncol = p) A[1:3,1:3] <- M1 Y <- t(A%*%X) Coint(Y, type = "both")# Example 1 (Example 1 in Zhang, Robinson and Yao (2019)) ## Generate yt p <- 10 n <- 1000 r <- 3 d <- 1 X <- mat.or.vec(p, n) X[1,] <- arima.sim(n-d, model = list(order=c(0, d, 0))) for(i in 2:3)X[i,] <- rnorm(n) for(i in 4:(r+1)) X[i, ] <- arima.sim(model = list(ar = 0.5), n) for(i in (r+2):p) X[i, ] <- arima.sim(n = (n-d), model = list(order=c(1, d, 1), ar=0.6, ma=0.8)) M1 <- matrix(c(1, 1, 0, 1/2, 0, 1, 0, 1, 0), ncol = 3, byrow = TRUE) A <- matrix(runif(p*p, -3, 3), ncol = p) A[1:3,1:3] <- M1 Y <- t(A%*%X) Coint(Y, type = "both")
This function performs inference for the debiased
iterative estimator of a factor loading vector in the tensor CP-factor model based on the DPI estimator (Chang et al., 2026+).
Given a direction vector h, a factor index i, a mode index
j, the observed tensor time series Y, and the output object
returned by CP_TTS, the function returns the debiased estimate,
its estimated standard error, the original iterative loading estimator, and
the estimated bias-correction term.
CP_Inference(h, i, j, Y, res.CP.DPI)CP_Inference(h, i, j, Y, res.CP.DPI)
h |
A numeric vector of length |
i |
A positive integer. The factor index. |
j |
A positive integer. The tensor mode index. |
Y |
An array containing the observed tensor time series with dimension
|
res.CP.DPI |
An output object returned by |
Let be the iterative estimator of the loading
vector for factor and mode . The function computes the debiased
estimator
and returns the linear transformation
The reported standard error is
where is the sample size.
A list with the following components:
aij.h.deThe debiased linear transformation
.
se.h.ijThe estimated standard error of aij.h.de.
aij.iterThe original iterative estimator
.
vartheta.ijThe estimated bias-correction term
.
Chang, J., Huang, G., Yao, Q., & Yu, L. (2026+). CP-Factorization for High Dimensional Tensor Time Series and Double Projection Iterations. Journal of the Royal Statistical Society Series B: Statistical Methodology, major revision.
## Not run: fit <- CP_TTS(Y) out <- CP_Inference( h = h, i = 1, j = 2, Y = Y, res.CP.DPI = fit ) out$aij.h.de out$se.h.ij out$aij.iter out$vartheta.ij ## End(Not run)## Not run: fit <- CP_TTS(Y) out <- CP_Inference( h = h, i = 1, j = 2, Y = Y, res.CP.DPI = fit ) out$aij.h.de out$se.h.ij out$aij.iter out$vartheta.ij ## End(Not run)
CP_MTS() deals with the estimation of the CP-factor model for matrix time series:
where is a
unobservable diagonal matrix,
is a matrix white noise, and are, respectively, and unknown constant matrices with their columns being
unit vectors, and is an unknown integer.
Let
and with some unknown .
This function aims to estimate and the loading
matrices and using the methods proposed in Chang
et al. (2023) and Chang et al. (2024).
CP_MTS( Y, xi = NULL, Rank = NULL, lag.k = 20, lag.ktilde = 10, method = c("CP.Direct", "CP.Refined", "CP.Unified"), thresh1 = FALSE, thresh2 = FALSE, thresh3 = FALSE, delta1 = 2 * sqrt(log(dim(Y)[2] * dim(Y)[3])/dim(Y)[1]), delta2 = delta1, delta3 = delta1 )CP_MTS( Y, xi = NULL, Rank = NULL, lag.k = 20, lag.ktilde = 10, method = c("CP.Direct", "CP.Refined", "CP.Unified"), thresh1 = FALSE, thresh2 = FALSE, thresh3 = FALSE, delta1 = 2 * sqrt(log(dim(Y)[2] * dim(Y)[3])/dim(Y)[1]), delta2 = delta1, delta3 = delta1 )
Y |
An |
xi |
An |
Rank |
A list containing the following components: |
lag.k |
The time lag
where |
lag.ktilde |
The time lag |
method |
A string indicating which CP-decomposition method is used. Available options include:
|
thresh1 |
Logical. If |
thresh2 |
Logical. If |
thresh3 |
Logical. If |
delta1 |
The value of the threshold level |
delta2 |
The value of the threshold level |
delta3 |
The value of the threshold level |
All three CP-decomposition methods involve the estimation of the autocovariance of
and at lag , which is defined as follows:
where ,
and is a threshold operator defined as
for any matrix
, with the threshold level and
representing the indicator function. Chang et al. (2023) and Chang et al. (2024) suggest to choose
when are fixed and when .
The refined estimation method involves
where is a threshold operator with the threshold level
, and is a vecterization operator
with being the vector obtained by stacking
the columns of the matrix . See Section 3.2.2 of Chang
et al. (2023) for details.
The unified estimation method involves
where is a threshold operator with the threshold level
. See Section 4.2 of Chang et al. (2024) for details.
An object of class "mtscp", which contains the following
components:
A |
The estimated |
B |
The estimated |
f |
The estimated latent process |
Rank |
The estimated |
method |
A string indicating which CP-decomposition method is used. |
Chang, J., He, J., Yang, L., & Yao, Q. (2023). Modelling matrix time series via a tensor CP-decomposition. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85, 127–148. doi:10.1093/jrsssb/qkac011.
Chang, J., Du, Y., Huang, G., & Yao, Q. (2026+). Identification and estimation for matrix time series CP-factor models. The Annals of Statistics, in press. doi:10.48550/arXiv.2410.05634.
# Example 1. p <- 10 q <- 10 n <- 400 d = d1 = d2 <- 3 ## DGP.CP() generates simulated data for the example in Chang et al. (2024). data <- DGP.CP(n, p, q, d, d1, d2) Y <- data$Y ## d is unknown res1 <- CP_MTS(Y, method = "CP.Direct") res2 <- CP_MTS(Y, method = "CP.Refined") res3 <- CP_MTS(Y, method = "CP.Unified") ## d is known res4 <- CP_MTS(Y, Rank = list(d = 3), method = "CP.Direct") res5 <- CP_MTS(Y, Rank = list(d = 3), method = "CP.Refined") # Example 2. p <- 10 q <- 10 n <- 400 d1 = d2 <- 2 d <-3 data <- DGP.CP(n, p, q, d, d1, d2) Y1 <- data$Y ## d, d1 and d2 are unknown res6 <- CP_MTS(Y1, method = "CP.Unified") ## d, d1 and d2 are known res7 <- CP_MTS(Y1, Rank = list(d = 3, d1 = 2, d2 = 2), method = "CP.Unified")# Example 1. p <- 10 q <- 10 n <- 400 d = d1 = d2 <- 3 ## DGP.CP() generates simulated data for the example in Chang et al. (2024). data <- DGP.CP(n, p, q, d, d1, d2) Y <- data$Y ## d is unknown res1 <- CP_MTS(Y, method = "CP.Direct") res2 <- CP_MTS(Y, method = "CP.Refined") res3 <- CP_MTS(Y, method = "CP.Unified") ## d is known res4 <- CP_MTS(Y, Rank = list(d = 3), method = "CP.Direct") res5 <- CP_MTS(Y, Rank = list(d = 3), method = "CP.Refined") # Example 2. p <- 10 q <- 10 n <- 400 d1 = d2 <- 2 d <-3 data <- DGP.CP(n, p, q, d, d1, d2) Y1 <- data$Y ## d, d1 and d2 are unknown res6 <- CP_MTS(Y1, method = "CP.Unified") ## d, d1 and d2 are known res7 <- CP_MTS(Y1, Rank = list(d = 3, d1 = 2, d2 = 2), method = "CP.Unified")
CP_TTS() deals with the estimation of the CP-factor model for tensor time series. Let be a tensor in
. The tensor CP-factor model is given by
where is a fixed but unknown constant,
is the idiosyncratic error tensor,
is the -dimensional factor vector,
and is a -dimensional loading vector
corresponding to the -th factor and the -th mode. Without loss of generality, we assume
for and .
This function aims to estimate and the loading
vectors using the method proposed in Chang et al. (2026+).
CP_TTS(Y, xi = NULL, r = NULL, A.init = NULL, control.DPI = list())CP_TTS(Y, xi = NULL, r = NULL, A.init = NULL, control.DPI = list())
Y |
An array representing a tensor-valued time series with dimension
|
xi |
An auxiliary scalar series |
r |
The prescribed number of factors. If set to |
A.init |
Optional initial loading matrices. It should be a list of length
|
control.DPI |
A named list of control parameters used in the double projection iteration (DPI) algorithm. The supported components are:
|
The initial method
involve the estimation of the autocovariance between and
at lag , which is defined as follows:
where ,
, and
is a threshold operator defined as
for any matrix , with threshold level
and denoting the indicator function.
Chang et al. (2026+) suggests choosing
by a grid search method. See Section 3.3 of Chang et al. (2026+) for details.
The function returns a list containing the following components:
A.hatThe final iterative loading matrices.
A.initThe initial loading matrices used to start the iteration.
Sigma.yij.xii.1The thresholded moment vectors used in the iterative updates and inference.
rThe number of factors used in the iterative procedure.
iter.stepThe number of iterations performed.
fnorm.residThe relative Frobenius norm of the residuals recorded during the iterations.
f.hatThe estimated factor series based on the final iterative loading matrices.
f.hat.inlThe estimated factor series based on the initial loading matrices.
delta.selThe selected threshold level from the initial one-pass estimation. If A.init is supplied by the user, this value is NULL.
control.DPIThe control list actually used in the function after merging user-supplied values with the defaults.
Chang, J., He, J., Yang, L., & Yao, Q. (2023). Modelling matrix time series via a tensor CP-decomposition. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85, 127–148. doi:10.1093/jrsssb/qkac011.
Chang, J., Huang, G., Yao, Q., & Yu, L. (2026+). CP-Factorization for High Dimensional Tensor Time Series and Double Projection Iterations. Journal of the Royal Statistical Society Series B: Statistical Methodology, major revision.
n <- 200 D <- c(10, 10) r <- 2 data <- HDTSA:::DGP.TCP( n = n, m = 2, D = D, r = r, w = c(10, 10), ar.coef = list(0.5, 0.3), factor.loading = "sparse-random", alpha = 0.3 ) Y <- data$Y fit <- CP_TTS(Y) fit$r fit$A.hat fit$f.hat fit.known <- CP_TTS(Y, r = 2)n <- 200 D <- c(10, 10) r <- 2 data <- HDTSA:::DGP.TCP( n = n, m = 2, D = D, r = r, w = c(10, 10), ar.coef = list(0.5, 0.3), factor.loading = "sparse-random", alpha = 0.3 ) Y <- data$Y fit <- CP_TTS(Y) fit$r fit$A.hat fit$f.hat fit.known <- CP_TTS(Y, r = 2)
DGP.CP() function generates simulated data following the
data generating process described in Section 7.1 of Chang et al. (2024).
DGP.CP(n, p, q, d, d1, d2)DGP.CP(n, p, q, d, d1, d2)
n |
Integer. The number of observations of the |
p |
Integer. The number of rows of |
q |
Integer. The number of columns of |
d |
Integer. The number of columns of the factor loading matrices |
d1 |
Integer. The rank of the |
d2 |
Integer. The rank of the |
We generate
for any , where
with being a time series,
is a matrix white noise,
and and are, respectively, and
factor loading matrices. , , and
are generated based on the data generating process described in Section 7.1 of
Chang et al. (2024) and satisfy and
, .
A list containing the following components:
Y |
An |
A |
The |
B |
The |
X |
An |
Chang, J., Du, Y., Huang, G., & Yao, Q. (2026+). Identification and estimation for matrix time series CP-factor models. The Annals of Statistics, in press. doi:10.48550/arXiv.2410.05634.
p <- 10 q <- 10 n <- 400 d = d1 = d2 <- 3 data <- DGP.CP(n,p,q,d1,d2,d) Y <- data$Y ## The first observation: Y_1 Y[1, , ]p <- 10 q <- 10 n <- 400 d = d1 = d2 <- 3 data <- DGP.CP(n,p,q,d1,d2,d) Y <- data$Y ## The first observation: Y_1 Y[1, , ]
Factors() deals with factor modeling for high-dimensional
time series proposed in Lam and Yao (2012):
where is an
latent process with (unknown) , is a unknown constant matrix, and is a
vector white noise process. The number of factors and the factor
loadings can be estimated in terms of an eigenanalysis for a
nonnegative definite matrix, and is therefore applicable when the dimension
of is on the order of a few thousands. This function aims to
estimate the number of factors and the factor loading matrix
.
Factors( Y, lag.k = 5, thresh = FALSE, delta = 2 * sqrt(log(ncol(Y))/nrow(Y)), twostep = FALSE )Factors( Y, lag.k = 5, thresh = FALSE, delta = 2 * sqrt(log(ncol(Y))/nrow(Y)), twostep = FALSE )
Y |
An |
lag.k |
The time lag
where |
thresh |
Logical. If |
delta |
The value of the threshold level |
twostep |
Logical. If |
The threshold operator is defined as
for any matrix
, with the threshold level and
representing the indicator function. We recommend to choose
when is fixed and when .
An object of class "factors", which contains the following
components:
factor_num |
The estimated number of factors
|
loading.mat |
The estimated |
X |
The |
lag.k |
The time lag used in function. |
Lam, C., & Yao, Q. (2012). Factor modelling for high-dimensional time series: Inference for the number of factors. The Annals of Statistics, 40, 694–726. doi:10.1214/12-AOS970.
# Example 1 (Example in Section 3.3 of lam and Yao 2012) ## Generate y_t p <- 200 n <- 400 r <- 3 X <- mat.or.vec(n, r) A <- matrix(runif(p*r, -1, 1), ncol=r) x1 <- arima.sim(model=list(ar=c(0.6)), n=n) x2 <- arima.sim(model=list(ar=c(-0.5)), n=n) x3 <- arima.sim(model=list(ar=c(0.3)), n=n) eps <- matrix(rnorm(n*p), p, n) X <- t(cbind(x1, x2, x3)) Y <- A %*% X + eps Y <- t(Y) fac <- Factors(Y,lag.k=2) r_hat <- fac$factor_num loading_Mat <- fac$loading.mat# Example 1 (Example in Section 3.3 of lam and Yao 2012) ## Generate y_t p <- 200 n <- 400 r <- 3 X <- mat.or.vec(n, r) A <- matrix(runif(p*r, -1, 1), ncol=r) x1 <- arima.sim(model=list(ar=c(0.6)), n=n) x2 <- arima.sim(model=list(ar=c(-0.5)), n=n) x3 <- arima.sim(model=list(ar=c(0.3)), n=n) eps <- matrix(rnorm(n*p), p, n) X <- t(cbind(x1, x2, x3)) Y <- A %*% X + eps Y <- t(Y) fac <- Factors(Y,lag.k=2) r_hat <- fac$factor_num loading_Mat <- fac$loading.mat
The portfolios are constructed by the intersections of 10 levels
of size, denoted by , and 10 levels of the book
equity to market equity ratio (BE), denoted by .
The dataset consists of monthly returns from January 1964 to
December 2021, which contains 69600 observations for 696 total months.
data(FamaFrench)data(FamaFrench)
A data frame with 696 rows and 102 columns. The first column
represents the month, and the second column named
MKT.RF represents the monthly market returns. The rest of the columns
represent the return series for different sizes and BE-ratios.
http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html
HDSReg() considers a multivariate time series model which
represents a high-dimensional vector process as a sum of three terms: a
linear regression of some observed regressors, a linear combination of some
latent and serially correlated factors, and a vector white noise:
where and are, respectively, observable and
time series, is an latent
factor process, is a vector white noise process,
is an unknown regression coefficient matrix, and
is an unknown factor loading matrix. This procedure proposed in
Chang, Guo and Yao (2015) aims to estimate the regression coefficient
matrix , the number of factors and the factor loading
matrix .
HDSReg( Y, Z, D = NULL, lag.k = 5, thresh = FALSE, delta = 2 * sqrt(log(ncol(Y))/nrow(Y)), twostep = FALSE )HDSReg( Y, Z, D = NULL, lag.k = 5, thresh = FALSE, delta = 2 * sqrt(log(ncol(Y))/nrow(Y)), twostep = FALSE )
Y |
An |
Z |
An |
D |
A |
lag.k |
The time lag
where |
thresh |
Logical. If |
delta |
The value of the threshold level |
twostep |
Logical. The same as the argument |
The threshold operator is defined as
for any matrix
, with the threshold level and
representing the indicator function. We recommend to choose
when is fixed and when .
An object of class "factors", which contains the following
components:
factor_num |
The estimated number of factors |
reg.coff.mat |
The estimated |
loading.mat |
The estimated |
X |
The |
lag.k |
The time lag used in function. |
Chang, J., Guo, B., & Yao, Q. (2015). High dimensional stochastic regression with latent factors, endogeneity and nonlinearity. Journal of Econometrics, 189, 297–312. doi:10.1016/j.jeconom.2015.03.024.
# Example 1 (Example 1 in Chang, Guo and Yao (2015)). ## Generate xt n <- 400 p <- 200 m <- 2 r <- 3 X <- mat.or.vec(n,r) x1 <- arima.sim(model = list(ar = c(0.6)), n = n) x2 <- arima.sim(model = list(ar = c(-0.5)), n = n) x3 <- arima.sim(model = list(ar = c(0.3)), n = n) X <- cbind(x1, x2, x3) X <- t(X) ## Generate yt Z <- mat.or.vec(m,n) S1 <- matrix(c(5/8, 1/8, 1/8, 5/8), 2, 2) Z[,1] <- c(rnorm(m)) for(i in c(2:n)){ Z[,i] <- S1%*%Z[, i-1] + c(rnorm(m)) } D <- matrix(runif(p*m, -2, 2), ncol = m) A <- matrix(runif(p*r, -2, 2), ncol = r) eps <- mat.or.vec(n, p) eps <- matrix(rnorm(n*p), p, n) Y <- D %*% Z + A %*% X + eps Y <- t(Y) Z <- t(Z) ## D is known res1 <- HDSReg(Y, Z, D, lag.k = 2) ## D is unknown res2 <- HDSReg(Y, Z, lag.k = 2)# Example 1 (Example 1 in Chang, Guo and Yao (2015)). ## Generate xt n <- 400 p <- 200 m <- 2 r <- 3 X <- mat.or.vec(n,r) x1 <- arima.sim(model = list(ar = c(0.6)), n = n) x2 <- arima.sim(model = list(ar = c(-0.5)), n = n) x3 <- arima.sim(model = list(ar = c(0.3)), n = n) X <- cbind(x1, x2, x3) X <- t(X) ## Generate yt Z <- mat.or.vec(m,n) S1 <- matrix(c(5/8, 1/8, 1/8, 5/8), 2, 2) Z[,1] <- c(rnorm(m)) for(i in c(2:n)){ Z[,i] <- S1%*%Z[, i-1] + c(rnorm(m)) } D <- matrix(runif(p*m, -2, 2), ncol = m) A <- matrix(runif(p*r, -2, 2), ncol = r) eps <- mat.or.vec(n, p) eps <- matrix(rnorm(n*p), p, n) Y <- D %*% Z + A %*% X + eps Y <- t(Y) Z <- t(Z) ## D is known res1 <- HDSReg(Y, Z, D, lag.k = 2) ## D is unknown res2 <- HDSReg(Y, Z, lag.k = 2)
The dataset consists of 7 monthly U.S. Industrial Production indices, namely the total index, nonindustrial supplies, final products, manufacturing, materials, mining, and utilities, from January 1947 to December 2023 published by the U.S. Federal Reserve.
data(IPindices)data(IPindices)
A data frame with 924 rows and 8 variables:
The observation date
The total index
Nonindustrial supplies
Final products
Manufacturing
Materials
Mining
Utilities
https://fred.stlouisfed.org/release/tables?rid=13&eid=49670
MartG_test() implements a new test proposed in
Chang, Jiang and Shao (2023) for the following hypothesis testing problem:
where MDS is the abbreviation of "martingale difference sequence".
MartG_test( Y, lag.k = 2, B = 1000, type = c("Linear", "Quad"), alpha = 0.05, kernel.type = c("QS", "Par", "Bart") )MartG_test( Y, lag.k = 2, B = 1000, type = c("Linear", "Quad"), alpha = 0.05, kernel.type = c("QS", "Par", "Bart") )
Y |
An |
lag.k |
The time lag |
B |
The number of bootstrap replications for generating multivariate normally distributed random vectors when calculating the critical value. The default is 1000. |
type |
The map used for constructing the test statistic.
Available options include: |
alpha |
The significance level of the test. The default is 0.05. |
kernel.type |
The option for choosing the symmetric kernel
used in the estimation of long-run covariance matrix.
Available options include: |
Write .
When type = "Linear", the linear identity map is defined
as .
When type = "Quad",
includes both linear and quadratic terms, where
.
We can also choose to capture
certain type of nonlinear dependence, where
.
See 'Examples'.
An object of class "hdtstest", which contains the following
components:
statistic |
The test statistic of the test. |
p.value |
The p-value of the test. |
lag.k |
The time lag used in function. |
type |
The map used in function. |
kernel.type |
The kernel used in function. |
Chang, J., Jiang, Q., & Shao, X. (2023). Testing the martingale difference hypothesis in high dimension. Journal of Econometrics, 235, 972–1000. doi:10.1016/j.jeconom.2022.09.001.
# Example 1 n <- 200 p <- 10 X <- matrix(rnorm(n*p),n,p) res <- MartG_test(X, type="Linear") res <- MartG_test(X, type=cbind(X, X^2)) #the same as type = "Quad" ## map can also be defined as an expression in R. res <- MartG_test(X, type=quote(cbind(X, X^2))) # expr using quote() res <- MartG_test(X, type=substitute(cbind(X, X^2))) # expr using substitute() res <- MartG_test(X, type=expression(cbind(X, X^2))) # expr using expression() res <- MartG_test(X, type=parse(text="cbind(X, X^2)")) # expr using parse() ## map can also be defined as a function in R. map_fun <- function(X) {X <- cbind(X, X^2); X} res <- MartG_test(X, type=map_fun) Pvalue <- res$p.value rej <- res$reject# Example 1 n <- 200 p <- 10 X <- matrix(rnorm(n*p),n,p) res <- MartG_test(X, type="Linear") res <- MartG_test(X, type=cbind(X, X^2)) #the same as type = "Quad" ## map can also be defined as an expression in R. res <- MartG_test(X, type=quote(cbind(X, X^2))) # expr using quote() res <- MartG_test(X, type=substitute(cbind(X, X^2))) # expr using substitute() res <- MartG_test(X, type=expression(cbind(X, X^2))) # expr using expression() res <- MartG_test(X, type=parse(text="cbind(X, X^2)")) # expr using parse() ## map can also be defined as a function in R. map_fun <- function(X) {X <- cbind(X, X^2); X} res <- MartG_test(X, type=map_fun) Pvalue <- res$p.value rej <- res$reject
PCA_TS() seeks for a contemporaneous linear
transformation for a multivariate time series such that the transformed
series is segmented into several lower-dimensional subseries:
where is an unobservable
weakly stationary time series consisting of both
contemporaneously and serially uncorrelated subseries. See Chang, Guo and
Yao (2018).
PCA_TS( Y, lag.k = 5, opt = 1, permutation = c("max", "fdr"), thresh = FALSE, delta = 2 * sqrt(log(ncol(Y))/nrow(Y)), prewhiten = TRUE, m = NULL, beta, control = list() )PCA_TS( Y, lag.k = 5, opt = 1, permutation = c("max", "fdr"), thresh = FALSE, delta = 2 * sqrt(log(ncol(Y))/nrow(Y)), prewhiten = TRUE, m = NULL, beta, control = list() )
Y |
An |
lag.k |
The time lag
where |
opt |
An option used to choose which method will be implemented to get a
consistent estimate |
permutation |
The method of permutation procedure to assign the
components of |
thresh |
Logical. If |
delta |
The value of the threshold level |
prewhiten |
Logical. If |
m |
A positive integer used in the permutation procedure [See (2.10) in Chang, Guo and Yao (2018)]. The default is 10. |
beta |
The error rate used in the permutation procedure[See (2.16) in
Chang, Guo and Yao (2018)] when |
control |
A list of control arguments. See ‘Details’. |
The threshold operator is defined as
for any matrix
, with the threshold level and
representing the indicator function. We recommend to choose
when is fixed and when .
For large , since the sample covariance matrix may not be consistent,
we recommend to use the method proposed
in Cai, Liu and Luo (2011) to estimate the precision matrix
(opt = 2).
control is a list of arguments passed to the function clime(),
which contains the following components:
nlambda: Number of values for program generated lambda. The default is 100.
lambda.max: Maximum value of program generated lambda. The default is 0.8.
lambda.min: Minimum value of program generated lambda.
The default is () or ().
standardize: Logical. If standardize = TRUE, the
variables will be standardized to have mean zero and unit standard
deviation. The default is FALSE.
linsolver: An option used to choose which method should be employed.
Available options include "primaldual" (the default) and "simplex".
Rule of thumb: "primaldual" for large , "simplex" for small .
An object of class "tspca", which contains the following
components:
B |
The |
X |
The |
NoGroups |
The number of groups. |
No_of_Members |
The number of members in each group. |
Groups |
The indices of the components of |
method |
A string indicating which permutation procedure is performed. |
Cai, T., Liu, W., & Luo, X. (2011). A constrained L1 minimization approach for sparse precision matrix estimation. Journal of the American Statistical Association, 106, 594–607. doi:10.1198/jasa.2011.tm10155.
Chang, J., Guo, B., & Yao, Q. (2018). Principal component analysis for second-order stationary vector time series. The Annals of Statistics, 46, 2094–2124. doi:10.1214/17-AOS1613.
# Example 1 (Example 1 in the supplementary material of Chang, Guo and Yao (2018)). # p=6, x_t consists of 3 independent subseries with 3, 2 and 1 components. ## Generate x_t p <- 6;n <- 1500 X <- mat.or.vec(p,n) x <- arima.sim(model = list(ar = c(0.5, 0.3), ma = c(-0.9, 0.3, 1.2,1.3)), n = n+2, sd = 1) for(i in 1:3) X[i,] <- x[i:(n+i-1)] x <- arima.sim(model = list(ar = c(0.8,-0.5),ma = c(1,0.8,1.8) ), n = n+1, sd = 1) for(i in 4:5) X[i,] <- x[(i-3):(n+i-4)] x <- arima.sim(model = list(ar = c(-0.7, -0.5), ma = c(-1, -0.8)), n = n, sd = 1) X[6,] <- x ## Generate y_t A <- matrix(runif(p*p, -3, 3), ncol = p) Y <- A%*%X Y <- t(Y) ## permutation = "max" or permutation = "fdr" res <- PCA_TS(Y, lag.k = 5,permutation = "max") res1 <- PCA_TS(Y, lag.k = 5,permutation = "fdr", beta = 10^(-10)) Z <- res$X # Example 2 (Example 2 in the supplementary material of Chang, Guo and Yao (2018)). # p=20, x_t consists of 5 independent subseries with 6, 5, 4, 3 and 2 components. ## Generate x_t p <- 20;n <- 3000 X <- mat.or.vec(p,n) x <- arima.sim(model = list(ar = c(0.5, 0.3), ma = c(-0.9, 0.3, 1.2, 1.3)), n.start = 500, n = n+5, sd = 1) for(i in 1:6) X[i,] <- x[i:(n+i-1)] x <- arima.sim(model = list(ar = c(-0.4, 0.5), ma = c(1, 0.8, 1.5, 1.8)), n.start = 500, n = n+4, sd = 1) for(i in 7:11) X[i,] <- x[(i-6):(n+i-7)] x <- arima.sim(model = list(ar = c(0.85,-0.3), ma=c(1, 0.5, 1.2)), n.start = 500, n = n+3,sd = 1) for(i in 12:15) X[i,] <- x[(i-11):(n+i-12)] x <- arima.sim(model = list(ar = c(0.8, -0.5),ma = c(1, 0.8, 1.8)), n.start = 500, n = n+2,sd = 1) for(i in 16:18) X[i,] <- x[(i-15):(n+i-16)] x <- arima.sim(model = list(ar = c(-0.7, -0.5), ma = c(-1, -0.8)), n.start = 500,n = n+1,sd = 1) for(i in 19:20) X[i,] <- x[(i-18):(n+i-19)] ## Generate y_t A <- matrix(runif(p*p, -3, 3), ncol =p) Y <- A%*%X Y <- t(Y) ## permutation = "max" or permutation = "fdr" res <- PCA_TS(Y, lag.k = 5,permutation = "max") res1 <- PCA_TS(Y, lag.k = 5,permutation = "fdr",beta = 10^(-200)) Z <- res$X# Example 1 (Example 1 in the supplementary material of Chang, Guo and Yao (2018)). # p=6, x_t consists of 3 independent subseries with 3, 2 and 1 components. ## Generate x_t p <- 6;n <- 1500 X <- mat.or.vec(p,n) x <- arima.sim(model = list(ar = c(0.5, 0.3), ma = c(-0.9, 0.3, 1.2,1.3)), n = n+2, sd = 1) for(i in 1:3) X[i,] <- x[i:(n+i-1)] x <- arima.sim(model = list(ar = c(0.8,-0.5),ma = c(1,0.8,1.8) ), n = n+1, sd = 1) for(i in 4:5) X[i,] <- x[(i-3):(n+i-4)] x <- arima.sim(model = list(ar = c(-0.7, -0.5), ma = c(-1, -0.8)), n = n, sd = 1) X[6,] <- x ## Generate y_t A <- matrix(runif(p*p, -3, 3), ncol = p) Y <- A%*%X Y <- t(Y) ## permutation = "max" or permutation = "fdr" res <- PCA_TS(Y, lag.k = 5,permutation = "max") res1 <- PCA_TS(Y, lag.k = 5,permutation = "fdr", beta = 10^(-10)) Z <- res$X # Example 2 (Example 2 in the supplementary material of Chang, Guo and Yao (2018)). # p=20, x_t consists of 5 independent subseries with 6, 5, 4, 3 and 2 components. ## Generate x_t p <- 20;n <- 3000 X <- mat.or.vec(p,n) x <- arima.sim(model = list(ar = c(0.5, 0.3), ma = c(-0.9, 0.3, 1.2, 1.3)), n.start = 500, n = n+5, sd = 1) for(i in 1:6) X[i,] <- x[i:(n+i-1)] x <- arima.sim(model = list(ar = c(-0.4, 0.5), ma = c(1, 0.8, 1.5, 1.8)), n.start = 500, n = n+4, sd = 1) for(i in 7:11) X[i,] <- x[(i-6):(n+i-7)] x <- arima.sim(model = list(ar = c(0.85,-0.3), ma=c(1, 0.5, 1.2)), n.start = 500, n = n+3,sd = 1) for(i in 12:15) X[i,] <- x[(i-11):(n+i-12)] x <- arima.sim(model = list(ar = c(0.8, -0.5),ma = c(1, 0.8, 1.8)), n.start = 500, n = n+2,sd = 1) for(i in 16:18) X[i,] <- x[(i-15):(n+i-16)] x <- arima.sim(model = list(ar = c(-0.7, -0.5), ma = c(-1, -0.8)), n.start = 500,n = n+1,sd = 1) for(i in 19:20) X[i,] <- x[(i-18):(n+i-19)] ## Generate y_t A <- matrix(runif(p*p, -3, 3), ncol =p) Y <- A%*%X Y <- t(Y) ## permutation = "max" or permutation = "fdr" res <- PCA_TS(Y, lag.k = 5,permutation = "max") res1 <- PCA_TS(Y, lag.k = 5,permutation = "fdr",beta = 10^(-200)) Z <- res$X
"factors" objectThis function makes predictions from a "factors" object.
## S3 method for class 'factors' predict( object, newdata = NULL, n.ahead = 10, control_ARIMA = list(), control_VAR = list(), ... )## S3 method for class 'factors' predict( object, newdata = NULL, n.ahead = 10, control_ARIMA = list(), control_VAR = list(), ... )
object |
An object of class |
newdata |
Optional. A new data matrix to predict from. |
n.ahead |
An integer specifying the number of steps ahead for prediction. |
control_ARIMA |
A list of arguments passed to the function
|
control_VAR |
A list of arguments passed to the function
|
... |
Currently not used. |
Forecasting for can be implemented in two steps:
Step 1. Get the -step ahead forecast of the
time series [See Factors], denoted by
, using a VAR model
(if ) or an ARIMA model (if ). The orders
of VAR and ARIMA models are determined by AIC by default. Otherwise, they
can also be specified by users through the arguments control_VAR
and control_ARIMA, respectively.
Step 2. The forecasted value for is obtained by
.
ts_pred |
A matrix of predicted values. |
library(HDTSA) data(FamaFrench, package = "HDTSA") ## Remove the market effects reg <- lm(as.matrix(FamaFrench[, -c(1:2)]) ~ as.matrix(FamaFrench$MKT.RF)) Y_2d = reg$residuals res_factors <- Factors(Y_2d, lag.k = 5) pred_fac_Y <- predict(res_factors, n.ahead = 1)library(HDTSA) data(FamaFrench, package = "HDTSA") ## Remove the market effects reg <- lm(as.matrix(FamaFrench[, -c(1:2)]) ~ as.matrix(FamaFrench$MKT.RF)) Y_2d = reg$residuals res_factors <- Factors(Y_2d, lag.k = 5) pred_fac_Y <- predict(res_factors, n.ahead = 1)
"mtscp" objectThis function makes predictions from a "mtscp" object.
## S3 method for class 'mtscp' predict( object, newdata = NULL, n.ahead = 10, control_ARIMA = list(), control_VAR = list(), ... )## S3 method for class 'mtscp' predict( object, newdata = NULL, n.ahead = 10, control_ARIMA = list(), control_VAR = list(), ... )
object |
An object of class |
newdata |
Optional. A new data matrix to predict from. |
n.ahead |
An integer specifying the number of steps ahead for prediction. |
control_ARIMA |
A list of arguments passed to the function
|
control_VAR |
A list of arguments passed to the function
|
... |
Currently not used. |
Forecasting for can be implemented in two steps:
Step 1. Get the -step ahead forecast of the
time series
[See CP_MTS], denoted by , using a VAR model
(if ) or an ARIMA model
(if ). The orders of VAR and ARIMA models are determined by
AIC by default. Otherwise, they can also be specified by users through the
arguments control_VAR and control_ARIMA, respectively.
Step 2. The forecasted value for is obtained by
with
.
Y_pred |
A list of length |
library(HDTSA) data(FamaFrench, package = "HDTSA") ## Remove the market effects reg <- lm(as.matrix(FamaFrench[, -c(1:2)]) ~ as.matrix(FamaFrench$MKT.RF)) Y_2d = reg$residuals ## Rearrange Y_2d into a 3-dimensional array Y Y = array(NA, dim = c(NROW(Y_2d), 10, 10)) for (tt in 1:NROW(Y_2d)) { for (ii in 1:10) { Y[tt, ii, ] <- Y_2d[tt, (1 + 10*(ii - 1)):(10 * ii)] } } res_cp <- CP_MTS(Y, lag.k = 20, method = "CP.Refined") pred_cp_Y <- predict(res_cp, n.ahead = 1)[[1]]library(HDTSA) data(FamaFrench, package = "HDTSA") ## Remove the market effects reg <- lm(as.matrix(FamaFrench[, -c(1:2)]) ~ as.matrix(FamaFrench$MKT.RF)) Y_2d = reg$residuals ## Rearrange Y_2d into a 3-dimensional array Y Y = array(NA, dim = c(NROW(Y_2d), 10, 10)) for (tt in 1:NROW(Y_2d)) { for (ii in 1:10) { Y[tt, ii, ] <- Y_2d[tt, (1 + 10*(ii - 1)):(10 * ii)] } } res_cp <- CP_MTS(Y, lag.k = 20, method = "CP.Refined") pred_cp_Y <- predict(res_cp, n.ahead = 1)[[1]]
"tspca" objectThis function makes predictions from a "tspca" object.
## S3 method for class 'tspca' predict( object, newdata = NULL, n.ahead = 10, control_ARIMA = list(), control_VAR = list(), ... )## S3 method for class 'tspca' predict( object, newdata = NULL, n.ahead = 10, control_ARIMA = list(), control_VAR = list(), ... )
object |
An object of class |
newdata |
Optional. A new data matrix to predict from. |
n.ahead |
An integer specifying the number of steps ahead for prediction. |
control_ARIMA |
A list of arguments passed to the function
|
control_VAR |
A list of arguments passed to the function
|
... |
Currently not used. |
Having obtained using the PCA_TS function, which is
segmented into uncorrelated subseries
, the forecasting for
can be performed in two steps:
Step 1. Get the -step ahead forecast
by using a VAR model (if the dimension of is larger than 1)
or an ARIMA model (if the dimension of is 1). The orders
of VAR and ARIMA models are determined by AIC by default. Otherwise, they
can also be specified by users through the arguments control_VAR and control_ARIMA, respectively.
Step 2. Let .
The forecasted value for is obtained by
.
Y_pred |
A matrix of predicted values. |
library(HDTSA) data(FamaFrench, package = "HDTSA") ## Remove the market effects reg <- lm(as.matrix(FamaFrench[, -c(1:2)]) ~ as.matrix(FamaFrench$MKT.RF)) Y_2d = reg$residuals res_pca <- PCA_TS(Y_2d, lag.k = 5, thresh = TRUE) pred_pca_Y <- predict(res_pca, n.ahead = 1)library(HDTSA) data(FamaFrench, package = "HDTSA") ## Remove the market effects reg <- lm(as.matrix(FamaFrench[, -c(1:2)]) ~ as.matrix(FamaFrench$MKT.RF)) Y_2d = reg$residuals res_pca <- PCA_TS(Y_2d, lag.k = 5, thresh = TRUE) pred_pca_Y <- predict(res_pca, n.ahead = 1)
The data on new hires at a national level are obtained from the Quarterly Workforce Indicators (QWI) of the Longitudinal Employer-Household Dynamics program at the U.S. Census Bureau (Abowd et al., 2009). The national QWI hires data covers a variable number of years, with some states providing time series going back to 1990 (e.g., Washington), and others (e.g., Massachusetts) only commencing at 2010. For each of 51 states (excluding D.C. but including Puerto Rico) there is a new hires time series for each county. Additional description of the data, along with its relevancy to labor economics, can be found in Hyatt and McElroy (2019).
data(QWIdata)data(QWIdata)
A list with 51 elements. Every element contains a multivariate time series.
https://qwiexplorer.ces.census.gov/
https://ledextract.ces.census.gov/qwi/all
Abowd, J. M., Stephens, B. E., Vilhuber, L., Andersson, F., McKinney, K. L., Roemer, M., and Woodcock, S. (2009). The LEHD infrastructure files and the creation of the quarterly workforce indicators. In Producer dynamics: New evidence from micro data, pages 149–230. University of Chicago Press. doi:10.7208/chicago/9780226172576.003.0006.
Hyatt, H. R. and McElroy, T. S. (2019). Labor reallocation, employment, and earnings: Vector autoregression evidence. Labour, 33, 463–487. doi:10.1111/labr.12153
SpecMulTest() implements a new multiple testing procedure proposed in
Chang et al. (2022) for the following hypothesis testing problems:
for .
Here, represents the cross-spectral density between
and at frequency with being
the -th component of the times series ,
and and refer to
the set of index pairs and the set of frequencies associated with the
-th test, respectively.
SpecMulTest(Q, PVal, alpha = 0.05, seq_len = 0.01)SpecMulTest(Q, PVal, alpha = 0.05, seq_len = 0.01)
Q |
The number of the hypothesis tests. |
PVal |
A vector of length |
alpha |
The prescribed level for the FDR control. The default is 0.05. |
seq_len |
The step size for generating a sequence from 0 to
|
An object of class "hdtstest", which contains the following
component:
MultiTest |
A logical vector of length |
Chang, J., Jiang, Q., McElroy, T. S., & Shao, X. (2025). Statistical inference for high-dimensional spectral density matrix. Journal of the American Statistical Association, 120, 1960–1974. doi:10.1080/01621459.2025.2468013.
# Example 1 ## Generate xt n <- 200 p <- 10 flag_c <- 0.8 B <- 1000 burn <- 1000 z.sim <- matrix(rnorm((n+burn)*p),p,n+burn) phi.mat <- 0.4*diag(p) x.sim <- phi.mat %*% z.sim[,(burn+1):(burn+n)] x <- x.sim - rowMeans(x.sim) Q <- 4 ## Generate the sets Iq and Jq ISET <- list() ISET[[1]] <- matrix(c(1,2),ncol=2) ISET[[2]] <- matrix(c(1,3),ncol=2) ISET[[3]] <- matrix(c(1,4),ncol=2) ISET[[4]] <- matrix(c(1,5),ncol=2) JSET <- as.list(2*pi*seq(0,3)/4 - pi) ## Calculate Q p-values PVal <- rep(NA,Q) for (q in 1:Q) { cross.indices <- ISET[[q]] J.set <- JSET[[q]] temp.q <- SpecTest(t(x), J.set, cross.indices, B, flag_c) PVal[q] <- temp.q$p.value } res <- SpecMulTest(Q, PVal) res# Example 1 ## Generate xt n <- 200 p <- 10 flag_c <- 0.8 B <- 1000 burn <- 1000 z.sim <- matrix(rnorm((n+burn)*p),p,n+burn) phi.mat <- 0.4*diag(p) x.sim <- phi.mat %*% z.sim[,(burn+1):(burn+n)] x <- x.sim - rowMeans(x.sim) Q <- 4 ## Generate the sets Iq and Jq ISET <- list() ISET[[1]] <- matrix(c(1,2),ncol=2) ISET[[2]] <- matrix(c(1,3),ncol=2) ISET[[3]] <- matrix(c(1,4),ncol=2) ISET[[4]] <- matrix(c(1,5),ncol=2) JSET <- as.list(2*pi*seq(0,3)/4 - pi) ## Calculate Q p-values PVal <- rep(NA,Q) for (q in 1:Q) { cross.indices <- ISET[[q]] J.set <- JSET[[q]] temp.q <- SpecTest(t(x), J.set, cross.indices, B, flag_c) PVal[q] <- temp.q$p.value } res <- SpecMulTest(Q, PVal) res
SpecTest() implements a new global test proposed in
Chang et al. (2022) for the following hypothesis testing problem:
where represents the cross-spectral density between
and at frequency with being
the -th component of the times series .
Here, is the set of index pairs of interest, and
is the set of frequencies.
SpecTest(X, J.set, cross.indices, B = 1000, flag_c = 0.8)SpecTest(X, J.set, cross.indices, B = 1000, flag_c = 0.8)
X |
An |
J.set |
A vector representing the set |
cross.indices |
An |
B |
The number of bootstrap replications for generating multivariate normally distributed random vectors when calculating the critical value. The default is 2000. |
flag_c |
The bandwidth |
An object of class "hdtstest", which contains the following
components:
Stat |
The test statistic of the test. |
pval |
The p-value of the test. |
cri95 |
The critical value of the test at the significance level 0.05. |
Chang, J., Jiang, Q., McElroy, T. S., & Shao, X. (2025). Statistical inference for high-dimensional spectral density matrix. Journal of the American Statistical Association, 120, 1960–1974. doi:10.1080/01621459.2025.2468013.
# Example 1 ## Generate xt n <- 200 p <- 10 flag_c <- 0.8 B <- 1000 burn <- 1000 z.sim <- matrix(rnorm((n+burn)*p),p,n+burn) phi.mat <- 0.4*diag(p) x.sim <- phi.mat %*% z.sim[,(burn+1):(burn+n)] x <- x.sim - rowMeans(x.sim) ## Generate the sets I and J cross.indices <- matrix(c(1,2), ncol=2) J.set <- 2*pi*seq(0,3)/4 - pi res <- SpecTest(t(x), J.set, cross.indices, B, flag_c) Stat <- res$statistic Pvalue <- res$p.value CriVal <- res$cri95# Example 1 ## Generate xt n <- 200 p <- 10 flag_c <- 0.8 B <- 1000 burn <- 1000 z.sim <- matrix(rnorm((n+burn)*p),p,n+burn) phi.mat <- 0.4*diag(p) x.sim <- phi.mat %*% z.sim[,(burn+1):(burn+n)] x <- x.sim - rowMeans(x.sim) ## Generate the sets I and J cross.indices <- matrix(c(1,2), ncol=2) J.set <- 2*pi*seq(0,3)/4 - pi res <- SpecTest(t(x), J.set, cross.indices, B, flag_c) Stat <- res$statistic Pvalue <- res$p.value CriVal <- res$cri95
This function implements the test proposed in Chang, Cheng and Yao (2022) for the following hypothesis testing problem:
where is
a univariate time series.
UR_test(Y, lagk.vec = NULL, con_vec = NULL, alpha = 0.05)UR_test(Y, lagk.vec = NULL, con_vec = NULL, alpha = 0.05)
Y |
A vector |
lagk.vec |
The time lag |
con_vec |
The constant |
alpha |
The significance level of the test. The default is 0.05. |
An object of class "urtest", which contains the following
components:
statistic |
A |
reject |
An |
lag.vec |
The time lags used in function. |
Chang, J., Cheng, G., & Yao, Q. (2022). Testing for unit roots based on sample autocovariances. Biometrika, 109, 543–550. doi:10.1093/biomet/asab034.
# Example 1 ## Generate yt N <- 100 Y <-arima.sim(list(ar = c(0.9)), n = 2*N, sd = sqrt(1)) con_vec <- c(0.45, 0.55, 0.65) lagk.vec <- c(0, 1, 2) UR_test(Y, lagk.vec = lagk.vec, con_vec = con_vec, alpha = 0.05) UR_test(Y, alpha = 0.05)# Example 1 ## Generate yt N <- 100 Y <-arima.sim(list(ar = c(0.9)), n = 2*N, sd = sqrt(1)) con_vec <- c(0.45, 0.55, 0.65) lagk.vec <- c(0, 1, 2) UR_test(Y, lagk.vec = lagk.vec, con_vec = con_vec, alpha = 0.05) UR_test(Y, alpha = 0.05)
WN_test() implements the white noise tests proposed in Chang, Yao and Zhou
(2017) and Chang et al. (2026+) for the following hypothesis testing problem:
WN_test( Y, lag.k = 2, B = 1000, method = c("L_inf", "L_2"), kernel.type = c("QS", "Par", "Bart"), pre = FALSE, alpha = 0.05, control.PCA = list() )WN_test( Y, lag.k = 2, B = 1000, method = c("L_inf", "L_2"), kernel.type = c("QS", "Par", "Bart"), pre = FALSE, alpha = 0.05, control.PCA = list() )
Y |
An |
lag.k |
The time lag |
B |
The number of bootstrap replications for calculating the critical value. The default is 1000. |
method |
The option for method used in the white noise test. Available options
include: |
kernel.type |
The option for choosing the symmetric kernel used
in the estimation of long-run covariance matrix, which is used when
|
pre |
Logical. This parameter is used when |
alpha |
The significance level of the test. The default is 0.05. |
control.PCA |
A list of control arguments passed to the function
|
An object of class "hdtstest", which contains the following
components:
statistic |
The test statistic of the test. |
p.value |
The p-value of the test. |
lag.k |
The time lag used in function. |
kernel.type |
The kernel used in function. |
Chang, J., He, J., Li, W., & Lin, C. (2026+). An adaptive -type test
for high-dimensional white noise. Preprint.
Chang, J., Guo, B., & Yao, Q. (2018). Principal component analysis for second-order stationary vector time series. The Annals of Statistics, 46, 2094–2124. doi:10.1214/17-AOS1613.
Chang, J., Yao, Q., & Zhou, W. (2017). Testing for high-dimensional white noise using maximum cross-correlations. Biometrika, 104, 111–127. doi:10.1093/biomet/asw066.
#Example 1 ## Generate data n <- 200 p <- 10 Y <- matrix(rnorm(n * p), n, p) ## L_inf res1 <- WN_test(Y, method ="L_inf") Pvalue1 <- res1$p.value statistic1 <- res1$statistic ## L_2 res2 <- WN_test(Y, method = "L_2") Pvalue2 <- res2$p.value statistic2 <- res2$statistic#Example 1 ## Generate data n <- 200 p <- 10 Y <- matrix(rnorm(n * p), n, p) ## L_inf res1 <- WN_test(Y, method ="L_inf") Pvalue1 <- res1$p.value statistic1 <- res1$statistic ## L_2 res2 <- WN_test(Y, method = "L_2") Pvalue2 <- res2$p.value statistic2 <- res2$statistic