Install the current release from CRAN:
Or install the latest development version from GitHub:
For n
samples, we simulate p
inputs
(features, covariates) and q
outputs (outcomes, responses).
We assume high-dimensional inputs (p
\(\gg\) n
) and low-dimensional
outputs (q
\(\ll\)
n
).
We simulate the p
inputs from a multivariate normal
distribution. For the mean, we use the p
-dimensional vector
mu
, where all elements equal zero. For the covariance, we
use the p
\(\times\)
p
matrix Sigma
, where the entry in row \(i\) and column \(j\) equals rho
\(^{|i-j|}\). The parameter rho
determines the strength of the correlation among the inputs, with small
rho
leading weak correlations, and large rho
leading to strong correlations (0 < rho
< 1). The
input matrix X
has n
rows and p
columns.
mu <- rep(0,times=p)
rho <- 0.90
Sigma <- rho^abs(col(diag(p))-row(diag(p)))
X <- MASS::mvrnorm(n=n,mu=mu,Sigma=Sigma)
We simulate the input-output effects from independent Bernoulli
distributions. The parameter pi
determines the number of
effects, with small pi
leading to few effects, and large
pi
leading to many effects (0 < pi
< 1).
The scalar alpha
represents the intercept, and the
p
-dimensional vector beta
represents the
slopes.
From the intercept alpha
, the slopes beta
and the inputs X
, we calculate the linear predictor, the
n
-dimensional vector eta
. Rescale the linear
predictor to make the effects weaker or stronger. Set the argument
family
to "gaussian"
, "binomial"
,
or "poisson"
to define the distribution. The n
times p
matrix Y
represents the outputs. We
assume the outcomes are positively correlated.
eta <- alpha + X %*% beta
eta <- 1.5*scale(eta)
family <- "gaussian"
if(family=="gaussian"){
mean <- eta
Y <- replicate(n=q,expr=rnorm(n=n,mean=mean))
}
if(family=="binomial"){
prob <- 1/(1+exp(-eta))
Y <- replicate(n=q,expr=rbinom(n=n,size=1,prob=prob))
}
if(family=="poisson"){
lambda <- exp(eta)
Y <- replicate(n=q,expr=rpois(n=n,lambda=lambda))
}
cor(Y)
The function joinet
fits univariate and multivariate
regression. Set the argument alpha.base
to 0 (ridge) or 1
(lasso).
Standard methods are available. The function predict
returns the predicted values from the univariate (base
) and
multivariate (meta
) models. The function coef
returns the estimated intercepts (alpha
) and slopes
(beta
) from the multivariate model (input-output effects).
And the function weights
returns the weights from stacking
(output-output effects).
The function cv.joinet
compares the predictive
performance of univariate (base
) and multivariate
(meta
) regression by nested cross-validation. The argument
type.measure
determines the loss function.
## [,1] [,2]
## base 1.259044 1.309195
## meta 1.240890 1.220423
## none 3.104446 3.558173
Armin Rauschenberger and Enrico Glaab (2021). “Predicting correlated outcomes from molecular data”. Bioinformatics 37(21):3889–3895. doi: 10.1093/bioinformatics/btab576. (Click here to access PDF.)