- This event has passed.
Seminar on Wednesday, December 28, 2011
December 28, 2011 @ 12:00 pm - 1:00 pm
Bayesian Penalized Methods for High-Dimensional Data
Full Title: | Bayesian Penalized Methods for High-Dimensional Data and Network Analysis |
Speaker: | Zakaria S Khondker |
University of North Carolina at Chapel Hill, USA | |
Date/Time: | Wednesday, December 28, 2011, 12:00 noon |
Venue: | ISRT Seminar Room |
The curse of dimensionality boils down to dealing with too many parameters than the sample size reasonably permits. When dimension is larger than the sample size the model is unidentifiable and all the parameters are not estimable. Even when the dimension is smaller than the sample size but dimension to sample size ratio is not small enough or there is colinearity among the predictors the estimators are unstable. Penalized methods for shrinkage of parameters are becoming increasingly popular. The advent of high-dimensional data, where the number of covariates (p) or responses (d) exceed the sample size (n), made traditional estimation techniques infeasible. Even for cases with sample size larger than the number of parameters shrinkage can improve performance in both mean and covariance
parameters.
Our first paper focuses on estimation of sparse covariance matrices and their inverse subject to positive definiteness constraints. The abundance of high-dimensional data, where the sample size (n) is less than the dimension (d), requires shrinkage estimation methods since the maximum likelihood estimator is not positive definite in this case. Furthermore, when n is larger than d but not sufficiently larger, shrinkage estimation is more stable than maximum likelihood as it reduces the condition number of the precision matrix. Frequentist methods have utilized penalized likelihood methods, whereas Bayesian approaches rely on matrix decompositions, Wishart priors or graph theory for shrinkage. In this paper we propose a new Bayesian method, called the Bayesian Covariance Lasso (BCLASSO), for the shrinkage estimation of a precision (covariance) matrix. We consider a class of priors for the precision matrix that leads to the popular frequentist penalties as special cases, develop a Bayes estimator for the precision matrix, and propose an efficient sampling scheme that does not precalculate boundaries for positive definiteness. The proposed method is permutation invariant and performs shrinkage and estimation simultaneously for non-full rank data. Simulations show that the proposed BCLASSO performs similarly as frequentist methods for non-full rank data.
Our second paper focuses on estimation of the matrix of regression coefficients for high-dimensional multivariate response. The common approaches for dimension reduction in high-dimensional data include variable selection and penalized regression. Penalized approaches like lasso, adaptive lasso, SCAD, and Bayesian lasso have been used for the estimation of mean parameters for multivariate response. A less explored approach for multivariate response involves dimension reduction via reduced rank decomposition of the regression coefficient matrix to take advantage of correlations among the regression coefficients that arises due to correlation among both responses and predictors. The approach may be advantageous when genes work in unison affecting each other and small effects of many genes may add up to a larger phonotypical impact. We first derive the framework for L1 priors on multivariate coefficient matrix in traditional approach. Then we develop the generalized low rank regression (GLRR) model under L2 priors and derive the framework for L1 priors. Simulations and application to ADNI data suggest that GLRR has great advantage over traditional approaches. It greatly reduces the number of parameters while performing much better; comparative performance gets even better for higher dimensions.
Details
- Date:
- December 28, 2011
- Time:
-
12:00 pm - 1:00 pm
- Event Category:
- seminar