Loading Events

« All Events

  • This event has passed.

Seminar on Classification and Clustering for RNA-seq data with variable selection 

June 12, 2023 @ 12:00 pm - 1:00 pm

Speaker: Tanbin Rahman PhD, FDA, USA
Title: Classification and Clustering for RNA-seq data with variable selection
Abstract: Clustering and classification play an important role in identifying sub-types of complex diseases as well as building a predictive model in the field of medicine. In recent years, lowering of cost and high accuracy has made RNA-seq widely popular which is expected to continue to grow over the next few years. One of the important features of RNA-seq data is its count data structure. While there has been a great deal of literature in both clustering and classification methods, most of them are either heuristic or suitable for continuous data and do not directly generalize to count data.

In the first part of the presentation, we develop a negative binomial mixture model with lasso or fused lasso gene regularization to cluster samples (small n) with high-dimensional gene features (large p). A modified EM algorithm and Bayesian information criterion are used for inference and determining tuning parameters. The method is compared with existing methods using extensive simulations and two real transcriptomic applications in rat brain and breast cancer studies. The result shows the superior performance of the proposed count data model in clustering accuracy, feature selection, and biological interpretation in pathways.

In the second part of this presentation, we will discuss a classification model based on negative binomial distribution via generalized linear model framework with double regularization for gene and covariate sparsity to accommodate three key elements: adequate modeling of count data with overdispersion, gene selection and adjustment for covariate effect. The proposed method is evaluated in simulations and two real applications using cervical tumor miRNA-seq data and schizophrenia post-mortem brain tissue RNA-seq data to demonstrate its superior performance in prediction accuracy and feature selection.

Details

Date:
June 12, 2023
Time:
12:00 pm - 1:00 pm
Event Category:

Leave a Reply