Usted está aquí: Inicio / Actividades / Seminarios / Seminario del Nodo Multidisciplinario de Matemáticas Aplicadas / Actividades del Seminario del Nodo Multidisciplinario de Matemáticas Aplicadas / Data Reduction Prior to Inference: Is it Sensible to Use Principal Component Scores to Make Group Comparisons in a Student's t-test or ANOVA?

Data Reduction Prior to Inference: Is it Sensible to Use Principal Component Scores to Make Group Comparisons in a Student's t-test or ANOVA?

Ponente: Edward J. Bedrick
Institución: Department of Epidemiology and Biostatistics, University of Arizona, Tucson
Tipo de Evento: Investigación
Cuándo 11/11/2019
de 16:00 a 17:00
Dónde Aula de cómputo, IMATE Juriquilla
Agregar evento al calendario vCal
iCal

Abstract: There has been a significant recent development of statistical methods for inference with high-dimensional data.  Despite these developments, biomedical researchers and computational scientists often use a simple two-step step process to analyze multivariate data. First, the dimensionality is reduced using a standard technique such as principal component analysis, followed by a group comparison using a t-test or analysis of variance. In this talk I will try to untangle a number of issues associated with this approach, stating with the simplest but most vexing question - what hypothesis is being tested? I will use a combination of approaches, including asymptotics, analytical construction of worst case scenarios, and simulation based on actual data to address whether this approach is sensible. Although asymptotics will consider a non-sparse setting, some discussion of sparse problems will be given.  A short discussion of the use of PC scores for classification will also be provided.