IWSM 2020: Invited Speakers
- Maria Durban (Professor of Statistics. University Carlos III of Madrid, Spain).
A general framework for prediction in generalized additive mixed models
TBA
- Montserrat Fuentes (Provost, Professor of Statistics, Actuarial Science and Biostatistics. University of Iowa, USA).
Spatial Statistical Modelling of Neuroimaging Data
Imaging data with thousands of spatially-correlated data points are common in many fields. In Neurosciences, magnetic resonance imaging (MRI) is a primary modality for studying brain structure and activity. Modeling spatial dependence of MRI data at different scales is one of the main challenges of contemporary neuroimaging, and it could allow for accurate testing for significance in neural activity. The high dimensionality of this type of data (millions of voxels) presents modeling challenges and serious computational constraints. Methods that account for spatial correlation often require very cumbersome matrix evaluations which are prohibitive for data of this size, and thus current methods typically reduce dimensionality by modeling covariance among regions of interest – coarser or larger spatial units – rather than among voxels. However, ignoring spatial dependence at different scales could drastically reduce our ability to detect important activation patterns in the brain and hence produce misleading results. To overcome these problems, we introduce a novel Bayesian Tensor approach, treating the brain image as response and having a vector of predictors. Our method provides estimates of the parameters of interest using a generalized sparsity principle. This method is implemented using a fully Bayesian approach to characterize different sources of uncertainty. We demonstrate posterior consistency and develop a computational efficient algorithm. The effectiveness of our approach is illustrated through simulation studies and the analysis of the effects of drug addiction on the brain structure. We implement this method to identify the effects of demographic information and cocaine addiction on the functioning of the brain.
- Yudi Pawitan (Professor of Biostatistics. Karolinska Institutet, Sweden).
- Virginie Rondeau (Directeur de Recherche, INSERM Bordeaux, France).
The use of joint modelling to validate surrogate failure-time endpoints
In many Biomedical areas, the identification and validation of surrogate endpoints is of prime interest to reduce the duration and/or size of clinical trials. Numerous validation methods have been proposed, the most popular is based on a two-step analysis strategy in the context of meta-analysis. For two failure time endpoints, two association measurements are usually considered, one at the individual level and one at the trial level. However, thus approach is not always available mainly due to convergence or estimation problems in clinical trials. We are presenting here different approaches based on joint frailty models and a one-step validation method with new attractive and well developed tools for the validation of failure time surrogate endpoints. Both individual- and trial-level surrogacy were evaluated using a new definition of Kendall's tau and the coefficient of determination.
We aim in this work to popularize these new surrogate endpoints validation approaches by making the methods available in a user-friendly R package. Thus, we provide in the frailtypack R package numerous tools, including more flexible functions, for the validation of candidate surrogate endpoints, using data from multiple randomized clinical trials. We have especially the surrogate threshold effect which is used in combination with R^2_{trial} to make a decision concerning the validity of the surrogate endpoints. It is also possible thanks to frailtypack to predict the treatment effect on the true endpoint in a new trial using the treatment effect observed on the surrogate endpoint. The leave-one-out cross-validation is available for the assessment of the accuracy of the prediction using the joint surrogate model. Other tools concerned data generation, studies simulation and graphic representations. We illustrate the use of the new functions with both real data and simulated data.
- Stijn Vansteelandt (Professor of Statistics. Ghent University, Belgium).
Machine learning for the evaluation of treatment effects: challenges, solutions and improvements
The evaluation of treatment effects from observational studies typically requires adjustment for high-dimensional confounding. This is the result of a lack of comparability between treated and untreated subjects in possibly many (pre-treatment) factors that are also related to outcome. While such adjustment is routinely achieved via parametric modelling, it is not entirely satisfactory as model misspecification is likely, and even relatively minor misspecifications over the observed data range may induce large bias in the treatment effect estimate. Over the past 2 decades, there has therefore been growing interest in the use of machine learning methods to assist this task. This is not surprising if one considers the enormous contributions that the machine learning literature has offered on how to predict outcomes based on possibly high-dimensional predictors or features. In this talk, I will therefore focus on the use of machine learning for the evaluation of (causal) treatment effects. This turns out to be a challenging task: while the prediction performance of a given machine learning algorithm can be measured by contrasting observed and predicted outcomes, such evaluation becomes impossible when machine learning is used for treatment effect estimation since the true treatment effect is always unknown. In this talk, I will demonstrate that naive use of existing machine learning algorithms is problematic for treatment evaluation and explain why that is the case. I will next give a gentle introduction to pioneering work on Targeted Learning and on Double Machine Learning, and will discuss improvements that we have made to these techniques. Throughout the talk, machine learning will be considered in the broad sense as any algorithm that uses data to learn a proper model for the data, thus including (though not being limited to) routine variable selection procedures. The talk is based on joint work with Oliver Dukes (Ghent University) and will be accessible to attendees without a detailed understanding of machine learning algorithms.