Date of Award
5-2022
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Mathematical Sciences
Committee Chair/Advisor
Christopher McMahan
Committee Member
Xiaoqian Sun
Committee Member
Andrew Brown
Committee Member
Yu-Bo Wang
Abstract
In this dissertation, we develop novel techniques that allow for the regression analysis of data emerging from group testing processes and set the groundwork for graphic processing units (GPU) enabled implementations. Group testing primarily occurs in clinical laboratories, where it is used to quickly and cheaply diagnose patients. Typically, group testing tests a pooled specimen--several specimens combined into one sample--instead of testing individual specimens one-by-one. This method reduces costs by using fewer tests when the disease prevalence is low. Due to recent advances in diagnostic technology, group testing protocols were extended to incorporate multiplex assays, which are diagnostic tests that, unlike their predecessors, test for multiple infectious agents simultaneously. The diseases that a multiplex assay screen typically share co-infection risks. The positive correlation stemming from co-infection risks creates a more challenging modeling framework. In this work, we develop a Bayesian regression methodology that can analyze multiplex testing outcomes collected as part of any group testing protocol. The model can maintain marginal interpretability for regression parameters and, when the assay accuracies are unknown, we can simultaneously estimate regression parameters with the test's sensitivity (true positive rate) and specificity (true negative rate). Based on a carefully constructed data augmentation strategy, we derive a Markov chain Monte Carlo (MCMC) posterior sampling algorithm that can be used to complete model fitting. We demonstrate our methodology via numerical simulations and by using it to analyze chlamydia and gonorrhea, which are sexually transmitted infections, data collected as part of Iowa's public health laboratory's testing efforts. This regression framework's drawback is the computational intensity of the proposed steps in the MCMC algorithm. Due to computational costs, this algorithm does not scale well to the high-volume clinical laboratory settings where group testing is commonly employed. We need to accelerate the proposed algorithms to provide for faster modeling fitting and prediction. The devised MCMC algorithm has several independent steps (e.g., sampling the latent disease status of each patient), or matrix operations. These sections of the algorithm are well-suited to be accelerated with parallel processing. Parallel processing is a software task that was historically used on large-scale computer clusters, but GPUs perform similar computations. To solve the scaling issue with our MCMC algorithms, we explore the GPU as a remedy. We discuss the necessary GPU techniques to take the first steps toward fitting group testing regression models with GPUs. Specifically, we explore stochastic gradient and coordinate descent algorithm implementation with independent steps and matrix operations. Finally, we provide an example MCMC implementation on a GPU to demonstrate potential acceleration.
Recommended Citation
Cubre, Paul, "Groundwork for the Development of GPU Enabled Group Testing Regression Models" (2022). All Dissertations. 3026.
https://open.clemson.edu/all_dissertations/3026