Date of Award
8-2024
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
School of Computing
Committee Chair/Advisor
Feng Luo
Committee Member
Long Cheng
Committee Member
Abolfazl Razi
Committee Member
Linke Guo
Committee Member
Yuyuan Ouyang
Abstract
Optimization in the realm of machine learning constitutes a fundamental process aimed at refining the parameters of models to enhance their performance. It serves as the backbone of various machine learning techniques, encompassing diverse algorithms and methodologies tailored to address specific tasks and objectives.
In machine learning, datasets are commonly structured as matrices or tensors, making techniques like matrix factorization and tensor factorization indispensable for extracting meaningful representations from intricate data. Furthermore, datasets commonly comprise multiple sets of features, which has inspired our exploration of effective strategies for leveraging information from diverse sources during optimization. Additionally, the interconnected nature of multiple tasks presents opportunities to enhance learning performance by simultaneously addressing these tasks. Starting from deterministic algorithms as our foundation, we investigate ways to accelerate update processes using randomized algorithms while maintaining solution quality. The specific studies conducted in this work include:
In the realm of matrix/tensor factorization, we delve into addressing the symmetric matrix factorization problem by introducing a regularization term. Our approach involves devising an efficient column-wise update rule and establishing a versatile framework capable of solving symmetric matrix factorization problems with various constraints. Additionally, we propose a robust nonnegative tensor factorization formulation inspired by the $L_{2,p}$-norm ($0 < p < 2$) for matrices. In order to improve performance, we incorporate a graph regularization term into our formulation. We then derive efficient update rules for the proposed optimization problems and provide rigorous proofs demonstrating the convergence and correctness of the algorithms.
In our exploration of effectively leveraging diverse information sources during optimization, we first present a robust multi-view kernel subspace clustering method. This approach enhances the consensus affinity matrix by integrating information from both the learning and clustering stages. Additionally, it extends the linear space to kernel space to capture the nonlinear structures inherent in multi-view data. Furthermore, we propose a multi-task learning framework that incorporates prior knowledge of feature relations. This framework imposes penalties on coefficient changes for specific features, allowing for the capture of a common set of features through group sparsity.
Finally, we acknowledge the potential effectiveness of incorporating randomness in certain scenarios. We partition the parameter learning in multi-task learning into separate processes for task-shared component and task-specific variation. Moreover, we explore an extension of the multi-task learning problem using a randomized optimization approach, highlighting the trade-off between compromising performance guarantees and reducing computational costs.
Recommended Citation
Zhang, Mengyuan, "Optimization Strategies to Enhance Performance in Matrix/Tensor Factorization and Multi-Source Data Integration" (2024). All Dissertations. 3733.
https://open.clemson.edu/all_dissertations/3733