Date of Award
8-2022
Document Type
Thesis
Degree Name
Master of Science (MS)
Department
Industrial Engineering
Committee Chair/Advisor
Thomas Sharkey
Committee Member
Yongja Song
Committee Member
Emily Tucker
Abstract
Qualitative coding is a long and strenuous process that requires a well-skilled investigator. Natural language processing techniques have made leaps and bounds as far as usability and application domain, although it does not work for every task. In this work, we have created a natural language processing framework to help qualitative coders automatically obtain the nodes and node arcs from federal case files, dockets, and indictments within a sex trafficking network. The produced nodes and arcs allows us to perform network modeling by providing us with the information needed to create network structures that can then be used for interdiction simulation. The network models can also be analyzed for patterns, trends, and contrasts. Another goal for these networks is to apply Operations Research (OR) methods to better understand the operations of sex trafficking networks. Results fared better for the node extraction task, begging the question, does automation belong in the process of coding sex trafficking networks? If yes, then future implementations should avoid rule-based matching, despite the high structure of court documents. Additionally, more data would help improve accuracy of a model; however, obtaining ground truth data requires human coders. This thesis helps to address the question of how automated techniques, such as natural language processing and machine learning, can play a role in qualitative coding and thematic analysis. Further, by focusing on obtaining networks from text documents, it provides a basis for inputs into operations research models.
Recommended Citation
Diaz, Maria, "Generating Sex Trafficking Networks From Text Documents" (2022). All Theses. 3869.
https://open.clemson.edu/all_theses/3869