Date of Award
5-2022
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Civil Engineering
Committee Chair/Advisor
Dr. Tuyen (Robert) Le
Committee Member
Dr. Kalyan R. Piratla
Committee Member
Dr. Kapil Chalil Madathil
Committee Member
Dr. Da Li
Abstract
Contract documents are a critical legal component of a construction project that specify all wishes and expectations of the owner toward the design, construction, and handover of a project. A single contract package, especially of a design-build (DB) project, comprises hundreds of documents including thousands of requirements. Precise comprehension and management of the requirements are critical to ensure that all important explicit and implicit requirements of the project scope are captured, managed, and completed. Since requirements are mainly written in a natural human language, the current manual methods impose a significant burden on practitioners to process and restructure them into a manageable format during different construction stages. The conventional manual methods may also involve human errors that could result in costly delays and legal disputes. With the advancement of natural language processing (NLP) techniques, there have been several efforts in automating the requirement processing and management. However, the existing automated models developed by previous researchers are highly domain-specific, application-oriented, and applicable to quantitative requirements only. The use of the specific dataset, categories, and rules in training those models has limited their applicability to certain applications only. To address these gaps, the current study proposes a novel requirement digitalization framework that utilizes natural language processing (NLP) techniques to process and restructure requirements in contracts. The proposed framework is comprised of four main models: (1) an NLP-based binary text classification model leveraging the rules and machine learning algorithms to extract all requirements from construction contracts, (2) an NLP-based multiclass text classification model to classify the requirement into different categories (such as design, construction, and operation and maintenance), (3) a syntactic rule-based requirement tagging model employing NLP to extract the project activities related information (such as actor, action, and object) from the requirements, and (4) a semantic NLP-based requirement prioritization model to rank requirements in terms of their severity levels. The models were evaluated in terms of different metrics including accuracy, precision, recall, and f-score. The evaluations were performed on datasets of unseen requirements extracted from contracts of real DB projects. The effectiveness of the proposed models was further investigated by conducting experimental studies to compare their performance with humans. The proposed models developed in this research yielded an impressive performance ranging from 80%-96%.
Recommended Citation
Hassan, Fahad Ul, "Digitalization of Construction Project Requirements Using Natural Language Processing (NLP) Techniques" (2022). All Dissertations. 3024.
https://open.clemson.edu/all_dissertations/3024
Author ORCID Identifier
0000-0002-2308-2606
Included in
Civil Engineering Commons, Construction Engineering and Management Commons, Digital Communications and Networking Commons, Other Computer Engineering Commons, Transportation Engineering Commons