Date of Award

5-2026

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Electrical and Computer Engineering (Holcomb Dept. of)

Committee Chair/Advisor

Tao Wei

Committee Member

Xiaoyong Yuan

Committee Member

Fatemeh Afghah

Abstract

This thesis presents a system that allows large AI models to run directly on personal devices instead of relying on cloud servers. Recent advances in artificial intelligence, especially large language models (LLMs), have made it possible to build powerful applications such as chatbots, coding assistants, and intelligent agents. However, most of these systems run in the cloud, which raises concerns about privacy, latency, and cost.

To address these issues, this work develops a local AI serving system that runs efficiently on a specialized hardware component called a Neural Processing Unit (NPU). The system provides a unified interface that supports multiple types of tasks, including text generation, image under- standing, speech recognition, and embedding-based search. It also supports advanced features such as streaming responses and tool calling, which are essential for building modern AI agents.

The system is designed to be compatible with widely used APIs, allowing existing appli- cations to use it without modification. Experimental results show that the system works correctly across different tasks and improves efficiency through techniques such as prompt caching.

Overall, this work demonstrates that it is possible to run advanced AI systems locally in a practical and efficient way, enabling faster, more private, and more flexible AI applications.

Recommended Citation

Ni, Zhiheng, "Implementation of a Local LLM Serving System for Agentic AI" (2026). All Theses. 4796.
https://open.clemson.edu/all_theses/4796

Download

Included in

Electrical and Computer Engineering Commons

COinS

All Theses

Implementation of a Local LLM Serving System for Agentic AI

Date of Award

Document Type

Degree Name

Department

Committee Chair/Advisor

Committee Member

Committee Member

Abstract

Recommended Citation

Included in

Search

Browse by

Useful Links

All Theses

Implementation of a Local LLM Serving System for Agentic AI

Author

Date of Award

Document Type

Degree Name

Department

Committee Chair/Advisor

Committee Member

Committee Member

Abstract

Recommended Citation

Included in

Share

Search

Browse by

Useful Links