Date of Award
August 2020
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Computer Engineering
Committee Member
Richard R Brooks
Committee Member
James Martin
Committee Member
Harlan Russell
Committee Member
Kuang-Ching Wang
Abstract
Network traffic analysis is using metadata to infer information from traffic flows. Network traffic flows are the tuple of source IP, source port, destination IP, and destination port. Additional information is derived from packet length, flow size, interpacket delay, Ja3 signature, and IP header options. Even connections using TLS leak site name and cipher suite to observers. This metadata can profile groups of users or individual behaviors.
Statistical properties yield even more information. The hidden Markov model can track the state of protocols where each state transition results in an observation. Format Transforming Encryption (FTE) encodes data as the payload of another protocol. The emulated protocol is called the host protocol. Observation-based FTE is a particular case of FTE that uses real observations from the host protocol for the transformation. By communicating using a shared dictionary according to the predefined protocol, it can difficult to detect anomalous traffic.
Combining observation-based FTEs with hidden Markov models (HMMs) emulates every aspect of a host protocol. Ideal host protocols would cause significant collateral damage if blocked (protected) and do not contain dynamic handshakes or states (static). We use protected static protocols with the Protocol Proxy--a proxy that defines the syntax of a protocol using an observation-based FTE and transforms data to payloads with actual field values. The Protocol Proxy massages the outgoing packet's interpacket delay to match the host protocol using an HMM. The HMM ensure the outgoing traffic is statistically equivalent to the host protocol. The Protocol Proxy is a covert channel, a method of communication with a low probability of detection (LPD). These covert channels trade-off throughput for LPD.
The multipath TCP (mpTCP) Linux kernel module splits a TCP streams across multiple interfaces. Two potential architectures involve splitting a covert channel across several interfaces (multipath) or splitting a single TCP stream across multiple covert channels (multisession). Splitting a covert channel across multiple interfaces leads to higher throughput but is classified as mpTCP traffic. Splitting a TCP flow across multiple covert channels is not as performant as the previous case, but it provides added obfuscation and resiliency. Each covert channel is independent of the others, and a channel failure is recoverable.
The multipath and multisession frameworks provide independently address the issues associated with covert channels. Each tool addresses a challenge. The Protocol Proxy provides anonymity in a setting were detection could have critical consequences. The mpTCP kernel module offers an architecture that increases throughput despite the channel's low-bandwidth restrictions. Fusing these architectures improves the goodput of the Protocol Proxy without sacrificing the low probability of detection.
Recommended Citation
Oakley, Jonathan, "Traffic Analysis Resistant Infrastructure" (2020). All Dissertations. 2673.
https://open.clemson.edu/all_dissertations/2673