TLS变异报文生成

星期五, 10月 18, 2024 | 9分钟阅读 | 更新于星期日, 12月 8, 2024

YuanFeng Xie

本文针对RFC规则抽取的过程进行详细说明，包括自然语言描述人工切分，手动划定转换节点，规则抽取工程，规则违反构建。本文以RFC8446文档举例说明，方法流程可以应用到其它具有网络状态转换的开源网络协议，例如RSTP。LLM基于GPT-4o-mini加系统提示词进行设计。

历史工作

AFLnet
AFLnet搜集历史报文数据，构建状态机并利用状态机构建符合网络协议规范的报文序列，利用状态机选择器选取特定的报文序列，输入到待测服务端查看反馈。状态机的使用体现在输入报文序列的有效性上。

HDiff
以RFC规则作为输入，分别进行了报文结构和报文约束的抽取。报文结构用人工定义的ABNF文法进行文本解析获取，用于生成基础的合法报文；报文约束利用NLP方法捕获RFC中强情感约束的语句获取，用于指导变异报文。在规则抽取的过程中，可以获取到包含规则字段约束和规则处理约束的规范要求（SR）。

语法约束
语法约束：报文字段的长度，类型，部分报文字段的计算方式，顺序，依赖关系等影响报文解析，违反约束会导致解析错误；

语义约束
语义约束：部分报文字段的顺序，依赖关系，计算方式等影响报文的返回类型，但不影响报文解析，违反约束会导致错误反馈。

CHATAFL
在HDiff的基础上，利用LLM的提示词工程，将适用协议进行推广，不再局限于HTTP，融合了AFLnet的思想，加入状态机，利用LLM进行消息骨架生成和消息变异。

RFCdiff
RFCdiff相比CHATAFL，专注于TLS，TLS协议难度更加复杂。难点在于TLS握手报文内部各个字段的语法约束多，在进行语义约束违法变异报文时，会打破语法约束，造成解析错误。

流程概述

目标
选取RFC文本描述规则作为输入，利用LLM作为工具载体，构建Agent链，抽取客户端发送消息的构建约束，以及服务端接收消息的处理反馈，用以指导模糊测试，检查RFC标准文档与网路协议代码实现之间的不一致。

TLS异常握手报文的生成可以划分为四个阶段：规则处理，种子收集，变异算子生成，测试验证。状态机的使用体现在输入报文的变异策略上。

规则抽取

结构：利用LLM进行规则提取，依次进行文档分割，规则抽取，规则组合构建，规则筛选

阶段成果：

文档分割：216个语义文段-> 97个包含强语气要求规则约束的文段
规则抽取+规则组合构建：174个约束规则->55个明确包含消息发送约束和消息处理约束的规则

规则筛选：ClientHello报文且无关消息序列依赖的规则20条，

期望的：<CLI-MSG-CONST> <1> (Clients MUST place the "pre_shared_key" extension as the last extension in the ClientHello) + <SRV-MSG-PROC> <1> (Servers MUST verify that the "pre_shared_key" extension is the last extension in ClientHello and fail the handshake with an "illegal_parameter" alert if it is not)
实际的：<CLI-MSG-CONST> <1> (Clients MUST update the "pre_shared_key" extension by recalculating the "obfuscated_ticket_age" and binder values, and MAY remove incompatible PSKs) + <SRV-MSG-PROC> <0> (Servers MUST verify the updated "pre_shared_key" values in the new ClientHello)

种子收集负责将收集到的TLS历史握手报文改造成变异模板
变异算子生成包括，规则违反生成，变异算子序列生成，状态转换生成
测试验证根据规则抽取得到的变异算子序列，消息模板，进行变异报文内容填充，输入到待测服务器查看反馈，并与状态转换进行比较，如果一致则跳过，如果不一致则列为问题报文进行分析

规则抽取

规则抽取流程概览
规则抽取划分为三部分：
完整语义文段抽取(Natural Semantics-based Splitting)：选取RFC8446文档的握手协议章节作为输入，利用LLM的文本总结能力，基于文档主题以及文段内容完整度对文段进行分割，输出文段片段内容以及对应主题。
文段抽取形式化规则(NL Rule Extraction)：选取分割后的片段，利用LLM的文本分类能力，基于文档内容进行进一步划分分类，将文档内容转换为消息构造约束(MSG-CONST)，消息处理约束(MSG-PROC)，再根据TLS协议的特性，具有两种类型的终端：客户端和服务端。进行进一步划分，可以分为客户端消息构造约束(SRV-MSG-PROC)，客户端消息处理约束(CLI-MSG-PROC)，服务端消息构造约束(SRV-MSG-CONST)，服务端消息处理约束(SRV-MSG-PROC)。同时，添加约束强度及原始文本两个属性，约束强度可以反应规则描述的强制执行程度（MUST, MAY等）以及约束获取的支撑性（约束是直接从描述中获得还是从文本描述中推理得到）；原始文本有助于利用LLM生成结果进行人工检查校验，查看LLM的设计缺陷，进行后续改进。
形式化规则配对(Rule Pair And Filter)：分类后的形式化规则信息零散，不能有效指导TLS协议实现的测试，需要对形式化规则进行配对，构成规则对，每个规则对包含两个消息，消息的组合包含多种，但以下两种居多：客户端消息构造约束(SRV-MSG-PROC)+服务端消息处理约束(SRV-MSG-PROC)；客户端消息处理约束(CLI-MSG-PROC)+服务端消息处理约束(SRV-MSG-PROC)。对应到TLS握手过程，即服务端如何处理客户端发送的满足特定约束的握手报文，或是客户端如何处理服务端发送的满足特定约束的握手报文。当前实验专注于第一种情况，客户端消息构造约束(SRV-MSG-PROC)+服务端消息处理约束(SRV-MSG-PROC)，报文主体即为Clienthello，报文反馈有三种大类：ServerHello, ServerRetryRequest, Alert。

提示
由于LLM输出会基于已有内容进行推测，输出内容可能与已有内容冲突，添加角色说明，任务描述，思维链设计，输出约束，样例介绍等，辅助LLM更好完成任务

文档分割: 基于自然语言的完整语义进行抽取。

When multiple extensions of different types are present, the extensions MAY appear in any order, with the exception of "pre_shared_key" which MUST be the last extension in the ClientHello (but can appear anywhere in the ServerHello extensions block). There MUST NOT be more than one extension of the same type in a given extension block.

规则抽取与规则配对：只获取CMC+SMP模式

示例输入
'''
When multiple extensions of different types are present, the extensions MAY appear in any order, with the exception of "pre_shared_key" which MUST be the last extension in the ClientHello (but can appear anywhere in the ServerHello extensions block). There MUST NOT be more than one extension of the same type in a given extension block.
'''
示例输出
'''
<CLI-MSG-CONST> <1> (Clients MUST place the "pre_shared_key" extension last in ClientHello, while other extensions MAY appear in any order) + <SRV-MSG-PROC> <1> (Servers MUST verify that the "pre_shared_key" is the last extension in ClientHello)
<CLI-MSG-CONST> <1> (Clients MUST NOT include multiple extensions of the same type in any extension block) + <SRV-MSG-PROC> <1> (Servers MUST reject ClientHello messages containing duplicate extension types)
'''

模板构建

报文模板构建流程概览
报文模板构建划分为两部分：
握手报文模板构建(TLS Handshake Packet Template Generation)：利用TLS框架（rustls）构建握手模板报文。当前实验关注ClientHello报文
报文模板解析(TLS Handshake Packet Field Parsing)：选取握手模板报文，遍历报文内容，构建解析字段框架，针对字段进行固定值/固定范围/随机值等类型的设计。当前实验关注ClientHello报文的字段解析。

握手报文模板生成

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 网络环境测试
    perform_local_network_test()?;
    // 运行指令参数获取
    let matches = get_command_matches();
    // 运行说明
    if matches.get_flag("use_guide") {
        terminal::print_help();
        return Ok(());
    }
    // 参数校验
    let server_name = get_server_name(&matches);
    let server_ip = get_server_ip(&matches);
    let port = get_port(&matches);
    let easy_read = matches.get_flag("easy_read");
    print_configuration_info(&server_name, &server_ip, port, easy_read);
    // 服务器环境测试
    perform_server_environment_test(&server_ip, port)?;
    // TLS握手准备，TCP连接建立
    let config = Arc::new(network_connect::create_tls_config());
    let mut conn = network_connect::create_client_connection(config, server_name.clone())?;
    // clienthello报文模板初始化
    let mut client_hello = Vec::new();
    // 模板报文生成
    conn.write_tls(&mut client_hello)?;
    // 解析clienthello报文
    parse_client_hello_if_enabled(&matches, &client_hello, easy_read);
    send_client_hello_if_test_env(&matches, &server_ip, port,&mut conn, &client_hello, easy_read)?;
    terminal::print_help();
    Ok(())
}

报文模板解析：

pub struct ClientHello {
    // TLS Record Layer
    pub content_type: u8,           // 固定值 0x16
    pub version: [u8; 2],           // 固定值 [0x03, 0x01]
    pub record_length: u16,         // 动态值，需要计算

    // Handshake Layer
    pub handshake_type: u8,         // 固定值 0x01
    pub handshake_length: [u8; 3],  // 动态值，需要计算

    // ClientHello specific fields
    pub client_version: [u8; 2],    // 固定值 [0x03, 0x03]
    pub random: [u8; 32],           // 32 字节随机数
    pub session_id_length: u8,      // 动态值
    pub session_id: Vec<u8>,        // 动态值
    pub cipher_suites_length: u16,  // 动态值
    pub cipher_suites: Vec<u8>,     // 动态值
    pub compression_methods_length: u8,  // 动态值
    pub compression_methods: Vec<u8>,    // 动态值
    pub extensions_length: u16,     // 动态值
    pub extensions: Vec<Extension>, // 动态值
}

pub struct Extension {
    pub extension_type: [u8; 2],
    pub extension_length: [u8; 2],
    pub extension_content: Vec<u8>,
}

变异算子生成

变异算子生成流程概览
变异算子生成划分为三部分：
规则违反描述配对(Structured Rule Violation Pair)：选取形式化规则配对中的客户端消息构建约束，更新为客户端消息构架约束违反，服务端消息处理约束保持不变。当前实验关注ClientHello报文构建约束违反的生成。
规则违反反馈匹配(NL Rule Violation Pair Expected Response)：选取客户端约束已更新为消息违反的规则对，更新规则队中的服务端消息处理约束为服务端消息处理反馈。当前实验关注ClientHello报文会触发的服务端反馈。
变异算子生成(Mutation Operator Sequence Generation)：利用原子级变异算子(Atomic Mutator)进行序列组合，构成变异算子序列，将规则对中的客户端约束违反更新为消息变异算子序列。当前实验关注ClientHello报文的变异算子序列。

规则违反描述配对(Structured Rule Violation Pair)

- 示例输入
'''
{CMC} <1> (Clients MAY include multiple extensions in any order, except for "pre_shared_key" which MUST be the last extension.) + {SMP} <1> (Servers MAY process multiple extensions in any order, except for "pre_shared_key" which MUST be the last extension.)
'''
- 示例输出
'''
{CMC} (Send a "pre_shared_key" extension with an incorrect "obfuscated_ticket_age" value that does not match the expected computation.)  + {SMP} <1> (Servers MAY process multiple extensions in any order, except for "pre_shared_key" which MUST be the last extension.)
{CMC} (Include an additional "pre_shared_key" extension with conflicting binder values, causing ambiguity in the handshake.) + {SMP} <1> (Servers MAY process multiple extensions in any order, except for "pre_shared_key" which MUST be the last extension.)
'''

规则违反反馈匹配(NL Rule Violation Pair Expected Response)

- 示例输入
'''
{CMC} (Send a "pre_shared_key" extension with an incorrect "obfuscated_ticket_age" value that does not match the expected computation.)  + {SMP} <1> (Servers MAY process multiple extensions in any order, except for "pre_shared_key" which MUST be the last extension.)
'''
- 示例输出
'''
{CMC} (Send a "pre_shared_key" extension with an incorrect "obfuscated_ticket_age" value that does not match the expected computation.)  + {SMP} (The server MUST abort the handshake with an "illegal_parameter" alert)
'''

原子变异算子(Atomic Mutator)

// 根据HashMap的Key进行对应mutate函数的选取，value即为变异后的值
pub fn mutate(&mut self, mutation_config: &HashMap<u8, Vec<u8>>) {
    for (&key, value) in mutation_config.iter() {
        match key {
            1 => self.mutate_random(value),
            2 => self.mutate_session_id(value),
            3 => self.mutate_cipher_suites(value),
            // 4 => self.mutate_compression_methods(value),
            // 5 => self.mutate_extensions(value),
            _ => println!("Unknown mutation key: {}", key),
        }
    }
}
// 总的变异过程，
// 1. 利用解析后的clienthello进行初始化，
// 2. 利用mutate进行变异，
// 3. 增强语法正确性，更新所有长度数据
pub fn mutate_client_hello(client_hello: &ClientHello, mutation_config: &HashMap<u8, Vec<u8>>) -> ClientHello {
    let mut mutator = ClientHelloMutator::new(client_hello.clone());
    mutator.mutate(mutation_config);
    mutator.update_lengths();
    mutator.get_mutated_client_hello().clone()
}

变异算子生成：待补充

环境验证

环境验证流程概览
环境验证：利用模板构建步骤获得的握手报文模板以及变异算子生成步骤生成的变异算子序列，服务端预期反馈三个内容作为输入，对握手报文模板使用变异算子序列，生成变异的握手报文并在测试环境中发送给服务器，获取服务器反馈并与服务器预期反馈进行比较，如果符合，则忽略，如果不符合，需要存储该异常报文以及对应的变异算子序列，用以指导下轮变异算子生成。

提示
测试环境搭建选取windows server 2022（老版本仅支持TLS1.2版本，不支持TLS1.3）作为测试客户机，在VMware上完成安装，基于IIS服务，构建自建证书，搭建TLS服务器

Agent部署

Agent部署采用FASTPOE-API接口进行实现，设计系统提示词，调整调用的LLM，temperature进行输出结果的调整。

from __future__ import annotations
from typing import AsyncIterable
import fastapi_poe as fp
from modal import App, Image, asgi_app

SYSTEM_PROMPT = """
prompt mentioned
""".strip()

class PromptBot(fp.PoeBot):
    async def get_response(
        self, request: fp.QueryRequest
    ) -> AsyncIterable[fp.PartialResponse]:
        request.temperature = 0.7
        request.query = [
            fp.ProtocolMessage(role="system", content=SYSTEM_PROMPT, content_type="text/plain")
        ] + request.query
        async for msg in fp.stream_request(
            request, "GPT-4o-Mini", request.access_key
        ):
            yield msg

    async def get_settings(self, setting: fp.SettingsRequest) -> fp.SettingsResponse:
        return fp.SettingsResponse(server_bot_dependencies={"GPT-4o-Mini": 1})

REQUIREMENTS = ["fastapi-poe==0.0.48"]
image = Image.debian_slim().pip_install(*REQUIREMENTS)
app = App("prompt-bot-poe")

@app.function(image=image)
@asgi_app()
def fastapi_app():
    bot = PromptBot()
    # see https://creator.poe.com/docs/quick-start#configuring-the-access-credentials
    # app = fp.make_app(bot, access_key=<YOUR_ACCESS_KEY>, bot_name=<YOUR_BOT_NAME>)
    app = fp.make_app(bot, access_key="**************************", bot_name="********************")
    return app

提示
bot_name以及access_key根据server_bot给出的内容进行修改，在modal serve阶段测试后，利用modal deploy进行部署，需要及时更新server bot内部的API

存在问题

限定条件的确认：TLS1.3版本的客户端，进行的变异是客户端发送报文内容（clienthello）的变异
agent的系统提示词设计需要改进
基于RFC规则的差分测试，将客户端报文规则违反和预期服务器反馈建立联系

TLS变异报文生成

历史工作

流程概述

规则抽取

模板构建

变异算子生成

环境验证

Agent部署

存在问题

参考资料

教育背景（研二在读）