Hugging Face 聊天

Hugging Face 文本生成推理 (TGI) 是一种专门的部署解决方案，用于在云中提供大型语言模型 (LLM) 服务，使其可以通过 API 访问。TGI 通过连续批处理、令牌流和高效的内存管理等功能，为文本生成任务提供优化的性能。

文本生成推理要求模型与特定于其架构的优化兼容。虽然支持许多流行的 LLM，但并非 Hugging Face Hub 上的所有模型都可以使用 TGI 部署。如果您需要部署其他类型的模型，请考虑改用标准的 Hugging Face 推理端点。

有关受支持模型和架构的完整最新列表，请参阅文本生成推理受支持模型文档。

先决条件

您需要在 Hugging Face 上创建一个推理端点并创建一个 API 令牌以访问该端点。更多详细信息，请参见此处。Spring AI 项目定义了一个名为spring.ai.huggingface.chat.api-key 的配置属性，您应将其设置为从 Hugging Face 获取的 API 令牌的值。还有一个名为spring.ai.huggingface.chat.url 的配置属性，您应将其设置为在 Hugging Face 中配置模型时获得的推理端点 URL。您可以在推理端点的 UI 此处找到它。导出环境变量是一种设置这些配置属性的方法。

export SPRING_AI_HUGGINGFACE_CHAT_API_KEY=<INSERT KEY HERE>
export SPRING_AI_HUGGINGFACE_CHAT_URL=<INSERT INFERENCE ENDPOINT URL HERE>

添加存储库和 BOM

Spring AI 工件发布在 Spring Milestone 和 Snapshot 存储库中。请参阅存储库部分，将这些存储库添加到您的构建系统。

为了帮助进行依赖项管理，Spring AI 提供了一个 BOM（物料清单）以确保在整个项目中使用一致版本的 Spring AI。请参阅依赖项管理部分，将 Spring AI BOM 添加到您的构建系统。

自动配置

Spring AI 为 Hugging Face 聊天客户端提供 Spring Boot 自动配置。要启用它，请将以下依赖项添加到项目的 Maven pom.xml 文件

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-huggingface-spring-boot-starter</artifactId>
</dependency>

或您的 Gradle build.gradle 构建文件。

dependencies {
    implementation 'org.springframework.ai:spring-ai-huggingface-spring-boot-starter'
}

请参阅依赖项管理部分，将 Spring AI BOM 添加到您的构建文件。

聊天属性

前缀spring.ai.huggingface 是允许您配置 Hugging Face 聊天模型实现的属性前缀。

属性

描述

默认值

spring.ai.huggingface.chat.api-key

用于对推理端点进行身份验证的 API 密钥。

spring.ai.huggingface.chat.url

要连接到的推理端点的 URL

spring.ai.huggingface.chat.enabled

启用 Hugging Face 聊天模型。

true

示例控制器（自动配置）

创建一个新的 Spring Boot 项目并将spring-ai-huggingface-spring-boot-starter添加到您的 pom（或 gradle）依赖项。

在src/main/resources目录下添加一个application.properties文件，以启用和配置 Hugging Face 聊天模型。

spring.ai.huggingface.chat.api-key=YOUR_API_KEY
spring.ai.huggingface.chat.url=YOUR_INFERENCE_ENDPOINT_URL

将api-key和url替换为您自己的 Hugging Face 值。

这将创建一个HuggingfaceChatModel实现，您可以将其注入到您的类中。这是一个使用聊天模型进行文本生成的简单@Controller类的示例。

@RestController
public class ChatController {

    private final HuggingfaceChatModel chatModel;

    @Autowired
    public ChatController(HuggingfaceChatModel chatModel) {
        this.chatModel = chatModel;
    }

    @GetMapping("/ai/generate")
    public Map generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        return Map.of("generation", this.chatModel.call(message));
    }
}

手动配置

HuggingfaceChatModel实现了ChatModel接口，并使用[低级 API]连接到 Hugging Face 推理端点。

将spring-ai-huggingface依赖项添加到项目的 Maven pom.xml 文件

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-huggingface</artifactId>
</dependency>

或您的 Gradle build.gradle 构建文件。

dependencies {
    implementation 'org.springframework.ai:spring-ai-huggingface'
}

请参阅依赖项管理部分，将 Spring AI BOM 添加到您的构建文件。

接下来，创建一个HuggingfaceChatModel并将其用于文本生成。

HuggingfaceChatModel chatModel = new HuggingfaceChatModel(apiKey, url);

ChatResponse response = this.chatModel.call(
    new Prompt("Generate the names of 5 famous pirates."));

System.out.println(response.getGeneration().getResult().getOutput().getContent());