LangChain 源码剖析-结构化输出详解(Structured output)
LangChain 源码剖析-结构化输出详解(Structured output)结构化输出允许代理以特定的、可预测的格式返回数据。您可以获得JSON对象、Pydantic模型或应用程序可以直接使用的数据类形式的结构化数据,而不是解析自然语言响应。LangChain的create_agent自动处理结构化输出。用户设置他们想要的结构化输出模式,当模型生成结构化数据时,它会被捕获、验证,并在代理状态的"structured_response"键中返回。defcreate_agent(...response_format:Union[ToolStrategy[StructuredResponseT],ProviderStrategy[StructuredResponseT],type[StructuredResponseT],]响应格式(Response Format)控制代理返回结构化数据的方式当直接提供模式类型时,LangChain会自动选择供应商策略(ProviderStrategy[StructuredResponseT])使用提供程序本机结构化输出一些模型提供者通过其API原生支持结构化输出(例如OpenAI、Grok、Gemini)。这是最可靠的方法。classProviderStrategy(Generic[SchemaT]):schema:type[SchemaT]ProviderStrategy: schema参数Pydantic模型:带字段验证的BaseModel子类pipinstallpydanicfrompydanticimportBaseModel,Fieldfromlangchain.agentsimportcreate_agentclassContactInfo(BaseModel):"""Contact information for a person."""name:str=Field(description="The name of the person")email:str=Field(description="The email address of the person")phone:str=Field(description="The phone number of the person")agent=create_agent(model="gpt-5",response_format=ContactInfo# Auto-selects ProviderStrategy)result=agent.invoke({"messages":[{"role":"user","content":"Extract contact info from: John Doe, john@example.com, (555) 123-4567"}]})print(result["structured_response"])# ContactInfo(name='John Doe', email='john@example.com', phone='(555) 123-4567')Dataclasses:带有类型注释的Python数据类fromdataclassesimportdataclassfromlangchain.agentsimportcreate_agent@dataclassclassContactInfo:"""Contact information for a person."""name:str# The name of the personemail:str# The email address of the personphone:str# The phone number of the personagent=create_agent(model="gpt-5",tools=tools,response_format=ContactInfo# Auto-selects ProviderStrategy)result=agent.invoke({"messages":[{"role":"user","content":"Extract contact info from: John Doe, john@example.com, (555) 123-4567"}]})result["structured_response"]# ContactInfo(name='John Doe', email='john@example.com', phone='(555) 123-4567')TypedDict:类型化字典类fromtyping_extensionsimportTypedDictfromlangchain.agentsimportcreate_agentclassContactInfo(TypedDict):"""Contact information for a person."""name:str# The name of the personemail:str# The email address of the personphone:str# The phone number of the personagent=create_agent(model="gpt-5",tools=tools,response_format=ContactInfo# Auto-selects ProviderStrategy)result=agent.invoke({"messages":[{"role":"user","content":"Extract contact info from: John Doe, john@example.com, (555) 123-4567"}]})result["structured_response"]# {'name': 'John Doe', 'email': 'john@example.com', 'phone': '(555) 123-4567'}JSON Schema:具有JSON模式规范的字典fromlangchain.agentsimportcreate_agent contact_info_schema={"type":"object","description":"Contact information for a person.","properties":{"name":{"type":"string","description":"The name of the person"},"email":{"type":"string","description":"The email address of the person"},"phone":{"type":"string","description":"The phone number of the person"}},"required":["name","email","phone"]}agent=create_agent(model="gpt-5",tools=tools,response_format=ProviderStrategy(contact_info_schema))result=agent.invoke({"messages":[{"role":"user","content":"Extract contact info from: John Doe, john@example.com, (555) 123-4567"}]})result["structured_response"]# {'name': 'John Doe', 'email': 'john@example.com', 'phone': '(555) 123-4567'}工具策略(ToolStrategy[StructuredResponseT])使用工具调用进行结构化输出对于不支持本机结构化输出的模型,LangChain使用工具调用来实现相同的结果。这适用于支持工具调用的所有模型,这是最现代的模型。classToolStrategy(Generic[SchemaT]):schema:type[SchemaT]tool_message_content:str|Nonehandle_errors:Union[bool,str,type[Exception],tuple[type[Exception],...],Callable[[Exception],str],]ToolStrategy: schema参数Pydantic模型:带字段验证的BaseModel子类frompydanticimportBaseModel,FieldfromtypingimportLiteralfromlangchain.agentsimportcreate_agentfromlangchain.agents.structured_outputimportToolStrategyclassProductReview(BaseModel)