jedanny

jedanny

Agent and Prompt Practice

What is an AI Agent System#

  • An Agent system is a type of system that can independently complete tasks. Unlike traditional software, it can execute workflows with a high degree of independence under user authorization.
  • The core of the Agent system lies in utilizing LLM for workflow management and decision-making, and it can dynamically select tools to interact with external systems as needed.

When to Build an AI Agent System#

  • Agent systems are suitable for complex, multi-step tasks that traditional automation methods struggle to handle, such as scenarios requiring complex decisions, difficult-to-maintain rules, or reliance on unstructured data.
  • For example, in payment fraud analysis, an Agent system can evaluate context and identify suspicious activities like an experienced investigator.

Fundamentals of AI Agent Design#

  • An Agent system consists of three core components: model (LLM), tools (external APIs), and instructions (clear behavioral guidelines).
  • Model selection should be based on a trade-off between task complexity, latency, and cost.
  • Tools should be standardized to allow flexible use across agent systems.
  • Instructions need to be clear and unambiguous to reduce ambiguity and improve decision quality.

Orchestration Patterns#

  • Single Agent System: A single model equipped with appropriate tools and instructions to execute workflows in a loop.
  • Multi-Agent System: Distributing workflows among multiple coordinated Agents, divided into "managed" (a central Agent coordinates multiple specialized Agents) and "decentralized" (multiple Agents equally hand over tasks to each other).

Guardrails#

  • Guardrails are key to ensuring the safe operation of Agent systems, managing data privacy risks and reputational risks.
  • Guardrails can include relevance classifiers, security classifiers, PII filters, content moderation, and tool security assessments.
  • The setup of guardrails should be based on actual risks and continuously adjusted as new vulnerabilities are discovered.

Implementation Recommendations#

  • Start with a single Agent system and gradually expand to a multi-Agent system.
  • Use flexible prompt templates to simplify maintenance and evaluation.
  • Implement human intervention mechanisms in the Agent system to address high-risk or failure situations.

Agent Prompting Practices#

openai Agent Prompting Best Practices

Persistence#

Tell the model that this is a multi-turn interaction that needs to keep working until the issue is resolved, not just answer once.

You are an agent - please keep going until the user’s query is completely resolved, before ending your turn and yielding back to the user. Only terminate your turn when you are sure that the problem is solved.

-> 

You are an intelligent assistant - please continue working until the user's issue is fully resolved, then end your turn. Only stop when you are certain the problem is solved.

Tool-calling#

Encourage the model to proactively use tools for queries when uncertain, rather than guessing.

If you are not sure about file content or codebase structure pertaining to the user’s request, use your tools to read files and gather the relevant information: do NOT guess or make up an answer.

-> 

If you are unsure about file content or code structure, please use tools to query relevant information: absolutely do not guess or fabricate answers.

Planning#

Guide the model to plan before each tool call and reflect afterward. This allows the model to "articulate the thought process," improving problem-solving ability (without relying on reasoning models).

You MUST plan extensively before each function call, and reflect extensively on the outcomes of the previous function calls. DO NOT do this entire process by making function calls only, as this can impair your ability to solve the problem and think insightfully.

-> 

You must plan in detail before each function call and reflect on the results afterward. Do not rely solely on function calls to solve the entire problem.

Correct Approach to Tool Calling#

  • It is strongly recommended to use the tools field of the API: stop manually inserting tool descriptions in prompts! Official tests show that passing tool definitions through the API can reduce errors and improve model performance (SWE-bench improvement of 2%).
  • Give a good name and clearly state the purpose: tool names and descriptions should be clear, as should parameters, to help the model use tools correctly.
  • Complex tool usage should be placed in examples: if the tool is complex, it is best to have a dedicated "# Example" section in the system prompt to keep descriptions concise.
  • Start with overall requirements: list basic requirements under the title "Instructions" or "Response Rules."
  • Detail specifics point by point: use subheadings to elaborate on specific behaviors.
  • Clarify the order of steps: if specific processes are required, clearly mark them with an ordered list.
  • Debugging and optimization:
    • Check for contradictions between instructions.
    • Provide clear examples demonstrating the desired outcome.
    • Use caution with all caps, exclamation marks, and other emphasis methods, as they may cause the model to over-focus on these points.

Common Pitfalls and Solutions#

  • Must XX trap: for example, forcing "must call tools before every response" may lead to random tool calls. Solution: add clarification that "ask the user first when information is insufficient."
  • Copying examples issue: the model may directly replicate the examples you provide. Solution: clearly state "reference but not limited to these examples, adjust flexibly based on the situation."
  • Too verbose issue: sometimes the model outputs too much explanation or unnecessary formatting. Solution: explicitly request conciseness and specific formatting in the instructions.

Context Handling#

  • Capability boundaries: while basic capabilities are strong, performance may decline when retrieving a lot of information from vast data or performing complex reasoning requiring global information.
  • Best practices: repeat instructions at both the beginning and end of the context.
  • You can clearly instruct the model whether it can only use the information you provide or if it can combine it with its own knowledge base.

Chain of Thought#

Guide the model to "think" like a human, breaking complex problems into smaller steps to solve simple chain of thought instructions (without relying on reasoning models):

...First, think carefully step by step about what documents are needed to answer the query. Then, print out the TITLE and ID of each document. Then, format the IDs into a list.

-> 

...First, carefully think step by step about what documents are needed to answer the query. Then, list the TITLE and ID of each document. Finally, format the IDs into a list.

Advanced CoT: If you find the model's thought process deviating, you can standardize its thinking strategy with more specific instructions. For example, the guide provides an example requiring the model to first conduct query analysis, then context analysis, and finally synthesis.

# Role and Objective
# Instructions
## Sub-categories for more detailed instructions
# Reasoning Steps (e.g., Chain of Thought instructions)
# Output Format
# Examples
## Example 1
# Context (if any)
# Final instructions and prompt to think step by step (e.g., the CoT starter)

-> 

# Role and Objective
# Instructions
## Sub-categories for more detailed instructions
# Reasoning Steps (e.g., Chain of Thought instructions)
# Output Format
# Examples
## Example 1
# Context (if any)
# Final instructions and prompt to think step by step

Separator Selection#

  • Preferred Markdown: titles, lists, code blocks, etc., are clear and intuitive.
  • XML is also good: suitable for precisely wrapping content and easy to nest.
  • JSON is relatively cumbersome: strong structure but may require escaping in prompts.
  • Long document scenarios: XML (<doc id=1 title="...">...</doc>) and tabular formats (ID: 1 | TITLE: ... | CONTENT: ...) work well, while JSON performs poorly.
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.