ACALBERTO CASTELO
Published on

Notes on Agents

Authors

OSS

Papers

  • Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
    • Make the LLM think step by step.
  • ReAct: Synergizing Reasoning and Acting in Language Models
    • Make the LLM search for the answer leveraging a tool (search engine).
    • Go through the result and evaluate whether the answer is correct.
    • Repeat the process (eg.: updating the query) until the model is satisfied with the answer.
  • Reflexion: Language Agents with Verbal Reinforcement Learning
    • 3 Models:
      • Actor: given context, selects next action
      • Evaluator: given context, action and world observation, grades the actor's action selection
      • Self-reflection: takes all the previous information and provides nuanced feedback that will be used to update the Actor's action selection.
    • Requires feedback from either the world or evaluator or both to improve its final output.
    • 2 memory types:
      • Short term memory for current task
      • Long term memory for quality feedback for future runs.
  • Mixture-of-Agents Enhances Large Language Model Capabilities
    • Proposer and Aggregator LLMs.
    • Iteratively and in parallel, use multiple LLMs to complete a task. At each iteration:
      • Each Proposer LLM will try to complete the task and will generate CoT traces.
      • Each Proposer LLM takes sees the original prompt and all the CoT traces generated by the other LLMs and solves the task again, generating new traces.
    • A final Aggregator LLM collects all the traces and outputs the final answer.