- Published on
Notes on Agents
- Authors
- Name
- Alberto Castelo
- @acaste10
OSS
Papers
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
- Make the LLM think step by step.
- ReAct: Synergizing Reasoning and Acting in Language Models
- Make the LLM search for the answer leveraging a tool (search engine).
- Go through the result and evaluate whether the answer is correct.
- Repeat the process (eg.: updating the query) until the model is satisfied with the answer.
- Reflexion: Language Agents with Verbal Reinforcement Learning
- 3 Models:
- Actor: given context, selects next action
- Evaluator: given context, action and world observation, grades the actor's action selection
- Self-reflection: takes all the previous information and provides nuanced feedback that will be used to update the Actor's action selection.
- Requires feedback from either the world or evaluator or both to improve its final output.
- 2 memory types:
- Short term memory for current task
- Long term memory for quality feedback for future runs.
- 3 Models:
- Mixture-of-Agents Enhances Large Language Model Capabilities
- Proposer and Aggregator LLMs.
- Iteratively and in parallel, use multiple LLMs to complete a task. At each iteration:
- Each Proposer LLM will try to complete the task and will generate CoT traces.
- Each Proposer LLM takes sees the original prompt and all the CoT traces generated by the other LLMs and solves the task again, generating new traces.
- A final Aggregator LLM collects all the traces and outputs the final answer.