OSS

Papers

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
- Make the LLM think step by step.
ReAct: Synergizing Reasoning and Acting in Language Models
- Make the LLM search for the answer leveraging a tool (search engine).
- Go through the result and evaluate whether the answer is correct.
- Repeat the process (eg.: updating the query) until the model is satisfied with the answer.
Reflexion: Language Agents with Verbal Reinforcement Learning
- 3 Models:
  - Actor: given context, selects next action
  - Evaluator: given context, action and world observation, grades the actor's action selection
  - Self-reflection: takes all the previous information and provides nuanced feedback that will be used to update the Actor's action selection.
- Requires feedback from either the world or evaluator or both to improve its final output.
- 2 memory types:
  - Short term memory for current task
  - Long term memory for quality feedback for future runs.
Mixture-of-Agents Enhances Large Language Model Capabilities
- Proposer and Aggregator LLMs.
- Iteratively and in parallel, use multiple LLMs to complete a task. At each iteration:
  - Each Proposer LLM will try to complete the task and will generate CoT traces.
  - Each Proposer LLM takes sees the original prompt and all the CoT traces generated by the other LLMs and solves the task again, generating new traces.
- A final Aggregator LLM collects all the traces and outputs the final answer.