§. Bibliography
Last verified April 2026 - 47 sources
Further Reading: A Working Bibliography
The papers, framework docs, books, and reports cited across this site. Annotated, grouped, and maintained as a single reference asset.
Foundational papers
- Yao, Zhao, Yu, Du, Shafran, Narasimhan, CaoThe foundational ReAct paper. Introduces the Thought-Action-Observation loop.
- Shinn, Cassano, Gopinath, Narasimhan, YaoVerbal reinforcement: an explicit memory of past failures across episodes.
- Madaan et al.Single-model iterative improvement loop.
- Schick, Dwivedi-Yu, Dessi et al.Early systematic treatment of LLM tool use.
- Wang, Ma, Feng, Zhang et al.The most-cited LLM-agent survey.
- Xi, Chen, Guo et al.Companion survey covering capabilities, applications, and risks.
- Wu, Bansal, Zhang et al.The AutoGen architecture paper. Conversational multi-agent foundations.
- Yao, Yu, Zhao, Shafran, Griffiths, Cao, NarasimhanGeneralisation of chain-of-thought to a tree search over reasoning steps.
- Lewis, Perez, Piktus et al.The original RAG paper.
- Liu, Lin, Hewitt et al.Documents the attention drop in the middle of long contexts.
Framework documentation (references, not rankings)
- LangChainGraph-based state-machine orchestration.
- Microsoft ResearchConversational multi-agent framework.
- CrewAIRole-based crew orchestration.
- OpenAIHandoff-based coordination, successor to Swarm.
- LlamaIndexRetrieval-first framework with agent support.
- AnthropicAnthropic’s production agent SDK and patterns.
- MicrosoftPlanner-and-skills model with .NET, Java, and Python support.
- PydanticType-driven agent framework with Pydantic validation.
- deepsetRetrieval-centric pipeline model with agent support.
Industry reports and indices
- Stanford HAITechnical performance, cost, and adoption data sourced from public datasets.
- MIT Technology ReviewEditorial coverage with strong sourcing discipline.
- Anthropic ResearchPublished research from Anthropic; agent and safety papers.
- Google DeepMindPublished research from Google DeepMind.
- OpenAI ResearchPublished research and technical reports from OpenAI.
- AnthropicThe clearest public treatment of the workflow-vs-agent distinction.
Books
- S. Russell and P. NorvigArtificial Intelligence: A Modern Approach (4th ed.)Pearson, 2020. The canonical AI textbook. Chapter 2 on intelligent agents remains the reference for the classical taxonomy.
- F. CholletDeep Learning with Python (2nd ed.)Manning, 2021. The deep-learning foundation underneath language models.
- S. Bubeck et al.Published as a preprint; widely read as a chapter-length essay on early-2023 GPT-4 capabilities.
Security and safety
- OWASPThe reference list of LLM-application security risks.
- NISTGovernment-grade framework for AI risk.
- Perez and RibeiroFoundational direct prompt-injection paper.
- Greshake, Abdelnabi, Mishra et al.The indirect prompt-injection threat model.
- AnthropicPublished policy for scaling frontier model deployments.
- Bai, Kadavath, Kundu et al.The Constitutional AI methodology paper.
Benchmarks (cross-reference)
- Liu, Yao, Zhang et al.Broad agent capability evaluation across eight environments.
- Jimenez, Yang, Wettig et al.The standard coding-agent benchmark.
- Zhou, Xu, Zhu et al.Web-navigation benchmark.
- Sierra ResearchMulti-turn customer-service tool-use benchmark.
- BerkeleyStandard tool-use reliability leaderboard.
- AgentCogito sister siteThe dedicated reference for agent and LLM benchmarks.
A note on citations
Where a claim on this site cites a number, a date, or an empirical result, the citation links to the source above. Where a claim is editorial synthesis (a generalisation across multiple sources, an opinion about practice), it is marked as such inline rather than fabricated as a stat. If a citation appears broken or a claim seems wrong, please send corrections via the contact route.