Large Language Models (LLMs) have become a central focus in AI research, demonstrating strong capabilities in processing and generating complex language. However, their practical use often requires significant human input to convert outputs into actionable decisions. To address this, LLM agents have been developed to improve autonomy, reducing human intervention while increasing their utility across diverse tasks. This blog will explain the current practice and challenges in adopting AI agents.
LLM agents can integrate modules to enhance their autonomy and perform tasks beyond the capability of standard LLMs. For example, in a customer service context, a simple LLM might respond to a query such as, “My laptop screen is flickering, and it’s still under warranty. What should I do?” with generic troubleshooting advice, such as restarting the device. If the issue persists, the LLM might suggest further steps. However, complex tasks including verifying warranty status, processing refunds, or arranging repairs require human intervention. LLM agents address this by incorporating the following modules to handle such scenarios autonomously:
Moreover, it can be applied in various situation such as employee empowerment; code creation; data analysis; cybersecurity; and creative ideation and production. Check out 185 proposed applications of LLM agents here.
Some academics argue that the agent paradigm is a plausible pathway to achieving Artificial General Intelligence (AGI). Proponents of this view suggest that these systems, which leverage multi-modal understanding and reality-agnostic training through generative AI and independent data sources, embody key characteristics of AGI. Indeed, a recent Stanford survey illustrates that when foundation models for agent tasks are trained on cross-reality data, they exhibit adaptability to both physical and virtual contexts. This adaptability, as they argue, underscores the viability of the agent paradigm as a step toward AGI.
This section provides a deeper dive explanation of the current technical practices of agentic designs briefly covered above, namely Multimodality, Tool Use, Memory, Refection, and Community Interaction.
Multimodal augmentation enhances LLM autonomy by enabling the processing of text, images, audio, and video. A typical Multimodal Large Language Model (MLLM) includes two key components: a pre-trained modality encoder, which converts non-text data into processable tokens or features, and a modality connector, which integrates these inputs with the LLM. The model is then fine-tuned using specialized datasets to ensure effective multimodal integration.
The connector plays a critical role in this process and can be implemented in different ways. Token-level fusion converts encoded features into tokens, which are merged with text tokens before processing. For instance, Q-Former in BLIP-2 uses learnable queries to compress visual data into an LLM-compatible format. MLP-based methods, such as those in LLaVA, align visual tokens with text embeddings. Feature-level fusion enables deeper integration by combining vision and language features, as seen in Flamingo, which uses cross-attention layers for continuous interaction between modalities. Find out more here.
Tool-use enhances LLMs by enabling interactions with external tools like APIs, databases, and interpreters, addressing their limitations in accessing real-time data and performing specialized tasks. This capability expands problem-solving, expertise, and environment interaction.
The tool-use process includes four stages:
Frameworks such as ReAct and Toolformer are commonly used. Find out more here.
Tool-use expands LLMs’ ability to interact with their environment by providing a framework for understanding and manipulating physical objects. This is achieved by integrating affordances—the possible actions an object allows based on its properties (e.g., grasping a handle or pushing an edge). By recognizing affordances, LLM agents can conceptualize the physical world as a set of actionable tools. For instance, understanding the affordances of a block enables the agent to identify the optimal side to push for a desired outcome. This affordance-driven approach bridges abstract reasoning with practical interaction in real-world contexts.
Memory is essential for LLM agents, enabling them to recall experiences, adapt to feedback, and maintain context for real-world interactions. It supports complex tasks, personalization, and autonomous evolution.
The memory mechanism consists of three steps:
A notable framework is MemoryBank. , which is explained in greater detail here.
LLM reflection enhances decision-making during inference without retraining, avoiding the need for extensive datasets and fine-tuning. It provides flexible feedback (scalar values or free-form) and improves tasks like programming, decision-making, and reasoning. Studies on Chain of Thought and test-time computation demonstrate that intermediate reasoning and adaptive computation enhance performance.
The Reflexion framework includes three models: the Actor, which performs actions (e.g., tool use, response generation); the Evaluator, which scores the outcomes of actions; and the Self-Reflection model, which provides feedback stored in long-term memory for future improvement. This iterative process allows the agent to refine its approach with each cycle.
Large Language Model-based Multi-Agent (LLM-MA) systems employ multiple specialized LLMs to collaboratively solve complex problems, enabling advanced applications in software development, multi-robot systems, policymaking, and game simulation. These systems, with specialized profiles and environments, outperform single-agent models in handling intricate problems and simulating social dynamics.
Key components include:
Notable frameworks include Autogen, Swarm, and MetaGPT, which are outlined here.
LLM agents offer advanced capabilities across domains but also present vulnerabilities that affect reliability, safety, and ethics. These risks stem from design insufficiencies—including issues with privacy, bias, sustainability, efficacy, and transparency, as well as operational challenges, such as adversarial attacks, misalignment, and malicious use. Addressing these challenges is essential for ensuring their safe and effective development.
Design inefficiencies in LLM agents come mainly from technical problems in how these systems are built. Unlike risks that depend on the social context in which they are used, these challenges are more about flaws in the system’s design and structure. Problems like privacy issues, bias, high energy use, poor performance, and lack of transparency show weaknesses in how these agents are developed. These are not so much about where or how the systems are applied but about the need to improve their basic design to make them safer, more reliable, and more ethical in any context.
Privacy-related issues in LLM agents arise from handling sensitive data. Multimodal inputs like images, audio, and video often contain identifiable information, requiring robust anonymization techniques that are challenging across modalities. Tool-use increases risks by sharing user data with x, which may not adhere to consistent privacy standards.
Under the GDPR, data controllers must ensure lawful, transparent, and secure processing, including accountability for third-party compliance. Memory management heightens privacy risks, as extensive data storage demands encryption, access controls, and mechanisms for data deletion to meet GDPR requirements for data minimization and erasure rights. Failure to comply with these provisions can lead to breaches. In LLM-MA systems, inter-agent communication amplifies risks, as weak protocols can expose sensitive data and hinder compliance with privacy regulations. Additionally, the GDPR’s right to manual evaluation requires LLM agents to enable human oversight and allow users to challenge decisions made solely through automated processing. Without such mechanisms, LLM agents risk non-compliance and a loss of trust. Addressing these challenges requires robust governance, data flow auditing, and oversight protocols to ensure privacy and regulatory adherence.
Bias in LLM agents can amplify harmful patterns, particularly through multimodal augmentation, which may reinforce skewed outputs from text, images, and cultural contexts. Tool-use can inherit biases from specific tools, memory may repeatedly draw on biased data, and reflection processes risk entrenching biases through skewed feedback loops. In LLM-MA systems, domain-specific biases and inter-agent interactions can further reduce fairness and transparency.
Regulations like the EU AI Act and NYC Local Law 144 require high-risk AI systems to prevent discriminatory outcomes and promote accountability, but compliance remains difficult. Diverse datasets are hard to secure, biases vary across modalities, and fairness metrics lack universal standards, leading to inconsistent evaluations. Tool operations and large-scale audits add further complexity, and existing guidelines often fail to address the unique challenges of agentic AI systems, making sustained bias mitigation a persistent challenge.
LLM agents face sustainability challenges due to high computational demands, leading to increased energy use and environmental impact. Multimodal models, particularly those processing images and audio, are resource-intensive, as are tool-use, memory storage, and reflection-based learning, which add to energy consumption. Systems with multiple specialized agents exacerbate this through extended computation cycles. Efficiency measures, such as model quantization, task-specific models, adaptive activation, and agent pruning, can help reduce resource usage.
Regulatory frameworks, including the EU AI Act (10^25 FLOPS threshold) and the U.S. Executive Order on AI (monitoring models above 10^26 FLOPS or clusters at 10^20 FLOPS), link high computational intensity to environmental risks but fail to capture the full scope of energy use. Lifecycle-based assessments, accounting for total emissions during development, training, and deployment, are crucial for addressing these sustainability challenges.
Efficacy-related design issues arise from the complexity of integrating diverse data types and coordinating multiple agents. Multimodal augmentation faces alignment challenges, as token- and feature-level fusion methods may fail to fully utilize non-textual data. Issues like cross-modal hallucination can lead to inaccurate outputs. Tool-use depends on accurate task planning and tool selection, where errors can result in poor responses. Inefficient memory management may retrieve irrelevant or outdated information, and scaling memory for large datasets can affect consistency. Reflection-based feedback poses risks of overfitting, reducing adaptability. Finally, LLM-MA systems can suffer from miscommunication between agents, compounding errors.
Transparency-related design issues in LLM agents stem from opaque decision-making processes, complicating accountability. Multimodal augmentation reduces interpretability by making it difficult to trace how data types like text, images, and audio contribute to outputs, especially in sensitive areas like healthcare. Tool-use lacks clarity on tool selection and decision-making, requiring thorough documentation. Memory management is opaque in terms of data retention and use, hindering debugging. Reflection-based decisions are hard to trace, with consistency in feedback interpretation posing challenges. In LLM-MA systems, transparency decreases due to complex inter-agent interactions, necessitating tools to track information flow and improve accountability.
Operational challenges in LLM agents refer to the difficulties that arise during their actual deployment and interaction within real-world environments. While these agents may perform well in controlled conditions, operational issues such as adversarial attacks, misalignment with user intent, and the potential for malicious use can undermine their effectiveness and safety. These challenges emerge from the complexity of adapting to dynamic, unpredictable environments, where the agent must handle diverse inputs and potential manipulation. Addressing these issues is vital for ensuring the ethical use of LLM agents in real-world applications.
Misalignment in LLM agents, particularly as autonomy increases, can lead to harmful impacts on individuals and society:
In general, adversarial attacks threaten the safety and reliability of LLM agents by exploiting vulnerabilities in their inputs, observations, planning, and memory, causing harmful behaviors. Key attack types include:
There are also attacks targeting specific agentic system designs. Multimodal systems are susceptible to adversarial attacks, where manipulated inputs can cause models to misinterpret data, leading to erroneous or harmful outputs. For instance, exploiting alpha transparency in images can deceive vision-based AI systems. In LLM-MA systems, the spread of manipulated knowledge can compromise the integrity of agent interactions, resulting in the dissemination of false information. Moreover, AI agents are vulnerable to adversarial attacks that exploit system vulnerabilities, such as state perturbations, which can significantly impair overall performance and stability.).
LLM agents, while highly capable, can be misused for malicious purposes many of which could breach the EU AI Act:
Understanding the risks of LLM agents is essential for their safe and responsible use. Rigorous evaluations, testing, monitoring, and audits can mitigate vulnerabilities and ensure compliance. Get in touch to find out how Holistic AI’s Safeguard can help you govern your LLMs.
DISCLAIMER: This blog article is for informational purposes only. This blog article is not intended to, and does not, provide legal advice or a legal opinion. It is not a do-it-yourself guide to resolving legal issues or handling litigation. This blog article is not a substitute for experienced legal counsel and does not provide legal advice regarding any situation or employer.
Schedule a call with one of our experts