Developers of agentic AI have been making some big claims. The promise has been of autonomous systems that can do everything, from booking our flights and keeping an eye on competitors in real time to handling entire procurement cycles, all without needing an actual human to hit “confirm.” And while the technology needed to achieve most of these marvels already largely exists, the infrastructure necessary to make it work reliably at scale still leaves much to be desired.
Gartner recently projected that over 40% of agentic AI projects will be canceled before the end of 2027, citing escalating costs, unclear business value, and inadequate risk controls. That’s pretty striking, especially in view of the expectation that autonomous agents would finally herald AI’s coming-of-age. And yet, this should not really surprise anyone who has seen the undeniable limitations these agents exhibit in the real world. Most people assume the underlying issue to be related to the quality of the models themselves. Although this might seem plausible, it is a little off the mark.
Why the Web Resists Agents
Consider what a capable agent actually needs. Accessing a website and getting a response is just the start; it then has to translate that response into something usable. Not only that, it has to do it consistently, in real time, and at a scale that makes the whole exercise worthwhile to begin with.
Given the web’s current shape, this is a daunting task. Just take online platforms as an example. There is no technical reason why an independent agent could not compare different platforms and make the choice that best suits users’ preferences. However, those same platforms currently depend on that information not being readily available. To maintain their advantage, they work on increasingly personalized results, sponsored placements, and urgency cues to shape user behavior and tip the scales in their favor. Without access to pertinent data, no AI agent will ever be able to complete tasks on the web or automate selecting the best option for its users.
The result of this is a web that works reasonably well for general browsing but systematically discourages automated access. Recent findings from research that scored over 120 countries based on various aspects of web accessibility provide a clear illustration. The global average score for practical reachability (how well a site responds to standard automated HTTP requests) stands at 83.4 out of 100. The score for anti-automation friction (such as CAPTCHAs, rate limiting, fingerprinting, and bot detection) is, on average, 62.8. And structured data interoperability (whether sites return data in formats that machines can actually work with) drops even further to 60.3. Those 20-plus-point differences reflect a structural gap. Sites generally respond to requests for automated access, but at the same time, restrictions abound, and data is often returned in machine-unfriendly ways. Agents that depend on reliable, timely, structured information will often fall into that gap.
Data-Starved AI
Within organizations, agents face a different but related problem: a lack of usable data. In other words, the relevant data exists but has not been cleaned, tagged, or structured in a way that an AI system can understand. Many enterprises have vast repositories of customer interactions, internal reports, and market intelligence, but these are often siloed in legacy systems or stored in unstructured formats. Without proper data pipelines and governance, agentic systems cannot access or leverage this information effectively.
The same applies to customer-facing applications built on agentic systems. Without real-time web data (current prices, live inventory, policy updates, market movements), they have no other choice than to reason based on a frozen version of the world. For example, a travel agent that cannot check live flight prices or hotel availability will make outdated recommendations, eroding user trust. Similarly, a financial agent that cannot access real-time stock data might execute trades based on stale information, leading to significant losses.
Latency is another problem. Put simply, an agent that eventually returns the right answer is far less useful than one that returns it fast enough to act on. When dealing with autonomous systems, the tolerance for delay is even lower. In automated trading, milliseconds matter. In customer support, a slow response can lead to frustration and abandonment. In supply chain management, delays in data retrieval can cause inventory mismatches or missed opportunities. In each case, the constraint is the same: agents need context they can trust, and they’re not getting it—not from their own organizational data, and not from the web.
Solving a Problem That’s Been Solved Before
It is easy to forget, but this is actually not the first time the sheer volume of information has eclipsed our capacity to process it. The early web is particularly instructive here. It already held so much knowledge but it could not be useful in its raw state. What made the difference back then was infrastructure built for scale. Namely, web crawlers were deployed to index pages, scrapers were used to compare prices online, and monitoring systems were put in place to track fraudulent ads and brand impersonation across thousands of domains. All of these innovations require the ability to collect public web data reliably and at scale.
A more recent example comes from an investigation into online disinformation and fraud conducted by a nonprofit organization. The investigation uncovered a large-scale, multilingual scam operation targeting former fraud victims. It identified over 50,000 ads, 459 domains, and more than 1,100 related web pages, with an estimated reach of 52 million people across Europe. That kind of coverage requires systematic, automated data collection at scale. Without the right infrastructure, such investigations would be impossible, and the same principle applies to agentic AI systems that need to gather and act on information from the web.
Agentic AI needs an infrastructure of the same kind, except with even higher demands, because agents do more with data than any previous application. They need information that is structured, current, complete, and returned fast enough to support real-time action. While traditional web scraping and crawling laid the groundwork, agentic systems require a more sophisticated approach that can handle dynamic content, session management, and context-aware reasoning.
The Three Cs of Reliable Agent Infrastructure
As noted above, all of this is unlikely to happen organically. For platforms, opening up to frictionless automated access means ceding control over discovery, ranking, and customer relationships. While this is beneficial for the consumer and invites reshaping business models accordingly, it is also a threat to short-term revenue. Therefore, the infrastructure that makes agentic systems work reliably has to be built independently. Three requirements, or three Cs, stand out.
Consistency: agents that encounter unreliable data sources produce unreliable behavior, and unreliable behavior is the fastest route to project cancellations. An agent that sometimes works and sometimes fails is worse than no agent at all, because it creates unpredictable outcomes and erodes user confidence. Consistency means ensuring that data sources are available and return the same quality of data every time, regardless of load or changes on the source side. This requires robust monitoring, redundancy, and fallback mechanisms.
Currency: real-time access to prices, inventory, availability, and policy is what separates an agent reasoning based on current facts from one reasoning by reference to stale assumptions. In most commercial contexts, the latter creates more problems than it solves. An agent that books a flight at an old price may end up with a cancellation or overcharge. An agent that recommends products based on yesterday’s inventory may suggest items that are no longer in stock. Currency demands low-latency data pipelines and the ability to push updates as soon as they occur, rather than relying on periodic refreshes.
Compliance: access built outside fair standards tends to provoke countermeasures that raise barriers for all automated systems. Any infrastructure worth building has to be sustainable, not just technically but in practice. This means adhering to the terms of service of websites, respecting robots.txt files, and implementing rate limiting and other ethical scraping practices. It also means staying within legal boundaries regarding data privacy, such as GDPR and CCPA, and ensuring that agents do not inadvertently collect or process personal data without consent. Compliance is not just a legal necessity; it is also a trust mechanism that allows agents to operate without triggering widespread blocks or lawsuits.
The web was not designed for agents. Within organizations, the context agents need is often not easily accessible to them or even readily available. These are data quality problems that can be solved and infrastructure problems that we are actively solving. The three Cs provide a framework for building the robust, reliable foundation that agentic AI needs to fulfill its promise. Whether across e-commerce, finance, healthcare, or logistics, the ability to autonomously gather, interpret, and act on data will unlock efficiencies and innovations that were previously unimaginable. But that future depends on our collective willingness to invest in the underlying infrastructure rather than simply expecting the models to work miracles.
Finally, what we as a society truly need is to decide if we are ready to welcome AI agents or if we want to keep holding them back. The technology is ready. The data is waiting. The only missing piece is the infrastructure that bridges the gap between capability and reality.