Anthropic’s new funding program for advanced artificial intelligence (AI) evaluations could accelerate the adoption of AI across various commercial sectors, industry experts say.
The AI company announced Tuesday it will fund third-party organizations to develop new methods for assessing AI capabilities and risks, addressing a critical gap in the rapidly evolving field.
The initiative seeks to create more robust benchmarks for complex AI applications, potentially unlocking billions in commercial value. As businesses look to deploy AI solutions, the lack of comprehensive evaluation tools has been a barrier to widespread adoption.
“We’re seeking evaluations that help us measure the AI Safety Levels (ASLs) defined in our Responsible Scaling Policy,” Anthropic stated in its announcement. These levels determine safety and security requirements for models with specific capabilities.
Key focus areas include assessments of AI models’ potential cybersecurity capabilities, such as vulnerability discovery and exploit development. The company also seeks “evaluations that assess two critical capabilities: a) the potential for models to significantly enhance the abilities of non-experts or experts in creating CBRN [chemical, biological, radiological and nuclear] threats, and b) the capacity to design novel, more harmful CBRN threats.”
The impact of this funding program is expected to be particularly significant for complex AI applications. “Straightforward applications like speech recognition already have decent benchmarks, but quantifying a model’s capability in assisting a crime is much more difficult,” Julija Bainiaksina, founder of the AI company MiniMe, told PYMNTS.
Improved benchmarks could address critical challenges in AI adoption for businesses. “The main problems of adapting generative AI at the moment are cost, hallucinations and safety,” Ilia Badeev, head of data science at Trevolution Group, told PYMNTS. “While the first is relatively predictable and controllable, the latter two are a pain and a breaking point for many projects and integrations.”
The initiative comes as significant tech companies race to develop increasingly powerful AI models, raising concerns about potential misuse. Anthropic, founded by former OpenAI researchers, has positioned itself as a “responsible” AI development leader.
“A robust, third-party evaluation ecosystem is essential for assessing AI capabilities and risks,” Anthropic emphasized. The company added that “developing high-quality, safety-relevant evaluations remains challenging, and the demand is outpacing the supply.”
Anthropic outlined several principles for good evaluations, including that they should be “sufficiently difficult” and “not in the training data.” The company stressed the importance of domain expertise: “If the evaluation is about expert performance on a particular subject matter (e.g., science), make sure to use subject matter experts to develop or review the evaluation.”
The company is accepting proposals through an online application form on a rolling basis. Its internal experts will work closely with selected teams to refine evaluation methods, noting that “refining an evaluation typically requires several iterations.”
Anthropic’s initiative could have far-reaching implications for the commercial AI landscape. By creating more reliable and comprehensive evaluation methods, businesses may gain the confidence to deploy AI solutions in critical areas such as healthcare, finance and customer service. This could potentially unlock productivity gains and new revenue streams across industries.
However, the success of this program will largely depend on the quality and relevance of the evaluations developed. If the new benchmarks fail to capture real-world scenarios adequately or are too narrowly focused, they may provide a false sense of security.
The challenge lies in creating rigorous evaluations to ensure safety and flexibility to keep pace with rapidly evolving AI capabilities. As the initiative unfolds, monitoring how well the resulting evaluations translate to practical commercial applications will be crucial.
For all PYMNTS AI coverage, subscribe to the daily AI Newsletter.
Mobile wallets are well-designed for instant gratification — as that ubiquitous instrument, the mobile phone, makes it easier than ever to pay merchants or to complete a P2P transaction with speed, especially as real-time networks go live across the globe.
But a smooth path between senders and receivers (despite both wielding mobile wallets) across borders is lacking, as there’s no widespread interoperability between networks. Recent research in collaboration between TerraPay and PYMNTS Intelligence indicates that 42% of consumers prefer to send cross-border payments via digital wallets — leaving a staggering greenfield opportunity of 58% of individuals. The opportunity stretches across 5.2 billion mobile wallet users and trillions of transactions, as people travel, conduct cross-border commerce and send remittances.
For TerraPay, which started a decade ago (with the working name “interoperable exchange”), the initial starting point in simplifying global money movement took its cue from the telecoms, and the fact that SMS messages could cross competing carriers’ networks — one of the earliest forms of interoperability.
But as Ani Sane, co-founder and chief business officer at TerraPay, told Karen Webster in a recent interview, moving money is about more than just the transaction: “It’s about compliance, regulations and reconciliation, and settlements and scheme rules.” Building a network to handle those complexities is no easy task, given the fact that as Sane said, digital wallets operate in silos, on the regulatory and technological sides of the equation, as they’re designed to work in a particular country.
TerraPay has been building a network to that allows banks to leverage their existing Swift relationships and send payments to be integrated into TerraPay’s platform to enable payments between digital wallets. The banks, he said, do not have to conduct any technical heavy lifting for that connectivity or to bring digital wallet options to end customers. Additionally, TerraPay connects to merchants, enabling them to accept digital wallet payments — and thus the thousands of wallets operating across the globe mimic the almost universal acceptance of physical cards at physical and digital points of sale.
“On our platform,” in 2024, “more than 50% of our transactions globally were delivered to mobile wallets … they were small-value ticket sizes,” he said, “but sending money to a wallet instantly is a great opportunity for banks to serve those small-value customers and businesses.”
There’s already broad familiarity with global fund flows, as the data shows 70% of consumers use cross-border transactions to pay and receive remittances and 77% of businesses generally engage in business-to-business cross-border transactions with suppliers.
That global reach, Sane said, will broaden financial inclusion. As he told Webster, “When you look at the underbanked and underserved segments” of the world, “and you look at mobile wallets [held] by that segment, it matches up almost 100%.”
COVID, especially, has made us all global citizens, and as such, we want to be able to transact globally. Sane recounted how TerraPay’s initial tests with merchants at duty-free shops at the Dubai airport revealed that offering M-Pesa, Airtel or other payment options through the interoperability network made African travelers landing there enthusiastic consumers.
“The point is to build that trust between merchants,” he said, “without having to think about which [payment] schemes have done the best. It’s a long journey … and we’ll need more efforts from the merchant side of this.”
Looking ahead, Sane said, “What we are trying to do is create the infrastructure to create the ‘rule books’ of reconciliation and settlement mechanisms … for both the wallets and the merchants and to do cross-border what they do domestically. … It’s an amazing tool to be able to use the mobile wallets as financial instruments.”