Sonatype's new research shows that "frontier" AI models often make wrong or made-up suggestions for software dependencies This article explores grounding ai models. . The research team looked at 36,870 different dependency upgrade suggestions from Maven Central, npm, PyPI, and NuGet between June and August 2025.
The DevSecOps company looked at 258,000 recommendations made by seven AI models from Anthropic, OpenAI, and Google in total. In other cases, the models actively added security holes by suggesting software versions that had known bugs, which sometimes made the AI stack itself more vulnerable. The study found that "grounding" AI models with real intelligence and context led to much better results.
Sonatypes says that the frontier models can reason, but the problem is with the dependencies' real-time intelligence and other things. The company looked at the frontier models and Sonatype's own hybrid approach, which uses real-time intelligence at the time of inference. Giving the models more information, like ranked upgrade candidates, vulnerability counts, and the platform's Developer Trust Scores, made a big difference in the number of vulnerabilities compared to the ungrounded ones.
The report says, "Grounding doesn't just stop hallucinations; it also guides the model toward versions with fewer known vulnerabilities when a perfect option isn't available." "People should make rules and policies. Fox says, "The system still needs to be based on real-time software intelligence."
"You are asking people to fix decisions that the system never had enough truth to make well in the first place," he says. The company says that AI models will make mistakes that are expensive to fix if they don't have live registry data, vulnerability intelligence, or compatibility context.












