Two Technologies That May Be Gamechangers For AI
The next wave isn’t scale- it’s smarter memory and provable knowledge.
We are obsessed with staying on top of research innovations in the fast evolving and improving world of AI. In the last few weeks, two quiet breakthroughs were released that are changing how AI “remembers” and how it “knows.”
The first is visual context compression (DeepSeek-OCR). Instead of feeding endless text tokens to a model, you can now store pages as compact “vision tokens” and only expand to text when needed. The result is much longer usable histories at a fraction of the cost. DeepSeek reports up to ~10× compression while retaining ~97% of information.
You’ll remember that DeepSeek already captured a lot of this “do more with less” mindshare when they first launched to great acclaim. Now they are back with a lot more meaningful innovation.
The second is verifiable knowledge (GraphMERT). This innovation speaks to how knowledge graphs (which underlie basically all AI and data models these days) can be constructed far more efficiently and far more verifiably. Rather than tossing facts into opaque embeddings, you can create a knowledge graph with provenance meaning not just a tree of knowledge but why and how each fact is represented. In a medical domain test, a tiny ~80M-parameter GraphMERT produced graphs with ~69.8% FActScore and ~68.8% ValidityScore, outperforming a much larger 32B LLM baseline that only hit ~40.2% and ~43.0% on the same metrics. That’s 50% greater performance at 1/400th the size!
Why this matters
Today’s AI spend is dominated by context (tokens) and every time your AI reads a policy, a contract, or a slide deck, you pay to bring all that text in. Visual compression flips that math. A document can sit in relatively few compact tokens and only “inflate” the exact passage you ask about. Your agent can “remember” weeks of chats and thousands of pages without having to have it all stored and analyzed.
What’s interesting about this is that this mirrors how we store memories ourselves. We have a very finite and expensive amount of local accessible memory but we still retain a sense of many other memories. When you work to recall again something you knew with perfect fidelity a year ago, you are doing a version of what this tech is making available to LLMs. Pretty amazing if you ask us.Knowledge graphs address a fundamental problem with LLMs which is bigger than hallucinations; it is about providing provenance of LLM analyses. GraphMERT’s takes this one step further by not just storing knowledge but providing reliability and traceability in an efficient way. That means responses your team can defend in audits, post-mortems, or regulatory reviews without hand-wringing over what the model meant.
What this unlocks
Here are a few real-world products and use-cases that illustrate the power of these two innovations:
Keep months of chat and ticket history available to customer support agents at low cost and “hydrate” only the details that matter at answer time. Even better, have provenance information attached to that history so if things escalate your sales and legal teams have provable data.
For contracts, store exact contract clauses, effective dates, and responsible owners in an auditable graph. As they cross-reference other agreements, bring those details into working memory only as needed. Then rapidly verify that contracts for a given counterparty are all in agreement with the master agreement and each other.
Power offline tolerant assistants with compressed but broad knowledge and enable pulling of rich detail only when connectivity and cost allow, as we outlined in our “AI in Your Pocket” article.
The bottom line is that innovations are coming fast and furious in the realm of generative AI. Visual compression lowers what you pay for memory. Verifiable knowledge graphs lower what you pay for mistakes. Together they reset the north star to the only metric that matters: your cost per correct, defensible answer.
GPT-5 made the point loud and clear: the next wave isn’t “bigger model, more data.” It’s smarter memory and governed knowledge. Winners will engineer how their AI remembers and prove what it knows given the vast amount of junk data out there.
Also both of these innovations do something that the industry simply isn’t focused on. Reduce the need for more datacenters. Is this because everyone is complicit in inflating the AI bubble? Stay tuned.
Want to dive deeper?
Check out our book BUILDING ROCKETSHIPS 🚀 and continue this and other conversations in our 💬 ProductMind Slack community and our LinkedIn community.
🎥 In case you missed it…Check out our latest podcast on the ProductMind YouTube channel. In this episode Oji Udezue and Ted Yang break down two major AI research studies from MIT and Wharton with opposing results on AI ROI and enterprise productivity.
They’re talking about:
🚀 What these findings mean for business leaders and middle management
🚀 Insights on Amazon layoffs
🚀 Meta’s AI strategy
🎧 Tune in now.
🎥 YouTube → https://www.youtube.com/watch?v=VRGubAHjbdI&t=1725s
🎵 Spotify → https://open.spotify.com/episode/2Znmw12kBe9HvwZv0vI3r9
🎵 We are excited to announce we have expanded our podcast 🎙️to Spotify. Please give us a listen and if you like what you hear share with a friend, follow us, and (or) rate us 5 stars. ⭐️⭐️⭐️⭐️⭐️




