Artificial Intelligence | News | Insights | AiThority
[bsfp-cryptocurrency style=”widget-18″ align=”marquee” columns=”6″ coins=”selected” coins-count=”6″ coins-selected=”BTC,ETH,XRP,LTC,EOS,ADA,XLM,NEO,LTC,EOS,XEM,DASH,USDT,BNB,QTUM,XVG,ONT,ZEC,STEEM” currency=”USD” title=”Cryptocurrency Widget” show_title=”0″ icon=”” scheme=”light” bs-show-desktop=”1″ bs-show-tablet=”1″ bs-show-phone=”1″ custom-css-class=”” custom-id=”” css=”.vc_custom_1523079266073{margin-bottom: 0px !important;padding-top: 0px !important;padding-bottom: 0px !important;}”]

AiThority Interview with Jon Bratseth, CEO and co-founder of Vespa.ai

Jon Bratseth, CEO and a co-founder of Vespa.ai chats about how Data privacy and security are paramount concerns, the future of Retrieval-augmented generation, AI in operational decision-making and more in this catch-up.

————

Jon, as the founder of Vespa.ai, can you share the inspiration and vision that led to the creation of Vespa?

Vespa grew out of my teams work on web search – we were competing with Google and the other before we were acquired by Yahoo twenty years ago. Through this work we realized two things: Firstly, when you want to scale to large amounts of data the traditional approach of pulling data out of a database to do something with it just takes too long and completely overwhelm your switches. You need to send what you want to do to the data instead. And secondly, to achieve good quality you need to apply a lot of intelligence to that data.

Putting that together led us to realize we needed to build a platform that allowed us to both distribute and index large amounts of data and perform distributed computations over that data in real time. And once we had built and generalized that platform – which took us a decade – we could apply it to lots of other problems besides search that would benefit from these capabilities, such as recommendation, personalization, ad serving and RAG.

Also Read: AiThority Interview with Venki Subramanian, SVP of Product Management at Reltio

Data privacy and security are paramount concerns for organizations leveraging AI technologies. How does Vespa.ai approach these challenges, especially in industries with strict regulatory requirements?

 There are some requirements that are a must for these organizations: Different data and workloads must run on separate hardware that is never mixed, and these separate data planes must be under the customer’s control, so in their account and VPCs etc. The challenge comes from combining that with providing a managed system that makes it easy for these companies to run their workloads reliably at scale in production, change them and deploy new ones. Vespa ensures that by what we call Vespa Cloud Enclave, where we combine a shared control plane for management with private customer-controlled data planes.

 In addition there’s of course the general basics of security – encrypted data and communications, minimal privileges, fleet endpoint monitoring, software supply chain security, continuous upgrades and OS patches, red teaming, a bounty program and so on, which we also manage on the systems that we run.

With over 20 years of experience working on large distributed systems, how has your approach to system architecture evolved with the rise of AI and Big Data demands?

 To be honest it hasn’t changed all that much as we as developers of a deep platform rely on anticipating developments far in advance from first principles. For example, we started developing support for tensors more than a decade ago, before TensorFlow, from seeing that the need for computing over large structured spaces of numbers would increasingly be valuable and economical.

One thing we did not anticipate was the rapid rise of LLMs, so over the last couple of years we have spent a lot of effort seeing how these can be integrated productively into serious enterprise systems and building out support for those use cases.

Accurate and relevant large language models (LLMs) are essential for the success of generative AI initiatives. What key factors should organizations prioritize to ensure the accuracy and relevance of large language models (LLMs) in generative AI?

 You need two things to succeed: An LLM that is sufficiently intelligent to solve the use cases you have in mind, and providing it with the information it needs to do it (RAG). For many (not all) business use cases, applying sufficient intelligence requires using very large LLMs, and those are run most economically by companies that specialize on doing that at scale so you should use those.

Also Read: AiThority Interview with Robert Figiel, VP of Centric Market Intelligence R&D at Centric Software

The second part, providing the LLM with information it needs, is where most organisations succeed or fail. When we provide search for a human employee, the relevance of the returned information is important, but not critical. Humans constantly absorb information by going to meetings, reading their mail and so on, and if they can’t find the information they need to solve a problem they will usually be able to use their already absorbed information, or at least know that there are things they don’t know. LLMs are not like that. After their training is done they absorb nothing, and so rely completely on the information we are able to surface at the time when they are solving a problem. In other words, relevance becomes even more important. Search relevance is a field with established best practices, all of which apply to providing LLMs rather than humans with information, and organizations that are successful adopt a variety of these to their needs, guided by evaluations.

Given the rapid developments in AI and machine learning technologies, where do you see the biggest opportunities for businesses to leverage AI in operational decision-making?

 We see organizations working on this across many areas across e-commerce, finance, health care and others. What it seems to me that all of the first wave of successful applications have in common is that there is a human in the loop. These systems aren’t yet at a stage where we can trust them to make high.stakes decisions on their own without competent human effectively checking their work, and this is most effective when conceptualized as a collaborative problem solving between the human and the system.

Could you share five thoughts on the future of Retrieval-augmented generation (RAG)

1. In the short term, the knowledge that when dealing with text, just using simple text search (with bm25) gives better result than simple vector embeddings, and that combining both vectors and text is superior to either.

2. As people move from proof of concepts to serious enterprise applications, scalability, reliability and security will come to the forefront, which require more comprehensive platforms than the initial experiments.

3. Retrieval methods such as ColBert, which apply tensor computations in more advanced ways to achieve superior results will continue to proliferate as the work we and others are doing to maker these economical at scale becomes more well known.

4. Visual retrieval such as ColPali will continue to increase in popularity and gradually take over document search.

5. As the world transitions to long reasoning as pioneered by OpenAI also in RAG applications using private data, low query latency and high request rates will become important also in internal RAG applications as all the reasoning steps taken by these models will translate to thousands of queries for each problem to be solved.

[To share your insights with us as part of editorial or sponsored content, please write to psen@itechseries.com]

Jon Bratseth is the CEO and a cofounder of Vespa.ai, and the architect and one of the main contributors to Vespa, the platform for applications combining AI and data online. Jon has 25 years experience as architect and programmer on large distributed systems, and is a frequent public speaker.

Vespa.ai is a platform for building and running real-time AI-driven applications for search, recommendation, personalization, and retrieval-augmented generation (RAG). It enables enterprise-wide AI deployment by efficiently managing data, inference, and logic, handling large data volumes and over 100K queries per second. Vespa supports precise hybrid search across vectors, text, and structured metadata. Available as a managed service and open source, it’s trusted by organizations like Spotify, Wix, and Yahoo. The platform offers robust APIs, SDKs for integration, comprehensive monitoring metrics, and customizable features for optimized performance.

Comments are closed.