First-of-its-kind Pinecone Knowledge Platform to Power Best-in-class Retrieval for Customers
Industry-leading vector database capabilities combined with proprietary AI models to help developers build up to 48% more accurate AI applications, faster and more easily
Pinecone recognized as AWS GenAI Innovator Partner of the Year
With its vector database at the core, Pinecone, the leading knowledge platform for building accurate, secure, and scalable artificial intelligence (AI) applications, has announced industry-first integrated inference capabilities. These include fully-managed embedding and reranking models, along with a novel approach to sparse embedding retrieval. By combining these innovations with Pinecone’s proven dense retrieval capabilities, the platform delivers an approach to cascading retrieval that defines a new standard for AI-powered solutions.
New proprietary reranking and embedding models, as well as the addition of third-party models like Cohere’s Rerank 3.5 model, further provide customers quick, easy access to high-quality retrieval and significantly streamline the development of grounded AI applications.
Also Read: AiThority Interview with Joe Fernandes, VP and GM, AI Business Unit at Red Hat
“Our goal at Pinecone has always been to make it as easy as possible for developers to build production-ready knowledgeable AI applications quickly and at scale,” said Edo Liberty, founder and CEO of Pinecone. “By adding built-in and fully-managed inference capabilities directly into our vector database, as well as new retrieval functionality, we’re not only simplifying the development process but also dramatically improving the performance and accuracy of AI-powered solutions.”
Pinecone’s composable platform now includes the following updates:
- pinecone-rerank-v0 proprietary reranking model
- pinecone-sparse-english-v0 proprietary sparse embedding model
- New sparse vector index type
- Integration of Cohere’s Rerank 3.5 model
- New security features, including role-based access controls (RBAC), audit logs, customer-managed encryption keys (CMEK), and the general availability (GA) of Private Endpoints for AWS PrivateLink
Advancing the state of the art for retrieval
High-quality retrieval is key to delivering the best user experience in AI search and retrieval-augmented generation (RAG) applications. Pinecone’s research shows that state-of-the-art performance requires combining three key components:
- Dense vector retrieval to capture deep semantic similarities
- Fast and precise sparse retrieval for keyword and entity search using a proprietary sparse indexing algorithm
- Best-in-class reranking models to combine dense and sparse results and maximize relevance
By combining the sparse retrieval, dense retrieval, and reranking capabilities within Pinecone, developers will be able to create end-to-end retrieval systems that deliver up to 48% and on average 24% better performance than dense or sparse retrieval alone.
“With the advent of GenAI, we knew we could challenge the status quo in talent acquisition by building an experience focused on the job seeker rather than the hiring company,” said Alex Bowcut, CTO of Hyperleap. “With Pinecone, we’ve seen 40% better click-through rates for the job matches we deliver with search results using their semantic retrieval as opposed to traditional full-text search. Now, with the addition of sparse vector retrieval to Pinecone’s proven natural language search capabilities, we’re excited to explore how we can bring deeper personalization to people looking for work.”
Pinecone proprietary models
With the introduction of its first proprietary models, Pinecone is making it easier for developers to build knowledgeable AI.
- pinecone-rerank-v0 improves search accuracy by up to 60% and on average 9% over industry-leading models on the Benchmarking-IR (BEIR) benchmark
- pinecone-sparse-english-v0 boosts performance for keyword-based queries, delivering up to 44% and on average 23% better normalized discounted cumulative gain (NDCG)@10 than BM25 on Text REtrieval Conference (TREC) Deep Learning Tracks
Natively integrated into Pinecone’s platform, these models simplify the development of production-ready AI applications.
AI search simplified with integrated inference
With the release of Pinecone’s integrated inference capability, engineers can now develop state-of-the-art applications without the burden of managing model hosting, integration, or infrastructure. By offering these capabilities behind a single API, developers can seamlessly access top embedding and reranking models hosted on Pinecone’s infrastructure, eliminating the need to worry about vectors or data being routed through multiple providers. This consolidation not only simplifies development but also enhances security and efficiency.
“Pinecone’s new integrated inference capabilities are a game-changer for us,” said Isaac Pohl-Zaretsky, CTO & Co-Founder at Pocus. “The ability to have embedding, reranking, and retrieval all within the same environment not only streamlines our workflows but also powers our AI solutions with minimal latency, less technical debt, and improved performance. Pinecone was already helping us deliver tremendous value with precise signals to power our customers’ go-to-market efforts, and now with their unique platform we’re thrilled to be able to deliver even more.”
Greater choice with Cohere Rerank
As part of Pinecone’s expanding inference capabilities, we’ve collaborated with Cohere to host cohere-rerank-v3.5 natively within the Pinecone platform. This allows customers to easily select and use cohere-rerank-v3.5 directly from the Pinecone API to enhance the relevance of their search results. Rerank 3.5 excels at understanding complex business information across languages making it optimal for global organizations in sectors like finance, healthcare, the public sector, and more. By incorporating Cohere’s latest industry-leading reranking model, developers can further refine search outputs, ensuring more accurate and contextually relevant responses for their applications.
Enhanced security for mission-critical workloads
Pinecone’s database is built for production, which means the security of customer workloads is paramount. The following advancements further strengthen Pinecone’s commitment to enterprise-grade security and compliance:
- More granular role-based access controls (RBAC) let users set API key roles for control and data plane operations
- Customer-managed encryption keys (CMEK) enable users to control their own data encryption and enhance tenant isolation
- Audit logs for control plane activities (e.g. index creation or deletion) via Amazon Simple Storage Service (Amazon S3) endpoints
- Support for AWS PrivateLink is now generally available (GA) for serverless indexes
Unlocking more with AWS
Pinecone is the recipient of the 2024 AWS GenAI Innovator Partner of the Year award. This award recognizes Pinecone for possessing a unique advantage in driving the advancement of services, tools, and infrastructure pivotal for implementing generative AI technologies.
Pinecone’s AWS Generative AI Competency acknowledges the company as an expert generative AI solution provider that creates value and drives business growth for customers. Customers can leverage Amazon Bedrock Knowledge Bases with Pinecone to build more effectively with AI and reduce operational complexity and costs. Specifically, Knowledge Bases for Amazon Bedrock provides “one click” integration with Pinecone, fully automating the ingestion, embedding, and querying of customer data as part of the LLM generation process. This seamless flow provides a scalable foundation for AI innovation, enabling faster time-to-value and more grounded, production-grade AI applications. Furthermore, customers using Amazon Bedrock Knowledge Bases with Pinecone can now run RAG evaluations natively in Amazon Bedrock instead of having to connect third-party tools.
Creating new possibilities with knowledgeable AI
As the first AI infrastructure company to provide a single platform for inference, retrieval, and knowledge base management, Pinecone is setting a new standard in the industry. This integrated approach is expected to lead to significant performance improvements and open up new possibilities for AI application development.
Customers can access Pinecone through the AWS Marketplace to fast-track procurement, accelerate deployment, and optimize costs to quickly and easily drive better outcomes with knowledgeable AI. Developers can also get started for free on the Pinecone console.
[To share your insights with us as part of editorial or sponsored content, please write to psen@itechseries.com]
Comments are closed.