Git-for-data Pioneer lakeFS Secures $20M in Growth Capital, Fills a Critical Gap in Enterprise AI Tech Stack
-
Enterprise AI initiatives are often derailed by the complexity and scale of their own data
-
Just as Git transformed software development, lakeFS is transforming enterprise AI by versioning the data that powers it
-
lakeFS has seen triple-digit growth in user adoption across thousands of organizations—including Arm, Bosch, Lockheed Martin, NASA, Volvo, and the U.S. Department of Energy—for highly scalable, safe, and efficient data versioning
lakeFS, the leading “git-for-data” version control system for enterprise data and AI initiatives, has raised $20 million in a growth funding round. With thousands of organizations including Arm, Bosch, Lockheed Martin, NASA, Volvo, and the U.S. Department of Energy already using lakeFS as part of their data management infrastructure, this new investment will accelerate its growth supporting data engineering, AI and ML projects in the enterprise and public sector markets. The funding round brings the company’s total raised capital to $43 million, and was led by Maor Investments along with existing investors Dell Technologies Capital, Norwest and Zeev Ventures.
Also Read: AiThority Interview with Dr. Petar Tsankov, CEO and Co-Founder at LatticeFlow AI
“We’re still at the very beginning of the AI revolution and organizations struggle to unlock value and business efficiencies using AI,” said Dr. Einat Orr, co-founder and CEO of lakeFS. “Enterprises are adopting lakeFS as an infrastructure layer in their data and AI operations to reduce time-to-market on their AI initiatives while increasing data and model quality. This is more important than ever because the organizations that innovate fastest will be the ones that win. This funding will allow us to double down on innovation – particularly around features critical to enterprise-scale AI operations.”
Closing the AI data infrastructure gap
Organizations are racing to obtain value and a competitive edge through artificial intelligence (AI) and machine learning (ML) initiatives. Yet, according to the most recent EY Survey on AI Adoption, the vast majority (83%) of surveyed executives said “AI adoption would be faster if they had a stronger data infrastructure, and 67% say they could move faster on AI adoption, but the lack of data infrastructure is holding them back”.
There is a huge and growing gap between data’s increasing importance and an organization’s ability to manage it with confidence and control. As AI, MLOps and data teams are building the AI infrastructure for their organization in real time training proprietary LLMs, building agents and agentic workflows, they are wasting precious time and resources wrangling the data that powers the AI transformation.
This often requires manually juggling copies of petabyte-sized datasets, building sub-datasets for experiments or specific projects, or reproducing the data that was used to train a specific LLM for research, compliance or training purposes. As a result, AI projects are delayed, are much more costly, open organizations up to compliance risks and deliver overall disappointing results.
Enterprise data version control: A foundational layer for AI infrastructure
Just as Git revolutionized software development, lakeFS is reshaping enterprise AI by versioning the data that powers it. Designed for massive volumes of unstructured, semi-structured and structured data in data lakes—text, images, audio, video—lakeFS gives organizations control, safety, and reproducibility at scale.
Using lakeFS, enterprise data, AI, and ML teams can:
- Efficiently experiment and iterate on massive datasets without duplicating storage
- Reproduce AI/ML models and training pipelines for compliance, auditing, and traceability
- Collaborate at scale with full control over changes to data, models, and environments
The role of lakeFS is frequently cited by experts from leading organizations at global technology conferences as a fundamental part of their data and AI infrastructure. As data volumes scale rapidly, lakeFS provides the versioning, reproducibility, and control needed to manage complex pipelines with confidence. It empowers teams to experiment safely and collaborate more efficiently, making it an essential component in delivering reliable and scalable data, AI and ML operations:
- Volvo: George Markhulia, engineering manager ML operations at Volvo, outlined at KubeCon how they have “natively integrated lakeFS” into their ML platform.
- Lockheed Martin: Greg Forrest, director of AI foundations at Lockheed Martin, presented their “AI Factory” at NVIDIA’s AI Summit, highlighting lakeFS as part of the “infrastructure and tools that enable engineers to build trustworthy AI.”
- Microsoft: Vara Ghanta, principal software engineering manager at Microsoft, noted in a blog post about their Overture Data Platform—managing datasets containing billions of buildings, roads, land, and water—that “version management of datasets was a crucial feature required for building that data platform.”
This investment supports the rapidly growing demand for lakeFS, fueling the expansion of its engineering and go-to-market teams, accelerating product development, and deepening global enterprise partnerships. The funding builds on a period of strong momentum for lakeFS, including Fortune 100 customer wins, triple-digit community growth and recent product release such as distributed data management, lakeFS Mount and Iceberg REST catalog support.
“AI progress is bottlenecked not only by algorithms, but also by data. lakeFS is solving one of the most critical and often overlooked challenges in modern data infrastructure for enterprises: enabling Git-like version control for massive, fast-evolving datasets,” explained Ido Hart, Partner at Maor Investments. “As AI data becomes larger, messier and more mission-critical, lakeFS delivers the control layer needed to build, iterate and ship with confidence. Built for the scale and complexity of modern enterprises, lakeFS is not just a smart solution, it’s a foundational layer for reproducibility, collaboration and trust in the AI era. We believe lakeFS will become indispensable to the modern AI stack, and we’re proud to back their bold vision.”
Also Read: AI Architectures for Transcreation vs. Translation
[To share your insights with us as part of editorial or sponsored content, please write to psen@itechseries.com]
Comments are closed.