Tabnine Introduces Ability to Flag Unlicensed Code in AI-generated Software
Feature enables engineering teams to reduce risk of IP infringement when using popular LLMs
Tabnine, the originators of the AI code assistant category, today introduced Code Provenance and Attribution, a feature that enables enterprises to benefit from the use of large-scale, popular LLMs for software development tasks while minimizing the likelihood of restrictively licensed code being injected into their codebase.
Large language models from Anthropic, OpenAI, and others have been trained on vast catalogs of content and code captured from publicly visible sources, many of which are not freely licensed. When combined with the likelihood of LLMs to generate content which matches what they have seen in the past, use of the vendors’ models may introduce IP or copyright liabilities. With Provenance and Attribution, Tabnine checks code generated using AI chat or AI agents against code that is publicly visible on GitHub, flags any matches, and references the source repository as well as its license type. This information makes it easier for engineering teams to review the code being generated with the assistance of AI and decide if the license of that code meets their specific standards and requirements.
Also Read: Mantis Robotics Secures $5 Million to Redefine Robotics through Physical AI
With Tabnine’s new Provenance and Attribution capability, Tabnine will more easily support development teams—and their legal and compliance teams—who want to leverage a wide variety of powerful models.
“Models trained on larger pools of data outside of permissively licensed open source code can offer superior performance, but enterprises who use them run the risk of running afoul of IP and copyright violations,” said Peter Guagenti, President at Tabnine. “Our Code Provenance and Attribution capability addresses this tradeoff, increasing productivity without sacrificing compliance. Experienced engineering teams expect to know the source and license of generative AI output and this feature ensures they do.”
Given that the copyright law for use of AI generated content is still unsettled, Tabnine’s proactive stance aims to drastically reduce the risk of IP infringement when enterprise use models like Anthropic’s Claude, OpenAI’s GPT-4o, and Cohere’s Command R+ for software development.
Tabnine’s license compliant model, Tabnine Protected 2, which is trained exclusively on code that is permissively licensed, remains a critical offering. Many companies believe that the very use of an LLM trained on unlicensed software may introduce risk, so Tabnine will continue to support and develop this unique model. The new Provenance and Attribution capability adds support for legal and compliance teams who are comfortable using a wider variety of models as long as they specifically do not inject unlicensed code.
The Code Provenance and Attribution capability supports the full breadth of software development activities inside Tabnine, including code generation, code fixing, generating test cases, implementing Jira issues, and more. Since Tabnine reads code like a human, it not only flags output that exactly matches open source code on GitHub but also if there are functional or implementation matches.
Tabnine soon expects to add capability to allow users to identify specific repos, such as those maintained by competitors, and then have Tabnine check generated code against them as well. Additionally, Tabnine plans to add censorship capability, allowing Tabnine administrators to remove matching code before it is displayed to the developer.
Code Provenance and Attribution is in Private Preview, open to any Tabnine enterprise customer, and works on all available models, including Anthropic, OpenAI, Cohere, Llama, Mistral, and Tabnine. Learn more about Code Provenance and Attribution here.
Also Read: AiThority Interview with Adriano Koshiyama, Co-founder and Co-CEO of Holistic AI
[To share your insights with us as part of editorial or sponsored content, please write to psen@itechseries.com]
Comments are closed.