Enterprise Investments in Natural Language Processing Surge in 2021 and Accuracy Remains the Top Concern, New Research Reveals
Gradient Flow’s Annual NLP Industry Survey Sheds Light on the Practices, Technologies, and Challenges Defining Natural Language Processing This Year
John Snow Labs, the Healthcare AI and NLP company and developer of the Spark NLP library, announced the results of the 2021 Natural Language Processing (NLP) Industry Survey, exploring how companies are using NLP. The results include a detailed analysis of NLP technologies being implemented by businesses, budgets, trends, widely used tools and cloud platforms, and use cases. The survey was conducted by Gradient Flow, an independent data science analysis and insights provider.
Recommended AI News: Toshiba Powers Superior Customer Experience For Pittsburgh Automotive Group
Despite responses from a variety of industries, company sizes, stages of NLP adoption, and geographic locations, the global survey showed NLP budgets are increasing across the board. In fact, 60% of Tech Leaders indicated that their NLP budgets grew by at least 10%, while 33% reported a 30% increase, and 15% said their budget more than doubled. This is a steady increase compared to 2020, which suggests pandemic-related financial constraints may be stabilizing.
People Tech Group Partners with UiPath to Launch New Automation Practice
Although investments in NLP have been healthy, practitioners face some significant barriers to progress. Similar to last year’s results, accuracy was the most important requirement when evaluating an NLP solution. But when asked about key challenges faced when using cloud NLP services, Tech Leaders specifically cited difficulty in tuning (39%) and cost (36%) as the top two challenges. This is important, as models often need to be tuned and customized for their specific domains and applications. As more difficult use cases, such as Q&A and natural language generation proliferate, accuracy will remain paramount for success.
Other key findings include:
- For the second consecutive year, Spark NLP was named the most popular NLP library, with 31% of respondents indicating they use it.
- Most practitioners use multiple libraries. In fact, 53% of respondents stated they used at least one of the following NLP libraries popular within the Python ecosystem: Hugging Face, spaCy, Natural Language Toolkit (NLTK), Gensim, or Flair.
- Among Tech Leaders, accuracy (40%) was the most important requirement when evaluating an NLP solution, followed by production readiness (24%) and scalability (16%).
- 54% of Tech Leaders singled out named entity recognition (NER) and 46% cited document classification as the primary use cases for NLP.
- For Healthcare industry respondents, entity linking / knowledge graphs (41%) and deidentification (39%) were among the top use cases.
- 83% of all survey respondents indicated that they use at least one of the four NLP cloud services listed (Google, AWS, Azure, IBM), in addition to NLP libraries.
- The top three data sources for NLP projects are text fields in databases, files (PDFs, docx, etc.), and online content.
- The top four industries using NLP represented by survey respondents include healthcare (17%), technology (16%), education (15%), and financial services (7%), which is reflective of overall industry adoption.
- The Spark NLP library is particularly dominant in the healthcare industry, in which 60% of respondents reported having adopted it.
“As we move into the next phase of NLP growth, it’s encouraging to see investments and use cases expanding, with mature organizations leading the way,” said Dr. Ben Lorica, Survey Co-Author and External Program Chair, NLP Summit. “Coming off of the political and pandemic-driven uncertainty of last year, it’s exciting to see such progress and potential in a field that is still very much in its infancy.”
[To share your insights with us, please write to firstname.lastname@example.org ]