New Study Finds Over 96% of Computer Vision (CV) Teams Already Using Synthetic Data for Training and Testing of Visual Machine Learning Models
Datagen, the leader in synthetic data generation on a mission to bring data simulation to every computer vision engineer, announced the release of a new research study, “Synthetic Data: Key to Production-Ready AI in 2022,” exploring training data in the field of Computer Vision (CV). The study reveals a once fragmented field beginning to coalesce around the promise of synthetic data to help mitigate frequent project delays and cancellations.
The study emphasizes that training data has become a significant stumbling block for computer vision professionals, who cited a number of data-related complications hindering their organization’s progress in CV. Among the data-related issues experienced, the most prevalent were:
- Wasted time and/or resources caused by a need to retrain the system often (52%)
- Poor annotation resulting in quality issues (48%)
- Poor data coverage of the intended application’s domain (47%)
- Lack of sufficient amount of data (44%)
All four of these problems can seriously jeopardize a project’s progress, making their widespread presence of significant concern to CV teams. As a result of these issues, the overwhelming majority of computer vision teams struggle with frequent, lengthy project delays, and even outright cancellations. Inadequate training data has led to an environment in which:
- 99% of respondents have experienced project cancellations
- 80% have experienced project delays lasting at least 3 months
- 33% have experienced project delays lasting 7 months or more
The frequency, length, and ubiquity of data-driven project disruptions in the field of computer vision are immense. However, the study also revealed several trends that indicate a growing appetite for synthetic data. The research revealed that a staggering 96% of computer vision teams reported already using synthetic data in the training and testing of their computer vision models.
Recommended AI ML Article: Do’s, Don’ts and Legalities Involved in Future Brand Collabs
Based on the survey findings, this surge in synthetic data adoption can be attributed to the fact that its many benefits are both broadly understood and broadly experienced by the computer vision community. For example, when asked what the primary motivation was behind their organization’s use of synthetic data, CV teams reported testing, training, and addressing edge-cases in near equal measure. Similarly, when asked about their first-hand experience, respondents reported experiencing the following benefits of synthetic data:
- Reduced time-to-production (40%)
- Elimination of privacy concerns (46%)
- Reduced bias (46%)
- Fewer annotation and labeling errors (53%)
- Improvements in predictive modeling (56%)
“Synthetic data is the future of data. This is the new way to control and consume the data our AI systems need,” said Ofir Chakon, founder and CEO of Datagen. “As simulation gets better over time, with all its benefits, it will take over the place of labor-intensive manual data collection that is no longer scalable at the speed the world is evolving.”
The survey, which was commissioned by Datagen and conducted by Wakefield Research, polled 300 computer vision professionals, from 300 unique organizations across a variety of industries. The survey set out to better understand how computer vision teams obtain and use AI/ML training data for computer vision systems and applications, and how these choices impact their work. The accompanying report also features commentary and insights from leading industry experts and innovators.
Biggest Ads of 2021: Can You Guess The 2021’S Most Emotionally Engaging Holiday Ads In The APAC Region?
[To share your insights with us, please write to email@example.com]