MobiDev Shares Case Study on Small Dataset-Based Object Detection
Getting started with any machine learning project often starts with the question: “How much data is enough?” Contrary to a popular belief, that machines only learn from large amounts of data, MobiDev’s AI team would like to share its experience of applying ML with a small dataset.
Recommended AI News: BitMart Earn The New Interest-Accruing Service For Token Holders
Our goal was to create a system capable of detecting objects for logistics. In our experiments, we used the following logic:
1. Take a Faster R-CNN pre-trained on COCO 2017 dataset with 80 object classes.
2. Replace 320 units in bounding box regression and 80 units in classification heads with 4 and 1 units respectively, in order to train the model for 1 novel class (bounding box regression head has 4 units for each class in order to regress X, Y, W, H dimensions of bounding box where X, Y are the center coords of the bbox center and W, H are its width and height).
Recommended AI News: Zib Digital Explains The Importance Of SEO Performance
After some preliminary runs we picked the following training parameters:
* Model config: R50-FPN
* Learning rate: 0.000125
* Batch size: 2
* Batch size for RoI heads: 128
* Max iterations: 200
With the parameters set, we started looking into the most interesting aspect of training: how many training instances were needed to obtain decent results on the validation set. Since even 1 image contained up to 90 instances, we had to randomly remove part of the annotations to test a smaller number of instances.
Maksym Tatariants, AI Solution Architect at Mobidev explains: “What we discovered was that for our validation set with 98 instances, at 10 training instances we could pick up only 1-2 instances, at 25 we already got ~40, and at 75 and higher we were able to predict all the instances.”
Increasing the number of training instances from 75 to 100 and 200 led to the same final training results. However the model converged faster due to the higher diversity of the training examples.
To sum up, the results showed that the model was able to pick up ~95% of the instances in the validation dataset. After fine-tuning with 75-200 objects instances provided validation data resembled the train data. This proves that selecting proper training examples makes quality object detection possible in a limited data scenario.
Recommended AI News: Defina Finance Receives Funding From OKEx BlockDream Ventures
[To share your insights with us, please write to email@example.com ]