WiMi to Work on Multi-Channel CNN-based 3D Object Detection Algorithm
WiMi Hologram Cloud a leading global Hologram Augmented Reality (“AR”) Technology provider, announced that its R&D team is working on a 3D object detection algorithm based on multi-channel convolutional neural networks. It uses RGB, depth, and BEV images as inputs to the network to regress the object’s category, 3D size, and spatial location, respectively. The algorithm combines a multi-channel neural network system to achieve 3D object detection.
Latest Insights: AiThority Interview with Thomas Kriebernegg, Managing Director & Co-Founder at App Radar
BEV images provide information perpendicular to the camera viewpoint and can represent the spatial distribution of objects. The BEV images are generated using point cloud projection and used as the neural network input to improve the 3D object detection accuracy. By directly processing the input point cloud data through CNN, the problem of encoding and feature extraction of disordered point clouds can be solved to obtain end-to-end regression of 3D bounding boxes. The algorithm extracts only 3D suggestion frames from monocular images and estimates 3D bounding boxes, then combines laser point clouds with visual information and projects the point clouds into the BEV images. The algorithm feeds the information into a CNN and fuses multiple pieces of information to estimate the 3D bounding box. The fusion of multiple information facilitates better detection of objects in 3D space.
WiMi’s 3D object detection algorithm, which can simultaneously identify the category, spatial location, and 3D size of objects, dramatically improves the accuracy and efficiency of object detection. The multi-channel object detection neural network system allows 3D object detection, extending the input to RGB, depth, and BEV images. First, RGB image, depth image, and BEV image are used as the network input, and then the feature map is obtained by CNN. And the feature vector of the proposed region in the feature map is generated using a spatial pyramid pooling layer, and then the classification and position regression of the object is realized using a classifier and regressor. The classifier is mainly used to determine which class the extracted features in the proposal belong to. Finally, multi-task regression will be performed by two fully connected layers to predict object classes and 3D bounding boxes.
Hot AI News: Innovations in Media Quality With The Emergence of New Mediums
3D object detection and recognition have always been crucial technology in computer vision. It is the machine’s basis for understanding and interacting with the outside world. 3D object detection technology can be widely applied in navigation, intelligent robotics, crewless vehicles, and security monitoring.
With the advancement of 3D data acquisition technology, the enhancement of computing power, deep learning, and the increase of application demand, the research and application of 3D vision technology have received more and more attention. WiMi’s algorithm enjoys a broad application prospect in autonomous driving, intelligent robotics, ARVR, remote sensing, biomedical, and so on.
AiThority: Put People, Not Tech, at the Heart of Your MarTech Program
[To share your insights with us, please write to sghosh@martechseries.com]
Comments are closed.