WIMI Hologram Academy: Research Progress in VR Panoramic Video Transmission
WIMI Hologram Academy, working in partnership with the Holographic Science Innovation Center, has written a new technical article describing their research progress in VR panoramic Video transmission. This article follows below:
Top NLP Update: Natural Language Processing: The Technology That’s Biased
In recent years, with the rapid development of computing, communication, and other technologies and the high-speed deployment of 5G networks, VR applications have flourished. Among them, panoramic video -also known as 360-degree video or immersive video – -is one of the important components of virtual reality applications, in the field of academic research and industrial applications. Scientists from WIMI Hologram Academy of WIMI Hologram Cloud Inc studied the progress in VR panoramic Video transmission. According to the survey data, the market share of panoramic video will continue to grow at an average annual growth rate of 34% from 2018 to 2024. International renowned investment banks have also reported that the VR business based on panoramic video is growing rapidly, accounting for 40% of the expected total number of users in the VR application field (130 million). It is estimated that by 2025, the number of VR panoramic video users will reach nearly 200 million.
Different from traditional video, which is presented only presented on a two-dimensional plane, as a new immersive media application, panoramic video allows users to obtain a full range of scenes in the 360degree spherical video, and randomly switch Field of view and FOV during the playback process. Today, users can play panoramic video using computers, smartphones, head-mounted display devices (HMD), and others. To create a better panoramic video quality of experience for users, QoE, the increased field of view range also means higher resolution requirements and bandwidth requirements. For entry-level panoramic video, the full-screen resolution is 8K, a monocular resolution is 1920*1920, and its network bandwidth requirement is about 100Mbps. Such a huge amount of data transmission is a daunting challenge.
Back in 2018, Moving Picture Experts Group, MPEG has the standardized panoramic video (MPEG-I), Joint Video Exploration Team, JVET; high-efficiency video, HEVC, with relevant support for panoramic video transmission.
1. The difference between panoramic video and traditional video
Panoramic video is completely different from traditional video in that the overall presentation of the panoramic video is different from the traditional video. Due to its full range, the market of users is limited, so users can only watch a small part of the current picture. To provide high-speed, smooth, and high-resolution video based on bandwidth saving, panoramic video transmission has become a research hotspot today. The panoramic video has the following advantages over traditional video:
(1) Strong interaction
In the VR panoramic video scene, interactive content such as text, charts, photos, and web pages can be inserted, so that the video experiencer can deeply interact with the display content set in advance, effectively improving the attraction and appeal of the video content.
(2) Strengthen the sense of entry
In traditional video shooting, various shots can be switched according to the video content. However, VR panoramic video is introduced in the first person, the viewer will be in a “real” environment, control is very convenient, and the panoramic up and lower perspective is not limited, thus creating a strong sense of scene entry.
(3) New digital marketing model
The era of VR panoramic technology has begun, providing a new marketing service model for major enterprises. More enterprises will enter the three-dimensional virtual digital era from the plane era. I believe that digital marketing will change the traditional business model and create a brand-new industrial value for the development of enterprises.
VR panoramic video plus 3D effect can bring immersion, compared with the traditional video has made a qualitative leap, and the application field is also very broad, it is worth trying and exploring.
AI and ML News: AI: Continuing the Chase for Brain-Level Efficiency
2. Research and outlook of VR panoramic video transmission
(1) The choice of panoramic video mapping format has a significant impact on encoding and other parts. Its related research has developed from the initial ERP one-sided mapping to the current multiple trade-offs, but an important challenge it still faces is the problems of oversampling or undersampling existing in the mapping process. In this case, elements such as content features, object movement, viewport prediction, and user behavior features are incorporated into the mapping format into consideration. By assigning more pixels to the viewport and salience elements, the mapping function is enhanced based on ensuring bandwidth utilization. For example, CHEC mapping is further improved mapping efficiency by combining content features according to HEC mapping.
(2) Due to the high resolution of the panoramic video, there is a huge data compression and computational complexity at both ends of the codec, which leads to many problems in the codec-related technology of panoramic video. Therefore, the new video codec technology needs to be developed to obtain more efficient compression, lower latency, and seamless picture switching, to provide a higher quality of user experience quality. During the codec improvement process, the methods of motion estimation adaptation, sampling density correction, reprojection, and in-frame prediction are taken into account. Moreover, the Tile-based HEVC design can realize the high-level parallel performance of the encoder and the decoder, providing a new idea for the development of codec technology.
(3) In the existing panoramic video quality assessment methods, the subjective quality assessment provides subjective quality scores for data sets, and objective quality assessment aims at predicting subjective quality scores, each with its characteristics and advantages. The standardized definition system of the testing protocol required for subjective evaluation, the statistical analysis of objectively evaluating the effectiveness in the case of large-scale data sets, and the quality assessment statistics of different users on different scales are all urgent problems to be solved. At present, most quality assessment mainly considers the influence of camera motion track and video content characteristics on quality assessment, including other factors, such as screen sickness, user physiological symptoms, user gender and age, user factors such as display equipment, virtual reality audio and video equipment, network factors such as network delay, picture jitter; video content factors such as camera motion, frame rate, mapping coding; video transmission factors such as viewport prediction error and playback cache. Existing data center simulated user attention distribution methods get better results, while perceptual methods generally have better performance, but risk overfitting, and combining sampling and perception can be regarded as future research trends. With the introduction of viewports, considering the comprehensive content of viewports and the sphere is also one of the directions.
(4) Research-based on Tile transmission is the mainstream today, which can transmit slices with different quality according to the user’s viewports, to ensure high resolution and reduce the bandwidth consumption rate. Because the user’s head motion is very variable in viewing, the existing Tile-based methods are insufficient to deal with complex viewport changes, so the interactive selection of slices should be considered in the case of priority panoramic video distribution, such as dynamically selecting the number of Tile in the transmission, or dynamically adjusting slices for different network conditions, slices can also use deep reinforcement learning for slice prefetch scheduling. Furthermore, for high-resolution slice transmission of multipath, high-priority slices can be provided in hierarchical order through the best available paths to the prevention of unordered delivery. However, Tile displays at different quality levels can create artifacts in the picture, especially at the boundary. By increasing the Tile quantity or the quality level, improving the quality transition smoothness is important for improving the user’s viewing experience.
(5) Advances in viewport prediction efforts can largely optimize key steps related to mapping and transmission. Current trajectory-based viewport prediction schemes can predict viewports with reasonable accuracy for up to 10 seconds, while content-based viewport prediction schemes improve in accuracy, but both do not reach a high-quality level. Therefore, for long-term prediction error, the temporal and spatial features of the video image are determined using an appropriate codec and convolutional LSTM architecture. Based on the saliency features, the user region of interest detection, and the user’s head movement trajectory add the research direction to the reference factors for the viewport prediction.
Founded in August 2020, WIMI Hologram Academy is dedicated to holographic AI vision exploration and researches basic science and innovative technologies, driven by human vision. The Holographic Science Innovation Center, in partnership with WIMI Hologram Academy, is committed to exploring the unknown technology of holographic AI vision, attracting, gathering, and integrating relevant global resources and superior forces, promoting comprehensive innovation with scientific and technological innovation as the core, and carrying out basic science and innovative technology research.
Read More About ML News : How AI English Language Training Tools can be Leveraged Ahead of the Summer Travel Surge
[To share your insights with us, please write to firstname.lastname@example.org]