Project 41: Optimal Coding for 360-degree Video Streaming

Contact Information:

Asso. Prof. Chenglin Li

Email: LCL1985@sjtu.edu.cn

Project Description and Objectives:

For 360-degree spherical images and videos, current methods to directly encode them into a bitstream are not yet mature. Therefore, the state-of-the-art 360-degree video streaming systems usually seek an alternative way by exploiting the advantages of two-dimensional video coding technology (e.g., H.264/AVC, HEVC), which first projects the 360-degree spherical surface onto a rectangular image/frame and then applies the advanced two-dimensional video coding encoder to encode the rectangular image/frame into a bitstream. To support the pixel resolution of the displayed viewport as 4K (3840 X 2160), the resolution of the rectangular projection should be at least 12K (11520 X 6480), which indicates that the viewport area covers roughly 14% of all pixels in the rectangular projection. This leads to recent researches on an efficient streaming technique for 360-degree videos, namely tile-based approach, where the two-dimensional projection is divided spatially into small rectangular tiles and each temporal sequence of these small rectangular tiles in the same spatial location is treated as an individual source for video encoding. In this way, only the tiles that cover the viewport will be transmitted to the user.

The size of the tiles affects conversely coding efficiency and transmission efficiency. If we decrease the tile size, the transmission efficiency is improved since the non-overlapped area between the viewport and the transmitted tiles becomes smaller. On the other hand, a smaller tile size results in a larger number of tiles per frame, which in turn increases the number of headers for the tiles and results in efficiency reduction in the intra- (spatial) and inter- (temporal) prediction. Therefore, we need to study the optimal determination of the tile size, by considering both the video content statistics and the viewports (and viewport prediction) of users. Another promising research direction is that, instead of fixed tiling, we may want to cover the whole rectangular frame with some sub-rectangular tiles with different sizes, which, if encoded together, could achieve the optimal tradeoff between coding and transmission efficiency.

Eligibility Requirements:

Basic knowledge of video codec (e.g., H.264 and HEVC), signals and systems, digital signal processing, digital image processing, matrix analysis, optimization theory.

Mastering more than one programming language, C/C++ and MATLAB preferred.

Main Tasks:

Develop overall adaptive video streaming system framework for tile-based 360-degree videos.

Formulate the relationship between the coding efficiency and the tile size, and the relationship between the transmission efficiency and the tile size.

Solve the ptimal trade-off between the coding efficiency and

the transmission efficiency, by determining the optimal tile.

Website:

Lab: http://min.sjtu.edu.cn/

School: http://english.seiee.sjtu.edu.cn/