IEEE ISM 23 - Temporal Layer Injection

Our paper titled “Temporal Layer Injection for Fast Bitrate Ladder Creation in Live Video Streaming” was accepted at the IEEE International Symposium on Multimedia (ISM) 2023. It will be presented at 11-13 December 2023 in Laguna Hills, California, USA.

Video streaming systems aim to provide high-quality video adapted to clients’ device and network conditions. For this purpose, adaptive streaming architectures encode video content at a variety of quality levels, organized in a bitrate ladder. However, compressing a video into a bitrate ladder multiple streams is resource-intensive, which may become especially problematic in live streaming applications with real-time demands.

Therefore, this paper proposes a novel solution for fast bitrate ladder creation, and provides the requirements for implementation in the H.266/VVC standard. More specifically, the proposed method creates new intermediate Combined Streams by injecting the lowest temporal layers of a higher-quality Augmentation Stream in a lower-quality Base Stream. Since the lowest layers are used as reference by the remaining layers, this procedure indirectly increases the quality of the frames in those untouched remaining layers as well.

We demonstrate that injecting more layers brings both the quality and bitrate closer to that of the Augmentation Stream. The disadvantage of the Combined Streams is that their quality fluctuates more than the quality of the the source streams, and that they are compressed less efficiently, comparable to going from a slower to fast or faster preset in the VVenC encoder. Most importantly, their main advantage is that they were generated at no significant additional computational complexity.

In this way, the proposed method is of great benefit when generating a bitrate ladder of video streams under constrained computational resources.

Some examples are shown below. We use the BQSquare sequence with a Base Stream with QPB=32 and Augmentation Stream with QPA=22.

Example crops. Example crop of a Base Stream, the five intermediate Combined Streams, and the Augmentation Stream. With each additional injected layer, the quality of the Combined stream increases.

Same example as the crops above, but showing the full frames instead of crops, sequentially in time. We advise to watch it in full screen, such that minor differences are perceived more easily. The quality difference can be perceived best at the water, tile borders, fence and palm tree.

Graph with PSNR values. Example graph showing the streams’ corresponding PSNR values over time. Again, we observe that the quality gradually increases with each additional injected layer - from the Base Stream to the Augmented Stream.

Example Base Stream. The quality is relatively low. By injecting a temporal layer of the Augmentation stream, the quality will be increased (see video below).

Example Combined Stream (tid=0). Although only the lowest layer of the base stream is replaced (i.e., every 32nd frame), the quality increases in all other frames as well (compared to the video of the Base Stream above).

All experimental result files (.csv &.xlsx) can be found here, in addition to some full example videos (.266 and .mp4) and more graphs.

GitHub logo The source code is available on