Donald is a PhD and Algorithmic Engineer for Euclid. In an earlier article, he wrote about how video compression is measured. In this post, Donald looks at one objective measurement of the video compression performance: Peak Signal to Noise Ratio (PSNR)
The Joint Collaborative Team Video Coding (JCT-VC) is set to finalize the ITU-T Rec. H.265 Video Coding Standard in January 2013. How should one compare the encoding performance of ITU-T Rec. H.265 (MPEG-H Part 2) to ITU-T Rec. H.264 (MPEG-4 Part 10)? How can we compare these video codecs? Encoded rate and distortion (PSNR) are important measurements in comparing moving picture quality. In a previous article“Compression Measurement Explained”, we discussed Bandwidth Reduction andCompression Ratio Gain which both have to do with the encoded rate comparison. Moving onto distortion, we will discuss an objective measure, PSNR, which is often used as a surrogate metric for the perceptual moving picture quality (PMPQ). Armed with metrics for encoded rate and distortion, we will present guidelines on how to conduct a fair video codec comparison.
2. How Do We Objectively Measure PMPQ?
The best way to assess perceptual moving picture quality is ask someone (the viewer) to watch and compare the decoded/reconstructed video to the original source video side-by-side on a good quality display. The viewer is asked to assess the picture quality of the encoded video sequence to the original source video sequence. However, subjective viewer quality assessment can be time-consuming and expensive. As a result, the video compression research community has used the Peak Signal-to-Noise Ratio (PSNR) as an objective measure of picture quality. Researchers acknowledge that the PMPQ does not correlate well with the PSNR or mean square error (MSE). Nonetheless, digital video codec algorithm developers have reported the video encoded bit rates and PSNRs for many years as a way to measure the PMPQ. Now that we have an objective way to measure PMPQ let’s discuss how to conduct a fair video codec comparison.
3. Guidelines For A Fair Video Codec Comparison
The following are guidelines for conducting a fair video codec comparison:
1. Select Different Types of Video Content and Formats:
Sports, head and shoulders, video conferencing, motion picture, computer-generated imagery (CGI). Action-packed motion pictures and sports content typically contain fast and complex motion which usually provides a good test for the prediction engine within a modern-day video codec. This type of content is often difficult to compress since the fast and complex motion is often difficult to model by conventional block-based motion compensation.
2. Configure each video codec in a similar manner:
a. Similar Group of Pictures (GOP) Coding Structures
i. Same number of frames in GOP
ii. Same Frame Coding Types
b. Similar Picture Prediction Structures
i. past reference frame(s)
ii. future reference frame(s)
iii. both past and future reference frames
c. Similar Rate Control Parameter Settings
i. Quality range should be typical of video compression/transmission application
ii. Video Quality Range Should Be Typical of Video Transmission Application
iii. Compare the Video Codecs each with the same Quantization Parameter (QP).
• The same fixed QP should produce compressed videos with approximately the same visual quality
• JCT-VC uses QP = 22, 27, 32, and 37.
3. Plot rate-distortion point for each fixed QP.
For each video sequence encoding run where H.264 and H.265 are configured similarly, plot a point in a scatter plot of Video Sequence Average PSNR vs Average Encoded Rate.
a. By comparing the rate distortion plots of H.264 and H.265, we can determine which video codec has a higher coding efficiency. Figure 1 shows an example of such a plot.
Figure 1. PSNR-Y vs Video Enc Bit Rate for H.265/HEVC and H.264/AVC. (Traffic 2560×1920 video sequence). Observe that the red (square) plot is above the blue (diamond) plot especially at lower bit rates. H.265 has a higher PSNR at all video encoded bit rates and will probably have better PMPQ.
By similarly configuring video codecs, choosing a wide range of video content types, measuring encoded bit rates and corresponding PSNRs, one can fairly compare the video compression performance of competing video codecs.
 This author believes that there is no good substitute for comparing video sequences side-by-side. Many will argue that for relatively small encoding errors, PSNR correlates well as a PMPQ metric.
 GOP (Group of Pictures) is coding picture type pattern that reoccurs. GOP lengths can typically be up to 1 second in length. (An H.264 GOP example would be (I,B,B,P,B,B,P,B,B) which would be a GOP length of 9.)
 Reference frames refer to previously encoded or decoded video frames which convention motion-compensated prediction video encoders use to predict the current macroblock. Improved motion-compensated prediction is how H.264/AVC achieved superior compression performance to MPEG-2/H.263.
 Rate control is the control method to limit encoded video rate so that the compressed video can be transmitted over the allocated channel. Typically, video encoders adjust QP to control the encoder bit rate. As QP increases (decreases) the video encoded bit rate decreases (increases).