Block transform coding refers to dividing an image into several blocks, mapping each block to another value domain after mathematical conversion, and then coding. It belongs to a kind of transform coding. Because it is often used with quantization, it belongs to lossy data compression.
It includes the following forms: DFT transform, DCT transform, Walsh Hadamard transform and KL Transform.
Image transform coding is to transform the image described in the form of pixels in the spatial domain into the transform domain and express it in the form of transform coefficients.
Most of the images are flat areas and areas with slow content transformation, that is, most of them are DC and low frequency, and there are few high frequencies. Therefore, appropriate transformation can convert the scattered distribution of image energy in spatial domain into relatively concentrated distribution in transform domain, so as to remove redundancy. Combined with quantization, "z" scanning, entropy coding and other coding technologies, Effective compression of image information can be obtained.
The basic idea is to decompose the image into DCT × 8 or 16 × 16, and perform a separate DCT transformation on each sub block, and then quantify and encode the transformation results. With the increase of sub block size, the complexity of the algorithm increases sharply. Therefore, 8 is usually used in practice × 8, but using a larger sub block can significantly reduce the image blocking effect.
At present, DCT is a basic technology constituting the mainstream mitigation coding framework. Because the DCT transformation form is independent of the input signal and there is a fast implementation algorithm, HEVC follows the integer DCT of H264 and popularizes the transformation forms of different sizes. In addition, in order to adapt to the distribution of residuals under different prediction methods, HEVC also introduces DST discrete sinusoidal transformation.
Fourier transform shows that any signal can be expressed as the superposition of multiple sine wave and cosine wave signals with different amplitudes and frequencies. If cosine is used, it is cosine transform, and if the input signal is discrete, it is discrete cosine transform.
DFT Transformation:
clc %Clear the contents of the command window close all %Close all Figure window clear all %Clear all variables in the workspace I=imread('Fig.tif'); figure(),imshow(I),title('original image '); len = length(I(:, 1, 1)); wid = length(I(1, :, 1)); lenLeft = mod(len, 8); widLeft = mod(wid, 8); I(1:lenLeft, :, :) = []; I(:, 1:widLeft, :) = []; fun = @(x) fft2(x); I = blkproc(I, [8 8], fun); fun = @(x) ifft2(x)/255; I = blkproc(I, [8 8], fun); figure(), imshow(I), title('after DFT Transform, and then inverse the transformed image'); imwrite(I,'DFT.jpg'); A=imread('Fig0831(a).tif'); B=imread('DFT.jpg'); a=double(B)-double(A); [m,n]=size(a); rmse1=sqrt(sum(a(:).^2)/(m*n))
DCT transformation:
clc %Clear the contents of the command window close all %Close all Figure window clear all %Clear all variables in the workspace I=imread('Fig0831(a).tif'); figure(),imshow(I),title('original image '); %% len = length(I(:, 1, 1)); wid = length(I(1, :, 1)); lenLeft = mod(len, 8); widLeft = mod(wid, 8); I(1:lenLeft, :, :) = []; I(:, 1:widLeft, :) = []; %% %DCT Transformation fun = @(x) dct2(x); I = blkproc(I, [8 8], fun); %% %quantification S = [ 16, 11, 10, 16, 24, 40, 51, 61; 12, 12, 14, 19, 26, 58, 60, 55; 14, 13, 16, 24, 40, 57, 69, 56; 14, 17, 22, 29, 51, 87, 80, 62; 18, 22, 37, 56, 68, 109, 103, 77; 24, 35, 55, 64, 81, 104, 113, 92; 49, 64, 78, 87, 103, 121, 120, 101; 72, 92, 95, 98, 112 ,100, 103, 99 ]; fun1 = @(x) fix(x./S); I = blkproc(I, [8 8], fun1); figure(2), imshow(log(abs(I)),[]),title('DCT Transformation&quantification'), colormap(gray(4)), colorbar; h = length(I(:, 1)); w = length(I(1,:)); fun2 = @(x)x.*S; I = blkproc(I, [8 8], fun2); fun = @(x) idct2(x)/255; I = blkproc(I, [8 8], fun); figure(), imshow(I), title('after DCT Transform, and then inverse the transformed image'); imwrite(I,'DCT.jpg'); A=imread('Fig0831(a).tif'); C=imread('DCT.jpg'); b=double(A)-double(C); [m,n]=size(b); rmse2=sqrt(sum(b(:).^2)/(m*n))
Through the block transform coding of DFT and DCT respectively, the information carrying capacity of DCT is stronger than that of DFT, but the best transform in information carrying is K-L transform. The compression efficiency of K-L transform is very high, but the algorithm is difficult to implement; The implementation of DFT transform algorithm is simple, but the compression efficiency is not very ideal. Although DCT transformation is not the best transformation, it is a common transformation method because of its simple calculation.
DCT transform has the following advantages: it can be realized by monolithic integrated circuit, the most information can be loaded into the least coefficients, and the block effect called block defect can be minimized when the boundary between sub images becomes visible. DCT provides a good compromise between information carrying capacity and computational complexity.