GISdevelopment.net ---> AARS ---> ACRS 1999 ---> Poster Session 5

PCA Colour Image Compression Using Vector Quantization

S. Chitwong(1), A. somboonkaew(1,2), F. cheevasuvit(1), K.Dejhan(1) and s. Mitatha(1)
(1)Faculty of Engineering, King Mongkut's Institute of Technology ladkraband, Bangkok 10520, Thailand
(2)Electro-Optics Lab, in Faculty of Science, King Mongkut's Institute of Technology
ladkrabang, National Electronics and Computer Technology
Bangkok 10520, Thailand

Abstract
Since the method of principal components analysis (PCA) is a powerful and efficient method for dimensionality reduction. Therefore, the generated colour image from the first three principal components can be contained almost the total variance of the original images. So many applications can be accomplished by using the PCA colour image. Generally, the PCA colour image needs 24 bits to represent the colour of each pixel. This will cause the problems of memory storage and transmitting channel bandwith. To conquer these problems, this paper proposes the vector quantization technique to compress the PCA image. Only 8 bits are used to encode the PCA colour image instead of 24 bits. The initial 256 code vectors in the codebook are selected from the 3-dimensional cluster diagram of first three components.

1.Introduction
principal components analysis (PCA) is first presented by H. Hotelling [1], hence sometimes, it is called Hotelling transform. PCA Plays and important role in the remote sensing applications such as change detection, classification, enhancement and data reduction. The PCA technique is based on the assumption that the variance of image data may be used as a measure of that image's 10 information content [2]. The satellite data acquisition system compose of many spectral sensors. Since each object can be well reflected in different spectral bands. In order to discriminate the object of interest, therefore, many spectral images are integrated to consider in the same time. PCA algorithm can be served this requirement because it can agglomerate the almost the total information of original images in the first three principal components which have the higher variance. Then the PCA colour image, obtained by the first three principal components, Will be used in image interpretation and classification. However, a colour image is required 24 bits for pixel encoding. This will cause the problems of bandwidth capacity and data storage. To carry out these problems, the data compression must be applied. As the vector quantization is quite efficacious method for data compression, then it is selected to compress the PCA colour image. Only 256 code vectors are selected in the codebook of vector quantization method. The index of code vector is encoded by 8 bits. Therefore, we can reduce the number of bits by three times.

2. PCA algorithm
PCA is a linear transform that can be written as


X is a p-dimensional of original variable
The expected value of Y is obtained by

E(Y) = E(CX +B)
=CE(X) +B             (2)
=Cmx + B = my

Normally, it may be chosen my = 0, as

Cmx +B = 0
-Cmx = B             (3)

By substituting equation (3) into equation (1),

Y = CX - Cmx
Y = C(X - mx)

But my = 0, the covariance of Y is then

E(YYT) = L = E{[C(X-mx)][C(X-mx)]T}
L = CE[(X-mx) (X-mx)T]CT             (4)
L = CåCT

where å is the (p x p) covariance matrix of x-variables and L is the covariance matrix of uncorrelated of y-variables.

L is a diagonal matrix which the elements are eigenvalue of x - variables. The first three eigenvectors which the higher variance will be selected to create the principal component images. These three images will assigned by the colour of R, G and B to create a colour composite of PCA image.

Here, 6 images from thermatic mapper system of Landsat in Fig. 1 have been used in PCA process. The corresponding 6 components from PAC are shown in Fig. 2. the colour image obtained by first three components is presented in Fig. 3. This colour image contains the 95 percents of the total variance.


Fig. 1 Original images



Fig. 2 Six component images from PCA



Fig 3 PCA colour image contained 95 percents of the total variance
(PC1 = red, PC2 = green, PC3 =blue)

3.Vector Quantization
Image compression using vector quantization (VQ) is a powerful lossy compression technique. The V1 algorithm is tried to create a codebook that is a set by finite code vectors for representing the input vector. These code vectors are generated under the constraint of minimum distortion. The most well-known technique of codebook design is presented by Linde-Buzo-Gray [3]. Since the 224 colour shades of PCA colour image will be compressed into 256 colours. Therefore, the initial 256 code vectors must be selected form most frequency appear of 256 colour from the PCA colour image via the 3-dimensional cluster diagram. Consequently, the computation time in code vector training process will be diminished. These 256 code vectors will be kept in the codebook. VQ encoder will compare will compare each colour of PCA image will all the 256 code vectors in order to generate in index under the minimum distortion criterion. By the decoder side, the index in used to point out the location of corresponding colour in the codebook. The output vector from the decoder will be the approximately colour of the corresponding pixel. The encoder and decoder of VQ can be presented as the Fig. 4.


(a) Transmitter



(b) Receiver
Fig 4. Block diagram of VQ process

By applying the VQ method that defined initial code vectors to the Fig. 3, compressed image is shown in Fig 5(a) with the mean square error of 6.52. While the Fig. 59b) is the resulting image using random initial code vectors, and its mean square error is of 129.64.


(a) Defined initial code vectors



(b) Random initiall code vectors
Fig.5 Compressed PCA image with the compression ratio 3:1.

4. Conclusion
The PCA has been shown as a powerful method for multispectral images interpretation especially in the colour composite form. However, each colour image needs a huge of memory storage and channel bandwidth. Therefore, this paper presents a vector quantization method to compress the PCA colour image. The initial code vector in the vector quantization's codebook are defined by the most frequently appear colours in the PCA image in order to reduce the computation time and also the mean square error. The compressed colour image will be suitable to transmit to the remote organization via the public communication system. Hence, the end users can be widely utilized the compressed PCA colour image for their objectives.

References
  • H. Hotelling, "Analysis of a complex of statistical variable into principal componets," J. Educ. Psysch., vol. 24,pp. 417-441, 1933.
  • S.K. Jenson and F.A. WALTY, "Principal components analysis and canonical analysis in remote sensing, " Proc. An. Soc. Of Photogrammetry, pp. 79-143, 1979.
  • Y.Linde, A. Buzo and R.M. Gray, " An algorithm for vector quatizatizer design, "IEEE Trans. commun, vol. COM-28, pp. 84-95, Jan 1980.
  • F.Cheevasuvit, K. Dejhan, S.Mitatha and S.Wongkharn, "Data compression using vector quantization and Huffman coding for satellite imagery," Proc. of the 16th CRS, pp. E1-1E1-5,1995.