Next: Methods and Results Up: Title Page Previous: Full Text Index Index: Full Text Index Contents: Conference Page

Introduction

In search of developing efficient image coding with optimal distortion at very low bit rates, recent research efforts use hybrid coding techniques by adaptive encoding of wavelet coefficients of multiresolution subimages by either scalar or vector quantization (VQ). The EZW algorithm ([6]) and a number of recently developed VQ algorithms ([9], [14], [15], [17], [18], [19]) use the 2-D multiresolution discrete wavelet transform to decompose images into statistically similar subimages for efficient quantization and encoding at different scales. The advantage of wavelet based subband coding lies in the fact that the probability distribution functions (pdf’s) of the coefficients can be modeled as generalized Gaussian distributions with proper choice of filter taps and other parameters. Such modeling allows prediction of achievable maximum lossless compression and better design for lossy compression at optimal distortion.

In the wavelet-based image coding algorithm using a new vector quantization technique, namely IAFC-VQ/AFLC-VQ [14] [17] [18], that utilizes adaptive neural network-based clustering algorithms [11] [19], the target bit rate is controlled by the number of clusters created. The generation of clusters is dependent on a parameter related to the cluster radius set in the initialization process. This new technique is capable of coding and decoding the visible human color images at the original resolution of 2048 x 1216 pixels unlike other current techniques which use images subsampled to 512 x 512 pixels. The unique feature of the new technique is that it has been theoretically modeled to ensure minimum distortion even at a very high compression ratio.

The original design and implementation of this new adaptive vector quantization despite its optimality in minimizing the distortion was computationally intensive for large size color images. Modifications of the current software design as well as hardware implementation for faster execution are currently under progress. A modified version of EZW, and a newly proposed combination of lossy wavelet decomposition using AFLC-VQ ([18]) for selective wavelet levels, indicate significant improvement in processing time. The downloading and decoding time for the lowest resolution image is around 6 seconds while the full resolution image takes around 48 seconds (using a 400 MHz Pentium).We present here the results of high fidelity compression of the Visible Human color images and the feasibility of fast transmission of such images for interactive use over the Internet.

Main features in EZW and AFLC-VQ

EZW
The main features of EZW include compact multiresolution representation of images by discrete wavelet transformation, zerotree coding of the significant wavelet coefficients providing compact binary maps, successive approximation quantization of the wavelet coefficients, adaptive multilevel arithmetic coding, and capability of meeting an exact target bit rate with corresponding rate distortion function (RDF). The details of this algorithm can be found in ([6]). This algorithm may not yield optimal distortion but it does provide a practical and general high compression algorithm for a variety of image classes.

The core of the EZW compression is the exploitation of self-similarity across different scales of an image wavelet transform. In other words EZW approximates higher frequency coefficients of a wavelet transformed image. Because the wavelet transform coefficients contain information about both spatial and frequency content of an image, discarding a high-frequency coefficient leads to some image degradation in a particular location of the restored image rather then across the whole image. As with other wavelet-based techniques, it also does not introduce blocking artifacts, inherent in windowed-frequency plane based compression methods such as JPEG ([21]). Another property of wavelet transforms is that for a wide class of images the wavelet coefficients tend to decrease in magnitude as we go to finer scales of the image. At the same time the number of the finer level coefficients grows as 2^2j where j is the number of decomposition levels. The larger coefficients require more bits to be represented than smaller ones. As it can be seen more frequent coefficients (finer scale coefficients) require less bits for their representation, which is the underlying principle of entropy coding. In the original EZW algorithm introduced by Shapiro this property was not used efficiently. If this property is properly exploited, as it has been suggested by Amir Said and William Pearlman ([1]), not only does the EZW algorithm efficiently approximate the details of an image, but it also allocates more bits for less frequent components, implicitly implementing entropy coding, while executing the main EZW algorithm. This could lead to further reduction in the execution time if the follow-up arithmetic coding used in the original EZW algorithm which does not provide a significant increase in the compression ratio, is eliminated.

This new modified EZW algorithm can be very valuable in image transmission via the Internet, because the compression/decompression execution time decreases as the compression ratio gets higher allowing for rough-scale image previews before the image is downloaded. However, AFLC-VQ is also capable of fast decoding following a progressive transmission scheme with a 400 MHz Pentium or equivalent as a client computer.

AFLC-VQ
The AFLC-VQ technique also uses multiresolution wavelet representation. However the coding process uses an adaptive clustering technique, namely AFLC [11] (Adaptive Fuzzy Leader Clustering) as briefly described below. The main contribution of the AFLC-VQ is in the optimization of distortion at the clustering stage by integrating the concept of fuzzy membership value ([7]) of the input pattern samples with the adaptive learning rate of an ART (Adaptive Resonance Theory) -type neural network ([2]). Such integration allows on-line generation of optimal number of clusters instead of assuming a chosen number of clusters based on training sets of images. Incorporation of fuzzy distortion measures into self-organizing neural network architectures provides a powerful tool for dynamic clustering with significant application in vector quantization for speech or image data compression. By generating multiresolution codebooks from wavelet decomposed subimages, the new clustering algorithm (AFLC) eliminates some of the problems, for example, the codebook initialization or getting trapped in local minima, extensive search procedures, etc., encountered by many of the existing VQ algorithms using k-means clustering.. The details of AFLC and a modified version of AFLC known as IAFC (Integrated Adaptive Fuzzy Clustering) can be found in references ([11], [20]), respectively. The advantage of AFLC-VQ over IAFC-VQ is that the former requires less computation time although at the expense of more memory space. The use of AFLC or IAFC eliminates the need for codebook initialization since the algorithm has the ability to generate an on-line codebook from the input vectors, following the characteristics of ART.

A combination of this new approach to vector quantization with run-length and entropy coding will yield further compression while maintaining the same perceptual quality.

Next: Methods and Results Up: Title Page Previous: Full Text Index Index: Full Text Index Contents: Conference Page