this article comes from hugi issue 12 ъ released in september 1998
      ДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДД


ДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДД
IMAGE COMPRESSION: BTC (BLOCK TRUNCATION CODE)                            DAVE
ДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДД



                          BTC (Block Truncation Code)
                          ДДДДДДДДДДДДДДДДДДДДДДДДДДД


                      Doc Written by David Palomar (DAVE)
                            email:  dave@bglinks.com



 1. INTRODUCTION
 ДДДДДДДДДДДДДДД

 This  doc  explains the BTC scheme for  compressing image data. Recently I was
 speaking with Biki and Aslan in #coders channel. We were talking about digital
 image  compression. Biki and Aslan (as other coders) told me they were looking
 for  docs  explaining  algorithms  to compress  image  data...  and they found
 NOTHING...  I'm  a lover of image compression  algorithms  and I know a lot of
 them...  so I promised these coders to help with these algos. I think this DOC
 will be helpful for coders... I hope :)


 2. BTC
 ДДДДДД

 First  of  all, BTC is a (new?)  image compression scheme that was proposed by
 Delp  and Mitchell, whose perfomance is comparable to the DTC, but which has a
 much  simpler  hardware implmentation. In Block  Truncation Code, an image is,
 firstly, segmented into NxN blocks of pixels. Well, in the most cases the size
 is  4x4,  but we can choose another  size,  like 8x8... Sencondly, a two level
 output  is choosen for every block and bitmap  is also encoded using 1 bit for
 every  pixel. The main idea is that  the two levels are selected independently
 for  every NxN block. So, encoding the  block is a simple binarization process
 because  every pixel of the block is truncated to 1 or 0 binary representation
 choosing the reconstruction level(the two level output).

 Example:  encoding 16 bytes (4x4 pixels block)  of an image, we should compute
 an  'statistics'  of  this block (the two  level  output)  and then, we should
 binarize this block with our two level output assigning a 0 or 1 binary... the
 results (if we're working with 256 grays levels) compressed:

 2 bytes indicating the two levels output
 2 bytes indicating the encoded bitmap (2bytes = 8+8bits = 16bits)

 Decoding  is  very simple because we  only place the appropiate reconstruction
 level for every bit indicating 0 or 1.

 Consider the following 4x4 block of pixels:

     і 79 80 75 77 і
 X = і 77 79 79 79 і      <ДДД This is our original image
     і 72 72 79 80 і
     і 77 79 79 87 і

 Think  that our "quantizer" computes the mean  of the block and the two levels
 output  ('a' and 'b'). The mean is 78 (in the block). So we can choose the two
 levels  output  by  pure  inspection (looking  at  the  original image). To be
 realistic,  the two levels are also computed using an algorithm. Look the next
 block and tell me if it is representative... Note that 1=a and 0=b.

     і 1 1 0 0 і
 Y = і 0 1 1 1 і          <ДДД This is our 'encoded' image
     і 0 0 1 1 і
     і 0 1 1 1 і

 ...  where  1  is  indicating that the pixel  is  greater  than the mean and 0
 indicates that it's below the mean.

 Finally, take a look the reconstructed block:

     і 80 80 73 73 і
 X'= і 73 80 80 80 і      <ДДД This is our 'decoded' image
     і 73 73 80 80 і
     і 73 80 80 80 і

 These are the statistics of the block (this was really computed):

 Mean       = 78
 Variance   = 3.351772
 Q          = 10
 Level 1(a) = 73
 Level 2(b) = 80

 As  I said before, we have 256 possibles values (grays), 256=2^8... this means
 we  have  8 bits to represent every pixel...  this tells us that the total bit
 rate is: (8+8+16)/16 = 2.0 bits / pixel ==> 2.00 bpp

 And  it  also tells us that a 64k bitmap  can be encoded to 16k only... yup :)
 ...  but  you  should  note that this  basic  BTC  will  introduce what I call
 artifacts (something similar to noise, but different), but don't worry because
 the  high frequencies of the image (SHARP) will be encoded with high fidelity.
 Other  algorithms I know typically blur sharp  edges. Well, it's time to speak
 about Quantizer Design... let's go!


 2.1 QUANTIZER DESIGN
 ДДДДДДДДДДДДДДДДДДДД

 In our example we made the thresholding process (to choose 2 levels output) by
 pure  inspection. The quantizer we want to  made MUST preserve the 1st and 2nd
 moments  of  a  particular block  minimizing  the  mean-squared reconstruction
 error. In fact, there are A LOT of ways to make a quantizer.

 Most  BTC  algos I know use  'moment  preserving' quantizers. This means these
 quantizers  preserve  a number of moments in  the block (i.e: the mean and the
 variance,  desviation...). These quantizers are  designed for finding the most
 important parameters in the image, the most important visual information... ya
 know:  the MEAN. There is no problem to  use the MEDIAN instead of the MEAN...
 this  would  be  another quantizer that,  probably,  will work better than the
 algo. that uses the MEAN.

 Well,  at  this  point a basic MATH will  be  used.  I hope you dont worry for
 this...  in  fact,  REAL PROGRAMMERS AREN'T AFRAID  OF  MATH.  So here are the
 equations  to  preserve the MEAN and VARIANCE  of  a particular nxn block of a
 image:

 m = nэ
 X1, X2, X3, ..., Xm       are the pixel values

        Д          m
        X  = (1/m) д Xi
                   i=1


        ДД         m
        Xэ = (1/m) д Xiэ
                   i=1


            Д      /Д\ э
       еэ = Xэ  - ( X )
            і      \ /    <ДДДДД This is the MEAN, and THEN SQUARE
            і
            АДДДДДДДДДДДДДДДДДДД First SQUARE, then MEAN

 Now, i'll define the threshold value (Xt) to binarize later:

                              і  a, if Xi < Xth
                        Xi =  і
                              і  b, if Xi >= Xth

                        where i= 1, 2, ..., m

 To  preserve the firsts moments, where q is the number of Xi's greater than or
 equal (>=) to the threshold (the mean)... in other words:

                         Д
                        mX  =  (m-q)a + qb

                         ДД
                        mXэ =  (m-q)aэ + qbэ

 ... and solving this system for 'a' and 'b' we have:
 (note that sqrt = square root)

                               Д
                          a  = X - еsqrt(q/(m-q))

                               Д
                          b  = X + еsqrt((m-q)/q)

 Now,  we must send the nxn bitmap  saying which pixels to reconstruct with the
 'a'  outlevel and which with the 'b' outlevel and also we have to include some
 info saying the two levels output: 'a' and 'b'.

 At  this  point,  we send 'a' and 'b'  directly  using 8 bits for everyone (if
 we're  using  256 grays levels). As I  said before, you'll obtain a 4:1 factor
 compression  (2.00  bpp) because 8 bits are used  for  'a' and 'b' and also 16
 bits for the bitmap = 8+8+16 = 32 bits = 4 bytes.

 A 64K image will be compressed (ALWAYS) to 16K only.

 Ok,  this  is ALL you need to make  a simple BTC encoder... it will work well,
 but... continue reading and you'll learn a lot MUCH MORE about BTC fascinating
 scheme for digital compression image. :)


 2.2. DESIGNING A FAST QUANTIZER
 ДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДД

 I  think  that  currently  you  have  noticed  that  squaring  and square root
 operations  are  required to implement the  quantizer...  and I think that you
 know  that these kind of operations are  slow (...and you would use the FPU...
 and this is NOT what one desires).

 An  approach  that avoids these operations is  what  is called as AMBTC, which
 stands for Absolute Moment Block Truncation Code. The quantizer is designed to
 preserve ABSOULUTE moments. I'll give you the equs:

                        m = nxn = nэ

                        Д          m
                        X =  (1/m) д  Xi
                                   i=1

                                  m        Д
                        а = (1/m) д  іXi - X і
                                  i=1


                            Д
                        a = X - (mа)/2(m-q)


                            Д
                        b = X + (mа)/2q

 Remember  that  q  is  the number of Xi's  greater  than  or equal (>=) to the
 threshold (the mean).


 3. INCREASING THE PERFOMANCE OF THE BTC ALGO.
 ДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДД

 We'll  see  some  non standard BTC  scheme  variations... these variations, of
 course,  will  increase the perfomance of our  encoder (in bpp), but also will
 increases  the  costs and the complexity  of  our algorithm, naturally. A very
 simple  technique to reduce the bit rate is to use fewer than 8 bits for the 2
 level  output (or the mean and the  variance)... for example using only 6 bits
 for each. This will result lower quality in the reconstructed image.


 3.1. BITMAP OMISSION
 ДДДДДДДДДДДДДДДДДДДД

 This is a very simple way to reduce bit rate... this is based on the fact that
 in  some  images  certain regions (blocks)  exist,  where the variance is very
 little  (of no meaning for our eyes)...  these regions are zones of the images
 where  is  not a considerable color change.  In other words, in some blocks of
 the  image,  'a' and 'b' are very  approximated  (variance is little) and, for
 this  reason,  there is NO need to send  'a'  and 'b' together, we should only
 send 'a' or 'b' with the block bitmap encoded. These techniques will work very
 well with images with large uniform zones.


 3.2. VQ FOR THE BITMAP
 ДДДДДДДДДДДДДДДДДДДДДД

 VQ  stands  for  Vector  Quantization,  and  this  is  another  digital  image
 compression  scheme. I assume you know this scheme, if you dont know email me,
 maybe i'll write another DOC explaining this scheme.

 This  method will consider the bitmap as  a m-dimensional vector. I remind you
 that  m=NxN.  For example for 4x4 blocks,  we have a 16-dimensional vector. It
 will  contain  16  values  (0  or 1): b0,  b1,  b2,  b3,  ..., b15. Well, this
 technique  is  based  in considering that,  in  fact,  when we're encoding the
 bitmap  (the  16 bit vector encoded) we  should  note that in the whole image,
 these  bitmaps  will  be repeated (or  very  approximated). This means you CAN
 GENERATE  a CODEBOOK with a training set  of bitmap vectors. Using this method
 you'll  achieve 1.5 bpp, which I think is a good bit rate. I think this method
 (combined with my series calculus algorithm) is better than the others schemes
 i'll comment here.

 J&I proposed me to send the 'a' and 'b' of each block of the whole image first
 and  then send the bitmaps of the whole image because doing it we would make a
 simple  Huffman  encoding for the bitmaps and  the bit rate could be reduced a
 lot. Think that bitmaps could be repeated A LOT in the encoding process:

 52,64, 23,54, 52,64,  23,67,  23,54.................
               ^               ^
               repeated        repeated


 Now,  I'll propose a simple BTC combined with VQ to make the text more easy to
 understand it.

 Example:  we're compressing a 64K image. We  want to use BTC combined with VQ,
 so  we  start  encoding the bitmap and also  finding  'a' 'b' and when we have
 encoded  the  bitmap, then we'll compare this  encoded  bitmap with all of our
 bitmaps  that  are in our particular  codebook  (containing a 256 m-vectors, 8
 bits).  Finally, we'll select the BEST aproximation  (usually this will be a 8
 bits  index  indicating the bitmap in codebook).  We'll send 'a' 'b' and the 8
 bit index.

 Got the idea? Very easy, I think.


 3.3 JOINT QUANTIZATION
 ДДДДДДДДДДДДДДДДДДДДДД

 I'll NOT explain extensively this method... but, take a look: this is based on
 the  fact that exact value of the mean isn't important in a high variance area
 (when  variance varies a lot), but it's important when variance does NOT vary.
 In  other  words, when variance is LITTLE  we have a couple of possible values
 for  the mean, but when variance is GREATER  we have only a few values for the
 mean.  Using  this technique, the mean  and  the variance is jointly quantized
 using only 10 bits for both. Got the idea?

 I dislike this method.... using it you'll obtain a bit rate of 1.63 bpp.


 4. ABS-BTC
 ДДДДДДДДДД

 ABS-BTC  stands  for Adaptive Block Size -  Block Truncation Code and, how the
 name  says,  this  scheme consists on  considering  the  block size when we're
 encoding  the  bitmap... coz fixing the block  size  to 4x4, for example, will
 work  well...  but it's also certain  that some specific images could compress
 much  more than using a fixed nxn for the entire bitmap. As I said before, the
 High  Freq.  zones  in  the  image  (sharp)  will  require  more info encoding
 (precission)  and the Low freq. zones in the images can be encoded which lower
 precission.

 This means we can vary the size of the block depending on what we're encoding.

 These are the steps to encode using ABS-BTC:

 ю Segment the image in 4x4 blocks.
 ю Compute 'a' and 'b'.
 ю Now,  every  block will be classified according  with the 'a' and 'b' levels
   output. We'll use 3 mode operations:

 Mode 1і
 ДДДДДДЩ
 Th1, where Th1 is a threshold if

                        іa-bі <= Th1

 We'll skip the bitmap and we send only the mean of the 4x4 block. Mode used in
 Low Freq. zones in the image.

 Mode 2і
 ДДДДДДЩ
 Th2, Th3, where Th2 and Th3 are both thresholds but different if:

                        Th1 < іa-bі < Th2

 ... AND...
                   ЪДД                           ДДї
                   і                               і
                   і    д іXi-aі +   діXi-bі       і
                   і         Д                Д    і
                   і  zz Xi< X       zz Xi >= X    і  <= Th3
                   і                               і
                   АДД                           ДДЩ

 where zz = for all the possible values.

 Then we'll send 'a', 'b' and the bitmap.

 Mode 3і
 ДДДДДДЩ

 We'll  select  this  mode  when Mode 1  and  Mode  3 are NOT satisfying. We'll
 segment  the  4x4 block to 2x2 block, then  we'll send 'a', 'b' and the bitmap
 for  the two segmented blocks. (Used in the  High Freq. zones in the image, we
 need precission.)

 ... and don't forget to send a header for every block indicating the mode that
 was  selected...  I  suggest you to use  2  bits  field into the bitmap header
 (you'll  NOT  introduce considerable artifacts  in the resulting reconstructed
 image... :).

 If your encoded bitmap is:    1101000100101110    = 16 bits = 2 bytes

 and you selected mode 1 then: 0101000100101110    = 16 bits = 2 bytes
                                 Дї
                                  і
                                  АД> 01 = 1 (field) 

                                 ... and NO considerable artifacts were
                                 introduced (we changed only a bit).

 5. SOME BASIC CODE?
 ДДДДДДДДДДДДДДДДДДД

 I'll  write  some  basic code for help you  to  get the basic idea... so don't
 expect so much. I'll write the standard BTC. I'm assuming we have read a file,
 and  have stored it in a bitmap pointer with reserved mem. I also assuming our
 bitmap  is 320x200 (64k). I put C language here coz many people know it, but I
 prefer Assembler, of coz ;)


 unsigned char *bitmap;
 unsigned int x,y,mean,a,b,q;
 float temp,a,b,sig;

 btc()
 {
     mean=0;
     es1=0;
     es2=0;

     for(y=0;y<4;y++){
         for(x=0;x<4;x++){
             mean+=bitmap[(y*320)+x];
             es1=es1+bitmap[(y*320)+x];
             es2=es2+pow(bitmap[(y*320)+x],2);
         }
     }

     mean=mean/16;
     es1=es1/16;
     es2=es2/16;
     sig=sqrt(es2-pow(es1,2));

     /* Compute q */
     q=0;
     for(y=0;y<4;y++){
         for(x=0;x<4;x++){
         if(bitmap[(y*320)+x]>=es1){ q++; }
         }
     }

     /* compute a and b output levels */
     if(q!=0){
         if(q!=16){ temp=sqrt(q/(float)(16-q)); }
         else { temp=4; }
         temp=temp*sig;
         a=(int)(es1-temp);
         temp=sqrt((float)(16-q)/q);
         temp=temp*sig;
         b=(int)(es1+temp);
     }

         /* Now, you have the 'a' and the 'b' output levels.
            So I leave for you to send the 'a' and the 'b' and also
            the encoded bitmap (2 bytes) to another buffer (or the
            same buffer, bitmap, if you want).
         */
 }


 6. LAST NOTES
 ДДДДДДДДДДДДД

 Well, at this moment I'm very tired...
 If you have a doubt or want to ask something, then email me:

 email: dave@bglinks.com or dave_es@yahoo.com

 If  more  than  10 coders email me answering  for  VQ or other known / unknown
 digital  image compression schemes, I promise  to release more DOCS... ...but,
 please,  dont ask me about my series calculus algo. for compressing images coz
 for  this moments it's a bit secret  (and, of course, currently I'm working in
 it).  I'm  a regular on #coders, so maybe  you'll  find me in that channel for
 talking about it.


 7. GREETS
 ДДДДДДДДД

 Special Greetings go to: Gandalf, Shock, Gyr, Kalms, Ravian, Altair
                          Yoghurt, Legend, Biki, Aslan, DjBlaze
                          Maza, Funktion, HarriH, ...

 ...  and,  of  course,  greets for all  regular  people  in  #coders and #trax
 channels.

 P.S.: Hey J&I! When We start a REAL project (a game, for example :).
       X-Static:  I  hope  you'll make some  cool  tunes  and graphics for next
       Euskal Party. ;)

                                                                - DAVE/PhyMoSys
                                                               dave@bglinks.com