$S@g·,                               g@S$$$
                 $$$$$$+-ù ú. `@g,          .ú-  -+g@ ½üýø^_
          +-Ä- - $$$$$$-- -Ä-$b,`$-- --- y '-- Ä- -$$ g@S$$$-Ä- ---+
           «¬¬«¬ $$$$$$ ¬¬«« $$l $ ¬¬«¬« $ «¬«¬¬«¬ $$ $$$$$$ «¬«¬¬«
           ¬¬«¬« $$$$$$ ¬«¬¬ $$$ l «¬«¬« $ ¬«¬¬«¬« $$ $(dt)$ ¬«¬¬«¬
          +-- Ä -l$$ü$$-Ä-- -$$$ $Ä- --- $`+,._  _,$l $$$$$$-- Ä- -+
                 $$$y$$      Sü',$       ü         $$,øý½SQ$
                 ½üýø^~      ,g@$$     ,g@S@,      $$$l
                             `ýÓ*S,_  _`*S$$b._   ,SSü'

         This article comes from Hugi issue 15 ú Released in May 1999
         ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ


ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
RLE: A BASIC DATA COMPRESSION SCHEME                               DARIO PHONG
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ



                      RLE: A BASIC DATA COMPRESSION SCHEME

                        ÄÄÄ( DARiO PhONG / PhyMosys )ÄÄÄ



 Introduction_____

 Hi!  Welcome  to this little article about  RLE.  This is a compression scheme
 used  in PCX, but we'll not learn the  RLE scheme of PCX, 'coz it's byte-based
 output.  I'll  present you with an RLE  encoder  with bit-based output. If you
 know how it works, then you are probably saying: "RLE is old and almost no one
 uses it, why the hell should I learn it?" Well, I can't answer you! Unless you
 use  it  for  (simple)  image  compression or  for  the  output  of BWT, a new
 compression  algorithm that makes a transformation in the text. If you want to
 learn it, read the related articles in Hugi #13 and #14 or check out some good
 references  available  on the net. I recommend  you  [2], 'coz [1] is a little
 difficult.


 How it works_____

 RLE  scheme  is  based on the assumption  that  a byte will appear again, just
 after  it  has  occurred. In 'aaaab', we  have  four consecutive 'a's. How RLE
 compresses is easy:

 1- Write an 'a' and say that there are 3 more 'a's.
 2- Write a 'b'.

 So  for letting the decoder know when there's a literal byte with a repetition
 we'll  use  a  flag, a prefix, only one bit.  If  it's  '0' then the next is a
 literal  byte,  if it was '1' then we have  a byte and the lenght, of how many
 more  bytes with the same value follow. Of  course you now need a routine that
 lets  you  write  bits,  but I will not  discuss  that  'coz I have written an
 article about this topic. Just read Hugi #14 or go to my homepage and download
 it.


 Encoder implementation_____

 First  read  the whole input file to a  buffer. Then the loop starts. Read the
 current  byte.  Then compare it with the  following, till you find one that is
 different, and keep track of the bytes read. In Asm:

        mov     edi,offset buffer       ;where the file is
 @@rle_inner:
        xor     ebx,ebx                 ;it keeps track of the repeated bytes
        mov     al,[edi]                ;the current byte, note that I read
 @@rle_inner_comp:                      ;only ONCE
        inc     ebx                     ;next byte (at the start, the next)
        cmp     al,[edi+bx]             ;compare it with the following bytes
        jz short @@rle_inner_comp       ;while is equal jump

 In C:

        next=0;
        byte=*(buffer);
        while(byte==*(buffer+next))
                ++next;

 If  there was no byte repeated at all  then (in the Asm version) we'll set EBX
 to '1' just decrement it, and we'll know that there was no repetition.

        dec     ebx
        jz short @@rle_no_rep

 I'll not support C anymore, 'coz it's so easy that you can easily implement it
 yourself. If there was no byte at all, then write an uncompressed byte.

 @@rle_no_rep:
        push    ax                      ;in AL we have the current byte
        xor     eax,eax                 ;flag of uncompressed byte
        mov     ecx,1                   ;write one bit
        call    put_bits                ;write the flag
        pop     ax                      ;restore it
        mov     ecx,8                   ;write the whole byte
        call    put_bits
        inc     edi                     ;next byte
        cmp     edi,buffer+imput_lenght ;precalculate this after the loop
        jne short @@rle_inner           ;while there's a byte to scan

 If we have some byte repeated, then write the byte and the length:

        push    ax                     ;save the current byte
        mov     eax,1                  ;flag of repetition
        mov     ecx,1                  ;write one bit
        call    put_bits               ;write the flag
        pop     ax                     ;restore it
        mov     ecx,8                  ;write the whole byte
        call    put_bits               ;so the decoder knows what to repeat
        mov     eax,ebx                ;in EBX we have the bytes repeated
        mov     ecx,5                  ;length from 0-31
        call    put_bits               ;write the length
        inc     edi                    ;the uncompressed byte
        add     edi,ebx                ;the repeated bytes
        cmp     edi,buffer+imput_lenght;precalculate this after the loop
        jne short @@rle_inner

 Now you have a simple RLE encoder.

 Note  that in order to avoid bugs, you  MUST change the next byte to the last.
 Imagine  you  have  03h as the last byte,  then  the next byte (it will not be
 compressed) MUST be different from 03h, for example 04h. Why? This is 'coz our
 parser  only stops when are no more repetitions.  In case the last byte of the
 file and the next in memory are equal, our parser will continue parsing beyond
 the END of the file, and then the program will hang.


 The length_____

 Of  course you must care that the  length isn't greater than the maximum, with
 something like that:

        cmp     ebx,31                  ;31 is our maximum
        jbe short @@rle__
        mov     ebx31
 @@rle__:

 But  if we want to make this better, we can only improve the way the length is
 saved, so let's have a look at it.

 We  use  5  bits, so the range is from  0-31,  but we never use the 0, 'coz we
 never  have  a repetition of 0, so let's  do the range from 1-32, much better.
 The encoder only has to decrement EBX, and the decoder increments it.

 Note  that  if  the  minimum was 2 the  encoder  would  add 2, and the decoder
 subtract  2.  Another  way of saving the  length  could be making it variable,
 depending of a flag, so:

        1<flag>
          if 0:  3<length>
          if 1:  6<length>

 Keep  the  first  trick that we used in  your  mind: For the second length our
 range  will  be  1-64 but in fact we never  use  the 1-8 'coz it's used in the
 first  length (3 bits), so its range can be: (1-64)+8 = 9-72. The enconder has
 to  test whether the length is lower than 9. If  it is, it has to emit a 1 and
 the length; if it isn't, then a 0 and the length.


 Decoder_____

 The  decoder is even easier than the encoder, as always. We just have to get a
 bit and check it:

        mov     edi,offset buffer       ;the output buffer
 @@rled_inner:
        mov     edx,1                   ;how many bits we want
        call    get_bits                ;bits in AL
        dec     al                      ;check the bit-flag (prefix)
        jz short @@rled_decode          ;if 1 a match (1-1=0) then zf on

 Now  in case the conditional jump has not  been taken, we just read a byte and
 put it in the buffer:

        mov     edx,8                   ;get a whole byte
        call    get_bits
        mov     [edi],al                ;put the byte
        inc     edi                     ;next byte
        cmp     edi,buffer+output_lenght;do I have to explain this again?
        jne short @@rled_inner          ;next one
        jmp short @@rled_end            ;end, don't execute any more code

 Now in case it was a 'match':

 @@rled_decode:
        mov     edx,8
        call    get_bits                ;get the byte to copy
        mov     [edi],al                ;put it in the buffer
        inc     edi                     ;next byte
        push    ax                      ;save it
        mov     edx,1                   ;check the flag for the length
        call    get_bits
        dec     al                      ;see what of the lengths is
        jz short @@rled_big             ;we have the 'big' length
        mov     edx,3                   ;the bits needed for the length
        call    get_bits                ;get the 'little one'
        mov     ecx,eax                 ;save the length
        and     ecx,0111b               ;erase the other bits
        inc     ecx                     ;the range 1-8
        pop     ax                      ;restore the byte to write
        jmp short @@rled_finally_decode ;let's decode it
 @@rled_big:
        mov     edx,6                   ;the length
        call    get_bits                ;get the 'big one'
        mov     ecx,eax                 ;save the length
        and     ecx,0111111b            ;mask it, so we have no problems
        add     ecx,9                   ;the range 9-72
        pop     ax                      ;restore the byte to write
 @@rled_finally_decode:
        mov     [edi],al                ;put byte in the buffer
        inc     edi                     ;next byte
        dec     ecx                     ;more bytes to copy?
        jnz short @@rled_finally_decode ;keep copying
        cmp     edi,buffer+output_lenght
        jne short @@rled_inner
 @@rled_end:
        ret                             ;end of the proc

 Here  is  almost all the code you need  for  a fast RLE encoder and decoder. I
 said  almost  all, 'coz you still have to  encode the length, well and get the
 params,  get the mem... About getting the  params, there's a little article at
 my h-page.


 References_____

 [1] http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/
            src-rr-124.html
     The original BWT paper.

 [2] http://www.dogma.net/markn
     Look for the article section.

 If you want more compression related links, visit my homepage at:
 http://www.geocities.com/SiliconValley/Byte/6789/


 Closing words_____

 Now  you  have already implemented it, but  you  see that it doesn't have good
 ratios.   What  else  can  you  do?  You  can  improve  it  with  Huffman,  or
 Shannon-Fano,  or even arithmetic coding... But this scheme (RLE) is only good
 for  images,  with loooong repetitions, and for  the output of a BWT-block. If
 you  want another scheme, you can try  lz77. From my homepage you can download
 an  article about it. If you still have  any doubt, or just want to talk about
 compression, send me an email: dario_phong@mixmail.com

 Now, just good luck with your coding!

                                            - DARiO PhONG, Barcelona 14/02/1999