Vdc decompress reimagined

VDC 8x2 Administrator VDC 8x2 Living life Posts: 348	Vdc decompress reimagined Jul 25, 2014 4:23:06 GMT Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by VDC 8x2 on Jul 25, 2014 4:23:06 GMT I think once this is rock solid, the hard part begins. how to make the code to compress the pictures to use this depress code.

hydrophilic
Global Moderator

Posts: 794

Vdc decompress reimagined Jul 25, 2014 12:16:13 GMT

Quote

Post by hydrophilic on Jul 25, 2014 12:16:13 GMT

Well for encoding, I would recommend writing it in high-level language like C or VisualBASIC (or Java or PHP, but I don't like debugging them). The good thing is it should be fast because compression is not complex, and the search is constrained to a maximum of 2K if you used the "big range" long copy... and it only has to search backwards for copy.

If you are brave (or maybe just insane) you could try writing/testing it in CBM BASIC

For my video compressor, I have a build switch to add "debug" info... this will keep track of which types were most useful (raw/rle/skip/copy) the count of bytes used (like 6 for an RLE instance or 12 for a copy instance), and for (long)copy the range (how far it had to search to find a match)... it tracks the min, max, count, and total (average would obviously be total/count) for each type. Based on that info, I could decide which codes were more useful, and how to allocate available bits to each code. Of course the ultimate result, compression ratio, is easy by comparing file sizes and doesn't need "debug" info.

Once it works in high-level language, it is trivial to make into ML if you need it.

If you're confident with the decoder so far, here are few optimizations you could make... change this

Parse ldy #$00
      lda (source),y
      lsr
      bcs SecondTest ;is it long or short copy
      lsr
      bcs Fill ;got fill
      bcc Skip ;skip x bytes
SecondTest:
      lsr
      bcc Copy
      jmp LongCopy

Fill ...

with something like

Parse ldy #$00
      lda (source),y
      lsr
      bcs SecondTest ;is it long or short copy
      lsr
      bcc Skip ;skip x bytes

Fill  ...
Skip  ...

SecondTest:
      lsr
      bcs LongCopy
Copy  ...

Assuming the branch would reach of course. Fill and Skip are pretty simple so it should.

You can also remove the LDA in this sequence...

   sbc WorkTemp+1
   sta WorkTemp+1 ;saved the results
   ldy #$20 ;source for copy
   sty vdcadr
   lda WorkTemp+1
   waitvdc
   sta vdcdat ;store high byte

Were you going to use this in your GoldBox game translation, or is this for something else?

Last Edit: Jul 25, 2014 12:20:54 GMT by hydrophilic: Typos!

I'm kupo for Kupo Nuts!
∇ • hydrophilic ≠ 0

VDC 8x2
Administrator

VDC 8x2

Living life

Posts: 348

Vdc decompress reimagined Jul 25, 2014 13:15:21 GMT

Quote

Post by VDC 8x2 on Jul 25, 2014 13:15:21 GMT

I am going to use it to compress the pictures it will be loading in the game.

Aside from the compression of the title page graphic, there is no compression at all in the original game. And that is just a simple rle compression.

It is also going to be for all vdc images. So it was started for GoldBox but, wanted it to stand alone too.

Last Edit: Jul 25, 2014 13:26:27 GMT by VDC 8x2

VDC 8x2 Administrator VDC 8x2 Living life Posts: 348	Vdc decompress reimagined Jul 25, 2014 14:55:00 GMT Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by VDC 8x2 on Jul 25, 2014 14:55:00 GMT would skip over be better option then long copy? Or, should I have a forked code. one for skip over the other for long copy?

hydrophilic
Global Moderator

Posts: 794

Vdc decompress reimagined Jul 25, 2014 15:19:23 GMT

Quote

Post by hydrophilic on Jul 25, 2014 15:19:23 GMT

Cool, glad to hear some progress with the game too, despite this diversion. Thanks for sharing. Helps me think about ways to improve MediaPlayer which I've been thinking about for awhile but haven't actually started.

Umm, actually in the last code snippet that I quoted, I think you could remove both LDA and STA to WorkTemp+1, because it is never used again by the CPU, just handed to the VDC.

I was thinking long copy, isn't really a good name. I think a better name would be random copy... because it can read from anywhere in the past 1K or 2K (depending on bit allocation), as opposed to standard copy which is like sequential copy (always from the immediately preceding bytes).

Anyway, one way that might work ...

LongCopy:
    sta WorkTemp+1 ;offset high
    wordinc Source
    lda (source),y
    tax            ;save length
    lsr WorkTemp+1
    ror
    lsr WorkTemp+1
    ror
    lsr WorkTemp+1
    ror
    sta WorkTemp   ;offset low
    txa
    and #%111      ;isolate length bits
    clc
    adc #3         ;minimum copy length
    tax            ;bytes to copy 
 ;to save code bytes, you could just jump to end of normal copy...
    ldy #$13
    sty vdcadr
    lda vdcdat
    sec
    sbc WorkTemp   ;low bite - value. 
    sta WorkTemp
    dey
    sty vdcadr
    waitvdc ;wait because reg 12
    lda vdcdat
    sbc WorkTemp+1  ;calc. high adrs

    ldy #$20 ;source for copy
    sty vdcadr
    waitvdc
    sta vdcdat      ;store high adrs
    iny
    sty vdcadr
    lda WorkTemp
    sta vdcdat      ;store low adrs

    lda #$18
    sta vdcadr
    lda vdcdat ;going set to copy
    ora #$80
    sta vdcdat ;set bit 7
    lda #$1e
    sta vdcadr
    stx vdcdat ;copy x bytes
    rts

That uses 3-bit copy size and 11-bit distance:
%hhhLLL11 , first byte has 3 high bits of distance, and 3 low bits of distance, plus 2 bit code (%11)
%LLLLLsss, second byte has remaing 5 bits of low distance, plus 3 bits of size

Size is actually 3+encoded value; so using above method (3-bit size), 3 to 10 bytes can be copied

You might find 4-bit size more useful (copy 3 to 18 bytes), but it would reduce distance from 11-bit (2K) to 10-bit (1K). To do this, all you need is add another LSR WorkTemp+1, ROR pair near the top (and fix the AND #%111 mask).

would skip over be better option then long copy?

Nope, not unless you plan on animated images. Well, I guess you *could* clear the screen first (all mem zero), and then skip might be useful. Yeah, you might need to fork the code and try both options. I know it works well for video, but except for that I example I just gave, I don't think it would be handy for static images.

Last Edit: Jul 25, 2014 15:23:41 GMT by hydrophilic

I'm kupo for Kupo Nuts!
∇ • hydrophilic ≠ 0

VDC 8x2
Administrator

VDC 8x2

Living life

Posts: 348

Vdc decompress reimagined Jul 25, 2014 15:56:03 GMT

Quote

Post by VDC 8x2 on Jul 25, 2014 15:56:03 GMT

I decided to go with the 3 to 10 random copy.

Here is the revised code:

Source          = $fd
destination     = $fb
WorkTemp        = $fb
vdcadr          = $d600
vdcdat          = $d601
defm    WAITVDC
        bit $d600
        bpl *-3
        endm

defm    WordInc
        inc /1
        bne *+4
        inc /1+1
        endm
    
*=$1300
Start           lda destination+1 ;high byte
                ldy #$12
                sty vdcadr
                WAITVDC         ;vdc ready? 
                sta vdcdat      ;high byte dest 
                iny
                sty vdcadr
                lda destination ;low byte
                sta vdcdat      ;low byte set. destination is now set
                jmp First

MAINLOOP        wordinc Source
First           jsr Parse
                waitvdc         ;wait for copy or fill to finish 
                bmi Mainloop    ;we know it is always mi after vdcwait so use 
                                ;it. 
Parse           ldy #$00
                lda (source),y
                lsr 
                bcs SecondTest  ;is it long or short copy
                lsr
                bcc Skip         ;skip x bytes  

Fill            cmp #$01        ;is it 1                    
                bne @not1
                ror             ;turn it into 128
@not1           tax
                dex             ;number of bites to fill. 2 to 63 or 128 or 256
                lda #$18        
                sta vdcadr
                lda vdcdat      ;going to set to fill 
                and #$7f
                sta vdcdat       ;clear bit 7
                wordinc source
                lda #$1f
                sta vdcadr
                WaitVdc         ;wait for ready                
                lda (source),y  ;byte to repeat
                sta VDCDAT
                lda #$1e
                sta vdcadr
                stx vdcdat      ;start fill                 
                rts
                
Skip            tax             ;transfer to x for counter
                bne @notzero
                pla             ;pull return off stack
                pla
                rts             ;done so peace out.
@notzero        lda #$1f
                sta vdcadr
@loop           wordinc Source   ;get next byte 
                lda (source),y
                WAITVDC         ;vdc ready?
                sta vdcdat      ;store value
                dex
                bne @loop 
                rts
 
SecondTest      lsr                 
                bcs LongCopy                                      

Copy            cmp #$01        ;is it 1                    
                bne @not1
                ror             ;turn it into 128
@not1           tax             ;copies -x bytes forwords +bytes 
                bne @not2       ;is it 0
                ldy #$01        ;set high byte to 1 if zero
@not2           sta WorkTemp     ;low byte
                sty WorkTemp+1 ;high byte 
Rdy2Subt        ldy #$13
                sty vdcadr        
                lda vdcdat
                sec    
                sbc WorkTemp        ;low bite - value.
                sta WorkTemp
                dey 
                sty vdcadr
                waitvdc         ;wait because reg 12
                lda vdcdat
                sbc WorkTemp+1
                ldy #$20        ;source for copy
                sty vdcadr
                waitvdc
                sta vdcdat      ;store high byte 
                iny
                sty vdcadr
                lda WorkTemp 
                sta vdcdat      ;store low byte
                lda #$18        
                sta vdcadr
                lda vdcdat      ;going set to copy 
                ora #$80
                sta vdcdat      ;set bit 7
                lda #$1e
                sta vdcadr
                stx vdcdat      ;copy x bytes    
                rts

LongCopy        sta WorkTemp+1 ;offset high
                wordinc Source
                lda (source),y
                tax            ;save length
                lsr WorkTemp+1
                ror
                lsr WorkTemp+1
                ror
                lsr WorkTemp+1
                ror
                sta WorkTemp   ;offset low
                txa
                and #%111      ;isolate length bits
                clc
                adc #3         ;minimum copy length
                tax            ;bytes to copy 
                jmp Rdy2Subt

Now to Test it some more.

Attachments:

nuvdcpress2.prg (256 B)

Last Edit: Jul 25, 2014 15:57:43 GMT by VDC 8x2: forgot to add the file.

hydrophilic
Global Moderator

Posts: 794

Vdc decompress reimagined Jul 25, 2014 17:36:16 GMT

Quote

Post by hydrophilic on Jul 25, 2014 17:36:16 GMT

Nice start, I guess you can't really tell what would be best until you have working encoder. In my encoder, when debug/analyze mode is turned on, it not only records all the codes it does use, but all the codes it wanted to use but couldn't (for example RLE length over 63, or copy distance over 2K), that way I know if I should change things.

Of course I imagine you would want to optimize it for GoldBox game, but to optimize it for general images, you need to test lots of different images. Only testing 6 to 12 images can lead to bad results with many other images if you optimize too tightly with a small sample set.

Actually I wouldn't worry too much, this is suppose to be a simple/fast scheme... you can get much better compression using variable codes but is much slower to decompress.

Speaking of speed, if your tests work out, another thing that would save 9 cycles for every code is replace the JSR Parse / RTS with a loop... remove the JSR Parse and replace all RTS with JMP MAINLOOP. Something like

         jmp Parse ; was jmp First

MAINLOOP wordinc Source
         waitvdc ;wait for copy or fill to finish

Parse    ldy #$00 
         ...
Skip     ...
         jmp MainLoop
Fill     ...
         jmp MainLoop
etc.

Somewhere in there (end of data), you would need an RTS but without the PLA PLA.

I would only use JSR if implementing more complex compression, for example where the decompressor can call itself to decompress a code (for pattern fill or other advanced things)

I'm kupo for Kupo Nuts!
∇ • hydrophilic ≠ 0

VDC 8x2 Administrator VDC 8x2 Living life Posts: 348	Vdc decompress reimagined Jul 25, 2014 20:32:57 GMT Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by VDC 8x2 on Jul 25, 2014 20:32:57 GMT Compiled with the changes you suggested. now to test it again.

VDC 8x2
Administrator

VDC 8x2

Living life

Posts: 348

Vdc decompress reimagined Jul 25, 2014 23:19:33 GMT

Quote

Post by VDC 8x2 on Jul 25, 2014 23:19:33 GMT

I was looking at my code and thinking 2 was a break even point in compression. We don't want to break even, we want to gain ground!

I reworked to this for rle and copy values: 3 to 63 normal, 2=64, 1=128 and 0=256. The gains outweigh the lose of 2 value.

I reworked the LongCopy size values to this: 3 to 7 normal, 2=64, 1=128 and 0=256. The gains are the most here, I think.

Source          = $fd
destination     = $fb
WorkTemp        = $fb
vdcadr          = $d600
vdcdat          = $d601
defm    WAITVDC
        bit $d600
        bpl *-3
        endm

defm    WordInc
        inc /1
        bne *+4
        inc /1+1
        endm
    
*=$1300
Start           lda destination+1 ;high byte
                ldy #$12
                sty vdcadr
                WAITVDC         ;vdc ready? 
                sta vdcdat      ;high byte dest 
                iny
                sty vdcadr
                lda destination ;low byte
                sta vdcdat      ;low byte set. destination is now set
                jmp Parse

MAINLOOP        wordinc Source
                waitvdc         ;wait for copy or fill to finish 
                               
Parse           ldy #$00
                lda (source),y
                lsr 
                bcs SecondTest  ;is it long or short copy
                lsr
                bcc Skip         ;skip x bytes  

Fill            cmp #$01        ;is it 1                    
                bne @not1
                lda #$80        ;1 becomes 128
                bne mainfill    ;we know it is nonzero so branch with it
@not1           cmp #$02
                bne mainfill
                lda #$40        ;2 becomes 64
mainfill        tax             ;0 will become 256
                dex             ;number of bites to fill. 3to63 or 64,128,256
                lda #$18        
                sta vdcadr
                lda vdcdat      ;going to set to fill 
                and #$7f
                sta vdcdat       ;clear bit 7
                wordinc source
                lda #$1f
                sta vdcadr
                WaitVdc         ;wait for ready                
                lda (source),y  ;byte to repeat
                sta VDCDAT
                lda #$1e
                sta vdcadr
                stx vdcdat      ;start fill                 
                jmp mainloop
                
Skip            tax             ;transfer to x for counter
                bne @notzero
                rts             ;done so peace out.
@notzero        lda #$1f
                sta vdcadr
@loop           wordinc Source   ;get next byte 
                lda (source),y
                WAITVDC         ;vdc ready?
                sta vdcdat      ;store value
                dex
                bne @loop 
                jmp mainloop
 
SecondTest      lsr                 
                bcs LongCopy                                      

Copy            cmp #$01        ;is it 1                    
                bne @not1
                lda #$80        ;set to 128
                bne PutInX      ;we know it not zero so can use it to branch
@not1           cmp #$02
                bne PutInx
                lda #$40        ;set to 64
PutInx          tax             ;copies -x bytes forwords +bytes 
                bne @not2       ;is it 0
                ldy #$01        ;set high byte to 1 if zero
@not2           sta WorkTemp     ;low byte
                sty WorkTemp+1 ;high byte 
Rdy2Subt        ldy #$13
                sty vdcadr        
                lda vdcdat
                sec    
                sbc WorkTemp        ;low bite - value.
                sta WorkTemp
                dey 
                sty vdcadr
                waitvdc         ;wait because reg 12
                lda vdcdat
                sbc WorkTemp+1
                ldy #$20        ;source for copy
                sty vdcadr
                waitvdc
                sta vdcdat      ;store high byte 
                iny
                sty vdcadr
                lda WorkTemp 
                sta vdcdat      ;store low byte
                lda #$18        
                sta vdcadr
                lda vdcdat      ;going set to copy 
                ora #$80
                sta vdcdat      ;set bit 7
                lda #$1e
                sta vdcadr
                stx vdcdat      ;copy x bytes    
                jmp mainloop

LongCopy        sta WorkTemp+1 ;offset high
                wordinc Source
                lda (source),y
                tax            ;save length
                lsr WorkTemp+1
                ror
                lsr WorkTemp+1
                ror
                lsr WorkTemp+1
                ror
                sta WorkTemp   ;offset low
                txa
                and #%111      ;isolate length bits
                cmp #$02
                bne @skip1
                lda #$80        ;set to 128 bytes
                bne @copybyte   ;we know its not zero so use to branch
@skip1          cmp #$02       
                bne @copybyte
                lda #$40       ;set to 64 bytes  
@copybyte       tax            ;if zero will become 256
                jmp Rdy2Subt

Now more stuff to test!

Attachments:

nuvdcpress2.prg (284 B)

hydrophilic
Global Moderator

Posts: 794

Vdc decompress reimagined Jul 26, 2014 5:25:26 GMT

Quote

Post by hydrophilic on Jul 26, 2014 5:25:26 GMT

I worry you're doing too much customization without testing it against an encoder.

Like before it was doing RLE 2~63 or 128 or 256... but now it is doing 3~64 or 128 or 256. I doubt that one extra byte will make any significant gain in most files, and no difference in many files... but the more complex code makes all RLE operations slower.

Also an RLE of 2 is sometimes useful. Now you may find that hard to believe, I know that I did when I saw the statistics. It took some work to figure out why... I can't explain it in words, so here is an example:
63 uncompressible bytes, run length of 2(some byte), run length of 6(some other byte)

Now *without* RLE 2 it encodes as
1 byte(code) + 63 bytes(raw data) + 1 byte(code) + 2 bytes(raw data) + 1 byte(code) + 1 byte(RLE data)
= 69 bytes

With RLE 2, it encodes as
1 byte(code) + 63 bytes(raw data) + 1 byte(code) + 1 byte(RLE data) + 1 byte(code) + 1 byte(RLE data)
= 68 bytes

I remember reading about some compressor (BASIC 8 or of iPaint ?) that made similar "mistake" with Copy. They never allowed 3 bytes for Copy, because normally it would not save any bytes (only break even)... but a similar example could be applied. Sometimes breaking even is good! If you have to add a raw(skip) packet instead, then you would loose a byte!

Well now hopefully you believe that it *can* happen, but if you're like I was, you're probably thinking it isn't very important / unlikely. All I can say is, after testing thousands of images, having a break-even code is better (in general) than extending a code by +1. I guess it is because the break-even case can happen on a semi-regular basis; but the chance that Big Blocks can be RLE/Copied will either be well inside the limits (for example 50 out of 63) or way outside the limits (like 100 of 63)... the chance it would be exactly the alterante +1 code (like 64 out of 63+1) is very very slim.

Don't get me wrong, your ideas aren't all bad. Like having different sizes for LongCopy would be good; I like your idea of 64 copy bytes! I wouldn't allocate code values for 128 and 256 because they are rare. When they do occur, you just need 2 or 4 of the CopyX64's... you still get amazing compression... MUCH more common however are the shorter lengths, so you should prioritize them.

Anyway, I wouldn't spend much time trying to tweek it until there is real data available from an encoder.

I'm kupo for Kupo Nuts!
∇ • hydrophilic ≠ 0

Commodore 128

Vdc decompress reimagined

Post by VDC 8x2 on Jul 25, 2014 4:23:06 GMT

Post by hydrophilic on Jul 25, 2014 12:16:13 GMT

Post by VDC 8x2 on Jul 25, 2014 13:15:21 GMT

Post by VDC 8x2 on Jul 25, 2014 14:55:00 GMT

Post by hydrophilic on Jul 25, 2014 15:19:23 GMT

Post by VDC 8x2 on Jul 25, 2014 15:56:03 GMT

Post by hydrophilic on Jul 25, 2014 17:36:16 GMT

Post by VDC 8x2 on Jul 25, 2014 20:32:57 GMT

Post by VDC 8x2 on Jul 25, 2014 23:19:33 GMT

Post by hydrophilic on Jul 26, 2014 5:25:26 GMT

Quick Reply