|
Post by nikoniko on Oct 13, 2018 14:00:09 GMT
I've done some preliminary testing to see if it can be done and am using the VDC lightpen register triggered by the CIA to assist with removing as much jitter as I can as an attempt to get a stable raster.
First of all, brilliant work, willymanilly! Raster splits are something I always thought the VDC would be capable of with the right timing, and it only took the right person to show the way. Thank you! And it's just as exciting to see how your experiments are helping you create an even more accurate VDC emulation for Z128K. Second, could you help me understand how the lightpen registers tie in? I've never used them or thought about them really and don't know how they work. When you perform a read on them, do they return the current horizontal and vertical position of the beam as it's drawing, regardless whether a light pen is active or not? And then you use that to refine your timing?
|
|
|
Post by willymanilly on Oct 13, 2018 21:30:16 GMT
I've done some preliminary testing to see if it can be done and am using the VDC lightpen register triggered by the CIA to assist with removing as much jitter as I can as an attempt to get a stable raster.
Second, could you help me understand how the lightpen registers tie in? I've never used them or thought about them really and don't know how they work. When you perform a read on them, do they return the current horizontal and vertical position of the beam as it's drawing, regardless whether a light pen is active or not? And then you use that to refine your timing? Hi Nikoniko,
Welcome to the group.
The VDC lightpen can be manuallyt triggered by CIA 1 by bit 4 of port B going high to low exactly the same way the VICII lightpen can be triggered. The VDC light pen can be triggered multiple times in one frames. The reported values are the current internal character counter horizontal and vertical positions. The resolution of this is determined by the character total height (VDC register 9) and character total width (bits 4-7 of register 22). I only use the LPX value to stabilise timing because LPY will always be 1 after I trigger the light pen immediately after VBLANK is set. The light pen is a latched value so you will need to always trigger it before reading to get valid values. From there you can calculate the cpu cycles that need to be waster to get within 2 VDC cycles of the desired VDC character position where the raster is being drawn. The jitter is caused because VDC and VICII use different clocks. I plan on including extra logic to improve the timing depending on VICIIe speed. See my discussion in 8 x 1 vdc mode trick.
|
|
|
Post by willymanilly on Oct 20, 2018 7:27:39 GMT
I've just finished doing some research on the display and attribute memory latching behaviour and believe I've discovered when it exactly gets triggered for all graphics modes (including interlaced modes) and Y smooth scrolling positions. I want to clean my test program before sharing the information. Basically attribute latching is done at the end of the last line and display memory latching is done at the start of the last line for text modes. Bitmap mode latches both at the start of the bottom line, except when Y scrolling is set to 0. In this instance the display memory latching is done on the first line. There's a little more to it than that but that gives you the general idea what happens. I will tabulate some results when I get some free time. I've uploaded a new version of Z64K with this cycle exact latching now. (Actually it's out by one cycle but it's close enough for now. I will make it cycle exact next release.)
|
|
|
Post by misoman on Oct 25, 2018 4:02:12 GMT
I will tabulate some results when I get some free time. I realize this is a lot to ask, but once you're finished updating your VDC model (for the time being), would you consider a write-up on what you've learned about the VDC's inner workings? I'd even contribute financially for an article that fills in some of the knowledge gaps of the available documentation, particularly timing and ready flag behavior so that I can use the VDC more effectively.
|
|
|
Post by willymanilly on Oct 26, 2018 3:59:03 GMT
I will tabulate some results when I get some free time. I realize this is a lot to ask, but once you're finished updating your VDC model (for the time being), would you consider a write-up on what you've learned about the VDC's inner workings? I'd even contribute financially for an article that fills in some of the knowledge gaps of the available documentation, particularly timing and ready flag behavior so that I can use the VDC more effectively. There's no need for financial incentive. There's quite a few updates I need to do to my VDC model, especially the glitches that occur when the VDC runs out of cycles to get correct character and attribute data. I eventually plan on formally documenting everything and sharing it but that will probably be a long way off still. I still need to write a user manual for Z64K before anything. In the meantime I will continue to share and discuss what I've observed in this forum. I'm always happy to try and answer specific questions as well to the best of my knowledge.
|
|
|
Post by jmpff3d on Oct 28, 2018 3:46:06 GMT
AHHH .. the ol' VDC .. some lingering *and fading* memories over the years ... www.oxyron.de/html/registers_vdc.htmlSuch as, for example, the venerable $1F VDC register. This thing has auto-increment so if you read/write a byte there, the address in $12/$13 will be +1 and so this way you dont need to set $12/$13 for every byte -- and also dont need to select different registers all the time, just keep it at $1F. The only thing you have to do is to wait for the READY flag in $D600, basically its like... STA $D601 BIT $D600 BPL *-3 or something like... LDX #$00 loop: LDA stuff,X STA $D600 or more like... LDX #$00 loop: STA $D601 BIT $D600 BPL *-3 INX BNE loop ... something like that will copy 256 bytes to VDC memory and, of course, the address has to be set before. So basically linear memory access is still quite fast but whenever you have to reset the address all the time, it gets slow. In other words, writing 256 linear bytes is fast but something like LDA $1000,X EOR #$FF STA $1000,X in VDC RAM is slow, because you'd have to set the address for every byte again! Cheers ! PS: Nice Read -- c128.com/phase-locking-vdc-vic-chip-short-version
|
|
|
Post by remark on Apr 26, 2020 15:33:17 GMT
What I conclude from this so far is: - Each line requires 85 VDC cycles to fetch character graphics data. ($44-$1A=$2A, $2A = 42 cpu cycles, 2000000 X 42 / 985248 =~85 VDC cycles)
I'm guessing this is 80 character fetches + 5 DRAM refresh cycles. note: I've tested increasing/decrease register 36 an confirm this affects time. I have not included those results this time.
- Extra fetches are required on the first line to get character pointers. Attributes require even more VDC cycles as expected.
- PAL has an additional cycle per line so has the extra free cycle to action the block fill of 1 one line earlier than NTSC. When using block fill of 2 the result is the same as NTSC because it cannot complete the blockfill before the next line of character graphics data fetches.
i.e. ($C2-$98=$2A) $2A is the same value as used in the first dot point implying an extra line was read before completing the blockfill
With my logic analyzer I can confirm your findings, in standard 80x25 text mode on PAL, the pattern is as follows:
1: 80 C | 5 R | 40 S | 3 I S: 40 screen bytes next character row 2: 80 C | 5 R | 40 S | 3 I S: 40 screen bytes next character row 3: 80 C | 5 R | 40 A | 3 I A: 40 attribute bytes next character row 4: 80 C | 5 R | 40 A | 3 I A: 40 attribute bytes next character row 5: 80 C | 5 R | 43 I 6: 80 C | 5 R | 43 I 7: 80 C | 5 R | 43 I 8: 80 C | 5 R | 43 I
C: Character data bytes current line S: Screen memory bytes for next character row (character pointers)
A: Attribute bytes for next character row
R: Refresh dram bytes (R36) I: Internal/idle cycle ($3FFF/$FFFF on address bus)
After the last line of the last character row the screen bytes and attribute bytes for the first character row are read :
2 I | 78 S | 5 R | 2 S | 38 A | 3 I S:$0000-$004f A:$0800-$0825 2 I | 42 A | 36 I | 5 R | 43 I A:$0826-$084f 80 I | 5 R | 43 I The last line shown is repeated until you get to the first line of the first character row (I didn't check all lines)
If you set R36 to zero, the VDC starts reading extra screen bytes early:
80 C | 45 S | 3 I 80 C | 35 S | 10 A | 3 I 80 C | 45 A | 3 I 80 C | 25 A | 23 I 80 C | 48 I 80 C | 48 I 80 C | 48 I 80 C | 48 I ===== 2 I | 80 S | 43 A | 3 I 2 I | 37 A | 89 I 128 I With R36=15 (maximum value):
80 C | 15 R | 30 S | 3 I 80 C | 15 R | 30 S | 3 I 80 C | 15 R | 20 S | 10 A | 3 I 80 C | 15 R | 30 A | 3 I 80 C | 15 R | 30 A | 3 I 80 C | 15 R | 10 A | 23 I 80 C | 15 R | 33 I 80 C | 15 R | 33 I ===== 2 I | 78 S | 15 R | 2 S | 28 A | 3 I 2 I | 52 A | 26 I | 15 R | 33 I 80 I | 15 R | 33 I
With R36=5, Attributes disabled:
80 C | 5 R | 40 S | 3 I 80 C | 5 R | 40 S | 3 I 80 C | 5 R | 43 I 80 C | 5 R | 43 I 80 C | 5 R | 43 I 80 C | 5 R | 43 I 80 C | 5 R | 43 I 80 C | 5 R | 43 I ===== 2 I | 78 S | 5 R | 2 S | 41 I 80 I | 5 R | 43 I
I haven't looked at bitmap mode yet.
I start the tables with character data fetches, but I don't know if the VDC starts its rasterline there. Probably parts at the right side of the table will be executed on the next rasterline. Also this is PAL with 128 CCLKs per line (R0). Also note, in the last character row of the screen (row 25), the screen and attribute bytes are still fetched of the next (non existent) row . This is, I assume, to make scrolling possible.
|
|
|
Post by tokra on Apr 26, 2020 15:54:11 GMT
Oh, very nice findings! Please try things like NTSC, Bitmap-Modes and Interlace as well, I've been waiting for some solid data on these for YEARS! Up until now we could only see some things fail and only assume explanations for this. With your tools we can FINALLY confirm some things about the VDC that have been lurking for 35 years :-)
|
|
|
Post by tokra on Apr 26, 2020 16:49:55 GMT
For example: My colour-bitmap-mode for 480x252 (VDC-FLI) should look like this (reg 0 = 126, so 127 CCLKs; reg 36 = 0): 60 C | 60 A | 7 I 60 C | 60 A | 7 I 60 C | 60 A | 7 I 60 C | 60 A | 7 I 60 C | 60 A | 7 I 60 C | 60 A | 7 I 60 C | 60 A | 7 I 60 C | 60 A | 7 I I noticed I could make the mode go up to 488 width which should look like this: 61 C | 61 A | 5 I 61 C | 61 A | 5 I 61 C | 61 A | 5 I 61 C | 61 A | 5 I 61 C | 61 A | 5 I 61 C | 61 A | 5 I 61 C | 61 A | 5 I 61 C | 61 A | 5 I However if I go up to 496 (62 chars) I start getting corruptions, though theoretically this should work if it looks like this (you can just change the register-settings in the BASIC-program in line 3020): 62 C | 62 A | 3 I 62 C | 62 A | 3 I 62 C | 62 A | 3 I 62 C | 62 A | 3 I 62 C | 62 A | 3 I 62 C | 62 A | 3 I 62 C | 62 A | 3 I 62 C | 62 A | 3 I I also get corruptions if I use reg 0 = 125 and 61 chars, which should look like this: 61 C | 61 A | 4 I 61 C | 61 A | 4 I 61 C | 61 A | 4 I 61 C | 61 A | 4 I 61 C | 61 A | 4 I 61 C | 61 A | 4 I 61 C | 61 A | 4 I 61 C | 61 A | 4 I It looks like the VDC needs at least 5 internal cycles here (for whatever reason) and anything below will lead to corruption of the display-data.
|
|
|
Post by remark on Apr 27, 2020 19:36:08 GMT
However if I go up to 496 (62 chars) I start getting corruptions, though theoretically this should work if it looks like this (you can just change the register-settings in the BASIC-program in line 3020): 62 C | 62 A | 3 I 62 C | 62 A | 3 I 62 C | 62 A | 3 I 62 C | 62 A | 3 I 62 C | 62 A | 3 I 62 C | 62 A | 3 I 62 C | 62 A | 3 I 62 C | 62 A | 3 I This is exactly what happens! I also changed R12/13 to accomodate for the extra bytes and it nicely starts at $0000 on the first screenline just like 60 chars (didn't change attribute start). Unfortunately i don't have the 64k upgrade installed, so i won't/can't see any corruptions. The part between the last screen line and the first is quite interesting and I need to study this further. At first glance the are only 2 attribute line reads, the rest are screen bytes reads (the 2040/2108 bytes before $FFFF) interspersed with 60/67 or 120/127 idle cycles.
|
|