|
Post by mirkosoft on Oct 14, 2014 20:19:36 GMT
Hi!
As first: Reading next lines you'll maybe mean "this idiot is programming The Ace OS!"
2nd: The Ace has now finished relocatable code and first address bytes are not important so are used as file IDs. LO byte is subtype of filetype in HI byte - this means no extensions, if used, only for users to correct orientation in directory, commonly used by software, but also when is extension wrong or mismatch, these 2 bytes recognizes file correctly. Of course IDs are selected to beware conflicts with standard Commodore files, I know that mismatch is possible nevertheless.
Main Q: Is possible easier method to read first 2 bytes (address bytes) than open file and use read routine? For use in reading directory and while displaying filenames identify them with less time as is possible...
I know it looks like I'm stupid, but I mean better method is to ask more people 'cause more heads can give better result...
Thank you for all advices and help.
Miro
|
|
|
Post by VDC 8x2 on Oct 15, 2014 5:07:22 GMT
open file read first 2 bytes I think would be the safest way.
|
|
|
Post by hydrophilic on Oct 16, 2014 6:13:53 GMT
I really know of no other way. If you are using an emulator (VICE etc.) then it might be feasible to read the first two bytes of EVERY SECTOR on disk into an array (or REU)... but this is extreme and would only be practical for emulators (stupid slow with real hardware).
|
|
|
Post by mirkosoft on Oct 16, 2014 16:19:01 GMT
So, ok, main Q has answer: No faster method exists. Thank you boys! Miro.
|
|
|
Post by gsteemso on Oct 18, 2014 3:39:30 GMT
In thinking about this, I figure there are two aspects to consider: How much data needs to be sent over the serial bus, and how much processing needs to be done by the drive / by the computer. Assuming you want to do all processing in the 128, as not all drives support things like the BLOCK-EXECUTE command (I'm looking at you, µIEC and Flyer), there are two sets of CBM DOS commands I can think of that would achieve the desired result:
First, the sequence you already know about, assuming you have extracted the file name from the directory entry into the variable FN$:
100 OPEN {FILE #},{UNIT #},{CHANNEL #},"{DRIVE #}:"+FN$+",R" 105 GET#{FILE #},{string variable to hold file subtype} 110 GET#{FILE #},{string variable to hold file type} 115 CLOSE{FILE #}
…some of the details (e.g. ',R') could be omitted without much risk, but this is how the manual says to do it.
Assuming a file named TEST FILE on drive (or partition) 1 of unit 9, accessed through disk channel 4, this would (assuming I recall the details correctly) require the following to be sent over the serial bus:
C128: MTA9 CSO4 '1' ':' 'T' 'E' 'S' 'T' ' ' 'F' 'I' 'L' 'E' ',' 'R' Drive: [1-character subtype] C128: UNT MTA9 MSA4 Drive: [1-character type] C128: UNT MLA9 CSC4
That’s 8 bytes sent under /ATN (which is always done using slow serial, even if you have JiffyDOS), plus 15 bytes of data (which can be fastloaded). The minimal amount of data (i.e., if you have a 1-character filename, you default to drive 0, and you don’t bother with ',R') is 3 bytes and the maximum that might be required (you’re using a maxed-out 16-character file name on partition #107 on a CMD hard drive and you specify ',R') is 24.
The other way I could think of to do this is as follows, assuming you have the command channel (15) open on logical file number 50, buffer channel 5 open on logical file number 40, and have extracted the track and sector from the directory entry in the variables TK and SR:
105 PRINT#50,"U1";5;{DRIVE #};TK;SR 110 GET#40,{string variable to hold file subtype} 115 GET#40,{string variable to hold file type}
Note the major differences here: The code always takes close to the same amount of time to execute (ignoring the drive’s seek time, of course, and it’ll take a bit longer if you’re accessing 3-digit track and sector numbers on a CMD hard drive), and the logical file gets opened and closed exactly once instead of every time through the loop… not that that last point really matters much, but who knows. The question, of course, is "IS IT FASTER?"
To answer that, of course, I must once again hope I am recalling the details of the serial bus correctly and step through it. Once again assuming drive (or partition) 1 on unit 9, and the “average case” where the file’s first track and sector are two digits each (let’s say track 22/sector 11):
C128: MLA9 MSA15 'U' '1' ' ' '5' ' ' '2' '2' ' ' '1' '1' UNL MTA9 MSA5 Drive: [1-character subtype] C128: UNT MTA9 MSA5 Drive: [1-character type] C128: UNT
That’s 9 bytes sent under /ATN (e.g., slow), and 12 (at least 10 / at most 14) data bytes (which are fastloadable).
Hmm. I observe that both programs can be sped up by (3 bytes sent under /ATN, i.e., with slow serial) if you control the serial bus via machine language and read both data bytes in one go, but I don’t know how tangible the speed boost would be, if at all.
As is so often the case, your real-world results will depend on your data (i.e., how long your filenames are), but my gut feeling is that the second program will have a slight edge. Of course, “slight” can add up fast when you have a large number of files on the device. Only thing to do is try it both ways and see what you get.
EDIT: I did it again! I didn’t notice that this topic was in the “assembly language” subforum when I was composing my reply. I know all of this is easier in machine language but I do not recall off the top of my head what the routines to call are named.
|
|
|
Post by gsteemso on Oct 19, 2014 4:38:06 GMT
Okay, I just had to do this properly now that I’ve had some time to consider it. I believe the assembly code to implement my two representative algorithms would look vaguely like this…
Algorithm the first:
INIT1: ; Here we set up things like what bank we're using (to keep it simple, ; we'll say RAM 0 with Kernal ROM in context, so bank 15), and what ; serial bus unit and what drive/partition we are accessing. We will ; keep the latter in locations THE_UNIT and THE_DRIVE, respectively. ; THE_DRIVE might more usefully be kept as a one- to three-character ; string, since it only ever gets used in the command string sent to ; THE_UNIT. Do as you like with that one. ; We will need a working storage location in RAM for intermediate ; results, such as the lengths of filename fragments. Call it TEMP. ; We will set two constants WORK_LFN and WORK_CHANNEL to make the code ; more readable. ; We will also set aside two RAM locations called ACE_FILE_TYPE and ; ACE_FILE_SUBTYPE for the results that we're doing all this for.
LOOP1: ; Assume this part does the initial work of loading and parsing a ; directory entry. Treat the directory entry as loaded into a 30-byte ; buffer at DIR_BUFFER. Then the filename is at DIR_BUFFER+3 through ; DIR_BUFFER+18. The following bit of code determines how long the ; filename actually is within that buffer: LDA #$A0 ; Shifted space char (bit 7 set): filename padding LDX #16 ; Filename length counter (local loop variable) - CMP DIR_BUFFER+2,X ; Is current character just padding? BNE + ; If we didn't find padding, break out of loop DEX ; We found padding, index backwards within name BNE - ; If we haven't run out of filename, loop BEQ [error handler -- we ran out of name] + STX TEMP
; Now assume we have copied THE_DRIVE as a string to the start of the ; 22-byte FILENAME_BUFFER (followed by a colon), with the offset to ; the 1st unused byte in that buffer left in index register Y (it ; will be 2, 3 or 4). Then we only need to do this: LDX #0 - INX LDA DIR_BUFFER+2,X STA FILENAME_BUFFER,Y INY CPX TEMP BNE - ; if we haven't run out of filename, loop LDA #',' STA FILENAME_BUFFER,Y INY LDA #'R' STA FILENAME_BUFFER,Y INY
; At this point Y contains the length of the filename we'll actually be ; sending to THE_UNIT: STY TEMP
; Set up our logical file: LDA #WORK_LFN LDX THE_UNIT LDY #WORK_CHANNEL JSR JSETLFS ; $FFBA
; We'll be using Bank 15 exclusively, so we will have already called ; SETBNK with that value once in the initialization code... and so we ; will not need to keep calling it here in the loop.
; Point to the file name: LDA TEMP ; Length of filename goes in A LDX <FILENAME_BUFFER ; LSB of address goes in X LDY >FILENAME_BUFFER ; MSB of address goes in Y JSR JSETNAM ; $FFBD
; Tell the Kernal to open a logical file: JSR JOPEN ; $FFC0 BCS [file error handling routine] LDA $90 ; Check the serial status flag is clear BNE [serial error handling routine]
LDX #WORK_LFN JSR JCHKIN ; Triggers the actual OPEN command on the serial bus ; Everything before this point is just bookkeeping to ; keep the Kernal from getting confused.
LDA $90 ; Check the serial status flag is clear BNE [serial error handling routine]
; Here's where we actually get our target data: JSR JACPTR ; Get the 1st data byte (the Ace file subtype (type?)) LDA $90 ; Check the serial status flag is clear BNE [serial error handling routine] STA ACE_FILE_SUBTYPE JSR JACPTR ; Get the 2nd data byte (the Ace file type (subtype?)) LDA $90 ; Check the serial status flag is clear BNE [serial error handling routine] STA ACE_FILE_TYPE
; OK, got what we wanted, abort the read: JSR JUNTLK ; $FFAB LDA #WORK_LFN JSR JCLOSE ; $FFC3
; We've finished with this directory entry. Go do whatever it is we're ; doing with the filetype data and then loop: JSR [whatever we're doing with the file type and subtype] JMP LOOP1
Algorithm the second (note that my BASIC version above forgot to set the buffer pointer, though it's faster to just read and discard the two-byte track/sector link at the beginning):
BY2STR: ; This is a utility function. If I had a real assembler it would be a ; macro. It takes the value in A, converts it to decimal, and outputs ; the string equivalent to COMMAND_BUFFER,Y. Because an unsigned ; value is never negative, the first character output is always a ; space. ; The routine both expects and leaves Y pointing at the next unused ; byte in COMMAND_BUFFER. ; Uses one byte on the stack beyond the return address. LDX #' ' STX COMMAND_BUFFER,Y INY CMP #100 ; How many digits? BCS + ; If 3, skip ahead to hundreds handling CMP #10 BCS ++ ; If 2, skip further ahead to tens handling BCC +++ ; If 1, skip to ones handling + LDX #0 ; This will be our hundreds digit - SBC #100 ; Carry is already set at this point INX CMP #100 ; Are we done counting hundreds yet? BCS - ; If not, loop PHA TXA ORA #'0' ; If so, convert to a digit character STA COMMAND_BUFFER,Y INY ; And save the digit PLA ++ LDX #0 ; This will be our tens digit CMP #10 ; Are there any tens? BCC + ; If not, skip ahead - SBC #10 ; Carry is already set at this point INX CMP #10 ; Are we done counting tens yet? BCS - ; If not, loop + PHA TXA ORA #'0' ; If so, convert to a digit character STA COMMAND_BUFFER,Y INY ; And save the digit PLA +++ LDX #0 ; This will be our ones digit CMP #1 ; Are there any ones? BCC + ; If not, skip ahead - SBC #1 ; Carry is already set at this point INX CMP #1 ; Are we done counting tens yet? BCS - ; If not, loop + TXA ORA #'0' ; If so, convert to a digit character STA COMMAND_BUFFER,Y INY ; And save the digit RTS
INIT2: ; Once again we will assume and set bank 15, and set up what serial bus ; unit and what drive/partition we are accessing. We will again keep ; the latter in locations THE_UNIT and THE_DRIVE, respectively. ; Again, THE_DRIVE might more usefully be kept as a one- to three- ; character string, since it only ever gets used in the command ; string sent to THE_UNIT. ; We will set four constants COMMAND_LFN, COMMAND_CHANNEL=15, WORK_LFN ; and WORK_CHANNEL to make the code more readable. ; We will construct our disk commands in COMMAND_BUFFER and use memory ; location INDEX for storage between loops. ; We will also set aside two locations called ACE_FILE_TYPE and ; ACE_FILE_SUBTYPE for the results that we're doing all this for.
; Set up the command channel logical file: LDA #COMMAND_LFN LDX THE_UNIT LDY #COMMAND_CHANNEL JSR JSETLFS ; $FFBA
; We already SETBNK'd with both values=15
; set a null filename: LDA #0 TAX TAY JSR JSETNAM ; $FFBD
; Tell the Kernal to open a logical file: JSR JOPEN ; $FFC0 BCS [file error handling routine] LDA $90 ; Check the serial status flag is clear BNE [serial error handling routine]
; Set up the data buffer logical file: LDA #WORK_LFN LDX THE_UNIT LDY #WORK_CHANNEL JSR JSETLFS ; $FFBA
; We already SETBNK'd with both values=15
; set a one-character filename '#': LDA #'#' STA COMMAND_BUFFER LDA #1 ; Length of string goes in A LDX <COMMAND_BUFFER ; LSB of address goes in X LDY >COMMAND_BUFFER ; MSB of address goes in Y JSR JSETNAM ; $FFBD
; Tell the Kernal to open a logical file: JSR JOPEN ; $FFC0 BCS [file error handling routine] LDA $90 ; Check the serial status flag is clear BNE [serial error handling routine]
; Construct the invariant part of the command string: LDA #'U' STA COMMAND_BUFFER LDA #'1' STA COMMAND_BUFFER+1 LDA #WORK_CHANNEL LDY #2 JSR BY2STR LDA THE_DRIVE JSR BY2STR STY INDEX ; Stash the current string position for next time ; through the loop
LOOP2: ; Assume this part does the initial work of loading and parsing a ; directory entry. (If we've finished them all then JMP to CLEANUP2.) ; Treat the directory entry as loaded into a 30-byte buffer at ; DIR_BUFFER. Then the first block of the file is at track ; DIR_BUFFER+1 and sector DIR_BUFFER+2.
; Now we finish building the DOS command string in COMMAND_BUFFER: LDA DIR_BUFFER+1 JSR BY2STR LDA DIR_BUFFER+2 JSR BY2STR LDA #13 ; end the command string with a carriage return ; -- can't recall if this is important or not STA COMMAND_BUFFER,Y INY ; At this point Y = length of the command string
; Set the command-channel file as our output destination: LDX #COMMAND_LFN JSR JCKOUT ; $FFC9 BCS [file error handling routine] LDA $90 ; Check the serial status flag is clear BNE [serial error handling routine]
; Actually write the command string: LDX #0 ; loop counter variable INY ; We want to start with Y = (length + 1) - LDA COMMAND_BUFFER,X JSR JCIOUT ; $FFA8 LDA $90 ; Check the serial status flag is clear BNE [serial error handling routine] INX ; Bump the DEY ; counters BNE - ; When Y=0, we have run out of command string
; OK, we said our piece, we can unlisten the bus now: JSR JUNLSN ; $FFAE
; Set the buffer file as our input source: LDX #WORK_LFN JSR JCHKIN ; $FFC6 BCS [file error handling routine] LDA $90 ; Check the serial status flag is clear BNE [serial error handling routine]
; Read and discard the two bytes at the start of the buffer (it's the ; track/sector link to the next block, not part of the file data): JSR JACPTR ; $FFA5 -- Get, ignore the track byte LDA $90 ; Check the serial status flag is clear BNE [serial error handling routine] JSR JACPTR ; Get, ignore the sector byte LDA $90 ; Check the serial status flag is clear BNE [serial error handling routine]
; Here's where we actually get our target data: JSR JACPTR ; Get the 1st data byte (the Ace file subtype (type?)) LDX $90 ; Check the serial status flag is clear BNE [serial error handling routine] STA ACE_FILE_SUBTYPE JSR JACPTR ; Get the 2nd data byte (the Ace file type (subtype?)) LDX $90 ; Check the serial status flag is clear BNE [serial error handling routine] STA ACE_FILE_TYPE
; OK, got what we wanted, abort the read: JSR JUNTLK ; $FFAB
; We've finished with this directory entry. Go do whatever it is we're ; doing with the filetype data and then loop: JSR [whatever we're doing with the file type and subtype] LDY INDEX ; restore command-string position JMP LOOP2
CLEANUP2: ; At this point we would close the WORK_LFN, and possibly the ; COMMAND_LFN depending on what else we have left to do.
As I don’t have any idea what all of this would be hooked up to / embedded in, I can’t really test the two versions of the code for relative speed or efficiency. It does look like the one with the buffer channel is clunkier though. Darn.
|
|
|
Post by hydrophilic on Oct 20, 2014 4:16:00 GMT
Thanks for your detailed response. I didn't pose the question, but really appreciate your thorough investigation.
First thing, in your original post, you can not use FN$ as a variable name because FN is a reserved keyword (for user functions).... blah, just a minor technicality, but computers are extremely fussy!
Yes, your second method should be noticeably faster, but will not work SD2IEC/MMC (or CD-ROM?) because it relies on block-access commands. So if I were Mirkosoft, I would use method#2 as the default (for speed) and fall-back to method#1 if #2 fails.
Another huge factor is how the data is laid out on the physical medium (assuming NOT sd2iec because they don't have a variable seek time). So if you scan the directory in the order it is written, it could be slow. Usually (1541/1571/1581) it will allocate a file below the directory track first... then the second file will be above the directory track... then third file will be below the directory track... I said USUALLY. This assumes all files are about the same size... the result would be different if the files varied wildly in size.
Edit After reading what I wrote, it does not seem clear, so let me give a practical example. Imagine 4 Koala files were written to disk each is almost 10K and consumes 40 blocks. Further, lets assume a 1541/1571 disk. The standard layout would be this: Directory: Track 18 File 1 Start: Track 17 (sector 0) File 2 Start: Track 19 (sector 0) File 3 Start: Track 16 (sector 11) File 4 Start: Track 21 (sector 0)
So if you read the directory, and then the first two bytes from each starting sector, the head movement would be: Track 18 (0... assume head starts on track 18) Track 17 (-1) Track 19 (+2) Track 16 (-3) Track 21 (+5)
In other words, the head would need to move |-1| + |+2| + |-3| + |+5| tracks, or 11 tracks.
Now, if you read it in "track-sorted order" it would be: Track 18 (0... assume head starts on track 18) Track 16 (-2) Track 17 (+1) Track 19 (+2) Track 21 (+2) In other words, the head would need to move |-2| + |+1| + |+2| + |+2| tracks, or 7 tracks. That is 63% improvement, in terms of head-movement. Also note this is a not-very-full disk (only 160 of 664 blocks). The speed enhancement would be even more when there are more files on disk!
/Edit
So a major speed improvement would be to sort the directory by starting track of each file. This should minimize head movement and thus dramatically improve speed. But it would still fail with non-CBM file systems (like FAT/CDFS).
Also, if you are using an emulator (with True Drive Emulation turned off) then the order would not matter.
|
|
|
Post by mirkosoft on Oct 20, 2014 18:43:53 GMT
Hi Robert!
Really important note. Thank you. Of course I'm using many devices and less of them have real CBM-filesystem...
Miro
|
|