|
Post by gsteemso on Jul 8, 2020 0:50:02 GMT
I'm not following your reasoning, here. The original question is "how do I find the exact size of an arbitrary file?"
In the absence of a programming technique to do it more directly, the only apparent method is to load the entire file into memory and then check to see where it ended. If you needed that RAM for something more important (like whatever your program actually _does_), well, sucks to be you. If you need the length of a file that wouldn't actually _fit_ in RAM, you're even further out of luck.
Your idea is to... I'm not clear, actually. What is it supposed to achieve if you put zeroes all over a disk? Bear in mind that any data written to an unused sector will have been staged in one of the drive's internal RAM buffers, and the whole works written out in one go. There's no reason whatsoever to assume that every Commodore-compatible drive anyone ever sold was configured to zero out the buffer first. So, no matter what you put on the disk ahead of time, it's not going to have any effect on the endings of files.
In any case, the actual problem being asked about involves examining files that already exist. By definition, any changes or improvements anyone could make are out of scope for this thread.
|
|
|
Post by gsteemso on Jul 8, 2020 19:50:20 GMT
I just re-read the original post, and noticed a detail we all overlooked.
There is, by definition, no way to know how many bytes a USR file actually holds. It is as the name says -- "USeR defined". Examples:
- If someone decided to disguise a totally new disk format by labelling it as a single USR file on an otherwise normal disk, it may have allocated every sector on that disk that Commodore DOS hadn't already claimed, making a file many times larger than could ever fit into the computer's RAM... but if the program responsible for doing it hadn't yet stored any real files in that filesystem image, there would be no actual data involved.
- Equally, if someone decided that dealing with Commodore DOS' very awkwardly-sized 254-byte file blocks was an intolerable pain in the neck, they might define a USR format that didn't bother with track-&-sector pointers in each sector, using some sort of index file to keep track of 256-byte file blocks instead. Depending on how carried away someone got with a scheme like that, you could end up with data blocks that _never_ contain unused space!
In any case like these, you are at the mercy of whomever implemented the USR file in question; if they didn't bother to keep the CBM DOS directory entry up to date regarding how many blocks it occupies, you have no way whatsoever to even _guess_ how much is in it.
Just to, you know, keep things cheerful and positive. *grin*
|
|
|
Post by bjonte on Jul 9, 2020 10:44:38 GMT
The idea is simple and you're only checking the last track and sector. I still don’t understand your reasoning. How would you find the last track and sector without traversing the linked list of sectors that the file consists of?
|
|
|
Post by bjonte on Jul 9, 2020 11:27:36 GMT
Ah, I see. I was confused because I assumed that the question was aiming at not loading the file contents.
|
|
|
Post by wsoft on Nov 2, 2020 23:58:52 GMT
Yes it is all about reading (and counting) the first track and sector of a file then you just go up from there until you reach EOF. After that all bets are off, unless you know a way to get the past the last <256 bytes. And it is not nessesary to read the entire blocks' contents as long as you're pointing your Memory-Reads at the same place and only reading two bytes at a time, which in turn reads the next track and sector, etc.
[edit] Your version of how to skip reading the entire file's contents are on point. All I was saying is that it gets problematic after EOF, so one way might be to trap EOF so you know the last track and sector that was read, then use a marker somehow but man what a hassle. I just go by the block-count in the directory, and subtract two bytes for every block which was read then I get a ballpark figure that can be plus or minus 256 bytes. There HAS to be a way to DO this, though - just saying.
|
|
|
Post by gsteemso on Nov 29, 2020 17:26:16 GMT
...."EOF"????
Commodore DOS doesn't really _use_ any kind of explicit EOF, unless you count the "end of data" handshake on the IEEE-488 or Commodore serial busses.
The on-disk format of a SEQ or PRG file, as discussed, divides each 256-byte sector into 254 bytes of data preceded by a two-byte next-sector pointer. The first byte of the pointer records the track, and the second records the sector on that track. The track number must be in the range [1 - 255]; when this byte instead contains a zero, that means the 254-byte data field holds the last portion of the file, and is most likely not completely used. The index (i.e., byte number) of the last data byte within the _sector_ -- not within the _data block_ -- is stored in what would otherwise be the next-sector number, in the second byte of the raw sector. Because bytes 0 and 1 of the sector are not within the data block, the valid range of this datum lies within [2 - 255].
So, I'm honestly having trouble finding the trouble. If your algorithm already knows to load each sector, read the two leading bytes, and loop, then that's it. What else do you need?
Load a sector. If sector-byte #0 is nonzero, add 254 bytes to your file length, and loop. After the loop, take (the contents of sector-byte #1) minus two, and add that to your file length. Done!
|
|
|
Post by wsoft on Dec 1, 2020 21:59:48 GMT
✌
|
|
|
Post by bjonte on Dec 2, 2020 5:00:25 GMT
gsteemso is right. Files are made up of linked sectors. Each sector begins by pointing to the next track and sector followed by 254 bytes of data. The last points to track 0 (which doesn’t exist) and sector N (which tells how many bytes of data are used in this last sector). So there’s the end marker, the last track/sector pointer.
|
|
|
Post by wsoft on Dec 2, 2020 12:26:25 GMT
I don't mean to argue with anyone here, but no EOF? Come on! What do you people think bit 6 in ST is for? And wasn't I the one who said you could read those two bytes, use those as the next memory-read and so on? What's the problem? On a different note I warned that it would be problematic, because under normal circumstances (normal file-reads), you can't get an end-address without the EOF signal and YES Caroline bit 6 in ST does just that; for every byte of data, you're checking the EOF bit in $90 but that only pertains to reading file data... so, in contrast to reading a file, you're doing individual memory-reads at two bytes per read and this means you are not going to use ST anyway. I better not have said you should... hmnn I admit maybe it came across that way. Maybe I didn't get enough sleep or something? Above, when referring to EOF I meant the actual end of the file, not the signal. You can still get the ending track and sector when you get an "00" and an "FF" back from your memory-read.
Wasn't this about getting the end-address of a large file? What's the problem, open the file and read every byte... wait for the EOF bit in $90 and wait all day for the thing, for all I care.
That being said, I sure didn't mean to read every block of data into the computer. First you need to know what track and sector your file starts on, you could get that info easily enough with a disk monitor because the first T/S in every file is noted on the directory track. Use those two bytes as a starting-point for memory-reads, then you know where to get the next two bytes, and the next two, etc. That was the idea but without an EOF signal you only get to the last track and sector, but you still don't know how many bytes are in there. You only know to check for a ballpark figure of where the file ends by looking for an "00" and an "FF"- which tells your program it can stop because it's on the last one. How many bytes within that last T/S actually pertain to the file is another question; there's no way to find that out by reading in this manner (unless, I suggested, that you prep the disk with some arbitrary value before saving to it).
What was the original idea behind knowing how long a file is, anyway? Most people will never need to know how big a file actually is, so what is practical about knowing at which exact point it ends? What is there to be gained? Do it an easier way, like, what is its filesize in the directory? Okay for every block in the file there are two bytes you can subtract for the track and sector, so essentially you're doubling the amount of blocks you can see there, and dividing the result by 256 and subtracting that amount from however many blocks that the directory says it is. Of course it won't be exact, but let me return to my point where I ask again, what purpose does it really serve?
I re-read my older posts and I can see how I flubbed up what I was writing. I'm neither the best, nor the most prolific at writing, but what we got here is mostly a failure to communicate (clarify) and that's my problem not yours. I seem to have some kind of issue getting my point across. Short answer is maybe I never paid nuff tension in skool. Alles Klar?
Bjonte, as for the last track and sector revealing the ending location, well then, now you have your answers and I hope that this will serve to verify the legitimacy of the method I have described above. I just checked, and it appears you are quite correct about the ending location being contained within the last track and sector pointer. Guess I was wrong about that. Otherwise I'm quite done here, and my brain farts have been exposed quite nicely 😁 Thanks. Other thank-yous to gsteemso as well, who definitely knows how much a block contains. 😁 'nuff sed. I'm at beer #15 so I will quit while I'm still ahead.
|
|
|
Post by wsoft on Dec 11, 2020 20:57:35 GMT
If I were the "Tin Man" I'd say I need a little oil... lol
|
|