!pr1
Shrinking Code Inside ProDOS...............Bob Sander-Cederlof

David Johnson challenged me a few days ago.  We were talking about ProDOS: the need for a ProDOS version of the S-C Macro Assembler, the merits vs. enhanced DOS 3.3, and the rash of recent articles on shrinking various routines inside DOS to make room for more features.

I've been avoiding ProDOS as much as possible, trying not to notice its ever-increasing market-share.  Dave's comment, "ProDOS is a fertile field for your shrinking talent," may have finally pushed me into action.

I am trying to make the ProDOS version of the S-C Macro Assembler, but is hard.  I have Apple's manuals, Beneath Apple ProDOS, and the supplement to the latter book which explains almost every line of ProDOS code.  Nevertheless, version 1.1.1 of ProDOS doesn't seem to conform to all these descriptions in every particular.  I spent four hours last night chasing one little discrepancy.  (Turned out to be my own bug, though.)

In the process, I ran across the subroutine ProDOS uses to convert binary numbers to decimal for printing.  In version 1.1.1 it starts at $A62F, and with comments looks like this.

       <<<< prodos listing >>>>

The conversion routine is designed to handle values between 0 and $FFFFFF.  The heghest byte must already have been stored at ACCUM+2 before calling CONVERT.TO.DECIMAL.  The middle byte must be in the X-register, and the low byte in the A-register.  The decimal digits will be stored in ASCII in the $200 buffer, starting and $201+Y and working backwards.

One way of converting from binary to decimal is to perform a series of divide-by-ten operations.  After each division, the remainder will be the next digit of the decimal value, working from right to left.  That is the technique ProDOS uses, and the division is done by the subroutine in lines 1280-1420.

The dividend is in ACCUM, a 3-byte variable.  The low byte is first, then the middle, and finally the high byte.  One more byte is set aside for the remainder.  A 24-step loop is set up to process all 24 bits of ACCUM.  In the loop ACCUM and REMAINDER are shifted left. If REMAINDER is 10 or more, it is reduced by ten and the next quotient bit set to 1; otherwise the next quotient bit is 0.

The first possible improvement I noted was in the area of lines 1330-1360.  the ROL REMAINDER will always leave carry status clear, because we never let REMAINDER get larger than 9.  If we delete the SEC instruction, and change SBC #10 to SBC #9 (because carry clear means we need to borrow), we can save one byte.  But that's not really worth the effort.

Next I realized that REMAINDER could be carried in the A-register within the 24-step loop, and not stored until the end of the loop.  Here is that version, which saves seven bytes (original = 31 bytes, this one = 24 bytes):


       <<<< listing of my lines 1260-1380 >>>>


To make sure my version really worked, I re-assembled the conversion program with an origin of $800, and appended a little test program.  Here is my test program, which converts the value at $0000...0002 and prints it out.


       <<<<listing of my lines 1510-1620 >>>>

My best version is yet to come.  I considered the fact that we could SHIFT the next quotient bit into the low end of ACCUM rather than using INC ACCUM to set a one-bit.  I rearranged the loop so that the remainder reduction was done first, followed by the shift-left operation.  I had to change the remainder reduction to work modulo 5 rather than 10, because the shifting operation came afterwards.  I also had to inlcude my own three lines of code to ROL ACCUM, since the little subroutine in ProDOS started with ASL ACCUM.  The result is still shorter than 31 bytes, but only four bytes shorter.  Nevertheless, it is faster and neater, in my opinion.


       <<<<lines 1640-1770>>>>
