Integer BASIC Pretty Lister

About 2 1/2 years ago, Mike Laumer, of Carrollton, Texas, wrote a program to make pretty listing of Integer BASIC programs.  He gave me a copy to look at, and then we both forgot about it.  A few days ago I found it again, dusted it off, typed it in, and tried it out.  After a little debugging, here is the result.

Which is neater?

     100 FOR I=1 TO 40: A(I)=I: A(I+41)=I*I: NEXT I

 or? 100 FOR I=1 TO 40
         : A(I)=I
         : A(I+41)=I*I
         : NEXT I

Mike and I happen to like the latter format, especially for printing in newsletters.  It is a lot easier to read.  And why print it if no one is going to read it?

If you are in Integer BASIC, and you have a program in memory ready to list, here are the steps to get a "pretty listing".

     1.  BLOAD B.PRETTY.LISTER
     2.  POKE 0,40    (or whatever number of characters
     3.  CALL 2048          per line you wish it to use)

If you want it to print on your printer, be sure to turn it on in the way you usually do before the CALL 2048.  For example, if you have a standard Apple interface in slot 1, type "PR#1" just before the CALL 2048.

If you check it out, you will find a lot of similarity between the code in this program and what is stored in the Integer BASIC ROMs around locations $E00C through $E0F9.  The routines are not in the same order, and there are a few significant changes to make the listing "pretty" and to control the line length.  As I was typing in Mike's program, I took the liberty of "modularizing" it a little more, so that I could understand it.  the PRINT.DECIMAL routine in lines 2500-2810 is almost identical to the one at $E51B in the BASIC ROMs.  The changes are for the purpose of counting the number digits actually printed; this allows a closer control over line length.

Since one of the promised features of the Apple Assembly Line was commented disassemblies of some of the Apple's ROM code, I will try to explain how PRETTY.LIST works in some detail, module by module.  You can then apply my explanation to the code which resides in ROM at $E00C-$E0F9.

PRETTY.LIST:  This module is the overall control for the listing process.  Since PP points to the beginning of the BASIC source program, lines 1270-1300 transfer this pointer into SRCP.  Then SRCP is compared with HIMEM, to see if we are finished listing.  The check is made before even listing one line, because it is possible that there is no source program to list!  If the value in SRCP is greater than or equal to the value in HIMEM, then the listing is finished, and PRETTY.LIST returns to BASIC by JMP to DOS.REENTRY ($3D0).  If the listing is not finished, I call PRINT.ONE.LINE to format and print out one line of the source program.  "One line" may be several statements separated by colons.  Then I jump back to the test to see if we are through yet, and so on and on and on.

PRINT.ONE.LINE:  A source line in Integer BASIC is encoded in token form, and this routine has to convert it back to the original form to list it.  First, let's look at how a coded line is laid out.

     #     line
    bytes number     body of source line      01

The first byte of a line is the line length; we will ignore it in this program, because we do not need it.  The last byte of each line is the hex value $01, which is the token for end-of-line.  That is all we need to signal the end of a line, and the start of another one.  The second and third bytes of each line are the line number, in binary, with the low byte first.  The body of the line is made up of a combination of tokens and ASCII characters.

For the most part, tokens have a hex value less than $80, while the ASCII characters have a hex value greater than $80.  One important exception is the token for a decimal constant.  These are flagged by a pseudo-token consisting of the first digit of the constant in ASCII (hex $B0 through $B9); after the token, two bytes follow which contain the binary form of the constant with the low byte first.  For example, the decimal constant 1234 would be stored in three bytes as:  $B1 D2 04.

The task of PRINT.ONE.LINE is to scan through the coded form of a line, printing each ASCII character, and converting each token to its printing form.  In addition, the routine must count line position as it goes, so that a new line can be started when one fills up.  Furthermore, we want it to start a new line whenever the ":" indicates a new statement has begun within a line.  We have to look out for REM statements and quoted strings, because the ":" might appear in them without signalling a new statement.

Lines 1400-1460 start the ball rolling.  The line position is set to zero, and the fill flag for the PRINT.DECIMAL routine is set to produce a right-justified-blank-filled number.  Then GET.NEXT.BYTE is called to advance the SCRP past the byte count in the first byte of the line.  GET.NEXT.BYTE returns the value of the byte in A, and with Y=0.  This time we ignore the value in A, and use the fact that Y=0 to clear A.

Lines 1470-1510 pick up the two bytes of the line number and call PRINT.DECIMAL to print it out.  These same lines will be used later to print out any constants which are in the line.  These lines are entered this time with A=0 and with IB.FILL set for the RJBF mode (right-justified-blank-filled).  Later for constants they will be entered with IB.FILL set for printing with no leading blanks, and with A <> 0.  The value in A is used to set IB.FLAG, which determines whether a trailing blank will be printed.  One will be printed after the line number, but not after a constant inside a line.  (For a character that uses so little ink, blanks can sure eat up a lot of code!)

At line 1520 the main body of the PRINT.ONE.LINE routine begins.  CHECK.EOL.GET.NEXT.BYTE decides whether we are getting too close to the end of the line.  This prevents splitting token-words in half, with a few characters dangling off the end of one line, and the rest starting a new one.  (At least, on the screen it would look like that; on a printer it might just print out into a margin.)  The routine will start a new line before returning if the end is too near.  When it finally does return, the next byte will be in A, and Y will be zero.  If the next byte is a token (less than $80), control branches to line 1720.  If the first bit of the byte is 1, and the second bit is 0, the code at lines 1550-1580 assumes the pseudo-token for a constant has appeared.  If the second bit is also 1, the byte is an ASCII character.  Before printing the character, lines 1590-1630 may print a blank.  This would be a trailing blank after printing a token or a line number.  The character is then printed at lines 1640-1650, and another end-of-line check is made.  This time "too near the end" is defined as within 3 spaces.  The next byte must either be a token or yet another ASCII character, so a determination is made in lines 1660-1700.

Tokens are harder to handle, because we have to test for several special cases, and if not a special case the token table must be searched to find the token's name.  Lines 1720-1740 test for the end-of-line token; if this is it, a carriage return is printed and PRINT.ONE.LINE returns back to its caller.

If the token is the new-statement-token, used for ":", a new line is started.  Then the fun begins:  we have to search the token table.  This table is the most recondite portion of the whole Apple computer!  I have only scratched its surface.  The table is located between $EC00 and $EDFF, but it is not in that order.  It goes like this: first $ED00, then $EDFF-$ED01 (yes, backwards!), then $EC00, then $ECFF-$EC01.  The names for all the tokens are stored in the table, along with various bits of information about precedence and syntax.  If you print out the table, you will not see any names...  Steve Wozniak subtracted $20 from each byte before putting it into the table.  Well, there is a lot more to it than that, but I am getting lost, side-tracked.

After finding the token's name string inside the token table, we have to print it out.  This is done in lines 1840-1940.  The name is terminated either by the last character having a value greater than $BF, or by the next character in the table having a value less than $80.  The routine at $E00C decides whether or not to print a trailing blank, I think.

After printing the token's name, lines 1960-2010 test for REM or a quoted string.  Either of these would be followed by a bunch of ASCII characters terminated by a token, so control branches to line 1660 to print them out.  If neither, we go back to line 1520, to get the next token, or whatever.

Somehow I skipped over line 1830.  I believe the JSR $EFF8 determines whether or not to print a space in front of the token name.

FIND.TOKEN:  Lines 2040-2110 set up a pointer to the half of the token table which contains the name string for the token we want.  Tokens $00 through $50 are in the first half, and $51 through $7F are in the second half.

Lines 2120-2250 scan through the table, counting token names as they are passed.  When the nth one is found, where n is the token value, the routine returns.  It returns with A=0, and Y = offset in the half of the token table we have been scanning.

CHECK.EOL.GET.NEXT.BYTE:  Enter this routine with A containing the number of bytes short of the end of the line you want to test for, as a negative number.  If too near the end, CR.7.BLANKS will be called to start a new line.  In any case the routine exits by transferring to GET.NEXT.BYTE to get the next byte from the source line.

CR.7.BLANKS:  Prints a carriage return adn 7 blanks to start a new line.

CHAR.OUT:  Simply counts characters and then calls on the Apple monitor to print out a character.  We need to count columns for CHECK.EOL.GET.NEXT.BYTE.

PRINT.DECIMAL:  Lifted out of Integer BAIC from $E51B, and modified to eliminate the ability to store the converted number in the input buffer, and to add the ability to count output characters.

Additions to this program:  You might like to add some more featrures to this program.  For example, it would be nice to have it request the line length and printer slot number itself, and turn the printer on and off.  Also, it would be helpful to add indentation for FOR...NEXT loops and IF...THEN statements.  The same program could be merged with a cross reference program to build and print a variable and line number cross reference.

If you decide to try any of these, or any other enhancements, why not write them up and send them to me for publication?
