!pr2
How Many Bytes for each Opcode?............Bob Sander-Cederlof

I have been thinking about a semi-automatic object code relocation scheme lately.  Steve Wozniak wrote one for the 6502 back in 1976, published in various places such as Call APPLE's "Wozpak".  But we are needing one for the 65C02, and maybe for the 65816.

Steve's version used his "Sweet-16" interpreter for some of the address arithmetic.  That was okay, because Sweet-16 was in ROM in every Apple in those days.  Not so now, although it is available to DOS 3.3 users as part of the Integer BASIC package.  But we should write one that does not require Sweet-16.

Steve's relocator also used a ROM-based routine (part of the built-in disassembler) to determine how many bytes are used by each opcode.  This routine has been modified in the //c monitor and the new enhanced //e monitor to include the 65C02 opcodes.  That's nice, because that means Woz's program will automatically work with 65C02 programs if you run it with the new monitors.  However, since I want to include all the 65816 opcodes, I need a new version.

The first step seems to be to write a program which will tell me how many bytes each opcode uses.  I know that opcodes which are only one or two bytes do not need any relocation adjustments when a program is moved to a different place in memory.  Most 3-byte and all 4-byte instructions contain absolute addresses; if an absolute address is inside the program being moved, it will have to be adjusted for the new location.

I haven't written the entire relocator yet, but I have written a program which will tell me all I need to know about the length of an opcode.  My program returns the length in bytes and also two flags.  One flag indicates the opcode is a 3-byte instruction which does include an absolute address.  The other flag indicates the opcode was an immediate mode instruction.  Immediate mode in 65816 code is ambiguous in length, except during execution.  My program calls them two-byte instructions, but they may be three bytes each if the status bits so indicate at execution time.  I am not sure how my relocator will handle this ambiguity, but for now I am content just to set a flag.

The code in the monitor which determines the length of opcodes uses a table lookup method.  I figure that I could do that too, with a 64-byte table, using two bits for each opcode.  I would still need a way to test for immediate mode and the special three-byte opcodes which do not have absolute addresses (MVP, MVN, PER, and BRL).

After looking at a chart which showed all the lengths, I decided to do it with bit analysis rather than table lookup.  It is probably a little slower, but also a little smaller.

It turns out that almost all of the opcodes whose second hex digit is less than 8 use two bytes.  There are only nine exceptions.  One interesting case here is BRK, which assembles to only one byte but is considered by the microprocessor to be a two-byte opcode.  I am not sure whether the relocator should considere BRK as a single byte or a two-byte opcode, but I think it should probably be one byte.

All opcodes of the with the hex values of $x8, $xA, and $xB are one byte, without exception.  All opcodes with the hex values $xC, $xD, and $xE are three bytes with absolute addresses, with only one exception:  $5C is a four-byte instruction.  All opcodes with value $xF are four bytes each.

The column of opcodes with values $x9 are divided into two groups.  Those with the first digit even ($09, 29, 49, etc.) are all three bytes each with absolute addresses.  The odd ones are immediate mode opcodes, which may be either two or three bytes each depending on status bits during execution.

Here is a table of the various byte counts, which was actually computed by my program.  I printed "2#" for immediate mode opcodes, and "3+" for three-byte opcodes with absolute addresses.

       0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F

   0
   1
   2
   3
   4
   5
   6
   7
   8
   9
   A
   B
   C
   D
   E
   F

The program which printed the table is in lines 1050-1320 below.  The program which computes how many bytes in an opcode follows that.  By inserting a "BEQ .6" between lines 1410 and 1420 I could make BRK a one-byte opcode.

My relocator should probably also be on the lookout for calls to ProDOS MLI.  This is in effect a six-byte instruction.  The first three bytes are $20, $00, $BF (JSR MLI).  The fourth byte is the MLI function code.  The last two bytes are the address of a parameter table, and so should be considered as a relocatable address.

I hope to continue to pursue this idea of a relocator, but I make no promises.  Maybe one of you would like to write one and share it with the rest of us.
