   A Proposed Assembly Language Syntax For 65c816 Assemblers 
       by Randall Hyde 
        
 
 This is a proposed standard for 65c816 assembly language.  The 
proposed standard comes in three levels: subset, full, and extended.  The 
subset standard is intended for simple (or inexpensive) products, 
particularly those aimed at beginning 65c816 assembly language programmers. 
The full standard is the focus of this proposal.  An assembler meeting the 
full level adopts all of the requirements outlined in this paper.  The 
extended level is a mechanism whereby a vendor can claim full compliance 
with the standard and point out that there are extensions as well.  An 
assembler cannot claim extended level compliance unless it also complies with 
the full standard.  An assembler, no matter how many extensions are 
incorporated, will have to claim subset level unless the full standard is 
supported.  This ensures that programmers who do not use any assembler 
extensions can assemble their programs on any assembler meeting the full or 
extended compliance levels.   
 
 In addition to the items required for compliance, this proposal  
suggests several extensions in the interests of compatibility with existing 
65c816 assemblers.  These recommendations are not required for full 
compliance with the standard, they're included in this proposal as suggestions 
to help make conversion of existing programs easier.  The suggestions are 
presented in two levels: recommended and optional.  Recommended items should 
be present in any decent 65c816 package.  Inclusion of the optional items 
is discouraged (since there are other ways to accomplish the same operation 
within the confines of the standard) but may be included in the assembler 
at the vendor's discretion to help alleviate conversion problems. 
 
 
 
 
 
 
 
 
 
   65c816 Instruction Mnemonics  
   ---------------------------- 
 
 
 All of the following mnemonics are required at the subset, full, 
and extended standard levels. 
 
 The following mnemonics handle the basic 65c816 instruction set: 
 
ADC - add with carry 
AND - logical AND 
BCC - branch if carry clear 
BCS - branch if carry set 
BEQ - branch if equal 
BIT - bit test 
BMI - branch if minus 
BNE - branch if not equal 
BPL - branch if plus 
BRA - branch always 
BRK - break point instruction 
BVC - branch if overflow clear 
BVS - branch if overflow set 
CLC - clear the carry flag 
CLD - clear the decimal flag 
CLI - clear the interrupt flag 
CLP - clear bits in P 
CLR - store a zero into memory 
CMP - compare accumulator 
CPX - compare x register 
CPY - compare y register 
CSP - call system procedure 
DEC - decrement acc or memory 
DEX - decrement x register 
DEY - decrement y register 
EOR - exclusive-or accumulator 
HLT - halt (stop) the clock 
INC - increment acc or memory 
INX - increment x register 
INY - increment y register 
JMP - jump to new location 
JSR - jump to subroutine 
LDA - load accumulator 
LDX - load x register 
LDY - load y register 
MVN - block move (decrement) 
MVP - block move (increment) 
NOP - no operation 
ORA - logical or accumulator 
PHA - push accumulator 
PHP - push p 
PHX - push x register 
PHY - push y register 
PLA - pop accumulator 
PLP - pop p 
PLX - pop x register 
PLY - pop y register 
PSH - push operand 
PUL - pop operand 
RET - return from subroutine 
ROL - rotate left acc/mem 
ROR - rotate right acc/mem 
RTI - return from interrupt 
RTL - return from long subroutine 
RTS - return from short subroutine 
SBC - subtract with carry 
SED - set decimal flag 
SEI - set interrupt flag 
SEP - set bits in P 
SHL - shift left acc/mem 
SHR - shift right acc/mem 
STA - store accumulator 
STX - store x register 
STY - store y register 
SWA - swap accumulator halves 
TAD - transfer acc to D 
TAS - transfer acc to S 
TAX - transfer acc to x 
TAY - transfer acc to y 
TCB - test and clear bit 
TDA - transfer D to acc 
TSA - transfer S to acc 
TSB - test and set bit 
TSX - transfer S to X 
TXA - transfer x to acc 
TXS - transfer x to S 
TXY - transfer x to y 
TYA - transfer y to acc 
TYX - transfer y to x 
WAI - wait for interrupt 
XCE - exchange carry with emulation bit 
 
Comments: 
 
 CLP replaces REP in the original 65c816 instruction set, since CLP 
is a tad more consistent with the original 6502 instruction set.  See  
"recommended options" for the status of REP.  CLR replaces the STZ 
instruction.  Since STA, STX, and STY are used to store 65c816 registers, 
STZ seems to imply that there is a Z register.  Using CLR (clear) eliminates 
any confusion.  CSP (call system procedure) replaces the COP mnemonic.  COP 
was little more than a software interrupt in both intent and implementation. 
CSP helps make this usage a little clearer.  HLT replaces the STP mnemonic. 
STP, like the STZ mnemonic, implies that the P register is being stored 
somewhere.  HLT (for halt) is just as obvious as "stop the clock" yet it 
doesn't have the same "look and feel" as a store instruction.   JML and JSL 
are not really required by the new standard;  but see recommended options 
concerning these two instructions.  Most of the new 65c816 push and pull 
instructions have been collapsed into two instructions: PSH and PUL. 
 
 PEA label   becomes  PSH #label 
 PEI (label) becomes  PSH label 
 PER label   becomes  PSH @label 
 PHB         becomes  PSH DBR 
 PHD         becomes  PSH D 
 PHK     becomes  PSH PBR 
 
 PLB      becomes  PUL DBR 
 PLD     becomes  PUL D 
  
These mnemonics are more in line with the original design of the 6502 
instruction set whereby the mnemonic specifies the operation and the operand 
specifies the addressing mode and address.  The RET instruction gets converted 
to RTS or RTL, depending on the type of subroutine being declared.  RTS and  
RTL still exist in order to force a short or long return.  SHL and SHR (shift 
left and shift right) are used instead of ASL and LSR.  The 6500 family has 
NEVER supported an arithmetic shift left instruction.  The operation performed 
by the ASL mnemonic is really a logical shift left. To simplify matters, SHL 
and SHR are used to specify shift left and shift right.  SWA (swap accumulator 
halves) is used instead of XBA.  Since this is the only instruction that 
references the "B" accumulator, there's no valid reason for even treating 
the accumulator as two distinct entities (this is just a carry-over from the 
6800 MPU).  Likewise, since the eight-bit accumulator cannot be distinguished 
from the 16-bit accumulator on an instruction by instruction basis (it depends 
on the setting of the M bit in the P register), the accumulator should always 
be referred to as A, regardless of whether the CPU is in the eight or sixteen 
bit mode.  Therefore, instructions like TCD, TCS, TDC, and TSC should be 
replaced by TAD, TAS, TDA, and TSA.  For more info on these new mnemonics, 
see the section on "recommended options". 
 
 
   Built-in Macros 
   --------------- 
 
 The following instructions actually generate one or more instructions. 
They are not required at the subset level, but are required at the full and 
extended levels. 
  
 
ADD - emits CLC then ADC 
BFL - emits BEQ (branch if false) 
BGE - emits BCS 
BLT - emits BCC 
BTR - emits BNE (branch if true) 
BSR - emits PER *+2 then BRA (short) or PER *+3 then BRL (long) 
SUB - emits SEC then SBC 
 
 
   Recommended Options 
   ------------------- 
 
 The following mnemonics are aliases of existing instructions.  The 
(proposed) standard recommends that the assembler support these mnemonics, 
mainly to provide compatibility with older source code, but does not 
recommend their use in new programs.  Some (or all) of these items may be 
removed from the recommended list in future revisions of the standard.  None 
of these recommended items need be present at the subset level.  If these 
are the only extensions over and above the full syntax, the assembler 
CANNOT claim to be an extended level assembler. 
 
ASL BRL COP JML JSL LSR PEA PEI PER 
PHB PHK PHK PLB PLD REP TCD TCS TDC 
TSC TRB WDM XBA 
 
 
 
 
  Symbols, Constants, and Other Items 
  ----------------------------------- 
 
 Symbols may contain any reasonable number of characters at the full 
level.  At the subset compliance level, at least 16 characters should be 
supported and 32 is recommeded.  A "reasonable" number of characters should 
be at least 64 if the implementor needs a maximum value. 
 
 Symbols must begin with an alphabetic character and may contain 
(only) the following symbols:  A-Z, a-z, 0-9, "_", "$", and "!".  The 
assembler must be capable of treating upper and lower case alphabetic 
characters identically.  Note that this does not disallow an assembler from 
allowing the programmer to choose that upper and lower case be distinct, it 
simply requires that in the default case, upper and lower case characters 
are treated identically.  Note that the standard does not require case 
sensitivity in the assembler (and, in fact, recommends against it). 
Therefore, anyone foolish enough (for many, many reasons) to create variables 
that differ only in the case of the letters they contain is risking port- 
ability problems (as well as maintenence, readability, and other problems). 
 
 The following symbols are reserved and may not be redefined within 
the program: 
 
  A, X, Y, S, DBR, PBR, D, M, P 
 
Nor may these symbol appear as fields to a record or type definition (which 
will be described later). 
 
 
 Constants take six different forms: character constants, string 
constants, binary constants, decimal constants, hexadecimal constants and 
set constants. 
 
 Character constants are created by surrounding a single character by  
a pair of apostrophes or quotation marks, e.g., "s", "a", '$', and 'p'.  If  
the character is surrounded by apostrophes, then the ASCII code for that  
character WITH THE H.O. BIT CLEAR will be used.  If the quotation marks are  
used, then the ASCII code for the character WITH THE H.O. BIT SET will be  
used.  If you need to represent the apostrophe with the H.O. bit clear or a 
quotation mark with the H.O. bit set, simply double up the characters, e.g., 
 
  '''' - emits a single apostrophe. 
  """" - emits a single quotation mark. 
 
 String constants are generated by placing a sequence of two or more 
characters within a pair of apostrophes or quotation marks.  The choice of 
apostrophe or quotation mark controls the H.O. bit, as for character 
constants.  Likewise, to place an apostrophe or quote within a string 
delimited by the same character, just double up the apostrophe or quotation 
mark: 
 
 'This isn''t bad!'  - generates  --This isn't bad-- 
 "He said ""Hello""" - generates  --He said "Hello"-- 
 
 
 Binary integer constants consist of a sequence of 1 through 32 zeros 
or ones preceded by a percent sign ("%").  Examples: 
 
   %10110010 
   %001011101 
   %10 
   %1100 
 
 Decimal integer constants consist of strings of decimal digits without 
any preceding characters.  E.g.,  25,  235,  8325, etc.  Decimal constants 
may be (optionally) preceded by a minus sign. 
 
 Hexadecimal constants consist of a dollar sign ("$") followed by 
a string of hexadecimal digits (0..9 and A..F).  Values in the range $0  
through $FFFFFFFF are allowed. 
 
 Set constants are only required at the full and  extended compliance 
levels.  A set constant consists of a list of items surrounded by braces, 
e.g., {0,3,5}.  For more information, see the .SET directive. 
 
 
 
   Address Expressions 
   ------------------- 
    
 Most instructions and many pseudo-opcode/assembler directives require 
operands of some sort.  Often these operands contain some sort of address 
expression (some, ultimately, numeric or string value).  This proposed  
standard defines the operands, precision, accuracy, and available operations  
that constitutes an address expression. 
 
Precision: all integer expressions are computed using 32 bits.  All string 
expressions are computed with strings up to 255 characters in length.  All 
floating point operations are performed using IEEE 80-bit extended floating 
point values (i.e., Apple SANE routines).  All set operations are performed 
using 32 bits of precision. 
 
Accuracy: all integer operations (consisting of two 32-bit operands and an 
operator on those operands) must produce the correct result if the actual 
result can fit within 32 bits.  If an overflow occurs, the value is truncated 
and only the low order 32 bits are retained.  If an underflow occurs, zero 
is used as the result.  If an overflow or underflow occurs, a special bit will 
be set (until the next value is computed) that can be tested by the ".IFOVR" 
and ".IFUNDR" directives.  Other than that, such errors are ignored.  All 
arithmetic is performed using unsigned arithmetic operations. All 
floating point operations follow the IEEE (and Apple SANE) suggestions, and 
are otherwise ignored by the assembler.  Any string operation producing a 
string longer than 255 characters produces an assembly time error.  All set 
operations must be exact. 
 
Integer operations: The following integer operations must be provided at all 
compliance levels: 
 
+ (binary) adds the two operands. 
- (binary) subracts second operand from the first. 
* multiplies the two operands. 
/ divides the first operand by the second. 
\ divides the first operand by the second and returns the remainder. 
& logically ANDs the two operands. 
| logically ORs the two operands. 
^ logically XORs the two operands. 
 
 
= 
<> These operators compare the two operands (unsigned comparison) and 
<  return 1 if the comparison is true, 0 otherwise. 
> 
<= 
>= 
 
- (unary) negates (2's complement) the operand 
~ (unary) complements (inverts - 1's complement) the operand 
 
 
The following operators must be provided at the full and extended compliance 
levels: 
 
<- shifts the first operand to the left the number of bits specified by the 
   second operand. 
-> shifts the first operand to the right the number of bits specified by the 
   second operand. 
 
@ (unary) subtracts the location counter at the beginning of the current 
          statement from the following address expression. 
 
% (ternary, e.g.: X%Y:Z)  This operator extracts bits Y through Z from X and 
  returns that result right justified. 
 
 
Floating point operations: floating point numbers and operations are required 
only at the full and extended levels.  The following operations must be 
available as well: 
 
+ adds the two operands. 
- subtracts the second operand from the first. 
* multiplies the two operands. 
/ divides the first operand by the second. 
- (unary) negates the operand. 
 
= 
<> These operators compare the two operands and 
<  return 1 if the comparison is true, 0 otherwise. 
> 
<= 
>= 
 
 
 
String operations: strings and string operations are not required at the 
subset level, but the standard recommends their presence.  The following 
string operations must be provided at the full and extended levels: 
 
+ concatenates two strings 
% (ternary, e.g., X%Y:Z) returns the substring composed of the characters in 
  X starting at position Y of length Z.  Generate an error if X doesn't 
  contain sufficient characters. 
 
= 
<> These operators compare the two operands and 
<  return 1 if the comparison is true, 0 otherwise. 
> 
<= 
>= 
 
 
Set operations: sets and set operations are required only at the full and 
extended levels.  The following set operations must be provided: 
 
+  union of two sets  (logical OR of the bits). 
*  intersection of two sets (logical AND of the bits). 
-  set difference (set one ANDed with the NOT of the second set) 
 
=  returns 1 if the two sets are equal, zero otherwise. 
<> returns 1 if the two sets are not equal, zero otherwise. 
<  returns 1 if the first set is a proper subset of the second. 
<= returns 1 if the first set is a subset of the second. 
>  returns 1 if the first set is a proper superset of the second. 
>= returns 1 if the first set is a superset of the second. 
 
% (ternary, e.g., X % Y:Z) extracts elements Y..Z from X and returns those 
  items. 
 
 
In addition to the above operators,  several pre-defined functions are also 
available.  Note that these functions are not required at the subset 
compliance level, only at the full and extended levels: 
 
float(i) - Converts integer "i" to a floating point value. 
trunc(r) - Converts real "r" to a 32-bit unsigned integer (or generates an  
    error). 
valid(r) - returns "1" if r is a valid floating point value, 0 otherwise 
           (for example, if r is NaN, infinity, etc.) 
length(s)- returns the length of string s. 
lookup(s)- returns "1" if s is a valid symbol in the symbol table. 
value(s) - returns value of symbol specified by string "s" in the symbol 
           table. 
type(s)  - returns type of symbol "s" in symbol table.  Actual values 
           returned are yet to be defined. 
mode(a)  - returns the addressing mode of item "a".  Used mainly in macros. 
STR(s)   - returns string s with a prefixed length byte. 
ZRO(s)   - returns string s with a suffixed zero byte. 
DCI(s)   - returns string s with the H.O. bit of its last char inverted. 
RVS(s)   - returns string s with its characters reversed. 
FLP(s)   - returns string s with its H.O. bits inverted. 
IN(v,s)  - returns one if value v is in set s, zero otherwise. 
 
 
The following integer functions must be present at all compliance levels: 
 
LB(i), 
LBYTE(i), 
BYTE(i)  - returns the L.O. byte of i. 
HB(i), 
HBYTE(i) - returns byte #1 (bits 8-15) of i. 
BB(i), 
BBYTE(i) - returns bank byte (bits 16-23) of i. 
XB, 
XBYTE(i) - returns H.O. byte of i. 
LW(i), 
LWORD(i), 
WORD(i)  - returns L.O. word of i. 
HW(i), 
HWORD(i) - returns H.O. word of i. 
WORD(i) 
 
Pack(i,j)- returns a 16-bit value whose L.O. byte is the L.O. byte of i and 
    whose H.O. byte is the L.O. byte of j. 
     
Pack(i,j,k,l)- returns a 32-bit value consisting of (i,j,k,l) where i is the 
        L.O. byte and l is the H.O. byte.  Note: l is optional.  If 
        it isn't present, substitute zero for l. 
 
 
 
 
 The order of evaluation for an expression is strictly left to right 
unless parentheses are used to modify the precedence of a sub-expression. 
Since parentheses are used to specify certain indirect addressing modes, the 
use of paretheses to override the strict left-to-right evaluation order 
introduces some ambiguity.  For example, should the following be treated 
as jump indirect through location $1001 or jump directly to location $1001? 
 
  JMP ($1000+1) 
 
The ambiguity is resolved as follows: if the parenthesis is the first char- 
acter in the operand field, then the indirect addressing mode is assumed. 
Otherwise, the parentheses are used to override the left-to-right precedence. 
The example above would be treated as a jump indirect through location $1001. 
If you wanted to jump directly to location $1001 in this fashion, the state- 
ment could be modified to 
 
  JMP 0+($1000+1) 
 
so that the parenthesis is no longer the first character in the operand 
field. 
 
 The use of parentheses to override the left-to-right precedence is 
only required at the full and extended compliance levels.  It is not 
required at the subset compliance level. 
 
 
 
 
 
    Expression Types 
    ---------------- 
 
 Expressions, in addition to having a value associated with 
them, also have a specific type.  The three basic types of expressions are 
integer, floating point, and string expressions.  Integer expressions can 
be broken down into subtypes as well.  A hierarchical diagram is the easiest 
way to describe integer expressions: 
 
 
 
integers ------ constants ------------ user defined (enumerated) types 
     |   | 
     |   +----- simple numeric constants 
     | 
     | 
     +-- addresses ------------ direct page addresses 
    | 
    +----- absolute addresses --- full 16-bit 
    |                          | 
    |      +- relative 8-bit 
    | 
    +----- long addresses 
 
 This diagram points out that there are two types of integer expres- 
sions: constants and addresses.  Further, there are two types of constants 
and four types of addresses.  Before discussion operations on these different 
types of integer values, their purpose should be presented. 
 
 Until now, most 65xxx assembler did little to differentiate between 
the different types of integer values.  In this proposed standard, however, 
strong type checking is enforced.  Whereas in previous assemblers you could 
use the following code: 
 
 label equ $1000 
  lda #Label 
  sta Label 
 
such operations are illegal within the confines of the new standard.  The 
problem with this short code segment is that the symbol "label" is used as 
both an integer constant (in the LDA instruction) and as an address  
expression (in the STA instruction).  To help prevent logical errors from 
creeping into a program, the assembler doesn't allow the use of addresses 
where constants are expected and vice versa.  To that end, a new assembler 
directive, CON, is used to declare constants while EQU is used to declare 
an (absolute) address.  Symbols declared by CON cannot be (directly) used 
as an address.  Likewise, symbols declared by EQU (and others) cannot be 
used where a constant is expected (such as in an immediate operand). 
 
 Although this type checking can be quite useful for locating bugs 
within the source file, it can also be a source of major annoyance.  Some- 
times (quite often, in fact) you may want to treat an address expression 
as a constant or a constant expression as an address.  Two functions are 
used to coerce these expressions to their desired form: PTR and OFS. 
PTR(expr) converts the supplied constant expression to an address expression. 
OFS(expr) converts the supplied address expression to a constant expression. 
The following is perfectly legal: 
 
 Cons1 CON $5A 
 DataLoc EQU $1000 
  lda #OFS(DataLoc) 
  sta PTR(Cons1) 
 
For more information, see the section on assembler directives.  PTR and OFS 
are required at all compliance levels of this proposed standard. 
 
 While any constant value may be used anywhere a constant is allowed, 
the 65c816 microprocessor must often differentiate between the various types 
of address expressions.  This is particularly true when emitting code since 
the length of an instruction depends on the particular address expression. 
If an expression contains only constants, direct page values, absolute 
values, or long values,  there isn't much of a problem.  The assembler uses 
the specified type as the addressing mode.  If the expression contains mixed 
types, the resulting type is as follows: 
 
Expression contains:    Result is: 
 |      | 
 |      | 
 +------------+-- Constants  - Constant 
 |       | 
 +-- Direct   |    -       Direct 
       | 
       +--+  Absolute  - Absolute 
       | 
       +--+- Long   - Long 
 
Allowable forms: 
 
 constant 
 direct  constant+direct 
 absolute constant+absolute 
 long  constant+long   
   absolute+long 
   constant+absolute+long 
  
 
This says that if you expression contains only constants, then the 
result is a constant.  If it contains a mixture of constants and direct 
page addresses, the result is a direct page address.  Note that direct page 
addresses cannot be mixed with other types of addresses.  An error must be 
reported in this situation (although you could get around it with an 
expression of the form "abs+OFS(direct)").  Likewise, adding a constant to 
an absolute address produces an absolute address.  Adding an absolute and 
a long address produces a long address, etc. 
 
 Sometimes, you need to force an expression to be a certain type. 
For example, the instruction "LDA $200" normally assembles to a load 
absolute from location $200 in the current data bank.  If you need to force 
this to location $200 in bank zero, regardless of the content of the DBR, 
the address expression must be coerced to a long address.  Coercion of this 
type is accomplished with the ":D", ":A", ":L", and ":S" expression suffixes. 
To force "LDA $200" to be assembled using the long address mode, the in- 
struction is modified to be "LDA $200:L".  The coercion suffix must always 
follow the full address expression.  The ":S" (for short branches) suffix 
is never required, since a short branch (for BRA and BSR) is always assumed, 
but it is included for completeness.  For BRA and BSR, the ":L" suffix is 
used to imply a long branch (+/- 32K) rather than the long addressing mode. 
 
 Caveats: If ":D" or ":A" is used to coerce a large address expression 
to direct or absolute, the high order byte(s) of the expression are truncated 
and ignored.  The assembler must assume that when a programmer uses these 
constructs he knows exactly what he's doing.  Therefore, "LDA $1001:D" will 
happily assemble this instruction into a "LDA $01" instruction despite the 
actual value of the address expression. 
 
 
 
 
 
Addressing Mode Specification 
----------------------------- 
 
 65c816 addressing modes are specified by certain symbols in the op- 
erand field.  A quick rundown follows: 
 
 Addressing mode  Format(s)  Example(s) 
 ---------------  ------------------ ---------------------- 
 
 Immediate  #<expression>  LDA #0 
    =<expression>  CMP =LastValue 
 
 Direct Page  <expression>  LDA DPG 
    <expression>:D  LDA ANY:D 
 
 Absolute  <expression>  LDA ABS 
    <expression>:A  LDA ANY:A 
 
 Long   <expression>  LDA LONG 
    <expression>:L  LDA ANY:L 
 
 Accumulator  {no operand}  ASL 
       INC 
 
 Implied   {no operand}  CLC 
       SED 
 
 Direct, Indirect, 
 Indexed by Y  (<direct expr>),Y LDA (DPG),Y 
    (<direct expr>).Y LDA (ANY:D).Y 
 
 Direct, Indirect, 
 Indexed by Y, Long [<direct expr>],Y LDA [DPG],Y 
    [<direct expr>].Y LDA [DPG].Y 
 
 Direct, Indexed by X, 
 Indirect  (<direct expr>,X) LDA (DPG,X) 
    (<direct expr>.X) LDA (ANY:D.X) 
 
 Direct, Indexed by X <direct expr>,X  LDA DPG,X 
    <direct expr>.X  LDA DPG.X 
 
 Direct, Indexed by Y <direct expr>,Y  LDX DPG,Y 
    <direct expr>.Y  LDX DPG.Y 
 
 Absolute, Indexed by X <abs expr>,X  LDA ABS,X 
    <abs expr>.X  LDA ANY:A.X 
 
 Long, Indexed by X <long expr>,X  LDA ANY:L,X 
    <long expr>.X  LDA LONG.X 
 
 Absolute, Indexed by Y <abs expr>,Y  LDA ANY:A,Y 
    <abs expr>.Y  LDA ABS.Y 
 
 Program Counter 
 Relative (branches) <expression>  BRA ABS 
    @<expression>  BRA @ABS 
 
 PC Relative (PSH) @<expression>  PSH @ABS 
 
 Absolute, Indirect (<abs expr>)  JMP (ABS) 
 
 Absolute, Indexed, 
 Indirect  (<abs expr>,X)  JMP (ABS,X) 
    (<abs expr>.X)  JMP (ABS.X) 
 
 Direct, Indirect (<dpg expr>)  LDA (DPG) 
       STA (ANY:D) 
 
 Stack Relative  <expr8>,S  LDA 2,S 
    <expr8>.S  LDA 2.S 
 
 Stack Relative, 
 Indirect, Indexed (<expr8>,S),Y  LDA (2,S),Y 
    (<expr8).S),Y  LDA (2.S),y 
    (<expr8),S).Y  LDA (2,S).y 
    (<expr8).S).Y  LDA (2.S).y 
 
 Block Move  <long expr>,<long expr> MVN LONG,LONG 
       MVP LONG,LONG 
 
 
 <dpg expr>, DPG- Any direct page expression or symbol. 
 <abs expr>, ABS- Any absolute expression or symbol. 
 <long expr>, Long- Any long expression or symbol. 
 expr8-   Any expression evaluating to a value less than 
    256. 
 
 
Note: the only real difference between the existing standard and the proposed  
standard is that the period (".") can be used to form an indexed address ex- 
pression.  This is compatible (in practice, as well as philosophy) with the  
record structure mechanism supported by this proposed standard.  This syntax  
for the various addressing modes is required at all compliance levels. 
 
 Suggestion: (<dpg expr>):L,  (<dpg expr>):L,Y, and (<dpg expr):L.Y  
should be allowed as substitutes for [<dpg expr>],  [<dpg expr>],Y, and  
[<dpg expr].Y, respectively.  This, however, is not required by this proposed  
standard. 
 
 
 
 
 
 
Assembler Directives and Pseudo-Opcodes 
--------------------------------------- 
 
 An assembler directive is a message to the assembler to change some 
status or otherwise affect the assembly operation.  It does not generate any 
object code.  A pseudo-opcode, on the other hand, is not a standard 65c816 
instruction but does generate object code.  Examples of assembler directives 
include instructions that turn the listing on or off, define procedures, 
equate labels to values, etc.  Examples of pseudo-opcodes include instructions 
like .BYTE which emit bytes of object code based on the instruction's 
parameters. 
 
 
Equates: 
-------- 
 
 Probably the most important assembler directives are the equates. 
The equate directives let you associate a value and a type with a symbol. 
The possible equates use the syntax: 
 
 <label>  .EQU <16-bit value> 
 <label>  .EDP <8-bit value> 
 <label>  .EQL <24-bit value> 
 <label>  .CON <32-bit value> 
 <label>  .FCON <SANE floating point value> 
 
All except .FCON are required at all compliance levels.  .FCON is required 
at the full and extended levels. 
 
 .EQU lets you define a absolute symbol; an address whose value is 
relative to the DBR.  An error should be generated if the value in the 
operand field requires more than 16 bits.  The type of the operand expression 
is ignored.  It may be a constant expression, a direct page expression, or 
even a long address expression.  As long as it's an integer expression an 
can fit into 16 bits, it's quite acceptable. 
 
 .EDP (equate to direct page) is used to define direct page symbols. 
Again, the operand field may be of any integer type as long as the result 
fits into 8 bits.  A recommended synonym for .EDP is .EPZ (equate to page 
zero) in deference to the 6502's zero page addressing mode. 
 
 .EQL (equate long) defines long address expressions.  As usual, the 
operand field may contain any integer expression that fits within 24 bits. 
 
 .CON (constant) is used to define integer numeric constants.  Any 
32 bit numeric value may be specified in the operand field. 
 
 .FCON (floating point constant) is used to declare symbolic floating 
point constants.  Such constants must be stored in the symbol table as 
80-bit SANE extended values. 
 
 In addition to the typed equates, this proposed standard also allows 
an untyped equate, which takes the form: 
 
  <label>  = <operand> 
 
where "<operand>" is any valid operand that may appear in the operand field 
of any instruction.  <operand>'s type may be integer, string, floating point 
and may also include an addressing mode.  The following are all legal: 
 
  lbl = 5 
  lbl = 5.5 
  lbl = "Five" 
  lbl = Array,X 
  lbl = (dp,s),y 
 
Labels defined by "=" may appear anywhere the operand field specified for 
that label is allowed.  In general, a simple string substitution should be 
performed when a label defined by "=" is used.  Note: a label declared by 
"=" can be redefined without error throughout the program.  The "=" directive 
is required only at the full and extended compliance levels. 
 
 
 
Data Definitions: 
----------------- 
 
 While the equates are probably the most important assembler  
directives, the data definition instructions are probably the most important 
pseudo-opcodes around.  These instructions are classed into four groups 
determined by the types of operands they accept.  In the following paragraphs 
all optional items are enclosed within braces. 
 
 The first group of data reservation instructions accept any integer 
type expression as operands.  They are: 
 
 {label}  .BYTE {expr1, expr2, ..., exprn} 
 {label}  .WORD {expr1, expr2, ..., exprn} 
 {label}  .LONG {expr1, expr2, ..., exprn} 
  
If a label is present, it is treated as a statement label within the current 
segment and assigned the value of the location counter before any bytes are 
emitted.  For the .BYTE opcode, one byte of data is emitted for each operand 
in the operand field, that byte being the L.O. byte of each expression.   
Operands are purely optional.  If no operand appears, then an indeterminate 
value is emitted. The .WORD opcodes outputs two bytes for each expression in  
the operand field (or two indeterminate bytes if no operand is present).  The 
.LONG instruction outputs four bytes for each operand.  These three pseudo- 
opcodes must be present at all compliance levels. 
 
 The next group of pseudo-opcodes are used to create tables of 
addresses.  As such, they only allow symbols that have been defined by 
.EQU, .EQL, "=" (as applicable), statement labels, procedure labels, and 
segment labels in their operand fields.  They are: 
 
 {label}  .OFFS expr1 {,expr2, ..., exprn} 
 {label}  .ADRS   expr1 {,expr2, ..., exprn} 
 {label}  .PTR expr1 {,expr2, ..., exprn} 
 
.OFFS outputs two bytes for each operand;  .ADRS outputs three bytes for 
each operand; and .PTR outputs four bytes for each operand.  These three 
pseudo-opcodes are only required at the full and extended compliance levels. 
 
 The third group of declarations are used to create constant tables. 
As such, they only allow symbols declared by .CON.  They are: 
 
 {label}  .SHORT  expr1 {,expr2, ..., exprn} 
 {label}  .INTEGER expr1 {,expr2, ..., exprn} 
 {label}  .LONGINT expr1 {,expr2, ..., exprn} 
 
These pseudo-ops output one, two, and four bytes respectively.  These 
pseudo-opcodes are not required at the subset compliance level, they are 
required only at the full and extended levels. 
 
 Note: non-symbolic constants are allowed in any of the above 
pseudo-opcodes.  Only symbols should have their type information checked. 
 
 The last group of data declaration pseudo-opcodes are used to 
initialize floating point values.  These pseudo-ops are: 
 
 {label}  .FLOAT  {item1, item2, ..., itemn} 
 {label}  .DOUBLE  {item1, item2, ..., itemn} 
 {label}  .EXTENDED {item1, item2, ..., itemn} 
 {label}  .COMP  {item1, item2, ..., itemn} 
 
each instruction generates operands of 4, 8, 10, or 8  bytes in length, 
respectively.  If the operand field is left blank, the corresponding bytes 
contain an indeterminate value, but the assembler should initialize them to 
NaN (not a number).  These four pseudo-opcodes are required only at the 
full and extended levels. 
 
 Although not required by the standard, the following data declaration 
directives are recommended and should be supported: 
 
 {label}  .HBYTE  expr1 {,expr2, ..., exprn} 
 {label}  .BBYTE  expr1 {,expr2, ..., exprn} 
 {label}  .XBYTE  expr1 {,expr2, ..., exprn} 
 {label}  .HWORD  expr1 {,expr2, ..., exprn} 
 
the first three reserve one byte of memory for each operand and store the 
H.O (bits 8-15), bank (bits 16-23), or extra byte (bits 24-31) respectively. 
.HWORD reserves two bytes composed of bits 16-31 for each operand. 
 
 
Arrays: 
------- 
 
 Space for arrays and data tables can be reserved using the data 
declaration statement mentioned above in conjunction with the "DUP" operator. 
DUP is a binary operator that takes the form: 
 
  count DUP (list) 
 
where count is some constant value and list is a (possibly empty) list of 
values.  The items in (list) are repeated "count" times.  For example, the 
following .BYTE statement reserves space for an array of 64 bytes and 
initializes each byte to zero: 
 
 MyArray  .BYTE  64 DUP (0) 
 
The following statement reserves 256 bytes consisting of the values 1, 2, 3, 
4, 5, 6, 7, and 8 repeated 32 times: 
 
 MyArray  .BYTE  32 DUP (1,2,3,4,5,6,7,8) 
 
 
 The DUP operator is fully recursive.  That is, one of the items in 
the list may, itself, be a list defined by the DUP operator.  For example, 
 
 Example  .BYTE  16 DUP (0,1,2 DUP (3,4,5)) 
 
reserves 128 bytes consisting of the list "0,1,3,4,5,3,4,5" repeated 16 times. 
 
 If the DUP list is empty,  e.g., "16 dup ()", then exactly one item 
is reserved for each entry, but it is not initialized.  The following example 
reserves space for 128 uninitialized words: 
 
 OffsetTable .WORD  128 DUP () 
 
 
 
 
Type definitions: 
----------------- 
 
 Enumerated data types can be declared with the ".TYPE" directive. 
This directive takes the form: 
 
 {label}  .TYPE  item1 {,item2, ..., itemn} 
 
The items in the list are assigned consecutive values starting from zero. 
For example, in the following .TYPE statement, the symbols red, green, and 
blue are assigned the values zero, one, and two, respectively: 
 
 colors  .TYPE  red,green,blue 
 
The symbols in the operand field of a .TYPE statement must be unique and 
undefined elsewhere (within the current scope, more on that later).  The 
.TYPE statement above is almost identical to the statements: 
 
  red .con 0 
  green .con 1 
  blue .con 2 
 
However, there is one major difference.  The .TYPE statement also defines a 
symbol specified in the label field.  This symbol can be used as a pseudo- 
opcode to reserve space for values of the specified type.  In the example 
above, "colors" could be used as a pseudo-opcode to reserve space for the 
values red, green, and blue.  To differentiate type declarations from other 
instructions, a special lead-in character is used.  The slash ("/") is 
recommended by this standard, but the user should have the option of choosing 
this character via a setup program for the assembler.  From the example 
above, colors could be used as a pseudo-opcode in the following manner: 
 
 Christmas /colors  red,green 
 Ocean  /colors  blue,green 
 Sky  /colors  blue 
   /colors  red 
 Primaries /colors  red,blue,green 
 
Unlike other data reserving pseudo-opcodes, a "/colors" definition only 
allows symbols that appear in the operand field of the associated .TYPE 
statement or one of those symbols in a expression that contains a single 
such symbol plus or minus a numeric constant, as long as the result is still 
within the range of symbols declared for that type.  E.g., 
 
 Okay  /colors  red,green+1,blue 
 NotOkay1 /colors  blue+2  ;Outside allowable range 
 NotOkay2 /colors  red+blue ;can't add two such symbols 
 NotOkay3 /colors  $25  ;Not red, green, or blue 
 
If you need to coerce an expression to the proper form, simply use the type 
name as a pseudo-function.  E.g., 
 
 ThisIsOkay /colors  colors(0),blue ;Same as red, blue 
 
If the operand is not appropriate, the assembler should generate a warning 
and emit the code as though the .BYTE statement were used. 
  
 
 If there isn't a label starting in column one of a .TYPE statement 
then the symbols defined in the operand field are applied to the previous 
.TYPE statement.  This allows you to create .TYPEs where several symbols 
(which couldn't possibly fit on a single line) are declared as constants. 
E.g., 
 
 colors  .TYPE  red, yellow, blue 
   .TYPE  orange, green violet 
   .TYPE  brown, black, white 
 
All of these symbols will be associated with "colors".   A maximum of 256 
symbols can be associated with a symbol via the .TYPE statement.  Whenever 
the data reservation form is used, exactly one byte is reserved for each 
item in the operand field.  If you need to reserve more than a single byte 
for each item, use the record declarations described next. 
 
 The DUP operator can be used to define enumerated data type arrays, 
e.g., 
 
 LotsOfRed /colors  16 DUP (red) 
 
 
 
 Another form of the .TYPE statement allows you to declare byte 
subrange values.  A definition of this type takes the form: 
 
 label  .TYPE start..stop 
 
where start and stop are constant values in the range 0..255 and  
start <= stop.   Examples: 
 
 LessThan10 .TYPE 0..9 
 Nibbles  .TYPE 0..$F 
 PrimaryColors .TYPE red..blue   ;From above, is red, yellow, blue 
 
 
 Implementation of the .TYPE statement is required only at the full 
and extended compliance levels. 
 
 
 
Records: 
-------- 
 
 A record data structure can be defined with the ".RECORD" and ".ENDR" 
directives using the syntax: 
 
 label .RECORD 
       <data declarations> 
  .ENDR 
 
This creates a template, but does not generate any code.  An example might 
be: 
 
 CursorPosn .RECORD 
 ROW  .BYTE 0 
 COLUMN  .BYTE 0 
   .ENDR 
 
This definition creates the type "CursorPosn".  Like the .TYPE definitions, 
the symbol defined by .RECORD can be used as a pseudo-opcode to reserve 
storage for a variable.  For example, to declare a variable of type 
"CursorPosn" the following statement is used: 
 
 MyCursor /CursorPosn 
 
This statement reserves two bytes, initialized to zeros, at the current 
location counter. 
 
 Access to the fields of the record is accomplished by using the 
"." operator, just like Pascal.   E.g., 
 
   lda MyCursor.ROW ;Fetches first byte. 
   lda MyCursor.COLUMN ;Fetches the second byte. 
 
 
 In the example above, the ROW and COLUMN fields of each variable 
declared with CursorPosn are always initialized to zero.  Any other value 
could have been used by substituting the appropriate value, or an  
indeterminate value could have been specified by the definition: 
 
 CursorPosn .RECORD 
 ROW  .BYTE 
 COLUMN  .BYTE 
   .ENDR 
 
 
 On occasion, you may want each record variable definition to 
specify the initial values.  This can be accomplished by specifying 
parameters in the record definition.  Parameters are specified by the 
symbols:  ?0, ?1, ..., ?9.  ?0 corresponds to the first parameter, ?1 to 
the second, etc.   Consider the following record and variable definitions: 
 
 CursorPosn .RECORD 
 ROW  .BYTE ?0 
 COLUMN  .BYTE ?1 
   .ENDR 
 
 HomePosn /CursorPosn 0,0 
 LowerRight /CursorPosn 23,79 
 MyCursor /CursorPosn 5,10 
 
 
 The only problem with this definition form is that each CursorPosn 
variable must supply exactly two operands.  Sometimes you may want to have 
a default value in the event an operand isn't specified.  This is accomplished 
using a record defintion of the form: 
 
 CursorPosn .RECORD ?0=0,?1=0 
 ROW  .BYTE ?0 
 COLUMN  .BYTE ?1 
   .ENDR 
 
This definition instructions the assembler to allow zero or more parameters, 
defaulting ?0 and ?1 to zero if their respective entries aren't present. 
The .DEFAULT directive can also be used, particularly if you run out of 
room on the .RECORD line: 
 
 OpenRec  .RECORD  ?0=0, ?1=1 
   .DEFAULT ?2=ZRO('Hello there'), ?3=2 
 FirstItem .WORD  ?0 
   .LONG  ?3 
 SecondItem .BYTE  ?1, ?2 
   .ENDR 
 
 
 Record definitions are required at the full and extended compliance 
levels, they are not required at the subset compliance level. 
 
 
 
Sets: 
----- 
 
 Bit string types can be declared using the .SET directive.  .SET is 
used in a manner quite similar to .TYPE except the items in the operand field 
can be any constant whose value is less than 32.  Up to 32 items may  
appear in the operand field of a .SET definition.  The syntax is 
 
 label  .SET item1 {,item2, ..., itemn}  ;n <= 32. 
 
An alternate form is to specify the name of some type variable in the operand 
field.  The following definition creates a set of integers in the range 
0..9: 
 
 LessThan10 .TYPE 0..9 
 SetOfDigits .SET LessThan10 
 
 
 Declaring a set variable is quite similar to declaring an enumerated 
type variable or a record variable: simply use the set name as a pseudo-opcode 
prefaced by a "/": 
 
 Digits  /SetOfDigits 
 
 
 Set constants are specified by placing the items in the set within 
a pair of braces.  E.G.: 
 
 BitValues .TYPE  0..7 
 SetOfBitValues .SET  BitValues 
 Bits  /SetOfBitValues {0,1,2,3} 
 ; 
 ; 
   lda  #{0,2,7} 
   sta  Bits 
 
 
 The assembler, by default, should allow set constants composed of 
the integer values 0..31.  This allows programmers to easily deal with bits 
by bit numbers rather than the integers those bit patterns represent.  For 
example, to strip all but the H.O. two bits in the (8-bit) accumulator, the 
instruction "AND #{6,7}" makes a lot more sense than "AND #$C0".  All other 
entities appearing within "{" and "}" must appear somewhere in the operand 
field of a .SET statement (or must be a member of a .TYPE definition if that 
type appears in the operand field of a .SET). 
 
 
 
Macros: 
------- 
 
 Macros are created using the .MACRO and .ENDM directives.  The syntax 
for a macro definition is 
 
 label  .MACRO  {default parameter values} 
        <macro body> 
   .ENDM 
 
Macros are invoked by placing an underscore, followed by the macro name (the 
label in the .MACRO statement).  The user should be able to change the macro 
lead-in character from underscore to some other character via an assembler 
set up program. 
 
All labels declared within the macro are local to that definition unless the 
".GLOBAL" directive is used to extend their scope.  In general, global 
macro labels (except, possibly, those defined by "=") are not useful anyway 
since a duplicate label error might occur on the second invocation of the  
macro. 
 
 The macro body consists of a sequence of assembler statements.  Most 
reasonable statements may be included in the macro body.  The standard does 
not required nested macro definitions.  Nor need the macro definitions allow 
.RECORD, .TYPE, or .SET definitions (since labels are local to the macro, 
such definitions are dubious anyway). 
  
 Macro parameters are specified using ?0, ?1, ..., ?9, just as for 
.RECORD definitions.  "?#" can be used to determine the actual number of 
parameters present.  "?:expr" can be used to select a parameter using a 
numeric expression.  For example, "?:?#-1" returns the value of the last 
parameter specified.  Default values for the parameters can be specified 
in the .MACRO operand field, or in a .DEFAULT statement, just like specifying 
default values for .RECORD parameters.  E.g., 
 
  MyMacro .MACRO  ?0=0, ?1=2 
   .DEFAULT ?2="Hello there" 
   .BYTE  ?0 
   .WORD  ?1 
   .BYTE  ?2 
   .ENDM 
 
then: 
 
   _MyMacro 10,20 
 
generates the bytes: 
   10, 20, 0, Hello there 
 
 
 Macros, by the very nature, allow a variable number of parameters. 
If more parameters are specified than there are references for, the extra 
parameters are ignored.  If fewer parameters are specified than there are 
references for, the additional references will be treated as undefined 
symbols.  If you want to be able to force the user to enter an exact number 
of parameters, then use the ?# in the default field to specify a fixed number 
of parameters.  The following macro definition requires the user to enter 
exactly two parameters whenever TwoParms is invoked: 
 
 TwoParms .MACRO ?#=2 
   lda ?0 
   sta ?1 
   .ENDM 
 
If the number of parameters is fixed at a certain value, default values 
are not allowed in the macro definition. 
 
 Since macro parameters, in a macro invocation, are separated by 
commas, you cannot directly create a macro of the form: 
 
 LDAIX  .MACRO ?#=1 
   lda ?0 
   .ENDM 
 
and invoke it by: 
 
   _LDAIX LBL,X 
 
intending the "LDA LBL,X" instruction to be generated.  Instead, the macro 
mechanism will think that LBL and X are two different parameters and generate 
an error since only a single parameter is allowed.  The "<<" and ">>" symbols 
are used as an escape mechanism to parenthesize such operands.  To handle the 
case above, the following statement could be used: 
 
   _LDAIX <<LBL,X>> 
 
and this would generate the instruction "LDA LBL,X". 
 
 The lookup, value, type, and mode functions are quite useful for 
dealing with macro parameters.  The exact values returned by these functions 
will be described at a later time. 
 
 For additional information on macros and dealing with macro para- 
meters, see the sections on conditional assembly and while loops. 
 
 Macros are required only at the full and extended compliance levels. 
 
 
 
Address Expression Functions: 
----------------------------- 
 
 Format: 
 
  label .FUNC {default parameter values} 
        <function body> 
   .RETURN expr 
   .ENDF 
 
 The .FUNC statement lets programmers define their own address 
expression functions that can be used in operand fields of assembly language 
statements.  The function body typically contains a sequence of equates 
and other value computing statements;  it may not contain any code generating 
statements. 
 
 Like a macro definition, all symbols defined inside an address 
expression function are local to that function.  Likewise,  default parameters 
may be declared in the operand field of the .FUNC statement or via the 
.DEFAULT statement.  Alternately, you can specify that a fixed number of 
parameters are required by using the "?#=expr" item in the operand field 
of the .FUNC statement. 
 
 The expression following the .RETURN statement is the value returned 
by the addressing mode function.  Note that more than one .RETURN may appear 
within the function (perhaps within the confines of a conditional assembly 
sequence).  If more than one .RETURN statement is encountered, all but the 
last are ignored.  The expression returned in the .RETURN operand field may 
contain addressing modes in addition to the actual expression value.  In 
general, anything allowed as a macro parameter can be returned as an address 
expression value. 
 
 An address expression function is invoked by placing the function 
name in some other expression followed by the parameters enclosed within 
parentheses.  The parentheses are required even if the parameter list is 
empty (just like the "C" programming language).  Examples follow: 
 
 StripLONibble .FUNC ?#=1 
 value  = ?0 AND $F0 
   .RETURN value 
   .ENDF 
 ; 
 AppendTXT .FUNC ?#=1 
 string  = ?0 + ".TXT" 
   .RETURN string 
   .ENDF 
 ; 
    . 
    . 
    . 
   LDA #StripLONibble($FF) 
    . 
    . 
    . 
   .BYTE AppendTxt("MyString") 
 
The LDA instruction generates  
 
   LDA #$F0,   
 
the .BYTE statement becomes 
 
   .BYTE "MyString.TXT" 
 
The latter example demonstrates that address expression functions can 
return any valid type.  This includes strings, records, sets, and any 
other entity allowed in an operand field.  Consider the following: 
 
  LBLX .FUNC ?#=2 
  L = ?0-?1,X 
   .RETURN L 
   .ENDF 
 
   LDA LBLX($100,10) 
 
This generates the code: 
 
   LDA $100-10,X 
 
 
 Address expression functions are required only at the full and 
extended compliance levels. 
 
 
 
 
The Label Type 
-------------- 
 
 The ".LABEL" directive is used to declare a valueless symbol, that is, one 
which 
is defined but is assigned no particular value.  The syntax for the .LABEL 
directive is: 
 
   .LABEL symbol1 {, symbol2, ..., symboln} 
 
Each symbol appearing in the operand field is inserted into the symbol table as 
a "label" 
typed symbol. 
 
 Label-typed symbols are useful mainly in macros and in the operand fields of 
conditional assembly statements.  The only operations you can perform using 
label-typed 
symbols are "=" and "<>".  Most of the reserved symbols in the assembler (such 
as A, X, 
Y, DBR, D, M, S, etc.) are actually label-typed symbols. 
 
 An example of where you might use a label-typed symbol follows: 
 
  CmpReg .MACRO ?#=2 
   .IF ?0=A 
   cmp ?1 
   .ELSE 
   .IF ?0=X 
   cpx ?1 
   .ELSE 
   .IF ?0=Y 
   cpy ?1 
   .ELSE 
   .PAUSE 
   .ENDIF 
   .ENDIF 
   .ENDIF 
   .ENDM 
 
 The "=" equate can also be used to defined label-typed symbols by specifying a 
label-typed symbol in the operand field, e.g., 
 
  ACC = A 
  XReg = X 
   etc. 
 
Note that the last equate above does not allow you to enter indexed by X 
addressing modes as 
 
   <expression>,XReg 
 
it simply allows you to use a statement of the form: 
 
   .IF XReg=X 
 
and wind up assemblying the code after the ".IF". 
 
 The ".LABEL" directive is required at the full and extended compliance levels; 
it 
is not required at the subset compliance level. 
 
 
 
 
 
Procedures: 
----------- 
 
 At the full and extended compliance levels, the .PROC and .ENDP 
directives can be used to declare 65c816 procedures (subroutines).  Procedure 
declarations take the form: 
 
 procname .PROC {near|far} 
 
      <procedure body> 
 
   .ENDP 
 
If an operand appears after the .PROC statement, it must be either "near" or 
"far".  If no operand appears, "near" is assumed. 
 
 The  procedure name that appears in the label field of the .PROC 
statement is assigned the current value of the location counter at that 
point in the program.  It is also given the type of near procedure or 
far procedure, depending upon the .PROC operand field. 
 
 All labels defined inside a procedure are local to that  procedure 
unless the .GLOBAL directive is used to extend their scope beyond the 
procedure.  Therefore, labels inside one procedure may be reused outside 
that procedure.  If a label inside a procedure is already defined outside 
that procedure an error is not generated, instead the new label supercedes 
the old one INSIDE THE PROCEDURE (scoping rules are the same as for Pascal). 
Procedures may be nested inside one other, the scoping rules used by Pascal 
apply in such situations. 
 
 Inside the procedure, RET can be used in place of RTS or RTL.  The 
assembler will automatically choose the appropriate version depending upon 
whether the procedure is a near or far procedure.  If RTS is used inside a 
FAR procedure or RTL is used inside a NEAR procedure, the assembler will 
generate a warning. 
 
 The assembler automatically assembles JSR using the absolute or 
long addressing mode depending upon the procedure definition.  If the 
assembler supports the JSL mnemonic and a JSL is used to call a NEAR 
procedure, the assembler must generate an warning.  If the address expression 
following a JSR was coerced using the ":A" or ":L" suffixes, no warning will 
be generated if the incorrect distance was specified.  I.e., the following 
does NOT generate an error: 
 
   JSR mysub:L 
    . 
    . 
    . 
  mysub .PROC NEAR 
    . 
    . 
    . 
 
If you use a coercion operator, the assembler assumes that you know what 
you are doing. 
 
 Note that the use of the .PROC statement is optional.  You may con- 
tinue to build and call subroutines without the .PROC directive.  However, 
using .PROC allows the assembler to perform additional type checking on 
certain operations.  An external data flow analysis program can also use the 
procedure declarations to help locate logical bugs in your code. 
 
 .PROC and .ENDP are required at all compliance levels of the 
standard. 
 
 
 
 
 
Module Communication Directives: 
-------------------------------- 
 
 Three directives, .GLOBAL, .PUBLIC, and .EXTERNAL, are used to 
communicate symbolic values across procedure, segment, and module boundaries 
(a module is any one source file which is assembled as a whole unit).  The 
.GLOBAL directive is used to make symbols visible outside of procedures, 
macros, functions, and records.  The .PUBLIC directive is used to make 
certain symbols visible outside the current module.  The .EXTERNAL directive 
is used to make symbols defined outside the current module visible within 
the module. 
 
 The syntax for the .PUBLIC and .GLOBAL directives is identical, it 
takes the form: 
 
   .PUBLIC  symbol1 {,symbol2, ..., symboln} 
and,   .GLOBAL  symbol1 {,symbol2, ..., symboln} 
 
A label is not allowed in the label field of either mnemonic.  The symbols 
specified in the operand field of these two instructions are made known 
outside the procedure or module where they currently reside.  If a procedure 
is nested inside another, the .GLOBAL statement makes its symbols known 
only to the procedure encompassing the nested procedure.  In the following 
example, LCL is known only inside procedure X1 and X2, not to the whole 
program: 
 
  X1 .PROC 
     . 
     . 
  X2 .PROC 
   .GLOBAL LCL 
     . 
     . 
   .ENDP 
   .ENDP 
 
If you wanted to make LCL visible at the level above X1, then another 
.GLOBAL statement must appear inside the X1 procedure declaring LCL to 
be global to that procedure. 
 
 Another alternative is to use the .PUBLIC statement.  Any symbol 
declared public with .PUBLIC is instantly visible throughout the program 
(within the confines of the scoping rules).  However, keep in mind that 
symbols declared as public are visible outside the current module as well 
and may intefere with other modules. 
 
 The .EXTERNAL directive is used to obtain access to symbols declared 
outside the current module.  The syntax for the .EXTERNAL directive is: 
 
  .EXTERNAL  symbol1:type {,symbol2:type, ..., symboln:type} 
 
Again, no label is allowed in the label field of the .EXTERNAL directive. 
The type item is any of NEAR, FAR, CONST, DIRECT, ABS, or LONG.   
 
 Note: symbols declared with "=", .MACRO, .RECORD, .SET, and .TYPE 
may not appear as operands to the .GLOBAL, .PUBLIC, or .EXTERNAL directives. 
 
 These directives are not required at the subset compliance level, 
only at the full and extended levels. 
 
 
 
 
Segments: 
--------- 
 
 Segments are used to group a collection of logically and physically 
related entities within a program.  A segment may contain the program code, 
variables, stack area, direct page area, or other such data.  Typically 
a segment is a load module.  That is, a segment is loaded as a whole into 
memory.  If a program consists of two or more segments, they need not all 
reside in memory at the same time.  The memory manager/loader may load 
segments as needed into memory. 
 
 Segment definitions are required at all compliance levels.  All 
programs must consist of at least one segment (this is a source of minor 
incompatibility with existing assemblers).  The most general form of the 
segment definition is: 
 
 label .SEGMENT TYPE=expr {,ALIGN=expr} {,ORG=expr} {,NOCODE} 
 
       <segment body> 
 
  .ENDS 
 
 
.SEGMENT lets you declare any general type of segment.  The symbol in the 
label field need not be unique, but if it is redefined elsewhere within the 
current scope, it must appear on a .SEGMENT definition whose type is exactly 
the same as the current definition. 
 
 Unlike .PROCs, .MACROs, etc.,  symbols defined inside a segment are 
not local to the segment, but are instantly visible to the reset of the 
module.  If you need to declare local variables within a segment, use the 
.LOCAL and .RELEASE directives. 
 
 The type of segment must be specified in the .SEGMENT operand field. 
The actual segment types will be defined at a later date.  For now, assume  
the types used by the Apple //GS loader are specified after the TYPE= item. 
The segment type describes the attributes of the segment, attributes such 
as whether the segment is relocatable or absolute, fixed or movable, etc. 
 
 The optional ALIGN operand is used to determine some number of bytes 
to which this segment (portion) must be aligned.  If ALIGN=1 , the segment 
will be aligned on any byte boundary.  If ALIGN=2 then the segment will be 
aligned on a word boundary, etc.  Any value between 1 and $10000 can be used 
(ALIGN=$10000 will align the segment on a bank boundary). 
 
 The ORG=expr option can be used to fix the starting address of the 
segment.  This option isn't normally used with code-generating segments. 
It's mainly used to define I/O port addresses and other absolute variables. 
 
 The NOCODE option is used to declare that a segment will not generate 
any code (i.e., it's just used to declare variables).  If any 65c816 instruct- 
ion appears in a NOCODE segment, an error will be generated.  All data 
declaring pseudo-opcodes (e.g., .BYTE) must specify indeterminate values else 
an error will be reported. 
 
 If multiple segments with the same name appear in a module (or 
across modules, for that matter), they will be combined into a single, 
contiguous module by the assembler and/or linker.  Consider the following: 
 
 MyCode  .SEGMENT Type=$1AF 
      . 
      . 
      . 
   .ENDS 
 ; 
 MyData  .SEGMENT Type=$100 
      . 
      . 
      . 
   .ENDS 
 ; 
 MyCode  .SEGMENT Type=$1AF 
      . 
      . 
      . 
   .ENDS 
 
 
Although MyCode appears in two completely disjoint areas, the assembler/linker 
will combine these items into a single segment.  Segments appear in the 
load module in the order they are declared in the source file.  In the 
example above,  segment MyCode appears before segment MyData (even though 
a portion of MyCode appears after MyData, MyCode was still declared before 
MyData). 
 
 Segments may be nested, but they don't follow any scoping rules. 
Declaring one segment inside another is no different that declaring those 
two segments completely separate. 
 
 If you have two separate segments (different names but the same 
type), you can combine them together using the .GROUP directive.  This 
directive takes the form 
 
 label .GROUP seg1, seg2 {,seg3, ..., segn} 
 
Referring to "label" refers to the segment obtained by combining the 
segments in the .GROUP operand field. 
 
 To simply segment usage, there are six predeclared segments.  They 
may be declared with the directives: 
 
  .CODE .DATA .DIRECT 
  .STACK .VAR .CONST 
 
.CODE is used to declare static, code-generating segments which allow 
65c816 instructions.  .DATA is used to declare static data-generating 
segments.  .CONST is identical to .DATA except data items inside the 
.CONST directive are read-only.  Any attempt to write to items inside a 
.CONST segment should generate an error by the assembler or data flow 
analysis programs.  .DIRECT is used to declare segments containing direct 
page variables.  This is a NOCODE segment, so only definitions are allowed, 
initial values are illegal.  .STACK segments are also NOCODE segments.  They 
are useful for declaring stack space down in bank zero.  The .VAR segment 
is used like the .DATA segment, except .VAR segments are NOCODE segments. 
They are used for declaring unintialized variables in main RAM. 
 
 The syntax for these six directives is 
 
 label .xxxx {ALIGN=expr | ORG=expr} 
 
       <segment body> 
 
  .ENDS 
 
 
 
 
The ASSUME Directive 
-------------------- 
 
 With the addition of the bank registers and the mode bits in the 
65c816 processor, an assembler can no longer determine the proper addressing 
mode to use in all circumstances without help from the programmer.  For 
example, if the assembler encounters an instruction of the form "LDA Label" 
and Label is a statement label inside some segment (i.e., not declared with 
EDP, EQU, EQL, or other type-defining directive), it has no idea whether to 
use the direct, absolute, or long addressing mode.  To do so would require 
that the assembler know the current values of the direct page and data bank 
registers at assembly time.  Frankly, it is not possible for the assembler 
to always know the content of these registers, hence the programmer must 
manually supply this information to the assembler.  This information, as well 
as some other useful information, is supplied to the assembler via the 
.ASSUME directive. 
 
 The .ASSUME directive uses the syntax: 
 
   .ASSUME operand1 {,operand2, ..., operandn} 
 
where operand(i) is one of the following: 
 
 DBR:expression24 
 DBR:NOTHING 
 DP:expression16 
 DP:NOTHING 
 M:expression1 
 M:NOTHING 
 X:expression1 
 X:NOTHING 
 CPU:cpu_type 
 
where expression24 is an expression yielding a 24-bit value, expression16 is 
an expression yielding a 16-bit value, expression1 is an expression yielding 
zero or one, NOTHING is a reserved word, and cpu_type is one of {6502, 65c02, 
65802, 65816} or one of the later versions of the 65c816 microprocessor. 
 
 DP (direct page) is used to let the assembler know where the direct 
page register is pointing.  If a segment name is given as the expression, 
that segment must be one that resides in bank zero and is of type DIRECT. 
If the assembler encounters a symbol declared in a segment that is assumed 
to be a direct page segment via the DP:expression operand, the assembler will 
reference that location using the direct page addressing mode (if posssible). 
If the "DP:NOTHING" form is used, the assembler will only use the direct page 
addressing mode if a symbol was declared with the EDP equate.  None of the 
segments will be treated as direct page segments, even if they were declared 
as type DIRECT.  If you want to simultaneously refer to several segments as 
direct page segments, group them together using the .GROUP directive and 
specify the group name as the expression value after the DP:, i.e., 
 
 DPGroup  .GROUP DPSeg1, DPSeg2, DPSeg3 
   .ASSUME DP:DPGroup 
 
By default, the assembler should assume DP:NOTHING. 
 
 
 DBR is used to tell the assembler which segment/bank the DBR (data  
bank register) points at.  References to variables within that segment will 
be assembled as absolute references (unless that segment name is also 
specified after DP:expr, in which case the direct page addressing mode will 
be used, if possible).  If DBR:NOTHING is specified, absolute addressing will 
be used only for those symbols declared via EQU, all other references will 
be assumed to be long references.  Note that the H.O. eight bits of the 
24-bit expression are used.  Therefore, to set the DBR assumption to an 
absolute bank in memory, an expression of the form: 
 
   .ASSUME DBR:$200000 ;Assume DBR=$20 
 
must be used.  By default, the assembler should assume DBR:NOTHING. 
 
 Normally, a programmer should use "#" and "=" to specify eight or 
sixteen bit immediate operand sizes.  To help ensure upwards compatibility 
with existing source code, a mechanism has been added whereby the "#" is 
used and the .ASSUME directive controls the size of immediate operands.  This 
task is achieved using the M:expr and X:expr operands.  Normally the assembler 
defaults to M:NOTHING and X:NOTHING.  In this mode,  "#" specifies 8-bit 
immediate operands and "=" specifies 16-bit operands.  If the expression 
following the M or X is zero or one,  then any immediate operand containing 
an equal sign is flagged as an error and the "#" specifies an eight-bit 
operand if the expression was 1, a sixteen-bit operand if the expression was 
zero.  If the expression evaluates to any other value an error is generated. 
Note that M only affects accumulator and memory operations while X affects 
the index register operations.  It is perfectly permissible to have an 
.ASSUME of the form: 
 
   .ASSUME  M:NOTHING,X:1 
 
The "=" immediate specifier would be allowed for accumulator operations but 
not for X/Y index register operations. 
 
 To help ensure compatibility with the existing defacto standard, 
LONGI, LONGA, SHORTI, and SHORTA should be provided as built-in macros 
generating the appropriate .ASSUME statement. 
 
 The "CPU:cpu_type" operand to the .ASSUME statement lets users 
specify the exact 6500 family CPU they are using.  The effect of this 
operand is to "disconnect" certain instructions.  If a certain CPU is 
specified and a programmer uses an addressing mode or instruction which isn't 
available on that CPU, the assembler will generate an error.  By default, 
the assembler should assume the CPU of the machine on which the assembler 
is intended to run (e.g., 65c816 for Apple //GS machines).  If the assembler 
is running on a different processor other than a 6500 family chip, it should 
default to 65c816.  The user should be able to choose this default value 
from an assembler set-up program. 
 
 The .ASSUME directive, and all operands available to it, must be 
supported at all compliance levels. 
 
 
 
Local Symbols 
------------- 
 
 In addition to local labels automatically specified inside procedures, 
macros, and expression functions, you can also explicitly declare local sym- 
bols within the source file.  User-defined local symbols come in two 
varieties: numeric and symbolic. 
 
 Up to 10 active numeric local labels can be specified at any given 
time.  The numeric local labels are similar to those used by D. E. Knuth 
in "The Art of Computer Programming, Vol 1", although the syntax is different. 
Numeric local labels are declared by placing a caret (up-arrow) in front of 
a single decimal digit in the label field.  Examples follow: 
 
  ^0 LDX #05 
  ^9 DEX 
  ^4 LDA LBL 
 
Numeric local labels are referenced with the ">n" and "<n" items, where "n" 
represents a single decimal digit.  If the greater than symbol prefaces a 
digit, then the next occurrence of that numeric local label in the source 
file is referenced.  If a less than ("<") symbol is used, then the previous 
numeric local label is used.  Examples: 
 
 
   LDX #5 
  ^0 CLR Array,X 
   DEX 
   BPL <0   ;References 2nd line above. 
  ; 
   LDA Array+2 
   bne >0  ;References ^0 below. 
   TXA 
  ^0 STA Array+1 
 
Note that multiple occurrences of the same numeric local label may appear 
within the program.  The are differentiated by the "<" and ">" symbols. 
 
 Since "<" and ">" may appear both as operators and as the beginning 
of an operand, a minor ambiguity results.  If you see a portion of an ex- 
pression like ">0",  does it mean  'is some value greater than zero' or does 
it refer to the next occurrence of "^0"?  This is easily handled from context. 
If the ">" or "<" appears where an operator is expected, then the appropriate 
operation is performed.  If they appear where an operand is expected and they 
are followed by a single decimal digit, then they are used as lead-ins for 
numeric local labels.  Otherwise an error must be generated. 
 
 Numeric local labels are great for those cases where you need to 
perform a short branch or to set up a small loop and you don't want to use 
meaningless mnemonics like "loop1",  "SkipInstr12", etc.  Other times, you 
may want to use a meaningful name like "MainLoop" or "ElseQuit", without 
having to worry about conflicts in other parts of the program.  Such cases 
are easily handled by the symbolic local label facility specified by this 
proposal.  Two assembler directives: .LOCAL and .RELEASE are used to define 
the scope of user-specified local labels.  The syntax for these two directives 
is identical, it is: 
 
   .LOCAL label1  {,label2, ..., labeln} 
   .RELEASE label1 {,label2, ..., labeln} 
 
A label defined with .LOCAL is confined to the scope of the .LOCAL/.RELEASE 
pair.   .LOCAL/.RELEASE pairs may be nested allowing you to redefine a symbol 
to any reasonable depth (say, a minimum of 8 levels). 
 
 Numeric local labels are required at all compliance levels.  Symbolic 
local labels are required at the full and extended compliance levels. 
 
 
 
Conditional Assembly 
-------------------- 
 
 Conditional assembly is handled by the  .IF, .ELSE, .IF1, .IF2, 
.IFDEF,  .IFNDEF, and .ENDIF directives.  .IF is followed by a numeric 
address expression that yields a zero (false) or non-zero (true) result. 
The following code (up to the .ELSE or .ENDIF) is assembled if the result is 
true.  Otherwise the code after the .ELSE (if it is present) is assembled in 
its stead.   .IF1 and .IF2 assemble their respective code during passes 
one and two.  .IFDEF and .IFNDEF accept a single symbol as their parameter 
and test whether or not this symbol is currently defined.   The .ELSE 
directive can be used to assemble additional code in the event the tested 
condition is false.  Finally, the .ENDIF directive is used to terminate 
a conditional assembly sequence. 
 
 Conditional assembly blocks can be nested to at least eight levels, 
preferably more.  Since all conditional assembly blocks are terminated with 
.ENDIF, there is no need to worry about matching .ELSEs as you would, say, 
in Pascal.  Every form of the IF statement is terminated with its own .ENDIF. 
 
 The .IF1 and .IF2 directives are normally used to print messages and 
perform other minor housekeeping chores.  In general, there's absolutely no 
reason why anyone would want to generate code inside one of these conditional 
assembly blocks.  Therefore, the assembler may optionally generate an error 
message if the location counter is modified anywhere inside the .IF1/.IF2 
conditional assembly block. 
 
 .IF, .ELSE, and .ENDIF are required at all compliance levels.  .IF1, 
.IF2, .IFDEF, and .IFNDEF are required only at the full and extended com- 
pliance levels. 
 
 
 
While Loops 
----------- 
 
 Sometimes, especially within macros, you will need some sort of 
looping structure to process parameters or otherwise generate sequences of 
code;  the .WHILE/.ENDW directives are used for this purpose.  The syntax 
for the while section is: 
 
   .WHILE expression 
      <body of loop> 
   .ENDW 
 
The instructions in the loop body are repeated as long as the expression 
yields a non-zero value.  For the loop to terminate, the variable(s) 
controlling the loop must be defined using the "=" assembler directive 
since this is the only directive that allows you to redefine an instance 
of a variable. 
 
 The .WHILE directive is especially useful for  processing a macro 
(or record definition) with a variable number of parameters.  Consider the 
following macro: 
 
 ByteTable .MACRO 
 ParmCnt  = ?# 
   .WHILE ParmCnt 
   .BYTE ?:(?#-ParmCnt) 
 ParmCnt  = ParmCnt-1 
   .ENDW 
   .ENDM 
 
   _ByteTable  0,5,4,2,7 
 
This example emits the five bytes  0, 5, 4, 2, and 7 into the object code 
stream. 
 
INCLUDE Mechanism 
----------------- 
 
 A source file include mechanism is provided by the .INCLUDE directive. 
Its syntax is 
 
   .INCLUDE "filename" 
 
The specified file will be inserted at the point of the .INCLUDE directive 
in the current assembly, as though the code were actually inserted at that 
point. 
 
 The include mechanism must be capable of nested includes up to four 
levels deep.  The .INCLUDE directive must be supported at all compliance 
levels of the assembler, although assemblers operating at the subset 
compliance level need not support nested include files. 
 
 
 
 
Programs, Modules, and Units 
---------------------------- 
 
 The assembler handles three types of sources files: programs, modules, 
and units.  Unless otherwise specified, all source files are assumed to be 
programs.  A program is differentiated from a module or unit in that the 
assembler/linker assumes that control is transferred to some point in a 
'program' when it is loaded into memory.  Modules and units are assumed to 
be subserviant sections of code that contain data and/or code used by 
programs. 
 
 By default, a piece of code is assumed to be a program and control 
is transferred to the first byte of that code when the program is loaded 
into memory.  This helps improve compatibility with existing source files. 
The .PROGRAM directive can be used to explicitly declare a piece of code as 
a main program, as well as provide an entry address other than the first byte 
of code emitted.  The syntax for the .PROGRAM directive is 
 
   .PROGRAM label 
 
where "label" is a program statement label somewhere within the current 
assembly.  The address of this label is passed on to the linker/loader where 
it will be used to provide a starting address for the code.   All of the 
statements in the source file will be assembled into the program from the 
.PROGRAM directive till the .END directive.  If a .PROGRAM directive appears 
in the source file, it must appear before any other statement (other than a 
comment or listing directive) and there may only be one .PROGRAM directive 
encountered per assembly.  No modules or units may appear as part of a 
program assembly (see below). 
 
 The .MODULE directive is used to tell the assembler that it is 
assemblying an object code module which is to be linked into a separate 
program before execution.   The .PUBLIC statement is used as the means 
to communicate linkage information to other modules, units, and programs. 
Like the .PROGRAM directive,  the .MODULE directive must appear before most 
statements in the source file and the module is terminated with the .END 
directive.  However,  another module may appear in the source file immediately 
after the .END directive.  Such modules are assembled as independent entries 
in a library.  The syntax for the .MODULE directive is: 
 
   .MODULE  ModuleName 
 
The module name operand is stored as part of the source file for use by the 
linker, but is not otherwise refereced during the assembly process.  In fact, 
this symbol may be redefined later in the source file. 
 
 The .LINK directive can be used to link a module into another module, 
unit, or program at assembly time.  The syntax for this directive is: 
 
   .LINK  "filename",ModuleName 
 
where filename is the operating system name of the object code file or 
library file containing the module, and ModuleName is the actual module 
name specified with the .MODULE directive.  The specified object code is 
inserted into the assembly at the point of the .LINK directive.  Access to 
the symbols declared public within the module is accomplished using the 
.EXTERNAL directive. 
 
 Units are a much more structured form of modules.  With a unit, 
you specify not only the symbols visible to the code using the unit, but 
also how that data is used.  Units also allow you to pass type checking 
information so the assembler can check for possible logical errors during 
assembly.  Finally, as an added bonus, within units you can link in macros, 
records, types, symbols defined by "=", and other entities that cannot be 
handled by modules and the .PUBLIC/.EXTERNAL mechanism. 
 
 A unit takes the form: 
 
   .UNIT UnitName 
 
      <interface section> 
 
   .BEGIN 
 
      <implementation section> 
 
   .END 
 
Like .MODULEs, several units may appear in the source file by simply following 
the .END directive with the next unit definition.  In fact, .MODULEs and  
.UNITs can be intermixed in the same source file.  If more than one module or 
unit appears in the source file, they will be assembled into different slots 
in the object file generated (i.e., a library file will be generated). 
 
 The interface section of a unit contains those items that will be 
public to the unit.  Equates, records, macros, types, sets, and any other 
non-code generating declaration can be used in the interface section (note: 
an exact list of items will be specified later).  Such definitions will be 
made available to the code that uses this unit as well as to the code in the 
implementation section.  In addition to such declarations, the interface 
section may also contain .PROC definitions and .ENTRY definitions.  The 
.PROC definitions simply contain the .PROC statement (which must also appear 
in the implementation section), the .ENTRY definition is used in lieu of the 
.PUBLIC directive and takes the form: 
 
  label .ENTRY {NEAR or FAR} 
 
An example of a simple unit might be: 
 
   .UNIT SimpleUnit 
  MyMac .macro 
   lda #0 
   sta ?0 
   .endm 
  ; 
  ClrSub .proc near 
  SetTrue .proc far 
  SetIt .entry far 
  ; 
   .BEGIN  ;Start implemenation section. 
  ; 
  ClrSub .proc near 
   _MyMac $11 
   ret 
   .endp 
  ; 
  SetTrue .proc far 
   lda #1 
  SetIt sta $23 
   ret 
   .endp 
   .end 
 
 
 To use the code defined in a unit, the ".USE" directive is used in 
a fashion not unlike the .LINK directive, namely, 
 
   .USE "filename",UnitName 
 
where filename is the operating system pathname and UnitName is the name 
specified in the operand field of the unit directive.  Whenever the .USE 
directive appears in a source file, the content of the implementation section 
will be listed if the source listing option is turned on. 
 
 Whenever the .USE or .LINK directives are employed, the corresponding  
object code is always inserted into the assembly.  Therefore the assembler 
is performing double duty, it's acting as both the assembler and linker. 
With units, the assembler always performs the link operation.  With modules, 
you can defer the link operation to a separate linkage step, although there 
are only a few instances where this would be beneficial (for example, while 
creating libraries). 
 
 All of the program linkage directives are optional at the subset 
compliance level, but required at the full and extended levels. 
 
 
 
Listing Controls 
---------------- 
 
 Several directives are used to control the appearence of the assem- 
bled source listing.  The exact format of the listing will be specified with- 
in this proposal (although at a later date).  The exact listing format must 
be adhered to so that symbolic debuggers can take advantage of an assembled 
source listing saved as a text file for use when stepping through a program. 
 
   .ON operands 
   .OFF operands 
 
 These two directives are used to turn certain listing options or 
or off.  Valid operands include  LIST, OBJ, MAC, and COND.  LIST controls 
whether or not the source file is listed and supercedes all other options. 
OBJ (if on) will force the assembler to display all bytes of object code 
emitted by an instruction, even if it takes more than one line to display it 
all;  if off, OBJ will only display the number of emitted object code bytes 
that fit on the current source line.  MAC controls macro expansions during 
the listing.  If off, only the macro name, not the expansion, will be dis- 
played.  COND controls the printing of statements in a false conditional or 
while loop section. 
 
 The .TITLE and .SUBTITLE directives let you assign titles and sub- 
titles to the source file.  The syntax for these directives is 
 
   .TITLE  "Title of source file" 
   .SUBTITLE "Subtitle for this section" 
 
The title is displayed at the top of each page and the subtitle is displayed 
immediately below the title.  .TITLE always forces a page eject, .SUBTITLE 
never does. 
 
 The .PAGE directive forces an immediate page ejection on the listing. 
It requires no operands. 
 
 The .PRINTF directive has the syntax: 
 
   .PRINTF "Control string" {,operands} 
 
It is used in a manner analogous to the PRINTF in the "C" programming 
language.  If expressions follow the control string, "%" modifiers in the 
control string specify their output format.  E.g., 
 
   .PRINTF "Label = $%4h",Label 
 
would print 
 
   Label = $1234 
 
assuming the value associated with Label was $1234. 
 
 
 The .PAUSE directive can be used to force an assembly time error. 
It is useful mainly in macros, records, expression functions, etc. to force 
an error if an illegal condition (like bad number of parameters) occurs. 
 
 The listing control directives are required only at the full and 
extended compliance levels. 
 
 
 
Data Flow Analysis Directives 
----------------------------- 
 
 The following directives are quite useful to add-on debuggers and 
data flow analysis programs.  They are required only at the full and extended 
compliance levels: 
 
  label .table 
        <data table> 
   .endt 
 
For .table, the label is assigned the current value of the location counter 
and label is treated like a statement label.  .TABLE and .ENDT are otherwise 
ignored. 
 
  label .REF label1 {, label2, ..., labeln} 
 
This statement is ignored by the assembler.  The statement label, if present, 
is also ignored. 
 
 
 
Other Optional Goodies 
---------------------- 
 
 The following are not required by this proposal, but should be 
provided nonetheless: 
 
   .system  "DOS command" 
 
.SYSTEM issues the specified command to the operating system.  This command 
is useful for deleteing files during assembly, changing directories, etc. 
 
 
 
 
 
Operation of the Assembler 
-------------------------- 
 
 Given the structure of the assembler, there's no way it can accomplish 
its job in less than three passes without placing severe burdens on the 
user (I could provide you with a mathematical proof of this, but I don't want 
to bore you to death).  Therefore, the standard specifies that the assembler 
must use three (or more) passes to do its job.  During the first pass the 
assembler associates labels with segments (and groups of segments), determines 
whether or not those symbols are near or far, and performs other housekeeping 
chores fit for pass one.  Pass two of the assembler is equivalent to the 
traditional pass one of an assembler, it computes the values for all of the 
symbols in the program.  Pass three generates the actual object code. 
 
 
  
In Addition to the Assembler... 
------------------------------- 
 
 The standard should also include specifications for a run-time 
library to be provided with the assembler as well as a list of tools 
(e.g., debugger, linker, librarian, etc.) which must be provided with the 
product to meet the full compliance level.  I would like to propose the  
following items in the run-time library: 
 
 TTY_IO: A set of routines to communicate with a text-based 
user console.  INIT, GETC, and PUTC are the basic routines.  These three 
routines are easily supported on any system supporting a user console. 
 
 TERMINAL_IO: A set of routines to communicate with a cursor-based 
terminal device.  Routines supported should include INIT, GETC, PUTC, GOTOXY, 
HOME, CLREOLN, and CLREOP. 
 
 CONSOLE_IO: A set of routines to communicate with a DMA-based video 
display device.  See the specifications for ANIX's CHARIO driver for the 
routines to be supplied with this library entry. 
 
 AUX_IO: A driver for a set of one or more serial communication ports. 
Routines should include INITA, SETUPA, GETA, PUTA, STATUSA. 
 
 PRT_IO: A driver for a set of one or more printer ports.  Routines 
should include INITP, SETUPP, PUTP, and STATUSP. 
 
 NET_IO: A driver for a set of one or more network ports.  Routines 
should include INITN, SETUPN, GETPacket, SendPacket, etc. 
 
 CLK_IO: A driver for a real time clock or clock-calendar unit. 
 
 FP: An IEEE floating point package for the 65c816 chip. 
 
 MATH: A set of integer math routines (multiply, divide, extended 
precision, etc.). 
 
 CONV: A set of conversion routines (binary -> decimal, etc.). 
 
 FILE_IO: A set of routines that interface to the host's operating 
system providing a common interface to various operating systems. 
 
 DVC_IO: A hardware independent device I/O package (allowing named 
devices which can be connected through a BIOS (like the AUX_IO and PRT_IO 
packages) to various hardware devices. 
 
 STD_IO: A set of routines to perform various I/O operations such 
as PRINT, PRINTF, SCANF, PUTI (integer), GETI, PUTH (hex), GETH, etc. 
 
 MEM_MGR: A set of memory management routines to efficiently allocate 
and deallocate memory. 
 
  
This is, by no means, an exhaustive list, but a quick sample of the types of 
routines that should be provided. 
 
 Apple //GS users may complain that many of these routines already 
exist within the confines of the Apple toolbox.  The intent, however, is to 
provide a set of useful routines that can be utilized on ANY 65c816 system 
so 65c816 code can be easily ported to systems other than the Apple //GS. 
 
