Chapter 6502-3 – The Assembler – Programming by Design

(Updated December 16, 2024)

Programs

Let’s take a 10,000-foot tour.

Our IDE has a form of compiler called an assembler. That is because it assembles the mnemonics, addresses, names, and such for a given CPU and translates it to machine code. Whereas a compiler translates a higher-level language (C, Java, etc.), which is highly abstracted, into an alternate language. That alternate language might be machine code, or it might be bytecode or some other intermediate form.

The assembler, like a compiler, translates source code from human consumable form (mnemonics, names, addresses, literals) to machine code. The machine code is the series of bytes in a specific order that represents the set of steps necessary to perform the programmed task.

Our assembler is primitive. It does not have many bells and whistles, but it will suffice for the work we must do to understand how the 6502 assembly language works.

We are writing our programs in a simple editor and then assembling them.

Important Note!

If you grab assembler source code from the Internet and drop it into this assembler, expecting it to assemble and run, you may find yourself mildly disappointed. While assemblers have a general format for writing code, there will be subtle differences and similar features may have a slightly different syntax.

One well-known build system that has a very robust 6502 Assembler is CC65.

Let’s look at a simple program and identify the components necessary to write effective programs.

Printing.asm

define CHROUT $ffd2

    ldx #0        ; set index to zero
print:
    lda words,x   ; load a letter from words
    beq done      ; if it's the zero, we're done
    jsr CHROUT    ; print the character
    inx           ; increment the index
    bne print     ; keep printing
done:
    brk           ; end!

; data below this line

words:
    txt "The quick brown\nfox jumps over\n"
    txt "the lazy dog.\n"
    dcb 0

So, let’s break down what this program represents. In addition to the actual 6502 instructions, there are also some assembler-specific pseudo functions or pseudo operations. We will break this down to describe what is happening in the code.

The define directive. This allows us to use the string CHROUT instead of having to remember and type $ffd2.
We define the labels print, done, and words. Labels are useful ways to mark locations in the program without identifying specific memory locations.
Two directives are used to allocate memory for storage. The txt directive indicates the quoted string should be stored as a series of bytes (without the quotes), while dcb (define constant byte) allows us to define a value to be placed in memory at the point of the dcb in the program.

Historical Note!

Over the years, assemblers have had categories of reserved words unrelated to specific CPU mnemonics. These have had many names like control commands, pseudo-ops, and directives. In general, they all mean the same thing. In some way, they will affect how the assembler produces a finished program.

The Assembler

In the previous section, we quickly looked at a typical 6502 assembly language program. We are using the assembler to create very primitive programs. This means multiple things:

We are writing code at a very low level. Writing programs will initially feel like a struggle because we need to be meticulous with our code.
Programs will be textually longer than in higher-level languages. This is, of course, why the higher-level languages were designed.
In addition to knowing the ins and outs of the CPU language, you need to understand how the assembler works and the additional language used to describe how the code should perform.

By way of example, here are two programs to print characters in a string:

Print a string (Java) Print a string (6502)

Print a string (Java)	Print a string (6502)
`public class Main { public static void main(String[] args) { String s = "The quick brown\nfox jumps over\nthe lazy dog."; for (int x = 0; x < s.length(); x++) System.out.print(s.charAt(x)); } }`	`define CHROUT $ffd2 ldx #0 ; set index to zero print: lda words,x ; load a letter from words beq done ; if it's the zero, we're done jsr CHROUT ; print the character inx ; increment the index bne print ; keep printing done: brk ; end! ; data below this line words: txt "The quick brown\nfox jumps over\n" txt "the lazy dog.\n" dcb 0`

public class Main {
  public static void main(String[] args) {
    String s = "The quick brown\nfox jumps over\nthe lazy dog.";
    for (int x = 0; x < s.length(); x++)



      System.out.print(s.charAt(x));



  }






}

define CHROUT $ffd2


    ldx #0        ; set index to zero
print:
    lda words,x   ; load a letter from words
    beq done      ; if it's the zero, we're done
    jsr CHROUT    ; print the character
    inx           ; increment the index
    bne print     ; keep printing
done:
    brk           ; end!

; data below this line

words:
    txt "The quick brown\nfox jumps over\n"
    txt "the lazy dog.\n"
    dcb 0

The extensive use of whitespace in the Java code attempts to line up the for loop with and printf() statement with the code that fetches and prints a character in the assembly language version.

The following Java code

    for (int x = 0; x < s.length(); x++)
      System.out.print(s.charAt(x));

is equivalent to

    ldx #0        ; set index to zero
print:
    lda words,x   ; load a letter from words
    beq done      ; if it's the zero, we're done
    jsr CHROUT    ; print the character
    inx           ; increment the index
    bne print     ; keep printing
done:

You may begin to understand why higher-level languages were developed. Of course, using higher-level languages also means you do not have to know every CPU on the planet to port your program to another platform. You would simply use the compiler on that system, and it would generate the code necessary for that machine.

Breakdown

In the previous chapter, like registers, we introduced categories of instructions. Now, we will begin to put those instructions to use to move data around and begin to solve problems. The example from above will be used to explain some finer details of the assembler while getting us acquainted with the language.

Here is the same program with significantly more detail in the comments.

Author's Note

Those with experience with higher-level languages may find these details over-explanatory. This is fine. Some will take to this language in moments. However we want to make sure there is a thorough comprehension of these primitive instructions. So, on occasion we will over-document for the sake of the learner.

; We can use CHROUT instead of $FFD2 throughout the program.
; This provides more readability in the code.
define CHROUT $ffd2

    ; Load the X register with 0 as a starting point.
    ldx #0

; Set a label here, marking the top of our loop.
print:

    ; This is indexed addressing (array). We take the address of words and
    ; add to it the current value of X. Then, fetch data from that address.
    lda words,x

    ; If the value loaded into A is 0, this will set the Z flag
    ; and we take the branch to done.
    beq done

    ; otherwise, print the character
    jsr CHROUT

    ; increment X, and as long as we've not overflowed, go to the top of the loop.
    inx
    bne print

; Set another label we can branch to when finished.
done:

    ; STOP the program.
    brk

; data below this line

; Another label that represents the memory location of our sentence.
words:

    ; Use the TXT pseudo-op to direct the assembler to turn our string into
    ; a sequence of ASCII characters in memory.
    txt "The quick brown\nfox jumps over\n"
    txt "the lazy dog.\n"

    ; Use the DCB pseudo-op to mark the end of the string with the null character.
    dcb 0

Now let's have a free-form discourse on the goings on...

Our IDE provides a handful of primitive library routines. The routine at address $FFD2 will print the character in the accumulator as ASCII. This is a throwback to the Commodore 64 days with a similar routine at the same location.

The define directive doesn't allocate any memory. Rather, it defines a named replacement for a location in memory. This is incredibly useful, so we don't need to remember memory locations.

Our use of labels serves a similar purpose, but it goes a bit deeper. The labels are maintained by the assembler and represent the memory location where that label occurred in the source code. This is hugely beneficial. Before assemblers were created, it was up to the programmer to know where in memory they needed to branch. Subsequent changes to the source code meant these locations likely changed and had to be recomputed. The programmer was then responsible for changing every occurrence of this location in the code! You can imagine how miserable that would have been!

The use of TXT and DCB pseudo-ops will allocate memory. These, used with labels, provide the makings for some primitive variables, as in higher-level languages.

The JSR instruction calls a subroutine. This is just like calling a function or method in a higher-level language. The subroutine returns when it executes RTS (not shown here).

The next bit is some of the magic of the processor.

    lda words,x

We load the accumulator with a character from the string words. This is done by taking the fixed address of words and adding to it the value of the X register. The resulting address is used to fetch the next character and store it in A. This is one of the forms of absolute addressing.

Once we have the character, we must determine when to stop printing. This is where the zero byte in the DCB comes in. The BEQ instruction is shown below.

    beq done

This branch instruction checks the zero flag - which is set if we load the accumulator with the zero value from the string. The limitation of branching on the 6502 is we can only move 127 bytes forward or 128 bytes back.

Finally, BRK ends the program, and the control returns to the IDE.

Features

There are some very nice features in this assembler within the IDE. The version we are using has a small graphical area based on 16 colors (see Chapter 1), a text output area, and a few subroutines for making some useful programs using the 6502 assembly language.

These are defined in the Notes section of the IDE.

define  SCINIT	$ff81 ; initialize/clear screen
define	CHRIN	$ffcf ; input character from keyboard
define	CHROUT	$ffd2 ; output character to screen
define	SCREEN	$ffed ; get screen size
define	PLOT	$fff0 ; get/set cursor coordinates

The routine for CHROUT we've already seen. What we'll look at next is CHRIN. An example program is provided below to show some basic I/O.

BasicIO.asm

define CHROUT  $ffd2
define CHRIN   $ffcf
define MAXNAME 32
define CR      $0d

    ldx #0        ; set index to zero

    ; ask them their name
printQuery:
    lda query,x
    beq endp
    jsr CHROUT
    inx
    bne printQuery
endp:

    ; read in the name
    ldx #0
getName:
    jsr CHRIN
    cmp #0        ; no char returned
    beq getName   ; try again

    ; was is the return key?
    cmp #CR
    beq endg

    ; store the char
    sta name, x
    jsr CHROUT
    inx

    ; stop if there are 32 chars!
    cpx #MAXNAME
    bcc getName
endg:
    ; print a CR and terminate the name
    lda #CR
    jsr CHROUT
    lda #0
    sta name, x 

    ldx #0
sayHi:
    lda hello, x
    beq endh
    jsr CHROUT
    inx
    bne sayHi
endh:

    ldx #0
prName:
    lda name, x
    beq endn
    jsr CHROUT
    inx
    bne prName

endn:
    lda end
    jsr CHROUT
    brk

query:
    txt "What is your name? "
    dcb 0
hello:
    txt "Hello, "
    dcb 0
end:
    txt "!"
    dcb 0
name:
    dsb 32
    dcb 0

Now, it's ok that some or all of this might be unclear. These are some examples that utilize the assembler embedded in our IDE. All of what you see in this code will be explained in the upcoming chapters.

Errors

As you write programs, you'll write code that doesn't make sense to the assembler. Our IDE is not very robust, and the assembler is less so. It's not a bad assembler; it's just primitive in its abilities. Certain enhancements have been made to the original, but one very lacking area is error reporting. You will not get errors like you would in more advanced compilers.

So, what do you do? Research the instruction or pseudo-op and make sure it's being used correctly.

Table of Contents

Programs

The Assembler

Breakdown

Features

Errors