Machine Code

(Updated July 22, 2024)

Overview

The methods of programming CPUs/computers have had a vast history. This document is intended to provide a historical introduction to the complexity of how systems were programmed.

The details contained here concern the 6502 and 6510 processors. The 6510 was used in the Commodore 64. This was chosen for its simplicity and the ability to represent the content meaningfully.

If you are unfamiliar with the various number systems of computing, you may want to read all about them.

CPU Detail

In this document, we mention several registers of the 6502. These represent a piece of paper the CPU uses to track what’s happening. The CPU doesn’t know anything that’s not recorded on that paper. It’s allowed to transfer knowledge from paper to memory and memory to paper, but the only details the CPU knows firsthand are on that paper.

Like programming, the term machine code may conjure something different for many. Some may imagine old-time punch cards, paper tape, tape drives, or sequences of binary digits arranged just so. These are technically valid mechanisms for representing the machine code in some form. More precisely, the program defined in the machine code must be in memory for the CPU to do the work represented by the code.

Since memory is typically a contiguous space (no breaks) and our programs are likely also to be contiguous, one possible representation of machine code could be just a stream of bits:

101011100011010001000000101011010011010101000000001000000010101101000000

Historical Note

There was once a time when the bits would be input to the front of the computer by flicking a single switch. Each flick of the switch represented a bit position’s value. The ENIAC computer (around 1944) initially required rewiring and setting switches to program it. At that point, it could only run that program. New programs required rewiring and new switch positions.

If we arrange the bits into groups of 8 forming bytes, then we could see it more plainly as:

Each byte in our program has a purpose. Multiple bytes could be combined to make more complex instructions. Of course, binary is tricky to work with. Ben Eater has a great video showing how he programs his primitive breadboard computer using a set of switches to represent the bits in a given memory location. (This also gives away the rest of the show!)

Staring at all the ones and zeroes presents little variation, meaning data entry becomes very clumsy, and errors will creep in. So we could go with decimal:

Indeed, that provides some variation, and we could talk the numbers out to ourselves to put them into the system with potentially fewer errors. However, decimal is generally worse as it doesn’t lend itself to seeing the binary digits. So, hexadecimal (and, in some cases, octal) became one of the preferred choices since we can pack 4 bits into every hexadecimal digit.

     A    E
AE (1010 1110)
34
40
AD
35
40
20
2B
40

Opcodes and Instructions

Now, we must ask ourselves why we needed all those bits, bytes, and hexadecimal digits. What was the point? Those values represent opcodes for the 6502 CPU. Well, opcodes and data. See, all CPUs only deal with two forms of information – opcodes and data. That’s it. Oh, opcode is short for operation code – the numeric value that represents the action to be performed by the CPU.

Some of those numbers represent opcodes, and the rest are data. So which ones are which? We usually start with an opcode – to begin with, data wouldn’t make any sense without a frame of reference. For the 6502, all opcodes are precisely one byte.

CPU Detail

The original 6502 CPU had exactly 56 unique operations which it could perform. Given the way memory access was combined with the operation (we call them addressing modes), there were a total of 151 opcodes. Since a byte has 256 possible values, all opcodes can be represented in a single byte.

After the one-byte opcode, we can have zero, one, or two bytes of data. The number of bytes of data the opcode needs is baked into it.

So, consider:

AE 34 40

These bytes say to load the X register (byte AE) with the value in memory location $4034 (bytes 34, 40). The CPU knows that opcode AE will perform that task and will need two more bytes to see where the data will be fetched from. Let’s break this down:

The CPU reads the AE opcode. The X register will be loaded with a value from a memory location (absolute addressing).
It then reads 34 and 40 from memory, knowing this represents the 16-bit address $4034.
The 8-bit data value in memory location $4034 is then fetched.
The fetched value is stored in the X register.

Endianness

You may have noticed that we store the address $4034 as the bytes 34 and 40, respectively. The way this data is stored is due to the endianness of the CPU. They are generally little-endian or big-endian. The 6502 is a little-endian CPU. This means that larger data is stored little end first. So, $4034 is in memory as 34 then 40. If this were a big-endian CPU, it would be stored as 40, then 34.

The opcode we’ve been working with, AE, represents an instruction called LDX. This is a mnemonic shorthand for LoaD X. (There is also a corresponding STX or STore X.) These mnemonic instructions represent the 56 unique operations of the 6502, and we tend to reference them by these names rather than the numeric values.

The three bytes we’ve been looking at are better written as:

LDX $4034

This is where we move into assembly language.

Assembly Language

To make the programmer’s job significantly easier, tools such as monitors and assemblers were created. This gave the programmer a much easier environment to write code for a given CPU. No more binary or hexadecimal numbers!

The monitor provided a text-based command line interface. With it, you could type pre-written assembler code, and the monitor would assemble each line, convert it to the proper byte sequence, and store it in memory, prompting the user for the next line to assemble. The monitor does not provide some of the more detailed features like memory abstraction (i.e., variables) or forward address calculations. The programmer is responsible for knowing the size of each written instruction to calculate addresses and have the program run correctly.

[INSERT VIDEO HERE!]

Consider the following program.

LDX $4034
LDA $4035
JSR $402B
LDX $4036
LDA $4037
JSR $402B
CLC
LDA $4036
ADC $4034
STA $4038
LDA $4037
ADC $4035
STA $4039
LDX $4038
LDA $4039
JSR $BDCD
LDA #$0D
JMP $FFD2
RTS

While this is a valid assembly language program, it requires the programmer to know much about where things are in memory. For example, the starting address of the program is $4000. There are also three memory locations set aside for calculations:

$4034 - $4035 ==> first number
$4036 - $4037 ==> second number
$4038 - $4039 ==> sum of the first and second number

The program is also making use of a subroutine written by the programmer and two subroutines provided by the system ROMs.

$402B ==> prints a number followed by a new line (user subroutine)
$BDCD ==> prints a number (BASIC subroutine)
$FFD2 ==> prints a character (KERNAL subroutine)

You can quickly see why tools that provide more abstract views of the program are so beneficial. The assembler program knows how to read and convert assembly language (written text) into machine code. There may be other steps involved with getting the program into memory, but the assembler is the first step in getting a program into proper working order.

Like in higher-level languages, the assembly language development environment is designed to hide nearly everything we’ve discussed thus far and provides basic abstractions like variables. Assembly language programmers can have similar functionality by using names and labels. We can then have a form of the program that is easier to manage, like the following.

         *= $4000    ; start at address $4000

define linprt $bdcd
define chrout $ffd2

         ldx num1
         lda num1+1
         jsr printxa
         ldx num2
         lda num2+1
         jsr printxa

adding:  clc
         lda num2
         adc num1
         sta sum
         lda num2+1
         adc num1+1
         sta sum+1

         ; this is printing the sum
prtsum:  ldx sum
         lda sum+1
printxa: jsr linprt

         lda #13
         jmp chrout  ; rts from call

         rts

num1:    dcw 3650
num2:    dcw 1217
sum:     dcw 0

With this form of assembly language, the programmer is freed from the bonds of knowing where everything lives in memory. We use names for everything, and the assembler is burdened with keeping track of where in memory everything will live.

Hopefully, the reader now appreciates the advent of higher-level languages and how they’ve allowed us to develop programs more quickly and with fewer errors than writing in assembly language.