(Updated November 20, 2024)
Table of contents
Overview
BASIC History
BASIC Statement Structure
BASIC Language
BASIC Constructs
Afterward
VICE
Overview
The BASIC (Beginner’s All-purpose Symbolic Instruction Code) language was created by John Kemeny and Thomas Kurtz at Dartmouth in 1964. It was a simple, unstructured language that was reasonably easy to learn.
BASIC was prevalent in the early days of personal computing in the late 1970s and 1980s. It was available on many platforms and often licensed from Microsoft. For the Commodore 64 (hereafter C64), the user programmed in what was known as BASIC version 2.
C64 BASIC V2, as it was called, had 71 keywords used to create commands or lines of a BASIC program. Programs were made up of many numbered lines of code and, if lucky enough, stored on some form of external storage device for later retrieval.
This document will provide examples for Java, C, and Python programmers to adapt simple programs to the BASIC programming language. It’s important to note that since BASIC is considered rather primitive compared to modern languages, much will be missing regarding what the language is capable of. Like C, additional facilities are left to the programmer to provide.
BASIC is a global-scope language, which means there are no boundaries or local-scope concepts. Variables can be clobbered at will, even in subroutines, so great care is needed to write stable programs.
BASIC History
The BASIC language was first available in 1964. Dartmouth had just completed its timesharing system, which allowed multiple programmers access to computing technology simultaneously. Fast-forward just 10 years, and that same computing ability began to enter our homes.
When personal computers first appeared in the late 1970s, there were no extravagant interfaces. You were provided a simple prompt and expected to know what to do next. There were some traditional forms of simple documentation, but the user was expected to learn through experimentation. Your interface was the ability to enter a line of text to be interpreted. No interactive text editors existed, at least not without an additional cost. Plus, memory was in short supply. It was expensive, and you had to do your best with what you had. This led to creative interfaces.
If you look at the computers of the era, they all had a prevalent and primitive look and feel (each link is an online emulator):
- Tandy TRS-80 Model I and III (Radio Shack) (Model I: August 1977, Zilog Z80 @1.77MHz, 4-48KB, Model III: July 1980, Zilog Z80/Z80A @2.03MHZ, 4-48KB)
- Apple II (press RESET) (June 1977, MOS 6502 @1MHz, 4-48KB)
- Atari 400/800(1979, MOS 6502B @1.7MHz, 8-48KB
- Commodore VIC-20 (1981, MOS 6502 @1MHz, 5-32KB)
- Texas Instruments TI-99/4A (June 1981, Texas Instruments TMS9900 @3MHz, 16KB)
- Commodore 64 (August 1982, MOS 6510 @1MHz, 64KB)
- Tandy TRS-80 MC-10 (Radio Shack) (1983, Motorola MC6803 @890KHz, 4-20KB)
This meant that the interface was so simple that it was assumed that you wanted to execute a command or standalone BASIC statement if the line of text began with letters. If it started with a digit, you added a line to a program that was a work in progress.
A well-formatted BASIC program might look like the following:
10 input "please enter your name:"; n$
20 print "hello, " ; n$ ; ", good to meet you!"
30 end
PRINT
use is similar to printf()
in C and Java. The INPUT
command is like nextLine()
in Java Scanner
, and fgets()
in C.
END
as the last line was not always enforced. The original standard indicated it was necessary and could produce a warning or error if missing. Most platforms eventually just ignored that it may never be present. We will drop its use from the remainder of the document except where an intentional END is required.One of the things you’ll discover is that the simplicity of the BASIC language led to easier coding. There was no strong need to ensure everything was secure by checking for array-bound issues or buffer overruns. Everything was simpler, mostly because software security had not yet become a general concern. However, memory was still a concern, so you often found programs written this way:
10 input"please enter your name:"; n$
20 print "hello, ";n$;", good to meet you!"
Spaces took up room in precious little memory, so they were often avoided wherever possible. The reduced number of bytes on slower machines could also lead to faster execution since fewer bytes needed inspection when interpreting the code.
Before we look at the language structure, let’s look at the three languages side by side.
First, we will look at BASIC and Java.
10 input "please enter your name:"; n$
20 print "hello, " ; n$ ; ", good to meet you!"
Scanner kb = new Scanner(System.in);
System.out.println("Please enter your name:");
String n = kb.nextLine();
System.out.println("Hello, " + n +
", good to meet you!");
Now BASIC and C.
10 input "please enter your name:"; n$
20 print "hello, " ; n$ ; ", good to meet you!"
char n[30];
printf("Please enter your name:\n");
fgets(n, sizeof n, stdin);
printf("Hello, %s, good to meet you!", n);
And, finally, BASIC and Python.
10 input "please enter your name:"; n$
20 print "hello, " ; n$ ; ", good to meet you!"
n = input('Please enter your name: ')
print('Hello, '+n+', good to meet you!')
As you can see, on the surface, the BASIC language can be easy to write programs in. It’s interesting that Python has similar ease of coding.
BASIC Statement Structure
BASIC is an interpreted language. It’s not compiled. It’s tokenized. Then, it is stored in memory in its tokenized form. Again, this was to reduce memory overhead and increase the execution speed since the language keywords had already been identified and stored with the rest of the program.
Each line in a BASIC program begins with a line number. Unlike structured programming, where the code is entered via an editor and placed in order by the programmer in a visual way, the line numbers dictate the arrangement of lines. It’s best to leave gaps between line numbers, such as writing 10, 20, 30, etc. This allows you to enter a line between, say, 25 if you need to do something before line 30.
The C64 allowed you to use the cursor keys to arrow up and edit lines on the screen. By hitting the ENTER key, you could have the line changed and the content re-tokenized. This could be used to renumber lines, change the code, or delete lines. This was also extremely helpful, so you didn’t have to retype the line.
Yes, programming during this period could require a good deal of patience.
After each line number, you would provide a BASIC command or a series of commands separated by colons. Each command was formed from one of the 71 BASIC keywords. Using our original example:
10 input"please enter your name:"; n$
20 print "hello, ";n$;", good to meet you!"
This could also be entered as a single line. Of course, this is not a version that is easily modified:
10 input"please enter your name:"; n$:print "hello, ";n$;", good to meet you!"
BASIC Language
Commands
All BASIC commands begin with one of the 71 keywords that make up BASIC V2. The list is shown below. Selecting any of the links will open a new window that describes each command.
ABS | AND | ASC | ATN | CHR$ | CLOSE |
CLR | CMD | CONT | COS | DATA | DEF |
DIM | END | EXP | FN | FOR | FRE |
GET | GET# | GOSUB | GOTO | IF | INPUT |
INPUT# | INT | LEFT$ | LEN | LET | LIST |
LOAD | LOG | MID$ | NEW | NEXT | NOT |
ON | OPEN | OR | PEEK | POKE | POS |
PRINT# | READ | REM | RESTORE | RETURN | |
RIGHT$ | RND | RUN | SAVE | SGN | SIN |
SPC | SQR | STATUS (ST) | STEP | STOP | STR$ |
SYS | TAB | TAN | THEN | TIME (TI) | TIME$ (TI$) |
TO | USR | VAL | VERIFY | WAIT |
Variables
Variable names can be any length, but only the first two characters matter when used in the code. Therefore, it is best to stick with two characters. Variable names can consist of letters and digits, but they must begin with a letter. They may also have an additional character that determines their type.
Variable types come in 3 forms:
- Real variables can hold floating point values in the range ±2.93873588×10−38 through ±1.70141183×1038. Variables of this kind have no suffix and are the default data type.
- Integer variables can hold 16-bit signed binary integers in the range from −32768 through 32767. The names of integer variables have a % sign as a suffix to indicate that they represent an integer.
- String variables can hold anything from 0 through 255 PETSCII characters: Names of this kind of variables end with a $ sign.
Operators and precedence
Operator Precedence (highest to lowest) exponentiation ↑ (^) unary +, - multiplicative *, / additive +, - relational <, <=, =, >=, >, <> logical/bitwise NOT NOT logical/bitwise AND AND logical/bitwise OR OR
Functions and Predefined Variables
Like C, there are a limited number of functions available in the language. These are part of the set of reserved words, and their categories are noted below.
Mathematical functions: ABS()
, ATN()
, COS()
, EXP()
, FN
, INT()
, LOG()
, RND()
, SGN()
, SIN()
, SQR()
, and TAN()
.
System functions FRE()
, POS()
, and USR()
.
String functions: LEFT$()
, MID$()
, and RIGHT$()
.
Conversion functions: ASC()
, VAL()
, CHR$()
, STR$()
, and LEN()
.
In addition to the functions, there are also some predefined system variables: STATUS
(ST
), TIME
(TI
), and TIME$
(TI$
). (Remember that only the first two characters matter for variables.)
You can also create your own, rather limited, function using DEF FN
. These are typically one-liner arithmetic functions and are expensive to use as they add a fair amount of stack overhead.
Conditionals
Conditional statements are the heart of most BASIC programs. They typically involve the keywords IF
, THEN
, and GOTO
.
10 input "please enter an integer: ";a
20 if a>0 then print a;"is positive."
21 if a<0 then print a;"is negative."
22 if a=0 then print a;"is zero."
Loops
The only loop defined by the BASIC V2 language is FOR
-NEXT
. It's only used to iterate over a specific range. There are no conditionals like the for
loop in C or Java.
10 for x = 1 to 10
20 print x
30 next
or
10 for x = 1 to 10:print x:next
Using a GOTO to get out of a loop is not a good idea. Space is occupied on the stack to manage the loop. Since the stack is only 256 bytes, this creates a big problem. You should consider rewriting the loop to work with conditional execution. (See Afterward)
Here is an example using a rewritten Bubble Sort (yes, we know Bubble Sort sucks. Move past it.) The goal here is to show the strengths and weaknesses of BASIC. We can write very simple, undecorated, and unsweetened code. However, due to global scope, our array data must already exist in h()
, and the length is defined in l
.
300 rem bubble sort
305 sw=0
310 for x=2 to l
315 if h(x) >= h(x-1) then 325
320 sw=1:t=h(x):h(x)=h(x-1):h(x-1)=t
325 next
330 if sw=1 then 305
399 return
public static void bubbleSort(int arr[]) {
boolean swapped;
do {
swapped = false;
for ( int x = 1; x < arr.length; x++) {
if ( arr[x] < arr[x-1] ) {
// swap values!
int t = arr[x];
arr[x] = arr[x-1];
arr[x-1] = t;
swapped = true;
}
}
} while (swapped);
}
Lines 305 and 330 act as the do
/while
. We have no like construct, so we have to simulate it. A conditional GOTO does this. Remember that due to a lack of language constructs, we often have to improvise.
330 if sw=1 then 305
is the same as
330 if sw=1 goto 305
The goal is to redirect the program flow to another part of the code.
The if
test in the for
loop is inverted to skip the swap code and force the next iteration of the loop - effectively making a continue
.
Arrays
Arrays are created using the DIM reserved word. Array variables follow the same naming convention as standard variables, and array elements are accessed using parentheses.
10 rem length in l array in a()
20 l=20
30 dim a(l)
40 rem populate the array
50 for x=1tol:read d:a(x)=d:next
55 rem print the array
60 for x=1tol:print a(x):next
70 end
400 data 2, 21, 27, 35, 10, 25, 47, 45, 37, 19
410 data 15, 50, 16, 12, 0, 42, 44, 6, 20, 9
The code uses FOR loops to populate and print the values. The DATA reserved word is used to have data already available in the program that READ can use at runtime to access the data. This is not the same as INPUT, which is use for user interaction.
BASIC Constructs
In this section, we'll examine common ways to accomplish what is much simpler today with modern programming languages. Let's examine another sort.
200 rem shell sort. g% is the gap size
205 g% = l / 2
210 if g% <= 0 then 299
215 for x = g% to l
220 t = a(x)
225 y = x
230 if y < g% then 245
231 if a(y-g%) <= t then 245
235 a(y) = a(y-g%)
240 y = y - g%:goto 230
245 a(y) = t
250 next x
255 g%=g%/2:goto 210
299 return
public static void shellSort(int[] arr) {
int x, y, t, gap, swaps;
for ( gap = arr.length/2; gap > 0; gap=gap/2 ) {
for ( x = gap; x < arr.length; x++ ) {
t = arr[x];
for ( y = x; y >= gap && arr[y-gap] > t; y=y-gap )
arr[y] = arr[y-gap];
arr[y] = t;
}
}
}
There is so much to unpack here.
- Again, the length in
l
is defined outside the subroutine. Remember that all variables are global-scope. - An integer variable was required for this solution. If
g
were used instead ofg%
, the program would never end as it would keep trying to divide a real number in half. - More test inversions were needed for lines 210, 230, 231.
- Note the innermost
for
loop in the Java version is completely rewritten in BASIC asif
tests and agoto
. - Again, this is presented as a subroutine to match the Java method. However, due to global scope, the array data to be sorted must be in the
a()
array.
It's been said that working with integers on the C64 was slower than working with reals. Here is a program that generates 2000 random numbers from 0-9999.
10 l=2000:dim a(l),b%(l)
15 ?"randomizing numbers"
20 x=rnd(time):for x=1tol:r=rnd(1)*10000:a(x)=r:b%(x)=int(r):next
30 t=0:t%=0
35 ?"summing reals"
40 b=time:for x=1tol:t=t+a(x):next:e=time:printt,e-b
45 ?"summing ints"
50 t=0:b=time:for x=1tol:t=t+b%(x):next:e=time:printt,e-b
The program shows that summing the integers differs by only 12 jiffies or 12/60ths of a second.
Afterward
It seems like there is a lot to this language, but there isn't.
For purposes of clarity and completeness, the C64 handles FOR
-NEXT
loop termination in the following ways:
- Upon completing a FOR-NEXT loop, the information is removed from the stack.
- If a
NEXT
statement with an explicit loop variable for an outer loop is executed, all inner loops are canceled, and the information is removed from the stack. - In the case of subroutines (
GOSUB
), all activeFOR
-NEXT
loops within a subroutine are terminated, and their information removed from the stack whenRETURN
is encountered. - If a
FOR
statement is begun with a variable name of an existing loop, the existing loop using the same variable name, along with any subsequent unfinished FOR-NEXT loops, are terminated and their information removed from the stack.
VICE
Another platform for programming BASIC and using the array of technology of the era is the VICE suite of Commodore emulators, which is available on many platforms. The one that closest
VICE is a great way to get the full effect of the C64 without the hardware. Every effort has been made to bring to life the original hardware interaction with files, sound, and graphics. VICE has also been ported to WebAssembly, which is how it can run in a browser, although in a limited form.