(Updated November 19, 2024)
Overview
We often have data in a form that is not immediately useful. If we need to do mathematical calculations and all we have is a string, we must first take another step. Here, we explore some of the finer details of how this can be done by examining the algorithm to convert from String
to int
.
Existing Tools
In Java, we would use Integer.parseInt()
. This method takes a String
argument and returns a 32-bit int
.
int x = Integer.parseInt("4302");
int y = Integer.parseInt("10101101", 2);
In C, we would use something like atoi()
or strtol()
.
int x = atoi("4302");
long y = strtol("10101101", NULL, 2);
Note that these do not consider data validation.
The Algorithm
First, we must understand that the string cannot be used in arithmetic calculations. It’s a group of characters.
Converting a string to a number is a straightforward process. We look at each character in the string and “add” it to a running total. Of course, adding that character requires that we convert it from char
to int
.
Knowing the ASCII table can help us here. For example, ‘4’ has a value of 52 – not very helpful. We need an integer of 4, not the char
‘4’. However, with some simple subtraction, we can get there. Since ‘0’ is 48, we can do the following:
'4' - '0' = 4 52 - 48 = 4
Yes, we can subtract the chars, and we get a legitimate value that can be used in the calculation. Further, if we start at the beginning of a decimal string, we can keep multiplying by 10 to arrive at the final value.
Let's convert "4302": let sum = 0 '4' - '0' = 4 sum = sum * 10 + 4, the result is 0 * 10 + 4 or 4. '3' - '0' = 3 sum = sum * 10 + 3, the result is 4 * 10 + 3 or 43. '0' - '0' = 0 sum = sum * 10 + 0, the result is 43 * 10 + 0 or 430. '2' - '0' = 2 sum = sum * 10 + 2, the result is 430 * 10 + 2 or 4302.
So, the mechanical steps need to be codified. In the next section, we see how we can accomplish that in Java, C, and assembly language.
The Code
Java Version
In Java, we are using a for
loop with the length()
method to determine when to stop. We access each character with charAt()
.
Throughout the examples, both the decimal and binary numbers are converted and then added together to show the efficacy of the conversion.
public class Convert {
public static void main(String[] args) {
String b10 = "4302";
String b2 = "10101101";
int x, num1, num2;
// convert number multiplying by 10
num1 = 0;
for ( x = 0; x < b10.length(); x++)
num1 = num1 * 10 + b10.charAt(x) - '0';
System.out.println("Converted string \"" + b10 + "\" to " + num1);
// convert using a bit shift
num2 = 0;
for (x = 0; x < b2.length(); x++)
num2 = (num2 << 1) + b2.charAt(x) - '0';
System.out.println("Converted string \"" + b2 + "\" to " + num2);
System.out.println(num1 + " + " + num2 + " = " + (num1+num2));
}
}
C Version
Now, we provide a C version of the algorithm. Note that the while loop accentuates the use of non-zero values as true. So, while the current character is not the null character ('\0') which is a zero in the ASCII table, we still have characters left in the string.
#include <stdio.h>
int main(void) {
char *b10 = "4302";
char *b2 = "10101101";
int x, num1, num2;
// convert number to int by multiplying by 10
num1 = 0;
x = 0;
while( b10[x] ) {
num1 = num1 * 10 + (b10[x] - '0');
x++;
}
printf("Converted string \"%s\" to %d\n", b10, num1);
// convert using a running multiplier
num2 = 0;
x = 0;
while ( b2[x] ) {
num2 = (num2 << 1) + (b2[x] - '0');
x++;
}
printf("Converted string \"%s\" to %d\n", b2, num2);
printf("%d + %d = %d\n", num1, num2, num1+num2);
}
Assembler Version
Here, we have an assembly language version of the solution that uses the 6502/6510 CPU. This code is specifically written for a Commodore 64 and will run in any emulator.
There is much to unpack here. The accumulator (A register) is doing all of the math. The X register is our index into the string, and the Y register is used for counting down the repeated addition for the base-10 conversion.
*= $4000
define linprt $bdcd ; print XA (LE) as int
define chrout $ffd2
lda #147
jsr chrout
; convert decimal
one: ldx #0
p1: lda ns1,x
beq print1
; multiply by 10; copy value
lda num1
sta tmp
lda num1+1
sta tmp+1
; add 9 more times
ldy #9
more: clc
lda tmp
adc num1
sta num1
lda tmp+1
adc num1+1
sta num1+1
dey
bne more
; add new digit
lda ns1,x
; subtract the '0' from the digit
sec
sbc #$30
clc
adc num1
sta num1
lda #0
adc num1+1
sta num1+1
inx
jmp p1
print1: ldx num1
lda num1+1
jsr linprt
lda #13 ; CR
jsr chrout
; convert binary
two: ldx #0
p2: lda ns2,x
beq print2
; subtract the '0' from the digit
sec
sbc #$30
; multiply by 2
asl num2
rol num2+1
; add the new digit
clc
adc num2
sta num2
lda #0
adc num2+1
sta num2+1
inx
jmp p2
print2: ldx num2
lda num2+1
jsr linprt
lda #13 ; CR
jsr chrout
add: clc
lda num1
adc num2
sta sum
lda num1+1
adc num2+1
sta sum+1
print3: ldx sum
lda sum+1
jsr linprt
lda #13 ; CR
jsr chrout
; and we're done!
end: rts
; data area
ns1: txt "4302"
dcb 0
ns2: txt "10101101"
dcb 0
num1: dcw 0
num2: dcw 0
tmp: dcw 0
sum: dcw 0