Overview
We often have data in a form that is not immediately useful. If we need to do mathematical calculations and all we have is a string, we must first take another step. Here, we explore some of the finer details of how this can be done by examining the algorithm to convert from String
to int
.
Existing Tools
In Java, we would use Integer.parseInt()
. This method takes a String
argument and returns a 32-bit int
.
int x = Integer.parseInt("4302");
int y = Integer.parseInt("10101101", 2);
In C, we would use something like atoi()
or strtol()
.
int x = atoi("4302");
long y = strtol("10101101", NULL, 2);
Note that these do not consider data validation.
The Algorithm
First, we must understand that the string cannot be used in arithmetic calculations. It’s a group of characters.
Converting a string to a number is a straightforward process. We look at each character in the string and “add” it to a running total. Of course, adding that character requires that we convert it from char
to int
.
Knowing the ASCII table can help us here. For example, ‘4’ has a value of 52 – not very helpful. We need an integer of 4, not the char
‘4’. However, with some simple subtraction, we can get there. Since ‘0’ is 48, we can do the following:
'4' - '0' = 4 52 - 48 = 4
Yes, we can subtract the chars, and we get a legitimate value that can be used in the calculation. Further, if we start at the beginning of a decimal string, we can keep multiplying by 10 to arrive at the final value.
Let's convert "4302": let sum = 0 '4' - '0' = 4 sum = sum * 10 + 4, the result is 0 * 10 + 4 or 4. '3' - '0' = 3 sum = sum * 10 + 3, the result is 4 * 10 + 3 or 43. '0' - '0' = 0 sum = sum * 10 + 0, the result is 43 * 10 + 0 or 430. '2' - '0' = 2 sum = sum * 10 + 2, the result is 430 * 10 + 2 or 4302.
So, the mechanical steps need to be codified. In the next section, we see how we can accomplish that in Java, C, and assembly language.
The Code
In Java, we are using a for
loop with the length()
method to determine when to stop. We access each character with charAt()
.
Throughout the examples, both the decimal and binary numbers are converted and then added together to show the efficacy of the conversion.
public class Convert {
public static void main(String[] args) {
String b10 = "4302";
String b2 = "10101101";
int x, num1, num2;
// convert number multiplying by 10
num1 = 0;
for ( x = 0; x < b10.length(); x++)
num1 = num1 * 10 + b10.charAt(x) - '0';
System.out.println("Converted string \"" + b10 + "\" to " + num1);
// convert using a bit shift
num2 = 0;
for (x = 0; x < b2.length(); x++)
num2 = (num2 << 1) + b2.charAt(x) - '0';
System.out.println("Converted string \"" + b2 + "\" to " + num2);
System.out.println(num1 + " + " + num2 + " = " + (num1+num2));
}
}
Now, we provide a C version of the algorithm. Note that the while loop accentuates the use of non-zero values as true. So, while the current character is not the null character ('\0') which is a zero in the ASCII table, we still have characters left in the string.
#include <stdio.h>
int main(void) {
char *b10 = "4302";
char *b2 = "10101101";
int x, num1, num2;
// convert number to int by multiplying by 10
num1 = 0;
x = 0;
while( b10[x] ) {
num1 = num1 * 10 + (b10[x] - '0');
x++;
}
printf("Converted string \"%s\" to %d\n", b10, num1);
// convert using a running multiplier
num2 = 0;
x = 0;
while ( b2[x] ) {
num2 = (num2 << 1) + (b2[x] - '0');
x++;
}
printf("Converted string \"%s\" to %d\n", b2, num2);
printf("%d + %d = %d\n", num1, num2, num1+num2);
}
Here, we have an assembly language version of the solution that uses the 6502/6510 CPU. This code is specifically written for a Commodore 64 and will run in any emulator.
There is much to unpack here. The accumulator (A register) is doing all of the math. The X register is our index into the string, and the Y register is used for counting down the repeated addition for the base-10 conversion.
Lines 6-7: Clears the screen Line 10: Set index to start of base-10 string. Lines 11-12: Get the char and check if we've reached the end. Lines 14-17: Copy the current number. Lines 19-28: Multiply by 10. Lines 29-33: Get the char and subtract '0'. Lines 34-39: Add new digit to running total. Lines 40-41: Increment X and jump to line 11. Lines 43-47: Print the converted number followed by return. Line 50: Set index to start of base-2 string. Lines 51-52: Get the char and check if we've reached the end. Lines 53-55: Subtract '0'. Lines 56-58: Multiply by 2. Lines 59-65: Add new digit to running total. Lines 66-67: Increment X and jump to line 51. Lines 69-73: Print the converted number followed by return. Lines 75-81: Add the two numbers together. Lines 83-87: Print the sum followed by return. Lines 92- : Data area.
*= $4000
define linprt $bdcd ; print XA (LE) as int
define chrout $ffd2
lda #147
jsr chrout
; convert decimal
one: ldx #0
p1: lda ns1,x
beq print1
; multiply by 10; copy value
lda num1
sta tmp
lda num1+1
sta tmp+1
; add 9 more times
ldy #9
more: clc
lda tmp
adc num1
sta num1
lda tmp+1
adc num1+1
sta num1+1
dey
bne more
; add new digit
lda ns1,x
; subtract the '0' from the digit
sec
sbc #$30
clc
adc num1
sta num1
lda #0
adc num1+1
sta num1+1
inx
jmp p1
print1: ldx num1
lda num1+1
jsr linprt
lda #13 ; CR
jsr chrout
; convert binary
two: ldx #0
p2: lda ns2,x
beq print2
; subtract the '0' from the digit
sec
sbc #$30
; multiply by 2
asl num2
rol num2+1
; add the new digit
clc
adc num2
sta num2
lda #0
adc num2+1
sta num2+1
inx
jmp p2
print2: ldx num2
lda num2+1
jsr linprt
lda #13 ; CR
jsr chrout
add: clc
lda num1
adc num2
sta sum
lda num1+1
adc num2+1
sta sum+1
print3: ldx sum
lda sum+1
jsr linprt
lda #13 ; CR
jsr chrout
; and we're done!
end: rts
; data area
ns1: txt "4302"
dcb 0
ns2: txt "10101101"
dcb 0
num1: dcw 0
num2: dcw 0
tmp: dcw 0
sum: dcw 0