Updated November 21, 2024
Table of contents
-
Overview
The
while
loopRange-based or Counter-based loop
Sentinel-based loop
Flag-based loop
EOF-based loop
The
for
loopThe
do-while
loopThe
break
and continue
statementsNested Loops
File Handling
Safely Reading Numbers – Part III
Building a Command Line Interface (CLI)
Exercises
Overview
There is a clear need for repetition in many things in our daily life. These may include cooking in the kitchen (3 meals, perhaps more), doing laundry (2 or more loads on a Saturday afternoon), broadcast television (show intro to sitcom, then alternate commercials/sitcom until end of the show), or simply mowing the lawn (walk mower to end of the yard, turn, walk mower back to the other end, repeat). It is straightforward to see that people operate iteratively very naturally.
Consider the following C’d-up pseudocode for mowing a lawn where fictitious variables and functions have been liberally created for entertainment purposes:
mower = removeFromStorage();
if ( oldClippings(mower) == high ) {
cleanCuttingChamber(mower);
notifyOwnerOfBadMaintenancePractice();
}
if ( bladeStatus(mower) != ADEQUATE )
replaceBlade(mower);
if ( fuelLevel(mower) != FULL )
refuel(mower);
while ( motorRunning(mower) != true )
startMower(mower);
do {
walkToEndOfLawn();
turnAndAdjustPosition();
mowed = checkLawnStatus();
} while ( mowed != true );
Now for some fun with this code. Several selection statements are involved in preparing the work to be done. This is very typical of programming with iteration or programming with loops.
Once the preliminary checks are performed, there is the while
loop to start the mower. The idea is that the startMower()
function will attempt to start mower
while the motorRunning()
function checks mower
and returns true
if the mower starts and stays running. That means that while the mower is not running, we should continue to try and start it. Now, in the real world, the average person would try only a few times, especially if its a pull-start, and then begin to look for things like a flooded engine, a bad float, a fouled spark plug, or simply stop to catch their breath, then try some more. That part of our program had been omitted but could easily be added in.
Now we introduce the concept of testing before performing repetition. When starting the engine, we checked to ensure it wasn’t running already by the parenthesized test to the right of the while
keyword. This constitutes a top-test loop meaning that we may perform the iterations of the loop zero or more times based on the result of that test. Another way to say it is: while something is true, perform the work.
There is a second loop example in our pseudocode. It is a do
/while
loop. The concept of this loop is to introduce a bottom-test loop. These loops are useful when we know we will perform the work inside the loop at least once. Only after the work is done once will we test to see if additional applications of the same work are necessary. Now we can say: mow a strip of lawn, turn and check lawn status while there is still lawn to be mowed.
The lawn mowing lends itself well to the bottom-test concept since we began by knowing that our virtual lawn needed cutting. Otherwise, we would not be there with our virtual mower.
It is important to note the subtlety in our two loops:
- While the mower is not running, start the mower.
- Mow a strip of lawn and turn while the mowing is incomplete.
Given different circumstances, we could have used do-while
for the starting of the mower or a while
for the cutting. The circumstances of the situation dictate how we will choose our looping methods.
This is carried over to programming, where we would want to do the same as our mowing example conceptually. Consider the need to read three values from the user to derive a sum and average of those three values. We could simply write:
int value1, value2, value3;
double avg;
printf("Enter value #1: ");
scanf("%d", &value1);
printf("Enter value #2: ");
scanf("%d", &value2);
printf("Enter value #3: ");
scanf("%d", &value3);
sum = value1 + value2 + value3;
avg = sum / 3.0;
This method works. Maybe this is not much typing for the average user, but extend the concept to 5 values. Ten values. 100 values. This suddenly becomes unwieldy the higher we go. We are also adding a lot of additional variables. We need to make this into a loop.
We can start by eliminating the idea of value1
, value2
or valuen
and just have a single variable called value
that we will recycle it through each iteration of the loop. We will also need to take into consideration that the calculation of sum
is no longer a job to be done after all the values are read in from the user. Instead, we will need to incorporate the idea of a running-total while we continue to read values.
One last piece, which is maintaining continuity, is incorporating the numeric value in the prompt just as our previous brute-force method did. Consider the following while
loop.
int sum = 0;
int x, value;
double avg;
x = 1;
while ( x <= 3 ) {
printf("Enter value #%d: ", x);
scanf("%d", &value);
sum = sum + value;
x++;
}
avg = sum / 3.0;
We start by setting sum
to zero since we have yet seen no values. This was unnecessary in our previous example since the sum was calculated after all the values were read from the user.
Next, the variable x
is set to 1. The variable x
will be used as our loop control variable. In other words, x
is the variable we will examine to determine if more work remains or if we have completed our iterations and can stop the loop.
The loop starts with a test of x <= 3
or, more to the point, 1 <= 3 since x
is 1. Since that test is true we enter the loop and begin our work.
value
since we have no reason to keep all the values. As an alternative Chapter 7 covers how to keep all the values in an array.The loop will perform work for values of x
equal to 1, 2 and 3. Each time going back to the top of the loop to reevaluate the condition that allows us to stay.
When the value of x
reaches 4, the test at the top will fail ( 4 <= 3 ). At that point, we fall out of the loop to the statement just below the loop itself, and the program continues by determining the average.
Note that each time the loop performs work, we update the value of x
. The increment operator bumps the value of x
by one before we retest its value at the top.
Now that we have seen a basic example of iteration, we introduce the three loop constructs:
- The
while
loop - The
for
loop. - The
do-while
loop
Each loop construct will now be presented in detail, starting with the while
loop.
The while
Loop
As we saw in the overview section, many repetitive things can be formed into iterative constructs. The overview touches on the concept of the top-test loop construct. Now we will look at multiple variations of the loop and how to control the work that is needed to be done.
First, let us discuss the three parts of a loop. These are initialization, condition and update. The initialization is the work necessary to get ready for the iterative process. The condition determines if the iterative process should continue. Finally, the update portion assists in altering the loop control variable in some way to make sure that the condition will eventually fail and cause the loop to stop if that is indeed our goal. Some loops that never stop, known as the infinite loop, are usually unintentional but are suitable for things like printer queues and web servers.
Range-based or Counter-based loop
The range-based loop is used when we know that something will occur a predetermined number of times, and the loop control variable will be the variable keeping track of how many times the work has been done.
The previous loop example is repeated here as a complete program with the details of initialization, condition, and update highlighted for clarity.
#include <stdio.h>
int main(int argc, char *argv[]) {
int sum = 0;
int x, value;
double avg;
x = 1; //initialization
while ( x <= 3 ) { //condition
printf("Enter value #%d: ", x);
scanf("%d", &value);
sum = sum + value;
x++; //update
}
avg = sum / 3.0;
printf("The sum is %d.\n", sum);
printf("The average is %f.\n", avg);
}
Example 1a: Range-based loop using while
.
A few details worth mentioning about Example 1a include:
- The loop will end its iterations when
x
exceeds 3. - We do the prompting and reading inside the
while
loop at the top. - Calculations and displaying results only occur when we know we have the answer. That is after the loop is complete.
Sentinel-based loop
Sentinel-based loops are used when we are expecting a specific value or values that the user must enter to indicate that the loop may cease. If we needed the user to enter -1 as a value to terminate the loop, we could craft something like this:
#include <stdio.h>
int main(int argc, char *argv[]) {
int sum = 0, count = 0;
int value;
double avg;
printf("Enter integers (-1 to end): ");
scanf("%d", &value); //initialization
while ( value != -1 ) { //condition
sum = sum + value;
count ++;
printf("Enter integers (-1 to end): ");
scanf("%d", &value); //update
}
avg = sum / (double)count;
printf("The sum is %d.\n", sum);
printf("The average is %f.\n", avg);
}
Example 1b: Sentinel-based loop using while
.
There are a few things to note about Example 1b include:
- The prompt appears before the loop and again at the bottom of the loop.
- The call to
scanf()
, just like the prompt is located both before the loop and again at the end of the loop. - The loop control variable is
value
. - The same statement is used to initialize and to update the loop control variable.
- Because we stop when we see -1, we need to count the number of values read in order to know the average.
We must include the prompt twice if we want the user to continue to be prompted once the loop begins since we cannot break the confines of the loop to get to the original first prompt. The prompting and reading of the next value happen at the bottom of the loop to alter the loop control variable before reapplying the test when we reach the top of the loop.
As we saw in the previous example, it is not at all unusual for the update statements to be the same statements that initialized our loop control variable. In the example above, the only way to alter the loop control variable is to read another value from the user. Since the loop is driven entirely by user input, there is often little or no choice.
Flag-based loop
The flag-based loop uses a boolean
variable for loop control. To demonstrate let's create a complete program to ask the user to guess a random number between 1 and 20 inclusive as shown in Example 1.
#include <stdio.h>
#include <stdbool.h>
#include <stdlib.h>
#include <time.h>
int main(int argc, char *argv[]) {
int randomNumber, guess;
bool found;
srand(time(NULL));
// random number between 1 and 20
randomNumber = rand() % 20 + 1;
printf("I'm thinking of a number between 1 and 20.\n");
found = false;
while ( ! found ) {
printf("Guess the value between 1 and 20: ");
scanf("%d", &guess);
if ( guess == randomNumber )
found = true;
else
printf("It's not %d.\n", guess);
}
printf("\nYou got it!\n");
}
Example 1c: Flag-based while loop.
A few things to note about this example are:
- The boolean variable,
found
, needs to be initialized tofalse
before the loop begins. - We are not concerned with guesses that are out of bounds or with a limit on the number of guesses a user is allowed.
- We read values from the user inside the loop with no special attention made to how or when we prompt since
found
is the loop control variable and notguess
.
The program in Example 1c is reworked slightly to deal with a limit on guesses and inform the user of the value if they are unsuccessful in guessing the correct answer.
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <time.h>
int main(int argc, char *argv[]) {
int randomNumber, guess, guessLimit=5;
bool found;
srand(time(NULL));
// random number between 1 and 20
randomNumber = rand() % 20 + 1;
printf("I'm thinking of a number between 1 and 20.\n");
found = false;
while ( ! found && guessLimit > 0 ) {
printf("You have %d guesses remaining.\n", guessLimit);
printf("Guess the value between 1 and 20: ");
scanf("%d", &guess);
guessLimit--;
if ( guess == randomNumber )
found = true;
else
printf("It's not %d.\n", guess);
}
if ( found )
printf("\nYou got it!\n");
else
printf("\nThe number was %d.\n", randomNumber);
}
Example 1d: Improved number guessing program.
The details of this modification are:
- The variable
guessLimit
has been set to 5 and will count down to zero. guessLimit
is decremented after eachguess
is read from the user.- We still lack code to deal with out-of-bounds guesses, but could easily be added after each
guess
. Should an out-of-bound guess count against the user? - We have added the
&&
logical operator to account for two conditions that need to be met to continue prompting for guesses. They must not have discovered the answer, and they must also have guesses remaining.
EOF-based loop
The EOF-based loop works with the predefined value of EOF
. The idea is to detect the End Of File (EOF), more to the point, the end of input. The user can enter CTRL-D in Unix or CTRL-Z in DOS/Windows to trigger the EOF and thereby terminate the loop. Consider the following program segment to read lines of text from the user and simply display the length of each string.
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[]) {
char line[120];
printf("Enter a string (CTRL-D to end): ");
while ( fgets(line, 120, stdin) != NULL ) {
printf("Input was %lu characters long.\n", strlen(line));
printf("Enter a string (CTRL-D to end): ");
}
}
Example 1e: EOF-based while
loop using fgets()
.
Items to note about Example 1e include:
fgets()
helps to determine if the loop should continue by detecting NULL (which technically happens on both EOF and an error).- We are prompting before the loop and at the bottom of the loop.
- The reading of input happens automatically through the use of
fgets()
. - If an input line exceeds the 120
char
size of the string, you may incorrectly view it as two or more lines of input. Care should be taken to ensure the buffer can hold a single read. - Newlines will be included in the input and need further processing if you do not want them.
The use of fgets()
is the best choice when reading strings. As you know, there are many ways to read input in C, so we now introduce a method to read numeric values using the average examples from above while maintaining the EOF model.
#include <stdio.h>
int main(int argc, char *argv[]) {
int sum = 0, count = 0;
int value;
double avg;
printf("Enter a number (CTRL-D to end): ");
while ( scanf("%d", &value) != EOF ) {
sum = sum + value;
count ++;
printf("Enter a number (CTRL-D to end): ");
}
avg = sum / (double)count;
printf("\nThe sum is %d.\n", sum);
printf("The average is %f.\n", avg);
}
Example 1f: EOF-based while
loop using scanf()
.
Items to note about Example 1e include:
scanf()
helps to determine if the loop should continue by detecting EOF (which technically happens on both EOF and an error).- We are prompting before the loop and at the bottom of the loop.
- The reading of input happens automatically through the use of
scanf()
. - There is a cast applied to
count
to make sure we capture the precision in the answer.
The for
Loop
Another top-test loop is the for
loop and is generally used for range-based solutions. This loop can be tricky to grasp at first glance, but once you realize that it is simply a reworked while
loop, it is a snap.
Since our loops still have the initialization, condition test, and update components, the for
loop chooses to put them all on display on the same line rather than spreading them out.
Let us take Example 1a and rework it as a for
loop.
#include <stdio.h>
int main(int argc, char *argv[]) {
int sum = 0;
int x, value;
double avg;
x = 1; //initialization
while ( x <= 3 ) { //condition
printf("Enter value #%d: ", x);
scanf("%d", &value);
sum = sum + value;
x++; //update
}
avg = sum / 3.0;
printf("The sum is %d.\n", sum);
printf("The average is %f.\n", avg);
}
Exaple 2a: Range-based loop using while
.
The for
loop version looks like this:
#include <stdio.h>
int main(int argc, char *argv[]) {
int sum = 0;
int x, value;
double avg;
for ( x = 1; x <= 3; x++) {
printf("Enter value #%d: ", x);
scanf("%d", &value);
sum = sum + value;
}
avg = sum / 3.0;
printf("The sum is %d.\n", sum);
printf("The average is %f.\n", avg);
}
Example 2b: Range-based loop using for
.
The highlighted portions show what has changed. Let us itemize the differences:
- Used the
for
reserved word in place ofwhile
. - Moved all three loop components to the parenthetic that previous only held the condition test.
- Used semicolons to separate the initialization, condition test and update.
The initialization only occurs once and before the condition test is applied. Entering the loop is still determined by the successful evaluation of the condition test, just as it is with the while
loop.
The update, however, is a little different. This will always happen as the very last statement before reapplying the condition test at the top of the loop. With our while
loop, the update could have been any statement within the loop's body. We chose to do it at the bottom of the while
loop since that made the most sense.
The for
loop enforces at the bottom when the update is specified in the parenthesized portion.
You may leave out any or all of the parenthesized components of the for
loop preserving the semicolons. However, you may want to reconsider why you chose the for
loop over the other options.
The do-while
Loop
The do-while
loop, often called the do
loop, is a bit different than the previous loop constructs. This one is a bottom-test loop. Recall from the overview that this loop also differs from the other two in that we will always perform the work in the body of the loop before applying the condition test.
The best example of the necessity of this type of loop is the menu selection program. We know we must display the menu choices once and read the menu selection from the user before deciding anything else.
One thing we could decide is that the user has chosen to quit the program through an appropriate menu choice. Another decision could be to re-display the menu over and over until they make a valid selection. Let us look at the program of Example 3.
#include <stdio.h>
#include <stdbool.h>
#include <ctype.h>
int main(int argc, char *argv[]) {
char choice;
bool valid;
valid = false;
do {
printf("\nPlease select a menu option.\n\n");
printf("Choices:\n");
printf("\tC) County Maps\n");
printf("\tM) Municipalities\n");
printf("\tT) Thermal Images\n");
printf("\tU) Utilities\n");
printf("\n\tQ) Quit\n");
printf("\n\nChoice: ");
scanf(" %c", &choice);
choice = toupper(choice);
switch (choice) {
case 'C': case 'M': case 'T': case 'U': case 'Q':
valid = true;
break;
default:
printf("The choice \"%c\" is not valid.", choice);
break;
}
} while ( ! valid );
}
Example 3: Menu program using a do-while
loop.
The first thing you should notice is the use of the valid
boolean
variable. This is used to turn our do-while
loop into a flag-based loop. Since there is a significant amount of processing needed to determine the choice made and whether it is a valid choice, a switch
statement has been added to simplify the use of a conditional statement to determine the validity of the value of choice
.
The default
clause of the switch
statement handles error reporting. All case entries have been lumped together to treat them equally as valid choices. If additional processing were needed beyond simple acceptance, handling it in another piece of code would likely be more appropriate. Remember that this intends to identify a correct menu choice selection. Keeping the loop focused is essential.
A rather interesting concept to point out that is relevant to the input processing is the following statements:
scanf(" %c", &choice);
choice = toupper(choice);
With this code, we read the next character from the input stream skipping any leading whitespace (note the space before the %c
conversion - see scanf
in Chapter 3). Then we convert the result to upper case.
By converting the character to upper case (or lower case if that is preferred), we do two things:
- Set the stage to accept upper- or lower-case choices since they will be converted.
- If accepting both character cases is acceptable, we reduce the list of potential values to be tested by one-half since we will never have to test the lower case possibilities.
The loop will continue to prompt the user with the complete menu and prompt for a choice for as long as they continue to make invalid selections.
The break
and continue
statements
When dealing with loops, there are times that we wish we could either proceed to the next iteration or stop the loop immediately. We, of course, do not want to stop the program, just the loop we may be stuck in or otherwise no longer desire to be a part of.
The break
and continue
reserved words are actually used as complete statements. These statements allow us to do exactly what their names suggest:
break
out of a loop or skip the remainder of aswitch
statement.continue
immediately with the next iteration of the loop.
The break
statement causes program execution to proceed after the end of the current compound statement block.
The continue
statement causes the loop to proceed to the next iteration by forcing the loop to move to the condition test. For a while
or do-while
loop this means proceeding to the condition test. For the for
loop, the update portion is done first, then the loop proceeds to the condition test.
Nested Loops
Placing a loop inside another loop allows for the complete set of iterations of the inner loop for each iteration of the outer loop. Nested loops are often used for dealing with two- or three-dimensional data but are not limited to those applications.
#include <stdio.h>
#include <stdbool.h>
int main(int argc, char *argv[]) {
int number, range;
bool prime;
for ( number = 2; number <= 100; number++) {
if ( number % 2 != 0 )
prime = true;
else
prime = false;
for ( range = 3; prime && range < number; range = range + 2 ) {
if ( number % range == 0 )
prime = false;
}
if ( prime )
printf("%d\n", number);
}
}
Example 4a: Nested loop to find prime numbers from 2 to 100.
Consider Example 4a which finds all of the prime numbers between 2 and 100. We will use the outer loop to process the actual numbers 2 through 100, and the inner loop will be used to find out which odd values are in the range from 3 to the number-1
divided evenly into number
.
The code in Example 4b shows how the two loops are utilized to determine if the number is prime. Remember that a number is prime if it is divisible only by itself and 1. We start by assuming that the number is prime until we find a value that divides evenly into number
. The values of range
are simply the set of numbers we will test with.
Although the compound statement is not required for the inner loop, it has been included for clarity. We also eliminate all even numbers with the if
test before entering the inner loop.
Finally, we are using short-circuit evaluation for the inner for
loop. When prime
is false
, the loop is clearly not entered. However, because the compiler knows that if the left operand to &&
is false
, there is no point in evaluating the right operand.
A slightly more efficient version is included in Example 4b whereby we restructure the outer loop to only consider odd numbers, thereby eliminating the even test.
#include <stdio.h>
#include <stdbool.h>
int main(int argc, char *argv[]) {
int number, range;
bool prime;
for ( number = 3; number <= 97; number = number + 2) {
prime = true;
for ( range = 3; prime && range < number; range = range + 2 ) {
if ( number % range == 0 )
prime = false;
}
if ( prime )
printf("%d\n", number);
}
}
Example 4b: Improved nested loop to find prime numbers from 2 to 100.
File Handling
Ultimately, you will need to handle data whose volume simply cannot be typed in. File handling is essential to successful programming practice.
As a programmer, you should already know what files are and that their names and extensions mean many things about the type of data plus any unique nature of the layout and encoding of its content.
Now, a favorite yet short-lived television show (Firefly) yielded a movie (Serenity) and a plethora of great lines, which are presented below to represent our data file for the following few examples.
Now I did a job. I got nothing but trouble since I did it, not to mention more than a few unkind words as regard to my character so let me make this abundantly clear. I do the job. And then I get paid. You have reputation! Malcom Reynolds gets it done, is the talk. You know what is reputation? Is people talking, is gossip. I also have reputation; not so pleasant, I think you know. Crow! Now for you, my reputation is not from gossip. You see this man? Ehh, he does not do the job. I show you what I do with him, and now for you my reputation is fact. Is solid. You do the train job for me, then you are solid. You do not like I kill this man? My wife’s nephew. At dinner I am getting earful. There is no way out of that. Anything goes wrong then your reputation only gossip and things between us not so solid. Yes? Sure, I got a secret. More'n one. Don't seem likely I'd tell 'em to you, do it? Anyone off Dyton Colony know's better'n to talk to strangers. You're talking loud enough for the both of us, though, ain't you? I've known a dozen like you. Skipped off home early, minor graft jobs here and there. Spent some time in the lock-down, I warrant, but less than you claim. Now you're what, petty thief with delusions standing? Sad little king of a sad little hill. Listen, you don’t know me, son, so I’m gonna say this once: if I ever kill you, you’ll be awake, you’ll be facing me, and you’ll be armed. Dear Diary, Today I was pompous and my sister was crazy. Today we were kidnapped by hill folk never to be seen again. It was the best day ever. Y'all see the man hanging out of the spaceship with the really big gun? Now I'm not saying you weren't easy to find. It was kinda out of our way, and he didn't want to come in the first place. Man's lookin' to kill some folk. So really it's his will y'all should worry about thwarting.
This content should be copied and pasted into a file called quotes.txt
. As noted previously, this file will be referenced in the following few examples and should be located in the same directory as the example programs.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
char *fname = "quotes.txt";
char buf[255];
FILE *in;
// Attempt to open file. Report error on NULL.
in = fopen(fname, "r");
if (in == NULL) {
fprintf(stderr, "Unable to open file %s\n", fname);
return 1;
}
while ( fgets(buf, sizeof(buf), in) != NULL )
printf("%s", buf);
// close the file
fclose(in);
return 0;
}
Example 5a: Basic example of EOF loop for FILE
type.
In Example 5a we are opening a file for reading. If successful, we will read all the text contained therein and write it to the screen in chunks up to the size of buf
, representing our input buffer. This is, at most, limited to 254 characters, not including the null character.
We are using buf
because to call it line
would be incorrect. A line, from the perspective of fgets
, means that we have found a newline character while reading. The newline will be part of the data that fgets
returns.
Here are some key details:
- Line 2: Includes
stdlib.h
for theFILE
structure. - Line 6: Defines the name of the input file.
- Line 7: Defines the pointer to a
FILE
object. - Lines 11-15: Attempt to open the file. A null pointer indicates failure and is handled by the conditional.
- Lines 17-18: Looping on
fgets
. Remember thatfgets
stops at newlines boundaries. - Line 21: Closes the file.
Proper file processing requires us to be mindful of what we are opening and how we process the data, and close the file when we are finished.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
char *infname = "quotes.txt";
char *outfname = "copy.txt";
char buf[255];
FILE *in, *out;
// Attempt to open input file. Report error on NULL.
in = fopen(infname, "r");
if (in == NULL) {
fprintf(stderr, "Unable to open file %s\n", infname);
return 1;
}
// Attempt to open output file. Report error on NULL.
out = fopen(outfname, "w");
if (out == NULL) {
fprintf(stderr, "Unable to open file %s\n", outfname);
fclose(in);
return 1;
}
while ( fgets(buf, sizeof(buf), in) != NULL )
fprintf(out, "%s", buf);
// close the files
fclose(out);
fclose(in);
return 0;
}
Example 5b: Use two FILE
objects to perform a copy.
The code in Example 5b uses another FILE
object to write the contents of the input file to a newly created output file.
Notable differences compared to Example 5a:
- Lines 19-24: Attempt to open output file with error handling. This includes closing the input file.
- Lines 26-27: Looping on
fgets
while writing to the output file withfprintf
. In this casefputs
could also have been used. - Lines 30-31: Close both files.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[]) {
char *fname = "quotes.txt";
char buf[80];
int count = 0;
FILE *in;
in = fopen(fname, "r");
if (in == NULL) {
fprintf(stderr, "Unable to open file %s\n", fname);
return 1;
}
while ( fgets(buf, sizeof(buf), in) != NULL ) {
printf("%s", buf);
if ( strchr(buf, '\n') )
count++;
}
fclose(in);
printf("\nNumber of lines is %d\n", count);
return 0;
}
Example 5c: Count the number of lines in a file.
The code in Example 5c relies on our ability to find newlines in the text as it is read from the input file. We intentionally use a smaller buffer to illustrate that we may not read a complete line on the first try. It may take multiple attempts to get the complete line. We will know the line is fully realized when we see the newline. This is determined with lines 20 and 21.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
int main(int argc, char *argv[]) {
char *fname = "quotes.txt";
char buf[80], *line, *p;
int len, inuse = 0, l, state;
int count = 0, words = 0, totalWords = 0;
FILE *in;
// Attempt to open file. Report error on NULL.
in = fopen(fname, "r");
if (in == NULL) {
fprintf(stderr, "Unable to open file %s\n", fname);
return 1;
}
len = sizeof buf;
if ( !(line = malloc(len * sizeof(char)))) {
fprintf(stderr, "Cannot allocate initial buffer of %d chars.\n", len);
return 1;
}
while ( fgets(buf, sizeof(buf), in) != NULL ) {
printf("%s", buf);
// check we have enough room.
l = strlen(buf);
if ( len - inuse < l ) {
len += sizeof buf;
if ( !(p = realloc(line, len * sizeof(char)))) {
fprintf(stderr, "Cannot realloc buffer to %d chars.\n", len);
free(line);
return 1;
}
line = p;
}
// add to line
snprintf(line + inuse, l, "%s", buf);
inuse += l;
// when we see a newline, we can start looking for words!
if ( strchr(buf, '\n') ) {
count++;
// state: 1 means we are in a word.
state = 0;
p = line;
while (*p) {
if ( isspace(*p) )
state = 0;
else if ( ! state ) {
state = 1;
words++;
}
p++;
}
printf("Words in this line is %d\n", words);
totalWords += words;
// reset to empty line
line[0] = '\0';
inuse = 0;
words = 0;
}
}
// close the file
fclose(in);
free(line);
printf("Total number of lines is %d\n", count);
printf("Total number of words is %d\n", totalWords);
return 0;
}
Example 5d: Count the number of lines and words in a file.
Now we venture into a new space regarding file processing. The code in Example 5d attempts to identify lines and words. This presents a non-trivial situation since fgets
may need multiple calls to read one complete line. It is further complicated by the fact that fgets
may stop in the middle of a word!
This requires a little more effort to get right. We need to recognize a paradox - we cannot know the longest line without first processing the whole file. Of course, doing that is inherently wasteful. So is scanning the file for lines, rewinding (yes, you can rewind!), and then checking for words. Processing twice is just not done unless we have no choice.
But we have an alternative - we can grow the string that holds the line using dynamic memory we manage at runtime. This requires a little more knowledge than we have at the moment, but this exercise is well worth the opportunity to use malloc
, realloc
and free
. This was touched on back in Chapter 3.
Since this will take a bit to explain, we will take it in pieces. Lines 27-68 comprise the bulk of the reading and processing. It is still a while
loop using fgets
, so we will not repeat those details here.
The core of the program is broken down into the following key components:
- Building a dynamic line terminated by a newline.
- Use a simple finite-state machine to identify words.
- Proper resetting of state in preparation for the next line.
Building a Dynamic Line
We begin with building a dynamic line. Line number references are relative to the chunk provided below.
len = sizeof buf;
if ( !(line = malloc(len * sizeof(char)))) {
fprintf(stderr, "Cannot allocate initial buffer of %d chars.\n", len);
return 1;
}
while ( fgets(buf, sizeof(buf), in) != NULL ) {
printf("%s", buf);
// check we have enough room.
l = strlen(buf);
if ( len - inuse < l ) {
len += sizeof buf;
if ( !(line = realloc(line, len * sizeof(char)))) {
fprintf(stderr, "Cannot realloc buffer to %d chars.\n", len);
return 1;
}
}
// add to line
snprintf(line + inuse, l, "%s", buf);
inuse += l;
Portion of Example 5d that dynamically allocates storage for the line.
-
Lines 1-5: Make the initial allocation of memory for the line. The size of
buf
(80 characters) is used as the starting value.Line 11: Gets the length of the newly read buffer.
Lines 12-18: Make a determination based on the length of the line, the number of characters in use and the length of the new buffer. If there is not enough space, increase the size of
line
by declared size of buf
using realloc
with appropriate error processing.Lines 21-22: Add the new buffer contents to the end of the line and update the amount of the amount of the line that is in use.
Remember from our string discussions that the length of a string is also the index of the null character. We are making use of that characteristic to update line
using snprintf
. The destiation is the address in line
plus the current value of inuse
- which is technically the length of line
. Since that is the address of the null character, we will write buf
as a string (%s
) up to the amount in l
which is the length of buf
.
We know there is enough room in line
to receive the data from buf
and buf
is a fixed size bounded at the fgets
call.
This code is completely safe.
A Simple Finite-State Machine
A finite-state machine is one of the four forms of automata. Truly it is what its name claims: a programmed machine with limited states. We will identify three states:
- Found whitespace.
- Found the start of a word.
- Found the end of a word.
Now the term word is being used rather loosely here. A word is any group of characters bounded by whitespace or the beginning or end of a line. That being said, we will now take to the code.
// when we see a newline, we can start looking for words!
if ( strchr(buf, '\n') ) {
count++;
// state: 1 means we are in a word.
state = 0;
p = line;
while (*p) {
if ( isspace(*p) )
state = 0;
else if ( ! state ) {
state = 1;
words++;
}
p++;
}
printf("Words in this line is %d\n", words);
totalWords += words;
Remember that this comes right after adding buf
to the line. Let us get to the details:
- Line 2: We begin the if-test to see if we have indeed read a complete line. Although
buf
has already been added toline
, it is cheaper to scanbuf
than to scan all ofline
for the newline. - Line 3: Increment the line count.
- Lines 6-8: Set the state to not in a word. Why? Because at the very beginning, we cannot say for certain we found a word! Then, set temporary pointer to the beginning of
line
and start the innerwhile
loop to inspect the characters ofline
. - Lines 9-14: Setup our conditional to examine three states. But wait, we are only testing two conditions!? Yeah about that...
- The
isspace(*p)
keeps us in state 0. - The
else if (! state)
is checking to see if we are still in state 0, BUT we are not looking at whitespace any longer. Switch to state 1 and bump the word count! (Yes, it is the beginning of word, but a word nonetheless.) - The unwritten test is when were are already in state 1 and still looking at non-whitespace characters. Look closely - as soon as we see whitespace we apply #1, above. We found the end of the word.
- Line 15: Advence to next character.
- Lines 17-18: Display details and add words to running total.
A finite-state machine is the most straightforward approach for processing the string and identifying words. Could we have just looked for whitespace? Yes, but the tests can become more complex with adjacency (multiple consecutive whitespace characters), end-of-line detection, etc.
Resetting the State
At the very end of the loop, we must take care to reset everything and get ready to read another complete line of unknown length using our limited buffer space.
// reset to empty line
line[0] = '\0';
inuse = 0;
words = 0;
}
Portion of Example 5d at the end of the if-test to reset and prepare for the next line.
This is what is happening:
- Line 2: Place a null character at the beginning of
line
. This sets the length to zero effetively removing the previous string. But we are not freeing the memory. After all that trouble to get it, we will keep it and lengthen it, if necessary. - Line 3: Setting inuse to zero means there are no characters we care about in
line
. The previous contents will simply be overwritten. - Line 4: Reset the word count for
line
. Remeber that we are counting words per line in addition to the total number of words in the file.
Ok, now for one more fun thing!
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
char *fname = "quotes.txt";
char buf[80];
int lines = 0, words = 0;
FILE *in;
// Attempt to open file. Report error on NULL.
in = fopen(fname, "r");
if (in == NULL) {
fprintf(stderr, "Unable to open file %s\n", fname);
return 1;
}
// read words - skipping whitespace
while ( fscanf(in, "%79s", buf) == 1) {
printf("%s ", buf);
words++;
// have 5 words been printed?
if ( words > 0 && words % 5 == 0 ) {
printf("\n");
lines++;
// ok, have 5 lines been printed?
if ( lines % 5 == 0 )
printf("\n");
}
}
// close the file
fclose(in);
return 0;
}
Example 5e: Arranging words in 5x5 blocks.
Finally, we close out the topic of basic file handling with Example 5e. This is simpler than the previous exaple since we do not need to build a complete line. In fact, we are not even going to use fgets
. Instead we will use fscanf
and the %s
conversion.
Since it is the nature of the scanf
family to ignore whitespace when using %s
, this will be perfect. Our goal now is to transform the file from written sentences to groups of 5 lines of 5 words each, skipping a line after every 5 words are written.
There are only a few details to mention here:
- Line 21: Increment the number of words read so far.
- Line 24: Determines if a newline should be printed based on the number of words printed. If evenly divisible by 5, we print the newline and increment the number of lines.
- Line 29: Determine if a newline should be printed based on the number of lines printed. If evenly divisible by 5, we print an additional newline to create 5x5 boxes of words.
Sample output is shown below:
Now I did a job. I got nothing but trouble since I did it, not to mention more than a few unkind words as regard to my character so let me make this abundantly clear. I do the job. And then I get paid. You have reputation! Malcom Reynolds gets ...
Safely Reading Numbers - Part III
In this section, we consider how to provide multiple attempts to complete the task once an error occurs. We take the code from Part II and wrap it in a do-while
loop.
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
int main(void) {
long a;
char buf[300];
char *p;
int success;
do {
printf("enter a number: ");
if (!fgets(buf, sizeof buf, stdin)) {
// reading input failed, give up:
return 1;
}
// have some input, convert it to integer:
errno = 0;
a = strtol(buf, &p, 10);
//printf("p = %p, buf = %p, len = %lu, p-buf = %ld\n", p, buf, strlen(buf), p-buf);
// *p can be '\0' or '\n', but p cannot be buf.
success = ((!*p || *p == '\n') && p != buf && !errno);
if (errno)
perror("strtol");
else if (!success)
printf("You did not enter a valid number.\n");
} while (!success); // repeat until we got a valid number
printf("You entered %ld\n", a);
return 0;
}
Example 6: Adding some error checking to our safe method of reading numbers.
The code in Example 6 is nothing more than what we accomplished in Chapter 4. The only difference is we wrapped the process in a loop and used the success
variable to determine if we should try again.
The final version of this code will be presented in Chapter 6 where we turn this process into a function that can be called on-demand.
Building a Command Line Interface (CLI)
At this point in the semester, we have more programming prowess than you might think. Consider for a moment that you wanted to build a command-line interface. Something like what is shown below:
Welcome to my really awesome command-line interface! Type 'help' at any time for assistance. (v. 1.0) cmd>
The detail above shows a simple welcome message and provides basic assistance and the product version. The final thing displayed is the command prompt itself, cmd>. Since this is truly nothing more than an EOF-based while
loop, let us begin with a simple setup like Iteration 1.
#include <stdio.h>
int main(int argc, char *argv[]) {
char line[100];
// Display the prompt, loop until EOF.
printf("cmd> ");
while ( fgets(line, sizeof line, stdin) != NULL ) {
// do something
printf("cmd> ");
}
return 0;
}
Iteration 1: The framework for reading lines.
The setup is textbook. No real revelation in its design. Prompting before the loop and at the end of it, we read a line using fgets
into the line
variable.
The next logical step is to identify the first word of the line
as the command. The other possibility is that the whole line
is the command, such as 'help
'.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
char line[255], *cmd=NULL, *p, *c;
char *brkset = " \t\v\n\r";
printf("cmd> ");
while ( fgets(line, sizeof line, stdin) != NULL ) {
// too long?
if ( !(strchr(line, '\n')) ) {
fputs("\nCommand line too long.\n\n", stderr);
do {
fgets(line, sizeof line, stdin);
} while ( !(strchr(line, '\n')) );
goto next;
}
// anything here?
c = line + strspn(line, brkset);
if ( !*c )
goto next;
// find command
p = strpbrk(c, brkset);
cmd = strndup(c, p - c);
printf("The command is: %s\n", cmd);
next:
free(cmd);
cmd = NULL;
printf("cmd> ");
}
return 0;
}
Iteration 2a: Added code to find the command.
Several design choices went into this bit of code. Many of which are the result of sticking to the C standard.
- Standard string library functions
strchr
,strpbrk
,strspn
andstrndup
are used to parse the line. - We are keeping the newline in the input string to make processing simpler. Having that on the end simplifies the loop processing as we can always rely on it.
- There is a limit to the size of a single command. This is 254 characters. Rather arbitrary, but this could be changed if we take the dynamic code from Example 5d.
- Additional code is added to deal with extraneous whitespace - do not rely on the user to do things correctly.
- The first appearance of
goto
in the entire textbook is inroduced to handle potential complexities of user error.
Let us jump right into the details:
- Lines 14-20: If
line
does not have a newline, then the command continued beyond the available space. In this situation, we continue to read fromstdin
until the remainder is consumed. Then we usegoto
and jump to the bottom of the loop to prompt again. This skips all the remaining processing without adding additional complex if-tests. - Lines 23-25: If we are here, then we have a complete line of something. The first thing we do is use
strspn
to get past any leading whitespace. Now, if*c
points to a null character, then the user only entered whitespace and we use thegoto
again. We need two separate pointers in relation toline
to keep track of where we are and where we are going.
0 1 2 3 4 5 6 7 8 9 10 11 12 ------------------------------------------------------- | | | c | m | d | | a | 1 | | a | 2 | \n | \0 | ------------------------------------------------------- ^ | c
strpbrk
to scan the string for the first occurance of whitespace. The we use strndup
to copy the command from line
.0 1 2 3 4 5 6 7 8 9 10 11 12 ------------------------------------------------------- | | | c | m | d | | a | 1 | | a | 2 | \n | \0 | ------------------------------------------------------- ^ ^ | | c p
goto
. In addition, we call free
for the command and set it to the null pointer. This is necessary to avoid a double-free situation. If they type a command, then just hit enter the next time, the goto
will end up invoking free
, but if the old pointer is still there, the attempt to free again is undefined behavior (UB). Regardless free
will always accept a null pointer.A big takeaway is that line
is fixed and always refers to the beginning of the text. We are using c
to move past leading whitespace, then using p
in relation to c
in order to get the command. Using goto
can be very powerful and should be limited to situations where error processing can be difficult - as we have in this case and we are not done!
strspn
and strpbrk
is not trivial. This is why we emphasize safe tools – it takes practice to get it right, and even then, things can go wrong. We also have to accept that a lot of what we are doing here is pointer management. This is covered in more detail in Chapter 7.
There is also the subtle art of anticipating the details of what the user typed within the string when parsing. This is a bit more challenging in C, using only standard library tools. Other languages often have vast libraries of tools to assist.
In the next iteration, we will deal with the command's arguments.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
char line[255], *cmd=NULL, *p, *c;
char *brkset = " \t\v\n\r";
printf("cmd> ");
while ( fgets(line, sizeof line, stdin) != NULL ) {
// too long?
if ( !(strchr(line, '\n')) ) {
fputs("\nCommand line too long.\n\n", stderr);
do {
fgets(line, sizeof line, stdin);
} while ( !(strchr(line, '\n')) );
goto next;
}
// anything here?
c = line + strspn(line, brkset);
if ( !*c )
goto next;
// find command
p = strpbrk(c, brkset);
cmd = strndup(c, p - c);
printf("The command is: %s\n", cmd);
// any args?
p = p + strspn(p, brkset);
while ( (c = strpbrk(p, brkset)) && c > p) {
//printf("%p %p\n", p, c);
int n = strspn(c, brkset);
*c = '\0';
printf("arg: %s\n", p);
p = c + n;
}
next:
free(cmd);
cmd = NULL;
printf("cmd> ");
}
return 0;
}
Iteration 2b: Looping to find the arguments.
Finding the arguments is only slightly more challenging than finding the command. However, the hard part is done, and we have several tools to make the job easier. Our focus is on lines 34-41 in Iteration 2b.
Now, p
is a valid pointer. There is no way to get to this point without trusting what is in p
. So the first thing we do is sweep past any whitespace we may be poiting to with p
(line 34):
0 1 2 3 4 5 6 7 8 9 10 11 12 ------------------------------------------------------- | | | c | m | d | | a | 1 | | a | 2 | \n | \0 | ------------------------------------------------------- ^ ^ | | c p
The idea is to loop in searching for text while skipping whitespace. So, c
finds the next whitespace and n
counts them. Then we make a mini string by placing a null character at c
. We will change this in the next iteration.
0 1 2 3 4 5 6 7 8 9 10 11 12 -------------------------------------------------------- | | | c | m | d | | a | 1 | \0 | a | 2 | \n | \0 | -------------------------------------------------------- ^ ^ | | p c n is 1
Then we can adjust p
and loop again.
0 1 2 3 4 5 6 7 8 9 10 11 12 ------------------------------------------------------- | | | c | m | d | | a | 1 | \0 | a | 2 | \n | \0 | ------------------------------------------------------- ^ ^ | | p c n is 1
And so on...
Now we re-introduce the idea of the array. An array is a collection of like types in a single variable like a string is a collection of characters. We will now introduce the parts
variable which is an array of char *
. The idea is to parse the parts of the line and use strndup for each part, adding the pointer to the array.
In Iteration 3, we have also performed some cleanup of the code and combined the identification of the command and the arguments as simply parts
. The string at parts[0]
will represent the command. All of the rest are arguments.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define ARGMAX 10
#define LINEMAX 255
int main(int argc, char *argv[]) {
char line[LINEMAX], *p, *c;
char *brkset = " \t\v\n\r";
int len = 0;
char *parts[ARGMAX];
printf("cmd> ");
while ( fgets(line, sizeof line, stdin) != NULL ) {
// too long?
if ( !(strchr(line, '\n')) ) {
fputs("\nCommand line too long.\n\n", stderr);
do {
fgets(line, sizeof line, stdin);
} while ( !(strchr(line, '\n')) );
goto next;
}
// anything here?
p = line + strspn(line, brkset);
if ( !*p )
goto next;
while ( (c = strpbrk(p, brkset)) && c > p) {
//printf("%p %p\n", p, c);
int n = strspn(c, brkset);
if ( len == ARGMAX ) {
fprintf(stderr, "Too many parts. Max is %d.\n", ARGMAX);
goto cleanup;
}
parts[len++] = strndup(p, c - p);
p = c + n;
}
printf("There are %d of %d parts\n", len, ARGMAX);
for ( int x = 0; x < len; x++)
printf("%s\n", parts[x]);
cleanup:
for ( int x = 0; x < len; x++)
free(parts[x]);
len = 0;
next:
printf("cmd> ");
}
return 0;
}
Iteration 3: Check the command.
Hare are some takeaways for Iteration 3:
- Lines 5-6: Use
#define
to move the numeric constants out of the code and use names. - Lines 32-41: Consolidation of the code to identify the parts of the line.
- Line 47: Added a new label to simplify cleanup when an error occurs.
- Line 53: Original label from the first time we started using
goto
.
We will add code to validate the entered command in our final iteration. The processing of arguments and actual execution of the command will not be addressed in this space.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define ARGMAX 10
#define LINEMAX 255
int main(int argc, char *argv[]) {
char line[LINEMAX], *p, *c;
char *brkset = " \t\v\n\r";
int len = 0;
char *parts[ARGMAX];
printf("cmd> ");
while ( fgets(line, sizeof line, stdin) != NULL ) {
// too long?
if ( !(strchr(line, '\n')) ) {
fputs("\nCommand line too long.\n\n", stderr);
do {
fgets(line, sizeof line, stdin);
} while ( !(strchr(line, '\n')) );
goto next;
}
// anything here?
p = line + strspn(line, brkset);
if ( !*p )
goto next;
while ( (c = strpbrk(p, brkset)) && c > p) {
//printf("%p %p\n", p, c);
int n = strspn(c, brkset);
if ( len == ARGMAX ) {
fprintf(stderr, "Too many parts. Max is %d.\n", ARGMAX);
goto cleanup;
}
parts[len++] = strndup(p, c - p);
p = c + n;
}
// check command
p = parts[0];
if (strcmp(p, "copy") == 0) {
printf("Performing copy.\n");
} else if (strcmp(p, "delete") == 0) {
printf("Performing delete.\n");
} else if (strcmp(p, "storage") == 0) {
printf("Performing storage.\n");
} else if (strcmp(p, "volume") == 0) {
printf("Performing volume.\n");
} else if (strcmp(p, "vserver") == 0) {
printf("Performing vserver.\n");
} else if (strcmp(p, "help") == 0) {
printf("Performing help.\n");
} else {
fprintf(stderr, "Invalid command: %s\n", p);
}
cleanup:
for ( int x = 0; x < len; x++)
free(parts[x]);
len = 0;
next:
printf("cmd> ");
}
return 0;
}
Iteration 4: But wait, there is more...
We have come a long way in this series of examples. It is also challenging to realize that we are just getting started with building a full-featured CLI. There is so much left undone: building out the commands and addressing how many arguments are allowed for each, interacting with the operating system, error processing per command, etc.
Of course, at this stage of development, we have not yet learned how to build user-defined functions. These would help get a lot of this code out of the main
function and into a series of additional functions that could perform better error management and recovery.
However, we should not lose sight of how much we have covered. Parsing, dynamic memory, complex pointer management, and even the use of goto
are invaluable assets in the C language. Take the time to absorb what has been provided here. These are idiomatic constructs that will serve you well in the future.
Exercises
- (Beginner) Take the code in Example 1a and convert it to an EOF loop. Since the loop is now bounded by EOF you will need to keep track of how many values are entered.
- (Intermediate) Write loop to read four letter grades from the user. Using the letter grade
switch
statement from Chapter 4, write a program to calculate the GPA of a student's grades (for example A, B, C, C) where each class is 4 credits. [ GPA = ( quality_points * credits + quality_points * credits ... ) / total_credits ). - (Advanced) Perform the same GPA calculation as #2, but ask the user for the 4 grades and the number of credits for each grade (they can be different!).
- (Advanced) Write a loop to read strings from the user until EOF. With each string determine how many letters and whitespace exist. Report these as separate values. [Hint: Using the
charAt()
method, pass eachchar
to one of theCharacter
class is-methods. This will be a nested loop - like afor
loop inside the EOF loop.]