(Updated March 25, 2023)
Table of contents
-
Basic User Output
Basic User Input
Strings Revisited
Advanced Output With the printf()
Function
Input with the scanf()
and fgets()
Functions
Safely Reading Numbers – Part I
Random Numbers
Quiz
Exercises
Basic User Output
Here we will introduce the idea of very primitive output without special formatting. As shown below, we have seen that printf
was used to send output to the user.
printf("Hello World!\n");
There are several ways to get output to the user, and the ones demonstrated here are available in the header file <stdio.h>
. Remember to include the header file in your program to use the functions defined therein.
Let us begin with the three standard I/O devices: stdin
, stdout
, and stderr
. These represent the standard input, standard output, and standard error devices (or streams), respectively.
stdin
– The standard input device. This is typically viewed as the keyboard and could be redirected data from another source.stdout
– The standard output device. This is typically viewed as the screen or terminal window where text is displayed. This could be redirected to another output destination without the program’s knowledge.stderr
– The standard error device. This is typically sent to the same device asstdout
. This additional designation allows the program’s output to be redirected separately from thestdout
data.
These are pointers to objects of the type FILE
. Note that stderr
could be used to point to an alternate stream. For example, send errors to a log file or a system-defined console by specifying output redirection from the command line at runtime. An example of this is shown at the end of the chapter.
Table 1 shows some of the functions available in the <stdio.h>
header file.
C functions for output |
---|
int fputc(int c, FILE *out) Writes a character to the output file out . Returns the character written or EOF if end of file or error occurred. |
int fputs(const char *restrict s, FILE *restrict out) Writes the characters in s to out . Returns EOF on error otherwise a non-negative value is returned. |
int puts(const char *s) Writes the characters in s plus a newline to stdout . Returns EOF on error otherwise a non-negative value is returned. |
int fprintf(FILE *restrict out, const char *restrict format, ...) Writes converted output to out , leaving the cursor on the end of the line. (The details of how this works are covered later in this chapter.) |
int snprintf(char *restrict dest, size_t n, const char *restrict format, ...) Writes converted output to dest string that is at most n-1 characters long and is null-terminated. (The details of how this works are covered later in this chapter.) |
int printf(const char *restrict format, ...) Same as fprintf using stdout . (The details of how this works are covered later in this chapter.) |
Table 1: Some of the output functions from <stdio.h>
. All of these are considered safe-use functions.
restrict
restrict
modifier keyword was added in C99, and many of the standard library functions have it applied to their parameters.
This allows the compiler to make certain inferences about the data to improve optimizations.
For beginning programming students, you can ignore this modifier and focus on the parameter types and positions.
The printf
function family is much more robust than it appears in the examples seen thus far. Later in this chapter, we will see just how powerful it is. For now, we will recognize that printf
can print text, leaving the cursor on the end of the line, and demonstrate some of the remaining functions.
Now, let us look at the group of output functions as an example.
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[]) {
char c;
char name[30];
snprintf(name, sizeof name, "%s", "John Smith");
c = 'L';
fputc(c, stdout);
fputc('\n', stdout);
puts(name);
printf("%s", name);
fputs(name, stdout);
return 0;
}
Example 1: Some output functions.
The program in Example 1 shows ways to print characters and strings.
- Line 8: re-introduces the
snprintf
function we saw in the previous chapter to duplicate a string safely. - Lines 11-12: print a single
char
in the first statement, then a newline with the second statement. Remember thatfputc
can only move onechar
at a time. - Lines 14-16: show three calls to produce output. The main difference between
puts()
andfputs()
is thatputs()
is intended to be used with stdout whilefputs()
is intended to be used with files. This is not a requirement. After all,stdout
is a file, as demonstrated on line 16.
In Chapter 2, we saw printf
using the %f
conversion for a floating point value and %s
for the conversion of a string. There are many conversions to be discussed later in this chapter.
The output of Example 1, below, is relatively unremarkable. The one thing to note is that printf
and fputs()
do not emit a newline, whereby puts()
does. This is why the last line of output shows two John Smiths on the same line.
L John Smith John SmithJohn Smith
putc()
and putchar()
. These two functions are omitted from Table 1 since they could be implemented as macros. Long story short, it is the opinion of this author that fputc
does the job just fine, and we do not need to worry about safety issues.Basic User Input
Obtaining user input in C is not a trivial task. We will show a couple of quick examples here, but know that one of the tools for reading information is also for parsing complex text. This means we will need to be careful how we use the tools, so we do not encounter undefined behavior.
Table 2 shows a list of some functions useful for user input.
C functions for input |
---|
int fgetc(FILE *in) Reads a character from the file in . Returns the character as unsigned char converted to int or EOF if end of file or error occurred. |
int fgets(char *s, int n, FILE *in) Read at most n-1 characters from in into s . Reading will also stop at a newline and will be included in s . A null character ('\0' ) will be added at the end of s . |
int fscanf(FILE * in, const char *format, ...) Read input based on format from in . (The details of how this works are covered later in this chapter.) |
int scanf(const char *format, ...) Same as fscanf using stdin . (The details of how this works are covered later in this chapter.) |
Table 2: Some of the input functions from <stdio.h>
.
Now we will put a couple of these to work.
#include <stdio.h>
int main(int argc, char *argv[]) {
int n;
char name[30];
printf("Please enter your name: ");
fgets(name, sizeof name, stdin);
printf("Please enter a whole number: ");
scanf("%d", &n);
printf("Hello, %s, you entered %d\n", name, n);
return(0);
}
Example 2: Using fgets()
and scanf()
to read data from user.
In Example 2, we are using both fgets()
and scanf()
to show two simple ways to read information from the user. We start by reading a string that represents the name using fgets()
, and then we use scanf()
to read an integer.
Note that scanf()
and printf()
have much in common with the format specifier strings, but their semantics can differ significantly. We will learn a great deal about these two functions in addition to how to use fgets()
in the following few sections.
No, really. Switch them and see what happens. (HINT: You will NEVER be able to enter your name without some additional code.)
The output of Example 2 looks like the following:
Hello, Bill , you entered 5
The reason this is broken across two like is due to the newline that is captured by fgets
. This is discussed in more detail below in the Input with the scanf()
and fgets()
Functions.
Strings Revisited
Strings are very versatile, and many functions can assist with copying, concatenating, splitting, changing cases, etc. However, some of these functions have also been deemed unsafe. They are hazardous because programmers have used them only to reveal situations where return values were unchecked, lengths were not confirmed, or entered into situations where there was no safe path to guarantee safe usage. Let us consider a basic example.
The strcpy()
function copies characters into the destination object, listed first, using the second object as the source.
char name[10];
strcpy(name, "Bill");
Example of safe use of strcpy()
.
The code above is perfectly safe. Why? We can see everything. We know there is enough room for the source string ("Bill"
) to be copied into the destination (name
). There is no possible way to get into trouble.
The issue is this will not always be the case. In later chapters, we will use objects obtained from various locations, and there is no way to determine if there is enough room for the result. Unless we create the object to receive the data, we cannot guarantee the location is safe for the amount to be written.
So in Table 3a, you will notice that there are several functions with strikethrough. These functions are considered unsafe, and we will not provide examples of how to use them (except the one noted above).
Instead, we will provide alternatives to help you write more robust, safer code.
C functions for strings |
---|
These functions must be avoided. They do not have sufficient bounds checking. We will use snprintf() and strdup() as a replacements. |
int strcmp(char *s1, const char *s2) Compares characters from s1 to s2 . Returns <0 if s1 < s2 , >0 if s1 > s2 and 0 if s1 == s2 . |
int strncmp(char *s1, const char *s2, size_t n) Compares at most n characters from s1 to s2 . Returns <0 if s1 < s2 , >0 if s1 > s2 and 0 if s1 == s2 . |
char *strchr(const char *str, int ch) Returns the index of the first occurrence of ch in str or NULL if not found. |
char *strrchr(const char *str, int ch) Returns the index of the last occurrence of ch in str or NULL if not found. |
char *strspn(const char *dst, const char *src) Returns the length of the string from the beginning of dst that matches the set of characters in src . |
char *strcspn(const char *dst, const char *src) Returns the length of the string from the beginning of dst that does not match the set of characters in src . |
char *strstr(const char *str, const char *substr) Returns the index of substr in str . If substr is not found, the null pointer is returned. |
char *strdup(const char *s) Returns a null-terminated string pointer that is a duplicate of s . The pointer MUST be sent to free() to avoid memory leaks. If an error occurs, NULL is returned, and errno may be set. (Originally POSIX, it was added to ISO C in 6/2019 and to C2x) |
char *strndup(const char *s, size_t n) Returns a null-terminated string pointer that is at most n characters of s . The pointer MUST be sent to free() to avoid memory leaks. If an error occurs, NULL is returned, and errno may be set. (Originally POSIX, it was added to ISO C in 6/2019 and to C2x) |
int snprintf(char *restrict dest, size_t n, const char *restrict format, ...) Writes converted output to dest string that is at most n-1 characters long and is null-terminated. (The details of how this works are covered later in this chapter.) |
Table 3a: String functions from <string.h>
. Note: snprintf()
is from <stdio.h>
You will notice several functions designed to compare, search, duplicate, and format strings. Earlier in the chapter, we noted snprintf()
in the section on basic output. Its repeat appearance here is intentional as this can replace strcpy()
and strcat()
– with a bit of retooling.
In each subsection, we will give brief details and examples of using these tools in everyday code.
Let us begin with some basics.
Comparing Strings
String comparisons are basic lexicographical comparisons in which each character in the same string position is compared by subtracting their values. If the result is zero, it proceeds to the next position. This continues until either there is a non-zero result or there is no more of either string to process.
So, if we compare strings a
and b
:
char a[] = "AAA";
char b[] = "AAB";
int res;
We will discover that a is less than b because when we subtract ‘B’ from ‘A’, the result is -1. This is because the ASCII value of ‘A’ is 65, and ‘B’ is 66. So, 65 minus 66 is -1.
res = strcmp(a,b); // res is -1
A A A - A A B ------- 0 0 -1
char b[] = 'AAa';
res = strcmp(a,b); // res is -32
A A A - A A a ------- 0 0 -32
When we compare b
to a
, we find that b
is greater since the result is 32.
res = strcmp(b,a); // res is 32
A A a - A A A ------- 0 0 32
Now we appreciate that the result is zero when both strings are the same.
Using strncmp
allows you to specify a limit on the number of characters to match from the beginning of the string. This can be useful when testing for prefixes.
Searching Strings
There are several ways to search strings. We will present four methods here, but these are by no means the only methods available.
To search for a specific character within a string, we have two choices: strchr
and strrchr
.
//index 11
//values 012345678901
char s[] = "abc123321cba";
char *p;
int pos, len;
p = strchr(s, 'A'); // returns null pointer (not found)
p = strchr(s, 'a'); // returns (s+0) as a pointer
p = strrchr(s,'a'); // returns (s+11) as a pointer
p = strstr(s, "123"); // returns (s+3) as a pointer
len = strspn(s, "abcde"); // returns 3 - stops matching at first digit
len = strcspn(s, "ABC"); // returns 12 - no characters were matched
Code demonstrating string search functions.
When using the strchr
functions, the return value will be the pointer of where the character was found or the null pointer if not found. Remember that strrchr
starts searching from the end of the string.
To search for a string within a string, we use strstr
. This behaves the same as strchr
and strrchr
.
Finally, the span functions strspn
and strcspn
inspect each character of the string to see if it matches the character set (strspn
) or does not match (strcspn
). It returns the index (length) of the first failed comparison.
Duplicating Strings
Duplicating strings assumes you do not yet have space to hold a result. Another way to look at it is if you think you will not have enough space, create a new string that is the correct size.
The strdup()
and strndup()
functions are used when we need to make an exact copy of an existing string. The functions take an object that refers to a known string and returns an object – a new memory location – containing the copy of the original.
strdup
and strndup
comes at a price. Unlike the local variables we declare in main
, you are responsible for the memory these functions provide. Failing to free
that memory will lead to memory leaks and badly written code.char *copy;
// this copies the argument value into a new string.
copy = strdup("Some text...");
//...
// We must remember to give back that memory when we are done with it.
free(copy);
The variable copy
is only a pointer to characters. It can hold an address of where the characters live in memory. The strdup
function provides new storage (which is also a pointer to characters) with a copy of the provided string. As such, we must endeavor never to lose that address since it must be returned at some point – typically, as soon as we know we no longer need it.
As we wind down the discussion on the basics of string handling, there is another talk we need to have about C, memory, and object storage classes.
Introduction to Storage Classes
We need to discuss storage classes. We are going to differentiate automatic versus allocated storage. In C, there are four storage classes, but we are only concerned with two for now.
- automatic – The storage is allocated upon entry of a block and is deallocated when the block is exited.
- allocated – The storage is allocated and deallocated upon request. This is typically done by using allocation functions like
malloc()
,calloc()
orrealloc()
and the deallocation functionfree()
.
Since printf()
is a staple function, it is only logical that a relative – snprintf()
– works specifically with producing formatted strings. Remember that we are using snprintf
as a replacement for strcpy
and strcat
family of functions that have been deemed unsafe.
We used the strdup
functions when we did not have space already allocated to receive the data. So now let us consider when there is enough space and we know how much. Consider the following:
char src[] = "Some data...";
char dst[20];
snprintf(dst, sizeof dst,"%s", src);
Using snprintf
to perform a safe strcpy
.
This is safe because we have provided a boundary for the copy. If the contents (src
) were to exceed the available space (dst
), the copy would stop at the boundary putting the null character at the correct location.
Consider the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[]) {
// pointers to characters.
char *concat, *another;
// an initialized string (compile-time).
char astring[] = "A string of characters";
// runtime allocation of memory.
another = strdup(", plus a few more.");
// calculate total space and allocate enough storage for concatenation.
int total = strlen(astring) + strlen(another) + 1;
concat = malloc ( total * sizeof(char));
// perform concatenation.
snprintf(concat, total, "%s%s", astring, another);
printf("astring (%p) = %s\n", astring, astring);
printf("another (%p) = %s\n", another, another);
printf("concat (%p) = %s\n", concat, concat);
// free everything we allocated at runtime.
free(concat);
free(another);
return 0;
}
Example 3a: A rather detailed example of string management.
The reality of the code presented in Example 3a is that string management is memory management. Managing memory in C takes practice and patience. This is not like Java, where there is a garbage collector to clean up all the objects that have been neglected.
The program breaks down like this:
- Lines 8 & 11 – Automatic storage is declared and one is initialized (
astring
) to a known literal. - Line 14 – Use
strdup()
to allocate space for and assign value to a string. - Line 17 – The length of both strings plus a null character is calculated.
- Line 18 – Use
malloc()
to allocate the storage - Line 21 – Use
snprintf()
to perform the concatenation. - Lines 23-25 – Display both the pointer and the data.
C functions for single-byte characters |
---|
int isalnum(int ch) Returns true if ch is alpha-numeric. |
int isalpha(int ch) Returns true if ch is alphabetic. |
int islower(int ch) Returns true if ch is lowercase. |
int isupper(int ch) Returns true if ch is uppercase. |
int isdigit(int ch) Returns true if ch is a digit. |
int isspace(int ch) Returns true if ch is a space character. |
int tolower(int ch) Returns ch converted to lowercase. |
int toupper(int ch) Returns ch converted to uppercase. |
Table 3b: A few functions from <ctype.h>
.
The functions listed in Table 3b allow us to classify or modify characters. A bit of sample code is shown in Example 3b to illustrate that some return true or false while others return the altered value of the character itself. Since they all return an int
, it may not be apparent initially.
#include <stdio.h>
#include <ctype.h>
int main(int argc, char *argv[]) {
printf("isalnum('A') = %d\n", isalnum('A'));
printf("isalpha('A') = %d\n", isalpha('A'));
printf("islower('A') = %d\n", islower('A'));
printf("isupper('A') = %d\n", isupper('A'));
printf("isdigit('A') = %d\n", isdigit('A'));
printf("isspace('A') = %d\n", isspace('A'));
printf("toupper('A') = %c\n", toupper('A'));
printf("tolower('A') = %c\n", tolower('A'));
return 0;
}
Example 3b: Several ctype.h
functions used on the letter 'A'
.
The output of Example 3b is shown below:
isalnum('A') = 1 isalpha('A') = 1 islower('A') = 0 isupper('A') = 1 isdigit('A') = 0 isspace('A') = 0 toupper('A') = A tolower('A') = a
The values returned may differ depending on the platform. At the same time, a zero means false, non-zero means true.
C functions for converting strings to numeric values |
---|
double atof(const char* str) Converts the string pointed to by str into a double . Values out of range are undefined and 0.0 is returned if conversion cannot be performed |
int atoi( const char *str ) Converts the string pointed to by str into an integer. The value 0 is returned if the conversion cannot be performed. |
float strtof( const char *restrict str, char **restrict str_end ) Converts the string pointed to by str into a float , double or long double . The value 0 is returned if the conversion cannot be performed. |
long strtol( const char *restrict str, char **restrict str_end, int base ) Converts the string pointed to by str into a long integer. The value 0 is returned if the conversion cannot be performed. |
Table 3c: A few functions from <stdlib.h>
.
Occasionally, you may have to convert a string to a numeric quantity suitable for use in an expression. This is briefly demonstrated in Example 3c.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
int i;
long l;
long long ll;
i = atoi ("37");
printf("%d\n",i);
l = atol("327875885868");
printf("%ld\n",l);
ll = atoll("8923764828768368282");
printf("%lld\n",ll);
i = atoi("0");
printf("%d\n",i);
i = atoi("garbage");
printf("%d\n",i);
return 0;
}
Example 3c: Demonstrating conversion functions.
37 327875885868 8923764828768368282 0 0
While helpful, these functions have some fundamental flaws addressed in the Safely Reading Numbers series that begins in this chapter.
Advanced Output With the printf()
Function
We have seen some basic forms of printf()
up to this point. It is time to delve a bit deeper into the power of printf()
.
[NOTE: The details of printf()
are extensive. This is intended to be introductory. The tables in this section are not exhaustive and are intended to introduce you to the most common notations. You can find extensive details in the C11 and C17 standard documents, section 7.21.6.1.]
The printf()
function is actually a family of functions: fprintf()
, printf()
, sprintf()
. Respectively, these are used to write formatted output to files, to the default output device, and to strings. What is discussed here applies to the entire family of functions.
Consider the following example:
const double PI=3.14159;
char s[] = "My string.";
printf("Here is the output of our program.\n");
printf("The value of Pi is %.5f\n", PI);
printf("The value of s is \"%s\"\n", s);
The output of such looks like the following:
Here is the output of our program. The value of Pi is 3.14159 The value of s is "My string."
The printf()
function takes as its first argument a string. This string is also known as the format string.
Zero or more additional arguments may follow the format string. The format string is essentially a combination of literal text and sequences of placeholders, or format specifiers, that define conversions. The format specifiers are designed to hold a place in the format string, which will be substituted at runtime with a value appropriate to the conversion type.
A format specifier is denoted by the percent (%) sign to indicate that the following sequence of characters describes how the value should be displayed. If a format string contains no format specifiers, then all of the text contained in the format string is processed as if it were plain text. The cursor remains on the current line and does not move to the beginning of the following line unless an explicit newline (\n
) is placed at the end of the format string. A format specifier has the following form:
%[flags][width][.precision][length]conversion
All of the bracketed values are optional, and their values depend on the conversion selected.
Conversion | Description | Argument type | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Length Modifier → | hh |
h |
(none) |
l |
ll |
j |
z |
t |
L |
|
% |
Writes a literal %. | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
c |
Matches a character or sequence of characters. If the width is specified, it will match the width characters. There is no null character appended in either case. | N/A | N/A |
int |
wint_t |
N/A | N/A | N/A | N/A | N/A |
s |
Matches a sequence of non-whitespace characters. If the width is specified, it will match up to the width or the first whitespace character. A null is always provided. | N/A | N/A |
char* |
wchar_t* |
N/A | N/A | N/A | N/A | N/A |
d |
Matches a decimal integer, same as strtol() with base of 10. |
signed char |
short |
int |
long |
long long |
intmax_t |
signed
size_t* |
ptrdiff_t* |
N/A |
o |
Matches an unsigned decimal integer, same as strtoul() with base of 10. |
unsigned char |
unsigned short |
unsigned int |
unsigned long |
unsigned long long |
uintmax_t |
size_t |
unsigned
ptrdiff_t |
N/A |
u |
Matches an octal integer, same as strtoul() with base of 8. |
N/A | ||||||||
x, X |
Matches an unsigned hexadecimal integer, same as strtoul() with base of 16. |
N/A | ||||||||
a, A |
Matching a floating-point number, same as strtof() . |
N/A | N/A |
float* |
double* |
N/A | N/A | N/A | N/A |
long double* |
p |
Matches the implementation-defined representation of a pointer. This format should be the same as what printf() would produce for %p . |
N/A | N/A |
void** |
N/A | N/A | N/A | N/A | N/A | N/A |
Table 4a: A table of data types for the arguments of available conversions.
While Table 4a provides detail on how to construct your conversion, Table 4b provides some common flags you may want to incorporate into your format strings.
Flag | Meaning |
---|---|
- |
The result is left justified. |
+ |
The result will always have a sign of – or +. The default behavior is that only negative values have a sign. |
0 |
The result is padded on the left with leading zeroes. |
Table 4b: Some flag values for printf()
format specifiers.
Some typical format specifiers are shown in Table 4c. One of the nice features of the printf()
function is the ability to format a line into fields of specific sizes to align columnar data. Now, by default, values are right-justified. If the width is wider than the value to be printed, the value will be padded on the left with spaces. If the width is smaller than the value displayed, the space will be enlarged to fit the value.
Format Specifier | Resultant Output |
---|---|
%-30s |
Left justified, space padded, string data that occupies 30 character columns. |
%c |
Single character data. |
%e |
Precision floating point data in scientific “e” notation. |
%.3f |
Precision floating point data as a decimal value printed to 3 decimal places. There is no specific width provided |
%10.3f |
Precision floating point data as a decimal value printed as 10 character columns to 3 decimal places. The decimal point and fractional portion consume 4 columns. The remaining 6 are for the whole number value padded on the left with spaces as needed. |
%10d |
Right justified, space padded, decimal integer that occupies 10 character columns. (No precision) |
%03d |
Zero padded, decimal integer that occupies 3 character columns. (No precision) |
%% |
Percent sign. |
Table 4c: Examples of some simple printf()
format specifiers.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
char s[] = "Some string!";
int x = 32, l;
const double pi = 3.14159;
printf(" 111111111122222222223\n");
printf("123456789012345678901234567890\n");
printf("%10d\n", x);
printf("%-10d\n", x);
printf("%30s\n", s);
printf("%-30s\n", s);
printf("%f\n", pi);
printf("%10.4f\n", pi);
printf("%.4f\n", pi);
l = strlen(s);
printf("\nThe length of \"%s\" is %d\n", s, l);
printf("The last char of \"%s\" is '%c'\n", s, s[l-1]);
}
Example 4: Program demonstrating the use of the printf()
method.
The program in Example 4 produces the formatted output below.
111111111122222222223 123456789012345678901234567890 32 32 Some string! Some string! 3.141590 3.1416 3.1416 The length of "Some string!" is 12 The last char of "Some string!" is '!'
Input With the scanf()
and fgets()
Functions
The functions of Table 5 are a list of typically used input tools. These are still considered to be safe to use.
C functions for input |
---|
int fgetc(FILE *stream) Reads a character from the input stream . Returns the character as unsigned char converted to int or EOF if the end of file or error occurred. |
int fputc(int ch, FILE *stream) Writes a character, ch , to stream . Returns the character written or EOF if an error occurs. |
char *fgets( char *restrict str, int count, FILE *restrict stream ) Read characters from stream into the char array pointed to by str up to count -1. returns str or null if an error occured. Will set EOF which can be detected by feof() . |
int fscanf(FILE *restrict stream, const char *restrict format, ...) Read input based on format from input stream . (The details of how this works are covered later in this chapter.) |
int scanf(const char *restrict format, ...) Same as fscanf using stdin . (The details of how this works are covered later in this chapter.) |
Table 5: Some of the input functions from <stdio.h>
.
We are going to focus on the details of scanf()
and fgets()
in this section.
fgets()
and scanf()
. Pick one of them and stick with it. This is mostly to do with the newline character and how scanf()
handles whitespace. They will be captured, consumed, or ignored depending on the circumstances, and you may have unwanted data left in the input stream for the next call.What is needed? | scanf() |
fgets() |
---|---|---|
Read text from user. | YES | YES |
Read text with whitespace preserved. | NO | YES |
Read a single character. | YES | NO |
Read numeric values. | YES | NO * |
Table 6: Choosing between scanf()
and fgets()
.
Using scanf()
We will begin the discussion of scanf()
by pointing out the similarities to printf()
. The majority of the conversions in scanf()
operate along the same lines as printf()
. This is not to say they are identical. Instead, if you are familiar with the conversions of printf()
, working with scanf()
will offer familiarity.
The most important thing to mention about scanf()
is that it is not intended to be used for user input. Instead, it is a parsing tool.
“I need you to take the garbage out.”
When we hear the above statement, we recognize that the person we are conversing with needs us to do something. Specifically, they are asking us to take out the garbage. We will not go into the subject, verb, object, or preposition discussion, but you get the idea.
Since scanf()
is intended to be a parser, it is much more complex than meets the eye. There is no simple thing as reading an integer, getting someone’s name, or getting their hourly rate. These are all meanings we apply to the data once we have it.
Getting that data can be tricky, and for now, we will abandon the need for validity checking, error recovery, and such. We will first assume a perfect world where everyone follows directions.
You are reading from a stream. Yes, it may be a keyboard or a file, but ultimately it is a stream. That stream contains characters that need to be converted into something more meaningful.
Consider the following:
int x;
scanf("%d", &x);
Sample code to read an int
.
Here is a bulleted list of what is happening in this code:
- This code declares an
int
variable. scanf()
will parse thestdin
input stream looking for anint
to satisfy the conversion of"%d"
.- This conversion may not be successful and result in a runtime error.
- If successful, it will use the address of
x
, represented as&x
, to store the result as anint
.
You may have noticed in previous sections the use of an ampersand, &, with scanf()
. The address-of operator (&) is required when using primitive types. We use the name of a primitive variable to represent the contents of the variable, not where it lives in memory. Hence, we must have the means to express where things live in memory.
This is not necessary with strings. Why? Because the name of a string does not represent the value. Instead, it means the start of memory where the value exists. We have said that strings are arrays of characters, but they are also referred to as pointers to characters.
Consider:
char name[30];
scanf("%29s", name);
Sample code to read a string.
This allows us to read a name. Theoretically. (Yeah, this is where things get interesting…)
Now we present a bulleted list of what is happening in this code:
name
is a string (an array of 30 elements of typechar
).- Properly formed strings must be terminated by a null character.
- The format specifier
"%29s"
defines a field width to guarantee we do not exceed the size ofname
. This help to make sure we stay within bounds and the null character will be available – no matter what happens. - The variable
name
represents the memory location where the characters will be placed. Note there is no&
operator. - The call to
scanf()
will end at the first occurrence of whitespace. Usingscanf()
does not capture whitespace in strings.
The scanf()
function uses the format string in the following way:
- Non-whitespace characters, except %, will be matched exactly. Any mismatched characters will cause the function call to fail.
- Any single whitespace character will cause the function to consume all consecutive whitespace characters from the input.
- Conversions take the input text and parse it to the desired form.
An extensive list of all the conversions supported by scanf()
can be found here. For now, we will provide a limited list as this is intended to be introductory.
Similar to printf()
, conversions have the following form
%[*][width][length]conversion
Following the leading %
, you may optionally use the assignment-suppression character, which is the asterisk (*
). This allows you to indicate the data will be in the stream, but we are not interested in storing it for later use.
The width
is optional and defines the maximum number of characters consumed from the input stream to match that conversion.
Conversion | Description | Argument type | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Length Modifier → | hh |
h |
(none) |
l |
ll |
j |
z |
t |
L |
|
% |
Match literal %. | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
c |
Matches a character or sequence of characters. If the width is specified, will match width characters. There is no null character appended in either case. | N/A | N/A |
char* |
wchar_t* |
N/A | N/A | N/A | N/A | N/A |
s |
Matches a sequence of non-whitespace characters. If the width is specified, it will match up to the width or the first whitespace character. A null is always provided. | |||||||||
[set] |
Matches a non-empty sequence of characters from set . If the first character is ^, characters not in the set are matched. If the width is specified, it will match only up to the width. A null is always provided. |
|||||||||
d |
Matches a decimal integer, same as strtol() with base of 10. |
[un]signed char* |
[un]signed short* |
[un]signed int* |
[un]signed long* |
[un]signed long long* |
[un]intmax_t* |
size_t* |
ptrdiff_t* |
N/A |
i |
Matches an integer, same as strtol() , with a base of 0, meaning the base is determined by the characters in the sequence. |
|||||||||
u |
Matches an unsigned decimal integer, same as strtoul() with base of 10. |
|||||||||
o |
Matches an octal integer, same as strtoul() with base of 8. |
|||||||||
x, X |
Matches an unsigned hexadecimal integer, same as strtoul() with base of 16. |
|||||||||
n |
Returns the number of characters consumed so far. No input is consumed and does not increment the assignment count. | |||||||||
a, A |
Matching a floating-point number, same as strtof() . |
N/A | N/A |
float* |
double* |
N/A | N/A | N/A | N/A |
long double* |
p |
Matches the implementation-defined representation of a pointer. This format should be the same as what printf() would produce for %p . |
N/A | N/A |
void** |
N/A | N/A | N/A | N/A | N/A | N/A |
Table 7a: A table of data types for the arguments of available conversions.
Now, we will show a bunch of fully formed format specifiers and describe their purpose.
Format Specifier | Intended Parsing |
---|---|
%-30s |
Left justified, space padded, string data that occupies 30 character columns. |
%c |
Single character data. |
%e |
Precision floating point data in scientific “e” notation. |
%.3f |
Precision floating point data as a decimal value printed to 3 decimal places. |
%10d |
Right-justified, space-padded decimal integer that occupies 10 character columns. (No precision) |
%03d |
Zero-padded decimal integer that occupies 3 character columns. (No precision) |
%% |
Percent sign. |
Table 7b: Examples of some simple scanf()
format specifiers.
scanf()
scanf()
function is NOT intended to be used for user input. That said, it is often used in that capacity anyway. The scanf()
function can be dangerous. However, with practice, you can use it quite safely as long as you follow some very simple rules found in this chapter.Using fgets()
The fgets()
function is probably the most straightforward method for reading strings and has only a few rules that need to be followed.
Using fgets()
allows us to read characters from the stream, including whitespace. The character array will be populated up to the point of a newline, which will be included, or reaching count
-1. The character array is guaranteed to be null-terminated. The fgets()
function is safe when the bounds are accurate.
Here is a simple idiomatic example of fgets()
.
char name[30];
printf("Please enter your name: ");
fgets(name, sizeof name, stdin);
Predefined storage is used, and we use the sizeof
operator to measure that space. This way, fgets()
knows the very limit and can guarantee the null character is written to the character array.
Remember, the newline will be captured and provided in the character array, assuming we have not exceeded the character buffer size. Should you not want this, you will need to remove it with
name[strcspn(name, "\n")] = 0;
We are using strcspn()
to get the length of the string that does not contain the newline. This will either be the position of the newline or the position of the null character. Either way, we unconditionally set the position to zero, which will remove the newline from the end.
Mixing fgets()
With scanf()
Earlier, we said not to mix scanf()
with fgets()
. You can mix them if you understand the side effects of mixing them.
Remember that scanf()
uses whitespace to mark boundaries while fgets()
can capture whitespace as part of the input, including newlines, and it will be in the result string. These are at odds with each other.
You may be able to compensate for extra text in the stream, but it will depend mainly on what the text is and whether we can predictably move on. Consider the code from Example 2.
If we take
int n;
char name[30];
printf("Please enter your name: ");
fgets(name, sizeof name, stdin);
printf("Please enter a whole number: ");
scanf("%d", &n);
and flip it to be
printf("Please enter a whole number: ");
scanf("%d", &n);
printf("Please enter your name: ");
fgets(name, 30, stdin);
We quickly see that we cannot enter our name because the newline, after consuming the number, remains in the input stream, and fgets()
is designed to read until it finds one! So, it found one and said, “I’m done!”
Technically, we can fix this. More to the point, we can compensate.
printf("Please enter a whole number: ");
scanf("%d", &n);
// clear out the input stream up to newline
scanf("%*[^\n]");
// consume newline
scanf("%*c");
printf("Please enter your name: ");
fgets(name, 30, stdin);
Remember that the *
in the conversion means we suppress assigning the value to a variable. This solution requires two additional and separate calls to scanf()
since the %c
cannot always be reached. This is because we cannot match the conversion, "%*[^\n]%*c"
, on good data. Meaning we will fail the first conversion of anything but newline because there is only the newline!
Can we do something else? Well, yes. We could do something like
printf("Please enter a whole number: ");
fgets(number, sizeof number, stdin);
sscanf(number, "%d", &n);
printf("Please enter your name: ");
fgets(name, 30, stdin);
This is equally horrible because there is one detail we have neglected while trying to get these reads to work: We are not handling lousy input! And, the scanf()
family of functions does not do a good job of handling error situations.
Safely Reading Numbers – Part I
Well-written, secure, and resilient C code takes time to get right. There are many circumstances surrounding this statement, many of which have to do with limited language options, a primitive set of functions, and a desire to stay within the boundaries of the C standard. We are not venturing into POSIX, GNU, or Microsoft libraries. Yes, they have functions that may get the job done more quickly, but they are not always portable between compilers and platforms. They may also have differences in their implementation between standards, compilers, and runtime libraries.
With this in mind, we will build our first iteration of writing well-developed code with error checking. Each iteration for the next few chapters will build on this base model.
#include <stdio.h>
#include <stdlib.h>
int main(void) {
long a;
char buf[300];
char *p;
printf("enter a number: ");
fgets(buf, sizeof buf, stdin);
// have some input, convert it to integer:
a = strtol(buf, &p, 10);
printf("You entered %ld\n", a);
return 0;
}
Example 5: Preparing to build more resilient code.
The code presented in Example 5 takes into consideration that scanf()
is insufficient for the task of writing sufficiently error-checked code.
The issue with scanf()
is that it will consume all leading whitespace. So, we will wait until the user enters something since pressing the return key does not cause the function to move on. The fgets()
function preserves whitespace and will move on when the user presses return.
With that detail in mind, we have created a sufficiently large buffer, a pointer to characters, and a variable to hold the result. We are not doing anything with p
at the moment. This is some setup for the next iteration. It will confirm we consumed the entire input during the conversion from string to long
.
The last detail is that we are using strtol()
instead of atol()
. This is due to the limitations of atol()
error handling. We only get a zero returned if there is an error with no other indicator. What if the user entered zero? Also, an out-of-range error cannot be detected since this is undefined behavior according to the standard. With strtol()
, we can detect all the possible errors that can occur.
Random Numbers
When creating games or simulations, there is a strong need for fixed details to help the process along to whatever discovery is needed. In addition, we often have to ask a lot of what-if type questions. In other words, rather than using fixed, known quantities, allow the system to operate with a little more entropy or randomness.
From a gaming perspective, the idea of randomness is essential. Imagine a world where random monster encounters are likely to occur. Or perhaps there is only a certain percentage chance that opponents can see each other or a possibility that a magical spell could fail if a monster makes a saving throw.
As a result, we could identify many rules of how encounters can happen but leave it to a dice roll to determine if it will happen. We will begin with a simple 6-sided die roll in Example 6a.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(int argc, char *argv[]) {
int r;
srand(time(NULL));
r = rand() % 6 + 1;
printf("r = %d\n", r);
printf("RAND_MAX = %d\n", RAND_MAX);
}
Example 6a: Generating random numbers.
We must discuss how random numbers are generated to understand the above statements. The numbers that will be generated for our program through the rand()
function are known as pseudo-random. Hence, you will often hear them called pseudo-random number generators (PRNG). True randomness comes from entropy. Entropy is a measure of unpredictability. Unpredictability is essential to guaranteeing fairness, distribution, and security. A computer system generally will collect entropy from the work it performs (keyboard, mouse, disk I/O, thread execution, and others) and use that unpredictability of devices to achieve randomness.
We begin by seeding
the PRNG. The call to the srand()
function handles this. The generally accepted method is to pass the current time as time(NULL)
to srand
. The PRNG uses a mathematical series calculation to generate number after number as the program needs. The seed starts the series at a given point. The rand()
function then returns a new random, positive integer value with each call.
In Example 6a, the call to srand()
is on line 9, and our PRNG is seeded. On line 10, we generate the random number. The call to rand()
simply returns a number. We then use that number with the modulus operator to get our random number into a range.
So, why do we add one? We are adding one simply because we must do it to have the proper range of values for this specific example. Since the modulus operator returns the remainder of the division by n, it can only result in the range 0 to n-1. When the divisor is 6, the range of whole numbers for modulus without adding one is 0-5. When we add one, we add one to both sides of the range, yielding 1-6. Now that is a proper 6-sided die roll.
To develop the details further, consider the following piece of code:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(int argc, char *argv[]) {
int r;
srand(time(NULL));
r = rand();
printf("r is %d\n\n", r);
printf("%d\n",r % 4 + 1);
printf("%d\n",r % 6 + 1);
printf("%d\n",r % 8 + 1);
printf("%d\n",r % 10 + 1);
printf("%d\n",r % 12 + 1);
printf("%d\n",r % 20 + 1);
}
Example 6b: Generating random dice rolls.
This code simulates different dice rolls (d4, d6, d8, d10, d12, and d20) using the same single random value. Some sample output is shown below:
r is 1742968586 d4 = 3 d6 = 3 d8 = 3 d10 = 7 d12 = 3 d20 = 7
Seeding and Predictability
The srand()
function need only be called once. It would be best if you always used a different seed which is why we use time, which constantly moves forward. The same series can be generated repeatedly if you use the same seed. This is demonstrated by Example 6c.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(int argc, char *argv[]) {
int r;
srand(37);
r = rand();
printf("r is %d\n\n", r);
printf("d4 = %d\n",r % 4 + 1);
printf("d6 = %d\n",r % 6 + 1);
printf("d8 = %d\n",r % 8 + 1);
printf("d10 = %d\n",r % 10 + 1);
printf("d12 = %d\n",r % 12 + 1);
printf("d20 = %d\n",r % 20 + 1);
}
Example 6c: Using the same seed.
r is 621859 d4 = 4 d6 = 2 d8 = 4 d10 = 10 d12 = 8 d20 = 20
Additional Range Options
Now, let us say that you want to get a random number in a range that is not 0 to something or 1 to something. How about a number in the range of 45 to 92? This can be easily achieved.
Remember that we already know how to get certain ranges. If we want 0-5, we multiply by 6 (see above). If we want 0-9, we multiply by 10. So the generic rule is if we multiply by n, we get the range 0 to n-1. But in the die roll example, we added 1. Yes. Yes, we did. And the secret lies therein – we can add another value to alter the range.
So, 45 to 92 inclusive? No problem. Take a look at the math below.
45 92 -45 -45 --- --- 0 47
Since we know how to do ranges that start with 0, we will take the lower bound value and subtract it from both ends of the range. Remember that adding 1 to the calculation, as we did earlier, we added 1 to both ends.
By subtracting 45 from both sides we get the range 0 to 47. If this represents 0 to n-1, then n is 48. Now we consider:
x = rand() % 48;
However, we need to add back what we took away that got us to that zero-based range.
x = rand() % 48 + 45;
Quiz
Exercises
- (Beginner) Which header files would you find the following functions:
- strchr()
- puts()
- scanf()
- snprintf()
- strtol()
- fgets()
- strlen()
- isupper()
- tolower()
- printf()
- println()
- (Intermediate) Write statements using
Math.random()
for the following ranges:- 2 to 6
- 3 to 9
- 30 to 55
- 59 to 65
- (Intermediate) Using the
strchr()
String function, write statements to determine the location of ‘d’ in the following strings: (You could also look for the ‘+’)- d6
- 3d6
- 10d4+4