(Updated March 25, 2023)
Table of contents
Programming Environment
The C Language
The C runtime
The C Standard(s)
Final Thoughts About This Book
Quiz
Exercises
Programming Environment
The C programming language makes it easy to develop programs using a simple and modest set of language constructs. The C language is a subset of C++. C++ is a derivative of C, as is C#. A history of the C language and its many standards can be found at the end of this chapter. A simple editor and a C compiler are all you need to begin writing programs in C. Compared with some other programming languages and their respective environments, this is near the top of the list for simplicity. However, any project can become sufficiently complex with enough moving parts.
C is a small language, but it is not an easy one. Many caveats and pitfalls await the unwary traveler. C is not for the faint of heart. C will get you close to the hardware, which is very powerful. And with great power comes great responsibility. You are responsible for all of your actions. There are few safety nets.
This language was developed at a time before the likes of object-oriented programming, large standard libraries, or garbage collection. The fact that it is still in use today is a testimony to its strength and power.
C also grew up when memory and storage were valuable commodities. Memory and storage are abundant now compared to the infancy and teen years of computing. As such, it was often commonplace to have minimalist naming conventions. Note the names of the functions in the standard library; rarely do any of the function names exceed six characters. As these are times of abundance, it seems adequate to set aside the post-depression model of minimalism. While we should respect the use of resources, we should always aim for the most straightforward solutions. However, the name conventions of variables and function names could stand for a bit of an overhaul. As such, camel-cased variables and function names will be prominent throughout. The use of meaningful identifier names is strongly encouraged, and if that means taking a naming convention from the likes of Java and other languages, then so be it.
Language Translation
The compiler converts source code (a human consumable form of the language) into object code. The object code file is mostly housekeeping items and machine code. This is CPU instructions, references to memory locations, calls to library objects, and such. The calls to library objects are usually functions that perform specific tasks like printing output, reading from files, or manipulating strings. However, the compiler does not know where these objects live, only that they exist simply because we have referenced them.
After the compiler successfully translates the source code, the object code is typically passed to a linker. The compiler could do this automatically if the compiler were instructed to produce an executable file. The linker is responsible for cleaning up all the loose ends from the compiler by connecting any remaining unknown references. Again, the compiler only knows that external library functions exist. The linker knows how to connect the object code with the external libraries to produce a fully functional program.
The program produced can only be executed on the platform for which it was compiled. The program cannot be executed on Linux or a Mac if it is compiled on Windows. It can only be run on a Windows platform. Some languages like Java are compiled into bytecode and run in a virtual machine with a run-time environment that allows the programs to cross the platform level. This means Java programs can be compiled on a Mac and theoretically run on a Windows or Linux Machine. However, such platform-independent implementations can sometimes be slower than those that are platform-dependent. So, it is often a tradeoff between speed and portability.
Figure 1: The path of a C source file to the executable program.
A language like C intends to get as close to the hardware as possible. Closer proximity to the hardware means we hope to remove as much latency as we can. The more latency you have, the slower the program will run. It could be said that we are sacrificing mobility for speed. This would be a fair assessment.
The desire to write purpose-driven software tied closely to hardware is not difficult to imagine. The Linux operating system is written in C and assembly language. The device driver software that controls components such as video, network adapters, and disk controllers is generally written in C.
As mentioned earlier, you only need an editor and a compiler to begin programming in C. There are also some IDEs (Integrated Development Environments) that can help you as you are programming. Tools like Eclipse, GNAT, Visual Studio CODE, CodeRunner, CLion, and many others that are free and commercial can help you write and manage your code while providing hints, library information, and contextual help.
The C Language
The obligatory “Hello world” example is shown below. If for no other reason, this is an excellent place to begin showing C’s basic structure.
#include <stdio.h>
int main(void) {
printf("Hello world!\n");
return 0;
}
Example 1a: Program with a simple main method accepting no arguments.
This is also sometimes seen as
#include <stdio.h>
int main(int argc, char *argv[]) {
printf("Hello world!\n");
return 0;
}
Example 1b: Program with the main function that allows it to accept arguments.
As a quick overview, this program has the following components.
Line 1: Preprocessing directive to load a header file for standard I/O. Line 3: Declaration and beginning of themain
function. Line 4: Aprintf
statement. (A statement that invokes theprintf
function.) Line 5: Areturn
statement to indicate success. Line 6: The end of themain
function.
One of the first things you may notice is the sparseness of the language. It is very lightweight and uncomplicated.
The program begins with #include
, which is a C pre-processing directive. There are several of these directives in the language. However, the one noted here indicates the need to load a specific header file. Header files help to direct the compiler to learn about external library information. Specifically, the compiler learns about functions or named values you may use in your programs. The compiler can also use this information to detect when you are misusing these functions.
In this example, it is the printf
function which can only be acquired through the loading of stdio.h
. It is common to see several #include
directives at the top of C programs requiring multiple library functions. The #include
directives may only appear at the top. Header files are also used to group functions with a related purpose, such as math functions, I/O functions, and string handling functions. Some of the more common header files are noted in Table 1.
Header File | Purpose |
---|---|
stdio.h |
Basic I/O funtions: printf , fprintf , fgetc , fgets , fopen , fclose |
stblib.h |
Utility functions: bsearch , qsort , atoi , srand , rand , malloc , free , abort , exit |
string.h |
String manipulation functions: strcat , strcpy , strcmp , memmove , memcmp , memcpy |
math.h |
Math functions: sin , cos , pow , sqrt , ceil , floor |
ctype.h |
Character management functions: isalpha , isdigit , isspace , isupper , tolower , toupper |
Table 1: Common header files.
It is not necessary to memorize these header files and their specific functions. This is intended only to indicate where you can find more information about the standard library.
On line 3, we have the main
function. This is the entry point of our program. In other words, we need a place to start. So, the main function provides the beginning of the program. Regardless of the number of functions we use or create, there will be no mistaking where the execution of program code will begin – the main
function.
We say that the main
function may or may not accept additional arguments. This is shown through the variation of line 3 in the two examples.
The printf
function allows us to send output to the display device. Our message is simple – say hello! All we will say about printf
at the moment is that it is a mighty function for displaying output.
The C Runtime
C programs get executed in a runtime environment. It is mostly comprised of a series of files that help the program run on the platform for which it was compiled.
Linux
On a Linux platform, you may know of the GNU C Compiler (gcc
, g++
, etc), the GNU C Library (glibc
or simply libc
), the linker, ld
and the loader (ld.so). Together these tools help to create executable programs from C source code files and get them loaded into memory with the necessary shared libraries to make them complete.
Mac
Apple has a similar environment for C, but the particulars are a bit different. You can read more about Mach-O (Mac does not use ELF like Linux) in Medium’s article about Mach-O.
You can also learn more about the Clang tools and C compiler for Mac available in Xcode. If you are looking for a quick start an IDE and an invocation of xcode-select --install
will get you started.
Windows
The Windows environment is slightly different and often a little more extensive to set up.
You will also need a C compiler. You can try Mingw-w64, which is one recommended for use with Code::Blocks IDE and the Visual Studio Code C/C++ Extension.
The Code::Blocks C/C++ and Fortran IDE is relatively simple to set up and will automatically detect your compiler setup after installation. This IDE is relatively easy to navigate and has a smaller learning curve than many IDEs.
You can get the Visual Studio Code software and add in the C/C++ Extension. This can be a complex operating environment and not be a good fit for beginners.
The C Standard(s)
Brian Kernighan and Dennis Ritchie developed the C programming language from the late 60s to the early 70s. With the gaining popularity and incongruence of compilers and interpretation of how the language should be translated, it was determined that C should be standardized. In this way, every compiler would do the same thing when interpreting the language producing consistent expectations at runtime.
The first official standard became known as the ANSI C Standard, named after the body responsible for the standard, the American National Standards Institute. This separated it from the original, historically known as the K & R Standard.
The ANSI standard became known as the C89 standard for the year it was ratified. At around the same time, ISO (the International Standards Organization) adopted the ANSI standard, reformatted it, and ratified it in 1990. This standard is known as C90. The C89 and C90 standards are functionally identical. This release essentially codified existing programming practices, introduced new features, and took const
and function prototyping from C++.
ISO released an extension to the ANSI standard called C95. This added digraphs and improved the language’s wide/multi-byte character support. It is unclear what form of support this particular version received, as it does not seem to have as much written about it as C89 and the subsequent C99.
In 1999, ISO ratified the C99 standard. This standard introduced a wealth of new capabilities for the C language as well as taking more ideas from C++, including inline
, the //
comment form, allowing declarations to be mixed with code (previously, these had to be listed before any code statements), allowing declarations in the initialization clause of the for
loop. A big piece was the removal of implicit functions and the implicit int
that was assumed to be the return type if not explicitly specified and for undeclared parameters.
The next official standard was not released until 2011 and was aptly called C11. This included eschewing gets
from the lexicon as it was a source of poorly written code prone to terrible defects. Also introduced were threads (not the same as pThreads
, but functionally similar), a host of new reserved words, and type generic functions.
In June 2018, ISO ratified the C17 (arguably the C18) standard, which clarified many defects in the C11 standard. These were listed as 54 specific DRs (defect reports). No significant changes to the C language occurred during this standard release.
__STDC_VERSION__
set to the value 201710L
.The C2X (C23) standard is in draft mode at the time of this writing. It is expected to have more defect reports resolved for post-C11 and some possible new reserved words and library functions.
Final Thoughts About This Book
C is an old language. Some would argue it has had its time and should step aside. Yet the Linux kernel, for example, is over 98% C code and is not likely to have a massive overhaul to a new language soon.
C works.
What needs to change is the use of some of the library routines that have historically been proven to contribute to poorly written and error-prone code that leads to runtime errors, buffer overflows, and security issues.
This book attempts to deal with these situations by simply replacing known bad habits with arguably better ones. We say arguably because everything is subject to interpretation, and many may disagree with the choices made in this text. Ultimately, the goal is to advise against using functions and idioms known to produce harmful code while encouraging the use of those described herein.
Your mileage may vary, but you will write better, albeit longer, code.
Quiz
Exercises
- Install a C runtime environment and successfully run the
ex1a.c
andex1b.c
, above. - Research the history of C and the differences between K&R C, (the original) ANSI C, and the ISO standards like C11 and C17.
- Research the details of the sister language C++ to get a sense of some of the differences and similarities.