Skip to content

Programming by Design

If you're not prepared to be wrong, you'll never come up with anything original. – Sir Ken Robinson

  • About
  • Java-PbD
  • C-PbD
  • ASM-PbD
  • Algorithms
  • Other

CISS-150 Project 3 – The C environment

Posted on April 8, 2013February 24, 2025 By William Jojo
CISS-150-Project

CISS-150 Project 3 (10 points)

(Updated February 24, 2025)

Overview

This project will introduce you to the many layers of memory management. It is for the layperson and is not intended to be exhaustive.

Memory is a commodity that needs to be efficiently managed. To understand why, we will investigate how the C programming environment organizes its memory. Ubuntu Linux will be the basis of this project.


Learning outcomes

  • Planning and design.
  • Enhancing existing virtualization/OS knowledge.
  • Operating in a new programming language.
  • Understanding basic Linux memory management principles.

Setting Up the Linux System

Begin by installing the C build environment. Run the following:

student@student-vm:~$ sudo apt update
student@student-vm:~$ sudo apt install build-essential

Once this install is complete, you can access the C compiler. This is necessary for compiling a C program in the future.

While several additional packages are installed during this step, the compiler is the only one we need.


Memory Layout

The operating system executes a program by allocating an amount of memory and loading the program into that memory. Linux uses the ELF (Executable and Linkable Format) file layout to define the components of a program and how they should be loaded into memory. You can use the following command to read more about ELF.

student@student-vm:~$ man elf

About 400 lines into the man page, you will see details regarding the sections (sometimes called segments). We are specifically interested in the ones noted in the diagram below.

        Highest Memory Location

            |-----------|
            |   argv    |  Command line arguments and
            |   env     |  system environment variables.
            |-----------|
            |   Stack   |  Functions and automatic variables
            |...........|
            |     |     |  The stack "grows" toward lower memory
            |     |     |  as space is allocated by pushing values
            |     v     |  and subtracting space from RSP for local
            |           |  variables.
            |           |
            |     ^     |
            |     |     |  The heap "grows" toward higher memoery
            |     |     |  as space is allocated by tools like malloc.
            |...........|
            |    Heap   | Memory allocated at runtime (malloc)
   _end --> |-----------| <-- program break
            |   .bss    | Uninitialized data - globals and static. (BSS)
 _edata --> |-----------|
            |   .data   | Initialized data - globals and static. (DS)
 _etext --> |-----------|
            |   .text   | Code (TEXT, Code Segment)
 _init  --> |-----------|

         Lowest Memory Location

As we will discover, programs can be executed anywhere in physical memory, so part of the design is to make sections of the program relocatable. When viewing the details of the memory of an ELF file, it is essential to realize that the addresses shown are offsets.

Although the executable format contains many possible segments, C generally uses only a few. The compiler can add many other segments, but the above layout is enough.

Section Name Purpose
.text (Text) The text segment, text section, .text, code segment. It goes by many names, and this is where the bulk of the program code lives. The boundary is typically marked by _init and _etext.
.data (Data) The data segment, DS, .data, etc. This also goes by many names, and this is where global and static data lives that have been pre-initialized. The boundary is marked by _edata
.bss (BSS) Rarely known by names other than BSS or .bss. This is where uninitialized global and static data live. These are initialized to zero at runtime. The boundary is marked by _end.
Stack This is where the runtime stack lives for function call return addresses and automatic (local in function) variables. As the stack’s memory is consumed, it grows to lower memory toward the heap.
Heap This is where runtime memory requests are satisfied. As the heap’s memory is consumed, it grows to higher memory toward the stack. This also affects the program break (see sbrk()) which identifies the current upper bound of the heap. The heap is located after the end of the last data segment, typically .bss. The value of

Now, the use of these sections is system-dependent. Although the documentation describes its intended use, the environment may diverge from this.


The C Code

Using the gedit command, create the following file. (Remember that you can open this page in Ubuntu and then copy and paste it from within your VM.)

cmem.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

extern void *_etext, *etext, *edata, *end;

int gd;
int igd = 42;

void mapstack(int iter) {
  int msvar;

  if ( iter < 6 ) {
    printf("iter   = %d\n&iter  = %p\n&msvar = %p\n", iter, &iter, &msvar);
    mapstack(iter + 1);
  }
}

int main(int argc, char **argv) {

  int d1;
  int d2 = 35;
  static int sd;
  static int isd = 15;
  char s[5];
  char *p1;
  char *p2;

  printf("The address of break   is %p\n", sbrk(0));
  printf("The address of etext   is %p\n", &etext);
  printf("The address of _etext   is %p\n", &_etext);
  printf("The address of edata   is %p\n", &edata);
  printf("The address of end     is %p\n\n", &end);

  printf("The address of main    is %p\n", main);
  printf("The address of gd      is %p\n", &gd);
  printf("The address of igd     is %p\n", &igd);
  printf("The address of sd      is %p\n", &sd);
  printf("The address of isd     is %p\n", &isd);
  printf("The address of d1      is %p\n", &d1);
  printf("The address of d2      is %p\n", &d2);
  printf("The address of s       is %p\n", &s);

  p1 = malloc(20 * sizeof(char));
  printf("\n\nThe address of p1      is %p\n", &p1);
  printf("The address in p1      is %p\n", p1);
  p2 = malloc(20 * sizeof(char));
  printf("The address of p2      is %p\n", &p2);
  printf("The address in p2      is %p\n", p2);
  printf("The address of break   is %p\n\n\n", sbrk(0));

  printf("The address of argv    is %p\n", argv);
  if ( argc > 0 )
    printf("The address of argv[0] is %p\n", argv[0]);

  printf("\nCalling mapstack()...\n");
  mapstack(1);

  printf("\n\nThis is a detailed memory map.\n");
  FILE *fd = fopen("/proc/self/maps", "r");
  if (fd) {
    char line[256];
    while (fgets(line, sizeof(line), fd)) {
        printf("%s", line);
    }
    fclose(fd);
  }

}

Here is a brief description of the details of the program.

  • Lines 7-8: Creates two global vars, one initialized, one not.
  • Lines 10-17: A recursive function to map the stack growing toward lower memory.
  • Lines 21-27: Local variables in main(), some initialized, some not.
  • Line 29: Show the current break.
  • Lines 44-49: Allocates two pointers and shows the heap growing toward higher memory.
  • Line 50: Show the current break after allocation – it should have moved.

To compile the program, use the following:

student@student-vm:~$ cc -o cmem cmem.c

Then execute the program with:

student@student-vm:~$ ./cmem

Data Collection

You will compile and run the program listed above. When complete, capture the addresses shown in the output. Then, run the following, noting the details of the first three columns.

student@student-vm:~$ size cmem
   text	   data	    bss	    dec	    hex	filename
   2479	    624	     16	   3119	    c2f	cmem

Now run the following to give details of the ELF file that is your executable, with the memory offsets sorted.

student@student-vm:~$ readelf -S cmem
student@student-vm:~$ nm cmem | sort

The readelf command will list all the sections and their offsets. The nm command provides details on memory offset for sections and named objects that are in the program itself. These are based on some base segment values.

[NOTE: you may want to read the details of man nm to understand the output.]

Capture that information as well. Perform an analysis of the variables and functions used in cmem.c. Identify variables in the BSS, DS, Heap, and Stack. Try to map out (as best you can) where you think the boundaries are for the memory sections based on the addresses provided by the tools.

Important Note!
Remember that the nm command shows offsets. To determine the boundaries, you will need to do a little hexadecimal math.

Put all of your output from the program, size and nm into a Word document. Include an analysis of your findings and use the layout given earlier.

Submit the Word document to the Learning Management System when complete.

Post navigation

❮ Previous Post: CISS-111 Project 7
Next Post: Ubuntu Topic 4 – Security/User/Group Management ❯

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Copyright © 2018 – 2025 Programming by Design.