Chapter 9S – Sorting and Searching – Programming by Design

(Updated May 5, 2025)

Overview

Mean, Median and Mode

Linear (Sequential) Search

Binary Search

Exercises

Overview

Once we have data in arrays, we quickly realize the value of having it in one place and easily addressable and accessible. Therefore, we can treat it like our own little database!

But before we do that, let us start by looking at how we can manage some good outcomes simply by how the data is arranged. It may also benefit you to peruse Big-O Notation as some algorithms are discussed along with their efficiency.

Mean, Median and Mode

Yes, we are heading into the Math realm! It will be quick, we promise!

Mean

As we know, the mean is also the average of the data in a set. So, we propose this very simple solution to the mean:

Mean.java

public class Mean {

    public static void main(String[] args) {

        int sum = 0, x, a[] = {7, 6, 2, 5, 0};
        double avg;

        for (x = 0; x < a.length; x++)
            sum += a[x];

        System.out.println("The sum is " + sum);
        avg = (double) sum / a.length;
        System.out.println("The mean is " + avg);
    }
}

Example 1: Calculate the average.

The calculation of the mean is somewhat idiomatic. Set up a loop and select the next value in the array, adding it to the running total. Then, take the total and divide it by the number of values. It's simple. The data need not be in any specific order to solve the problem.

Median

The median value is simply the value in the middle. To find the middle, one must first put the data in an expected order. The desired order is sorted and ascending. This means that the data is arranged in order from lowest to highest. Before discussing how sorting can be done in code, we will use a built-in sorting tool available in the Arrays class.

Median.java

import java.util.Arrays;

public class Median {

    public static void main(String[] args) {

        int a[] = {7, 6, 2, 5, 0, 4};
        double median;

        Arrays.sort(a);
        int l = a.length;

        // odd length means take middle value, otherwise
        // take the average of the two.
        if (l % 2 == 1)
            median = a[l / 2];
        else
            median = (a[l / 2 - 1] + a[l / 2]) / 2.0;

        System.out.println("The median is " + median);
    }
}

Example 2: Calculate the median value.

We can see the use of the Arrays.sort() method. We pass the array name to the method, which sorts in ascending order—pretty easy.

The if test handles the detail of determining what the median value is. If there is an odd number of values, it is the middle value. Otherwise, it is the average of the two middle values. The calculations in the array subscript look a little kooky at first, but then you realize we have to find the midway point of the array and use that as the index value.

Mode

Now, the mode is a bit more tricky. Why? Well, technically, there can be many modes. A set of data values can be single-modal, bi-modal, and even multi-modal. One value could appear six times as the highest frequency. If no others appear that many times, then it is single-modal. Two data values with the same number of appearances are bi-modal, and you get the idea.

There are various ways to track how many times we have seen a number - not all of them are great. We could use another array whose index values represent the range of numbers we are tallying. Or we could have parallel arrays where we add the number to one array (number array), and the corresponding location in the other represents the tally (tally array).

The first solution is ok if we know the range of numbers is relatively small. The operational cost of using such an arrangement is O(1), or constant (see Big-O Notation). Since the index is the number and the value inside is the count, updating and looking up values is as cheap as possible. The amount of memory could be an issue, however. Remember that after tallying, you must make a linear array scan to determine which value(s) occurred most frequently. It is not elegant, but it is potentially good with the right data set and potentially wasteful regarding memory.


Example of the tally array. Indexes represent the data set, and the values are the number of occurrences.

                0    1    2    3    4    5    6    7      100
              ------------------------------------------------
             |  0 | 12 |  3 |  9 |  3 |  6 |  0 |  4 |...| 10 |
              ------------------------------------------------

The number 5 has appeared six times, while the number 1 has appeared 12.

The second solution is potentially suitable for sparse data. The memory footprint could be far less than Solution 1 without a discernible range. The operational cost is O(n) per data value. The linear nature of this solution is not a show-stopper - lots of well-behaved, relatively efficient algorithms are linear. But, you must scan the number array every time you take a value from the data set. (If you have not seen the value previously, add it to the list.) Using the index of where that number is, update the corresponding index in the other tally array. Then, after you finish tallying, you must make one more linear scan of the tally array to determine which value(s) occurred most frequently and use those index(es) to determine which numbers they were. Yuck! Slow, wasteful, clumsy. Yuck!


Here's an example of a dual array: The Numbers array represents the data set, and the Tally array is the number of occurrences. Indexes are only used to line up numbers to tallies.

                0    1    2    3    4    5    6    7      100
              ------------------------------------------------
  Numbers    | 56 | 23 | 13 | 19 | 23 | 36 |  0 | 14 |...| 40 |
              ------------------------------------------------

                0    1    2    3    4    5    6    7      100
              ------------------------------------------------
   Tally     |  0 | 12 |  3 |  9 |  3 |  6 |  0 |  4 |...| 10 |
              ------------------------------------------------

The number 14 has appeared four times while the number 23 has appeared 12.

So, to recap:

Solution 1 is O(n). Why? Ultimately we must make one pass through the data set - that is linear and cannot be avoided. This alone does not discourage the use of the solution. The range of values in the data set determines the efficiency. If the range was 1-100, sure, use this solution; it is a reasonably fair trade-off compared to our final solution. However, when the data set a range is a million values? That is 4 million bytes (32-bit ints) and is too expensive.
Solution 2 is O(n²). This is wasteful operationally and has a higher memory footprint. We are making a linear scan of the number array every time we process a value. That is n times n. With certain data sets, it will use twice the memory of Solution 1 or more.

So, let's consider Solution 3. This solution is O(n log₂ n). We cannot avoid the linear pass through the data set. And the Arrays.sort() method can help us make that one pass through the data as efficient as possible. Since we are using Java's sort method, we know this has O(n log₂ n)

The code in Example 3 only handles the seeking of one mode value.

Mode.java

import java.util.Arrays;

public class Mode {

    public static void main(String[] args) {

        int x, a[] = {4, 5, 7, 8, 4, 3, 2, 8, 7, 4, 5, 7, 9, 0, 1, 2, 6, 7, 8};
        int cur, ccount, mode = 0, mcount = 0;

        Arrays.sort(a);
        int l = a.length;

        // NOTE: This code does NOT handle bi-modal or multi-modal data!

        cur = a[0];
        ccount = 1;

        for (x = 1; x < l; x++) {
            //System.out.println(cur + " " + a[x]);
            if (cur == a[x]) {
                ccount++;
            } else {
                //System.out.println(cur + " appeared " + ccount + " times.");

                // set the new mode?
                if (ccount > mcount) {
                    mode = cur;
                    mcount = ccount;
                }
                // Always!
                ccount = 1;
                cur = a[x];
            }
        }
        System.out.println("The mode is " + mode);
    }
}

Example 3: Calculating the mode.

The algorithm for determining the mode works like this:

Sort the array. Set mcount and mode to zero.
Select the first value in the array (a[0]) and hold on to it (cur) noting we have now seen this value once (ccount).
Using a loop compare the next element to cur. If the same, bump ccount, else if ccount > mcount move ccount to mcount, cur to mode, a[x] to cur.

There are multiple values in flight here. Two values, mode and mcount, have to do with managing the most recent candidate for mode and the number of times it occurred. The other two, cur and ccount, represent the current value being tallied. The a[x] value is the next number to be examined in the data set.

If we uncomment the two println() statements, we can see some of the work in action as shown below:

0 1
0 appeared 1 times.
1 2
1 appeared 1 times.
2 2
2 3
2 appeared 2 times.
3 4
3 appeared 1 times.
4 4
4 4
4 5
4 appeared 3 times.
5 5
5 6
5 appeared 2 times.
6 7
6 appeared 1 times.
7 7
7 7
7 7
7 8
7 appeared 4 times.
8 8
8 8
8 9
8 appeared 3 times.
The mode is 7

The debug statements help us to follow the logic to make sure it is sound. The very first line shows the cur is 0, and the next number in the array (a[x]) is 1. So, immediately, we will update mode and mcount with new values 0 and 1, respectively. This is because we saw 0 precisely 1 time.

The mode value will not be updated again until we see a[x] is 3, then 5, and lastly, when a[x] is 8. Remember, we do not consider updating the mode until we know we are about to see a new value and that we have seen the current value more frequently than the last mode value.

Sorting

Much has been written about sorting over the long history of computing. The reader is left to investigate the finer details of the various sorting algorithms (Shell, Selection, Quick, etc.). Many of these methods have been the topic of Ph.D. theses.

In this section, we will introduce the idea of sorting by presenting an essential traditional Bubble Sort. Critics may claim that it is ludicrous to offer a method we know to be the worst form. However, we believe that you must begin somewhere, and this is it. It may be a modestly implemented and poorly performing sorting method, but it perfectly describes the methodology.

Bubble Sort

Many bubble sorts have been written as nested for loops with two ranges and an additional boolean to assess if further iterations are necessary. In other words, if we swapped two values, then there may still be more values to swap.

This is simply a variation in which the outer loop relies on the knowledge that there has been a swap while the inner loop examines the entire list.

BubbleSort.java

import java.util.Arrays;

public class BubbleSort {
  
  public static void bubbleSort(int arr[]) {
    boolean swapped;
    
    System.out.println(Arrays.toString(arr) + "\n");

    do {
      // no swap yet
      swapped = false;
      
      for ( int x = 1; x < arr.length; x++) {
        if ( arr[x] < arr[x-1] ) {
          // swap values!
          int t = arr[x];
          arr[x] = arr[x-1];
          arr[x-1] = t;
          swapped = true;
        }
      }
      
      System.out.println(Arrays.toString(arr));
    } while (swapped);
  }
  
  public static void main(String[] args) {
    
    int a[] =  {2, 21, 27, 35, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 0, 42, 44, 6, 20, 9};
    
    bubbleSort(a);
  }
}

Example 4a: A Bubble Sort with a do-while loop.

The do-while begins on line 8. We set swapped to false since we have not swapped anything on this iteration of the outer loop. We then begin the inner for loop to swap adjacent values. The swap occurs in lines 13-19 and sets swapped to true, which further feeds the outer loop.

The print statement displays the array at the end of each outer loop iteration. This is a debugging tool for checking the sort's progress.

A simple enhancement to the algorithm would include recognizing that the upper end of the array is already sorted, moving toward the lower end of the array.

The output of Example 4a is shown below:

[2, 21, 27, 10, 25, 35, 45, 37, 19, 15, 47, 16, 12, 0, 42, 44, 6, 20, 9, 50]
[2, 21, 10, 25, 27, 35, 37, 19, 15, 45, 16, 12, 0, 42, 44, 6, 20, 9, 47, 50]
[2, 10, 21, 25, 27, 35, 19, 15, 37, 16, 12, 0, 42, 44, 6, 20, 9, 45, 47, 50]
[2, 10, 21, 25, 27, 19, 15, 35, 16, 12, 0, 37, 42, 6, 20, 9, 44, 45, 47, 50]
[2, 10, 21, 25, 19, 15, 27, 16, 12, 0, 35, 37, 6, 20, 9, 42, 44, 45, 47, 50]
[2, 10, 21, 19, 15, 25, 16, 12, 0, 27, 35, 6, 20, 9, 37, 42, 44, 45, 47, 50]
[2, 10, 19, 15, 21, 16, 12, 0, 25, 27, 6, 20, 9, 35, 37, 42, 44, 45, 47, 50]
[2, 10, 15, 19, 16, 12, 0, 21, 25, 6, 20, 9, 27, 35, 37, 42, 44, 45, 47, 50]
[2, 10, 15, 16, 12, 0, 19, 21, 6, 20, 9, 25, 27, 35, 37, 42, 44, 45, 47, 50]
[2, 10, 15, 12, 0, 16, 19, 6, 20, 9, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50]
[2, 10, 12, 0, 15, 16, 6, 19, 9, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50]
[2, 10, 0, 12, 15, 6, 16, 9, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50]
[2, 0, 10, 12, 6, 15, 9, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50]
[0, 2, 10, 6, 12, 9, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50]
[0, 2, 6, 10, 9, 12, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50]
[0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50]
[0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50]

Technical Note

Having trouble visualizing Bubble Sort?

Imagine walking along a row of boxes containing numbers. You are responsible for sorting the contents, but there are some rules:

You are allowed to walk from the current box to the next box.
You are allowed to peer into the box in front of you and ONLY the previous box.
You are only allowed to swap values between the current box and the previous box.
Every time you reach the end of the row, you must walk back to the beginning.
If you know you have not swapped after walking the entire row, you may stop.

The rules are simple, but very time consuming.

Selection Sort

An attempt at improving the sort is the selection sort.

SelectionSort.java

import java.util.Arrays;

class SelectionSort {

    public static void selectionSort(int arr[]) {
        int x, y, t, min;

        System.out.println(Arrays.toString(arr) + "\n");

        for (x = 0; x < arr.length - 1; x++) {

            min = x;
            for (y = x + 1; y < arr.length; y++)
                if (arr[y] < arr[min])
                    min = y;

            t = arr[min];
            arr[min] = arr[x];
            arr[x] = t;

            System.out.println(Arrays.toString(arr));
        }
    }

    public static void main(String[] args) {

        int a[] = {2, 21, 27, 35, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 0, 42, 44, 6, 20, 9};

        selectionSort(a);
    }
}

Example 4b: A Selection sort with nested for loops.

Note that there is no optimization of the algorithm in Example 4b. It is simply a variation of how the Bubble sort runs. There is an enhancement to the algorithm, recognizing that the lower end of the array is already sorted as we move toward the higher end.

The output of Example 4b is shown below:

[0, 21, 27, 35, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 2, 42, 44, 6, 20, 9]
[0, 2, 27, 35, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 21, 42, 44, 6, 20, 9]
[0, 2, 6, 35, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 21, 42, 44, 27, 20, 9]
[0, 2, 6, 9, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 21, 42, 44, 27, 20, 35]
[0, 2, 6, 9, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 21, 42, 44, 27, 20, 35]
[0, 2, 6, 9, 10, 12, 47, 45, 37, 19, 15, 50, 16, 25, 21, 42, 44, 27, 20, 35]
[0, 2, 6, 9, 10, 12, 15, 45, 37, 19, 47, 50, 16, 25, 21, 42, 44, 27, 20, 35]
[0, 2, 6, 9, 10, 12, 15, 16, 37, 19, 47, 50, 45, 25, 21, 42, 44, 27, 20, 35]
[0, 2, 6, 9, 10, 12, 15, 16, 19, 37, 47, 50, 45, 25, 21, 42, 44, 27, 20, 35]
[0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 47, 50, 45, 25, 21, 42, 44, 27, 37, 35]
[0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 50, 45, 25, 47, 42, 44, 27, 37, 35]
[0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 45, 50, 47, 42, 44, 27, 37, 35]
[0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 50, 47, 42, 44, 45, 37, 35]
[0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 35, 47, 42, 44, 45, 37, 50]
[0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50]
[0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50]
[0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50]
[0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50]
[0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50]

Technical Note

Having trouble visualizing Selection Sort?

Imagine walking along a row of boxes containing numbers. You are responsible for sorting the contents, but there are some rules:

You begin at the very first box.
You are allowed to walk from the current box to the next box.
You are allowed to peer into the box in front of you and decide if it’s the smallest number you’ve seen so far.
When you reach the end of the row, you are only allowed to swap values between the smallest box and the starting box.
Every time you reach the end of the row, you must walk back to the last starting position plus one.
When you have exhausted all possible starting positions, you may stop.

The rules are slightly more involved, and the field reduces in size with each new beginning.

Insertion Sort

The Insertion Sort is the first iteration of the Shell sort presented in the next section. This sort plays on array management concepts discussed in Chapter 9. In this sort, we pull a value from the array. We then close that gap by moving all the values to the right (toward higher memory) as long as the pulled value is less than the values being moved. This creates an opening on the left (lower memory). The pulled value is then inserted into the newly created opening. This continues until all the values are processed.

To see the sort in action, the following code provides additional debugging information to show the boundaries of the inner while loop. In addition, you can see the value of x and the stopping value of y.

InsertionSort.java

public class InsertionSort {

    public static void insertionSort(int[] arr) {

        int x, y, t;

        for (x = 0; x < arr.length; x++) {

            t = arr[x];
            y = x;

            while (y > 0 && t < arr[y - 1]) {
                arr[y] = arr[y - 1];
                //print_debug(arr, x, y);
                y--;
            }

            arr[y] = t;

            print_debug(arr, x, y);
        }
    }

    public static void print_debug(int[] a, int x, int y) {
        // Not part of sort routine.
        // Only used to print debug info!

        System.out.printf("x=%2d, y=%2d arr=", x, y);
        for (int i = 0; i < a.length; i++)
            System.out.printf("%3d", a[i]);
        System.out.println();
        // end debug
    }

    public static void main(String[] args) {

        int a[] = {2, 21, 27, 35, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 0, 42, 44, 6, 20, 9};

        insertionSort(a);
    }
}

Example 4c: An insertion sort with debugs showing the visual of the sort.

The output of Example 4c is shown below:

x= 0, y= 0 arr=  2 21 27 35 10 25 47 45 37 19 15 50 16 12  0 42 44  6 20  9
x= 1, y= 1 arr=  2 21 27 35 10 25 47 45 37 19 15 50 16 12  0 42 44  6 20  9
x= 2, y= 2 arr=  2 21 27 35 10 25 47 45 37 19 15 50 16 12  0 42 44  6 20  9
x= 3, y= 3 arr=  2 21 27 35 10 25 47 45 37 19 15 50 16 12  0 42 44  6 20  9
x= 4, y= 1 arr=  2 10 21 27 35 25 47 45 37 19 15 50 16 12  0 42 44  6 20  9
x= 5, y= 3 arr=  2 10 21 25 27 35 47 45 37 19 15 50 16 12  0 42 44  6 20  9
x= 6, y= 6 arr=  2 10 21 25 27 35 47 45 37 19 15 50 16 12  0 42 44  6 20  9
x= 7, y= 6 arr=  2 10 21 25 27 35 45 47 37 19 15 50 16 12  0 42 44  6 20  9
x= 8, y= 6 arr=  2 10 21 25 27 35 37 45 47 19 15 50 16 12  0 42 44  6 20  9
x= 9, y= 2 arr=  2 10 19 21 25 27 35 37 45 47 15 50 16 12  0 42 44  6 20  9
x=10, y= 2 arr=  2 10 15 19 21 25 27 35 37 45 47 50 16 12  0 42 44  6 20  9
x=11, y=11 arr=  2 10 15 19 21 25 27 35 37 45 47 50 16 12  0 42 44  6 20  9
x=12, y= 3 arr=  2 10 15 16 19 21 25 27 35 37 45 47 50 12  0 42 44  6 20  9
x=13, y= 2 arr=  2 10 12 15 16 19 21 25 27 35 37 45 47 50  0 42 44  6 20  9
x=14, y= 0 arr=  0  2 10 12 15 16 19 21 25 27 35 37 45 47 50 42 44  6 20  9
x=15, y=12 arr=  0  2 10 12 15 16 19 21 25 27 35 37 42 45 47 50 44  6 20  9
x=16, y=13 arr=  0  2 10 12 15 16 19 21 25 27 35 37 42 44 45 47 50  6 20  9
x=17, y= 2 arr=  0  2  6 10 12 15 16 19 21 25 27 35 37 42 44 45 47 50 20  9
x=18, y= 8 arr=  0  2  6 10 12 15 16 19 20 21 25 27 35 37 42 44 45 47 50  9
x=19, y= 3 arr=  0  2  6  9 10 12 15 16 19 20 21 25 27 35 37 42 44 45 47 50

Technical Note

Having trouble visualizing Insertion Sort?

Imagine walking along a row of boxes containing numbers. Each time you walk the row, you start at one more position to the right (first 0, then 1, then 2…).

Pull the value from the starting box.
While there are boxes to your left and they contain values greater than the pulled value, move it to the current box. Move left.
Store the pulled value in the box now before you.
When you have exhausted all possible starting positions, you may stop.

The rules are more complicated, and there is a lot of moving data.

Shell Sort

Now, we will introduce the Shell sort, named after Donald Shell. This algorithm decreases the gap between swapped values. In Shell's original design, the gap begins at half the length.

Beginning at roughly the midpoint of the array, if the current value is less than the value gap locations behind it, they are swapped. We then move down the array by one and repeat.

The gap then shrinks by half each time. So, the second complete iteration will begin at roughly the 25% point and continue until we reach a gap of 1, which is effectively an insertion sort.

The critical component here is the swaps continue toward the beginning of the array as long as we still have reachable values gap locations behind us. This moves lower values to the beginning of the array more rapidly.

Important Note!

Consider the efficiency obtained by these simple changes:

The gap starts big and gradually becomes one.
Only gap-distance values are swapped per iteration. Insertion sort moved entire data blocks to accommodate one value.

We offer the Shell sort with debugging information to show the gaps and the number of swaps that occur within each gap iteration.

ShellSort.java

public class ShellSort {

    public static void shellSort(int[] arr) {

        int x, y, t, gap, swaps = 0;

        for (gap = arr.length / 2; gap > 0; gap = gap / 2) {
            swaps = 0;
            for (x = gap; x < arr.length; x++) {
                t = arr[x];
                for (y = x; y >= gap && arr[y - gap] > t; y = y - gap) {
                    arr[y] = arr[y - gap];
                    swaps++;
                    //print_debug(arr, gap, swaps);
                }

                arr[y] = t;
            }
            print_debug(arr, gap, swaps);
        }
    }

    public static void print_debug(int[] a, int g, int s) {

        if (g == 0 && s == 0)
            System.out.print("                arr=");
        else
            System.out.printf("gap=%2d swaps=%2d arr=", g, s);
        for (int i = 0; i < a.length; i++)
            System.out.printf("%3d", a[i]);
        System.out.println();
    }

    public static void main(String[] args) {

        int x, a[] = {2, 21, 27, 35, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 0, 42, 44, 6, 20, 9};

        print_debug(a, 0, 0);
        shellSort(a);
        print_debug(a, 0, 0);
    }
}

Example 4d: Shell sort with debugs showing the visual of the sort.

The output for Example 4d will look like the following:

                arr=  2 21 27 35 10 25 47 45 37 19 15 50 16 12  0 42 44  6 20  9
gap=10 swaps= 7 arr=  2 21 16 12  0 25 44  6 20  9 15 50 27 35 10 42 47 45 37 19
gap= 5 swaps= 3 arr=  2 21  6 12  0 15 44 16 20  9 25 47 27 35 10 42 50 45 37 19
gap= 2 swaps=26 arr=  0  9  2 12  6 15 10 16 20 19 25 21 27 35 37 42 44 45 50 47
gap= 1 swaps= 8 arr=  0  2  6  9 10 12 15 16 19 20 21 25 27 35 37 42 44 45 47 50
                arr=  0  2  6  9 10 12 15 16 19 20 21 25 27 35 37 42 44 45 47 50

Quick Sort

QuickSort.java

public class QuickSort {

    public static void qsort(int[] arr) {

        qsort1(arr, 0, arr.length - 1);
    }

    /**
     * Move values around a pivot point. This uses the lower bound
     * to set the initial pivot.
     *
     * @param arr array of ints
     * @param l   lower bound
     * @param u   upper bound
     */
    public static void qsort1(int[] arr, int l, int u) {
        int p, x, t;

        if (l >= u)
            return;
        p = l; // pivot begins at lower bound
        for (x = l + 1; x <= u; x++) {
            if (arr[x] < arr[l]) {
                p++;
                // swap
                t = arr[p];
                arr[p] = arr[x];
                arr[x] = t;
            }
        }
        // swap
        t = arr[l];
        arr[l] = arr[p];
        arr[p] = t;

        print_debug(arr, l, u, p);
        qsort1(arr, l, p - 1);
        qsort1(arr, p + 1, u);
    }

    public static void print_debug(int[] a, int l, int u, int p) {
        // Not part of sort routine.
        // Only used to print debug info!

        System.out.printf("l=%2d, u=%2d, p=%2d arr=", l, u, p);
        for (int i = 0; i < a.length; i++)
            System.out.printf("%3d", a[i]);
        System.out.println();
        // end debug
    }

    public static void main(String[] args) {

        int a[] = {2, 21, 27, 35, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 0, 42, 44, 6, 20, 9};

        // this prints the indexes of the array
        System.out.print("                     ");
        for (int x = 0; x < a.length; x++)
            System.out.printf("%3d", x);
        System.out.println("\n");

        qsort(a);
    }
}

Example 4e: Quicksort with debugs showing the visual of the sort.

                 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19

l= 0, u=19 arr=  0  2 27 35 10 25 47 45 37 19 15 50 16 12 21 42 44  6 20  9
l= 2, u=19 arr=  0  2  9 10 25 19 15 16 12 21  6 20 27 37 35 42 44 47 50 45
l= 2, u=11 arr=  0  2  6  9 25 19 15 16 12 21 10 20 27 37 35 42 44 47 50 45
l= 4, u=11 arr=  0  2  6  9 20 19 15 16 12 21 10 25 27 37 35 42 44 47 50 45
l= 4, u=10 arr=  0  2  6  9 10 19 15 16 12 20 21 25 27 37 35 42 44 47 50 45
l= 4, u= 8 arr=  0  2  6  9 10 19 15 16 12 20 21 25 27 37 35 42 44 47 50 45
l= 5, u= 8 arr=  0  2  6  9 10 12 15 16 19 20 21 25 27 37 35 42 44 47 50 45
l= 5, u= 7 arr=  0  2  6  9 10 12 15 16 19 20 21 25 27 37 35 42 44 47 50 45
l= 6, u= 7 arr=  0  2  6  9 10 12 15 16 19 20 21 25 27 37 35 42 44 47 50 45
l=13, u=19 arr=  0  2  6  9 10 12 15 16 19 20 21 25 27 35 37 42 44 47 50 45
l=15, u=19 arr=  0  2  6  9 10 12 15 16 19 20 21 25 27 35 37 42 44 47 50 45
l=16, u=19 arr=  0  2  6  9 10 12 15 16 19 20 21 25 27 35 37 42 44 47 50 45
l=17, u=19 arr=  0  2  6  9 10 12 15 16 19 20 21 25 27 35 37 42 44 45 47 50

Sort Stress Test

Finally, we introduce a comparison of many sorts. This is a conglomeration of multiple variations of hand-written from multiple algorithm sources to those that occur naturally in the language runtime. Each sort is measured solely with a clock-on-the-wall time and the number of swaps. The number of swaps is used to measure time complexity, which will vary significantly.

StressSort.java

import java.util.concurrent.ThreadLocalRandom;
import java.util.Arrays;

public class StressSort {

    public static void bubbleSort(int arr[]) {
        boolean swapped;

        do {
            swapped = false;

            for ( int x = 1; x < arr.length; x++) {
                if ( arr[x] < arr[x-1] ) {
                    int t = arr[x];
                    arr[x] = arr[x-1];
                    arr[x-1] = t;
                    swapped = true;
                }
            }
        } while (swapped);
    }

    public static void insertionSort(int[] arr) {

        int x, y, t;

        for (x = 0; x < arr.length; x++) {

            t = arr[x];
            y = x;

            while ( y > 0 && t < arr[y-1]) {
                arr[y] = arr[y-1];
                y--;
            }

            arr[y] = t;
        }
    }

    public static void selectionSort(int[] arr) {

        int x, y, t, min;

        for (x = 0; x < arr.length - 1; x++) {

            min = x;
            for (y = x + 1; y < arr.length; y++)
                if (arr[y] < arr[min])
                    min = y;

            t = arr[min];
            arr[min] = arr[x];
            arr[x] = t;
        }
    }

    public static void shellSortOrig(int[] arr) {

        int x, y, t, gap;

        for ( gap = arr.length/2; gap > 0; gap=gap/2 ) {

            for ( x = gap; x < arr.length; x++ ) {

                t = arr[x];
                for ( y = x; y >= gap && arr[y-gap] > t; y=y-gap ) {
                    arr[y] = arr[y-gap];
                }

                arr[y] = t;
            }
        }
    }

    public static void shellSortInSed(int[] arr) {

        int[] gaps = {1391376, 463792, 198768, 86961, 33936, 13776, 4592, 1968, 861, 336, 112, 48, 21, 7, 3, 1};
        int g, x, y, t, gap;

        for ( g = 0; g < gaps.length; g++ ) {

            gap = gaps[g];
            for ( x = gap; x < arr.length; x++ ) {

                t = arr[x];
                for ( y = x; y >= gap && arr[y-gap] > t; y=y-gap ) {
                    arr[y] = arr[y-gap];
                }

                arr[y] = t;
            }
        }
    }

    public static void shellSortBentlyA(int[] arr) {

        int gap, x, y, t, n;

        n = arr.length;

        // Find largest gap based on 3.
        for ( gap = 1; gap < n; gap=3*gap+1 );
        while (true) {
            gap /= 3;

            if ( gap < 1)
                break;

            for ( x = gap; x < n; x++ ) {
                for ( y = x; y >= gap; y=y-gap ) {
                    if ( arr[y-gap] < arr[y] )
                        break;
                    t = arr[y];
                    arr[y] = arr[y-gap];
                    arr[y-gap] = t;
                }
            }
        }
    }

    public static void shellSortBentlyA2(int[] arr) {

        int g, gap, x, y, t, n;

        n = arr.length;

        // Find largest gap based on 3.
        for ( g = 1; g < n; g=3*g+1 );
        for ( gap = g/3; gap > 0; gap=gap/3 ) {

            for ( x = gap; x < n; x++ ) {
                for ( y = x; y >= gap; y=y-gap ) {
                    if ( arr[y-gap] <= arr[y] )
                        break;
                    t = arr[y];
                    arr[y] = arr[y-gap];
                    arr[y-gap] = t;
                }
            }
        }
    }

    public static void shellSortBentlyB(int[] arr) {

        int[] gaps = {1391376, 463792, 198768, 86961, 33936, 13776, 4592, 1968, 861, 336, 112, 48, 21, 7, 3, 1};
        int g, x, y, t, gap, n;

        n = arr.length;
        for ( g = 0; g < gaps.length; g++ ) {

            gap = gaps[g];

            for ( x = gap; x < n; x++ ) {
                for ( y = x; y >= gap; y=y-gap ) {
                    if ( arr[y-gap] < arr[y] )
                        break;
                    t = arr[y];
                    arr[y] = arr[y-gap];
                    arr[y-gap] = t;
                }
            }
        }
    }

    public static void qsort(int[] arr) {

        qsort1(arr, 0, arr.length-1);
    }

    public static void qsort1(int[] arr, int l, int u) {
        int p, x, t;

        if (l >= u)
            return;
        p = l; // pivot begins at lower bound
        for (x = l + 1; x <= u; x++) {
            if (arr[x] < arr[l]) {
                p++;
                // swap
                t = arr[p];
                arr[p] = arr[x];
                arr[x] = t;
            }
        }
        // swap
        t = arr[l];
        arr[l] = arr[p];
        arr[p] = t;

        qsort1(arr, l, p - 1);
        qsort1(arr, p + 1, u);
    }

    public static void main(String[] args)
    {
        int ASIZE=100000000;
        int[] a1 = new int[ASIZE], a2 = new int[ASIZE], a3 = new int[ASIZE];
        int[] a4 = new int[ASIZE], a5 = new int[ASIZE], a6 = new int[ASIZE];
        int[] a7 = new int[ASIZE], a8 = new int[ASIZE], a9 = new int[ASIZE];
        int x, n;
        long s, e;

        for ( x = 0; x < ASIZE; x++ ) {
            n = ThreadLocalRandom.current().nextInt(1, 10000000);
            a1[x] = n;
            a2[x] = n;
            a3[x] = n;
            a4[x] = n;
            a5[x] = n;
            a6[x] = n;
            a7[x] = n;
            a8[x] = n;
            a9[x] = n;
        }

        System.out.print("Bubble Sort ");
        s = System.nanoTime();
        //bubbleSort(a1);
        e = System.nanoTime();
        reportTime(s, e);

        System.out.print("Selection Sort ");
        s = System.nanoTime();
        //selectionSort(a2);
        e = System.nanoTime();
        reportTime(s, e);

        System.out.print("Insertion Sort ");
        s = System.nanoTime();
        //insertionSort(a3);
        e = System.nanoTime();
        reportTime(s, e);

        System.out.print("Shell Sort Original ");
        s = System.nanoTime();
        shellSortOrig(a4);
        e = System.nanoTime();
        reportTime(s, e);

        System.out.print("Shell Sort InSed ");
        s = System.nanoTime();
        shellSortInSed(a5);
        e = System.nanoTime();
        reportTime(s, e);

        System.out.print("Shell Sort Bently A ");
        s = System.nanoTime();
        shellSortBentlyA2(a6);
        e = System.nanoTime();
        reportTime(s, e);

        System.out.print("Shell Sort Bently B ");
        s = System.nanoTime();
        shellSortBentlyB(a7);
        e = System.nanoTime();
        reportTime(s, e);

        System.out.print("Arrays.sort() ");
        s = System.nanoTime();
        Arrays.sort(a8);
        e = System.nanoTime();
        reportTime(s, e);

        System.out.print("Quicksort ");
        s = System.nanoTime();
        qsort(a9);
        e = System.nanoTime();
        reportTime(s, e);
    }

    public static void reportTime(long s, long e) {
        long t = e - s;
        System.out.println("Execution time in milliseconds : " + t / 1000000);
    }
}

Example 4f: Sort stress test.

Each of the sorts is stress-tested with 100,000,000 random values. The first three sorts (bubble, selection, and insertion) aren't even considered. When you see the times shown in the sample output below, you'll quickly understand why.

Bubble Sort Execution time in milliseconds : 0
Selection Sort Execution time in milliseconds : 0
Insertion Sort Execution time in milliseconds : 0
Shell Sort Original Execution time in milliseconds : 35184
Shell Sort InSed Execution time in milliseconds : 21747
Shell Sort Bently A Execution time in milliseconds : 44954
Shell Sort Bently B Execution time in milliseconds : 25738
Arrays.sort() Execution time in milliseconds : 8473
Quicksort Execution time in milliseconds : 9141

Technical Note

You can reduce the array sizes to 10000, uncomment the three worst sorts, and slowly increase the array sizes to see how quickly they become unusable.

Searching

Searching algorithms allow us to seek out specific values in an array of data. We can improve the search methods when we know something specific about the data. It must be detailed enough to make the search method more refined and efficient.

Linear (Sequential) Search

First, we begin with the most basic form - the Linear Search or Sequential Search. This search works regardless of the data layout. Completely random and ordered data can have a linear search applied. This is because, as the name implies, we scan all values in the space.

  public static int seqSearch(int[] data, int key) {

      for ( int x = 0; x < data.length; x++ )
          if ( data[x] == key )
              return x;
    
      return -1;
  }

Example 5a: A simple linear search.

The linear search begins at the first value in the array and iterates over the entire expanse. We compare each value in the array to the key value we seek. If they match, we return to the location it was found. Otherwise, we move on to the next location.

When the array has been exhausted, we are still in the method. We simply return -1 to indicate it was not found.

This form of search has a time and space complexity of O(n).

Binary Search

The following form assumes we know something specific about the data. In this case, the data will be sorted. By sorting the array before beginning the search, we can get to the time complexity of O(log₂n). The storage complexity remains O(n) since we still represent the same number of values.

  public static int binSearch(int[] data, int key) {
      int midpoint = 0, first = 0, last;
    
      last = data.length - 1;
    
      while ( first <= last ) {
          midpoint = (first + last) / 2;
          if ( data[midpoint] == key )
              return midpoint;
          else if ( data[midpoint] > key )
              last = midpoint - 1;
          else
              first = midpoint + 1;
      }
    
      return -1;
  }

Example 5b: An applied binary search.

After the requirement of sorting the data has been met, we begin by setting up pickets representing the search area's beginning and end. We then find the midpoint of that expanse and examine that location. If the key value has been found, we return the location. Otherwise, we either move the first picket or move the last picket. This reduces the search space to half of what it was.

Get That Dictionary!

To better understand the binary search process, open a dictionary to the middle. Examine the facing pages. If you found the word, you are done. If not, determine if the word you seek is on the left or right side of the book. Tear the book in half, producing two halves of the original book, and toss the half you no longer need. Take the remaining half and repeat.

Each iteration of these steps reduces the field by half. So, after the first comparison, you have eliminated 50%. After the second comparison, the reduction is 75%. After the third, the reduction is 87.5%, and so on.

When the first and last variables cross over such that first is greater than last, we know the array has been exhausted. We return -1 to indicate it was not found.

Exercises

(Beginner) Research the Arrays.toString() overloaded method. Pop it into Median and/or Mode, above, and prove that the sort is happening.
(Intermediate) Modify Example 3 to handle a multi-modal solution. (Hint: You only need to maintain one mcount, but multiple mode values.)
(Intermediate) Modify bubbleSort1() and bubbleSort2() to be more efficient by adjusting the inner loop to the array upper bound the size minus the iteration number.
(Advanced) Modify the Insertion Sort to be head-recursive. (See Chapter 13 - Recursion)

Table of contents

Overview

Mean, Median and Mode

Sorting

Searching

Exercises