(Updated September 1, 2024)
Table of contents
Overview
Once we have data in arrays, we quickly realize the value of that data being in one place and easily addressable and accessible. Therefore, we can treat it like our own little database!
But before we do that, let us start by looking at how we can manage some good outcomes simply by how the data is arranged. It may also benefit you to peruse Big-O Notation as some algorithms are discussed along with their efficiency.
Mean, Median and Mode
Yes, we are heading into the Math realm! It will be quick, we promise!
Mean
As we know, the mean is also the average of the data in a set. So, we propose this very simple solution to the mean:
public class Mean {
public static void main(String[] args) {
int sum=0, x, a[] = {7,6,2,5,0};
double avg;
for (x = 0; x< a.length; x++)
sum += a[x];
System.out.println("The sum is " + sum);
avg = (double)sum / a.length;
System.out.println("The mean is " + avg);
}
}
Example 1: Calculate the average.
The calculation of the mean is somewhat idiomatic. Set up a loop and select the next value in the array adding it to the running total. Take the total and divide by the number of values. Simple. The data need not be in any specific order to solve the problem.
Median
The median value is simply the value in the middle. To find the middle, one must first put the data in an expected order. The desired order is sorted and ascending. This means that the data is arranged in order from lowest to highest. Before discussing how sorting can be done in code, we will use a built-in sorting tool available in the Arrays
class.
import java.util.Arrays;
public class Median {
public static void main(String[] args) {
int a[] = {7,6,2,5,0,4};
double median;
Arrays.sort(a);
int l = a.length;
if ( l % 2 == 1 )
median = a[l / 2];
else
median = (a[l / 2 - 1] + a[l / 2]) / 2.0;
System.out.println("The median is " + median);
}
}
Example 2: Calculate the median value.
On line 10, we see the use of the Arrays.sort()
method. We pass the array name to the method, which sorts in ascending order—pretty easy stuff.
The if
test handles the detail of determining what the median value is. If there is an odd number of values, it is the middle value. Otherwise, it is the average of the two middle values. The calculations in the array subscript look a little kooky at first, but then you realize we have to find the midway point of the array and use that as the index value.
Mode
Now, the mode is a bit more tricky. Why? Well, technically, there can be many modes. A set of data values can be single-modal, bi-modal, and even multi-modal. One value could appear six times as the highest frequency. If no others appear that many times, then it is single-modal. Two data values with the same number of appearances are bi-modal, and you get the idea.
There are various ways to track how many times we have seen a number - not all of them are great. We could use another array whose index values represent the range of numbers we are tallying. Or we could have parallel arrays where we add the number to one array (number array), and the corresponding location in the other represents the tally (tally array).
The first solution is ok if we know the range of numbers is relatively small. The operational cost of using such an arrangement is O(1), or constant (see Big-O Notation). Since the index is the number and the value inside is the count, the cost of updating and looking up values is as cheap as you can get. The amount of memory could be an issue, however. Remember that after you finish tallying, you must now make a linear scan of the array to determine which value(s) occurred most frequently. Not elegant, but potentially good with the right data set and potentially wasteful regarding memory.
Example of the tally array. Indexes represent the data set, and the values are the number of occurrences. 0 1 2 3 4 5 6 7 100 ------------------------------------------------ | 0 | 12 | 3 | 9 | 3 | 6 | 0 | 4 |...| 10 | ------------------------------------------------ The number 5 has appeared six times while the number 1 has appeared 12.
The second solution is potentially good for sparse data. If there is no discernible range, the memory footprint could be far less than Solution 1. The operational cost is O(n) per data value. The linear nature of this solution is not a show-stopper - lots of well-behaved, relatively efficient algorithms are linear. But, you must scan the number array every time you take a value from the data set. (If you have not seen the value previously, add it to the list.) Using the index of where that number is, update the corresponding index in the other tally array. Then, after you finish tallying, you must now make one more linear scan of the tally array to determine which value(s) occurred most frequently and use those index(es) to determine which numbers they were. Yuck! Slow, wasteful, clumsy. Yuck!
Example of dual array. Indexes represent the data set, and the values are the number of occurrences. 0 1 2 3 4 5 6 7 100 ------------------------------------------------ Numbers | 56 | 23 | 13 | 19 | 23 | 36 | 0 | 14 |...| 40 | ------------------------------------------------ 0 1 2 3 4 5 6 7 100 ------------------------------------------------ Tally | 0 | 12 | 3 | 9 | 3 | 6 | 0 | 4 |...| 10 | ------------------------------------------------ The number 14 has appeared four times while the number 23 has appeared 12.
So, to recap:
- Solution 1 is O(n). Why? Ultimately we must make one pass through the data set - that is linear and cannot be avoided. This alone does not discourage the use of the solution. The range of values in the data set determines the efficiency. If the range was 1-100, sure, use this solution; it is a reasonably fair trade-off compared to our final solution. However, when the data set a range is a million values? That is 4 million bytes (32-bit
int
s) and is too expensive. - Solution 2 is O(n2). This is wasteful operationally and has a higher memory footprint. We are making a linear scan of the number array every time we process a value. That is n times n. With certain data sets, it will use twice the memory of Solution 1 or more.
The solution we are going to use is O(n log2 n). We cannot avoid the linear pass through the data set. And the Arrays.sort()
method can help us make that one pass through the data as efficient as possible. Since we are using Java's sort method, we know this has O(n log2 n)
The code in Example 3 only handles the seeking of one mode value.
import java.util.Arrays;
public class Mode {
public static void main(String[] args) {
int x, a[] = {4,5,7,8,4,3,2,8,7,4,5,7,9,0,1,2,6,7,8};
int cur, ccount, mode = 0, mcount = 0;
Arrays.sort(a);
int l = a.length;
// NOTE: This code does NOT handle bi-modal or multi-modal data!
cur = a[0];
ccount = 1;
for ( x = 1; x < l; x++ ) {
//System.out.println(cur + " " + a[x]);
if ( cur == a[x] ) {
ccount++;
} else {
//System.out.println(cur + " appeared " + ccount + " times.");
// set the new mode?
if ( ccount > mcount ) {
mode = cur;
mcount = ccount;
}
// Always!
ccount = 1;
cur = a[x];
}
}
System.out.println("The mode is " + mode);
}
}
Example 3: Calculating the mode.
The algorithm for determining the mode works like this:
- Sort the array. Set
mcount
andmode
to zero. - Select the first value in the array (
a[0]
) and hold on to it (cur
) noting we have now seen this value once (ccount
). - Using a loop compare the next element to cur. If the same, bump
ccount
, else ifccount
>mcount
moveccount
tomcount
,cur
tomode
,a[x]
tocur
.
There are multiple values in flight here. Two values, mode
and mcount
, have to do with managing the most recent candidate for mode and the number of times it occurred. The other two, cur
and ccount
, represent the current value being tallied. The a[x]
value is the next number to be examined in the data set.
If we uncomment lines 19 and 23, we can see some of the work in action as shown below:
0 1 0 appeared 1 times. 1 2 1 appeared 1 times. 2 2 2 3 2 appeared 2 times. 3 4 3 appeared 1 times. 4 4 4 4 4 5 4 appeared 3 times. 5 5 5 6 5 appeared 2 times. 6 7 6 appeared 1 times. 7 7 7 7 7 7 7 8 7 appeared 4 times. 8 8 8 8 8 9 8 appeared 3 times. The mode is 7
The debug statements help us to follow the logic to make sure it is sound. The very first line shows the cur
is 0 and the next number in the array (a[x]
) is 1. So, immediately we will update mode
and mcount
with new values 0 and 1 respectively. This is because we saw 0 exactly 1 time.
The mode
value will not be updated again until we see a[x]
is 3, then 5 and lastly when a[x]
is 8. Remember, we do not consider updating the mode until we know we are about to see a new value and that we have seen the current value more frequently than the last mode
value.
Sorting
Much has been written about sorting over the long history of computing. It is left to the reader to investigate the finer details of the various sort algorithms (Shell, Selection, Quick, etc.). Many of these methods have been the topic of many a Ph.D. thesis.
In this section, we will introduce the idea of sorting by presenting an essential traditional Bubble Sort. Now, critics may claim that this is ludicrous to offer a method we know to be the worst form. However, we believe that you must begin somewhere, and this is it. It may be a modestly implemented and poorly performing sorting method, but it describes the methodology perfectly.
Bubble Sort
Many bubble sorts have been written as nested for
loops with two ranges and an additional boolean to assess if further iterations are necessary. In other words, if we swapped two values, then there may still be more values to swap.
This is simply a variation whereby the outer loop relies on the knowledge of there having been a swap while the inner loop examines the entire list.
import java.util.Arrays;
public class BubbleSort1 {
public static void bubbleSort(int arr[]) {
boolean swapped;
System.out.println(Arrays.toString(arr) + "\n");
do {
// no swap yet
swapped = false;
for ( int x = 1; x < arr.length; x++) {
if ( arr[x] < arr[x-1] ) {
// swap values!
int t = arr[x];
arr[x] = arr[x-1];
arr[x-1] = t;
swapped = true;
}
}
System.out.println(Arrays.toString(arr));
} while (swapped);
}
public static void main(String[] args) {
int a[] = {2, 21, 27, 35, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 0, 42, 44, 6, 20, 9};
bubbleSort(a);
}
}
Example 4a: A Bubble Sort with a do
-while
loop.
The do
-while
begins on line 8. We set swapped
to false
since we have not swapped anything on this iteration of the outer loop. We then begin the inner for
loop to swap adjacent values. The swap occurs in lines 13-19 and sets swapped
to true
, which further feeds the outer loop.
The print statement on line 22 displays the array at the end of each outer loop iteration. This is a debugging tool for checking the sort's progress.
A simple enhancement to the algorithm would include recognizing that the upper end of the array is already sorted, moving toward the lower end of the array.
The output of Example 4a is shown below:
[2, 21, 27, 10, 25, 35, 45, 37, 19, 15, 47, 16, 12, 0, 42, 44, 6, 20, 9, 50] [2, 21, 10, 25, 27, 35, 37, 19, 15, 45, 16, 12, 0, 42, 44, 6, 20, 9, 47, 50] [2, 10, 21, 25, 27, 35, 19, 15, 37, 16, 12, 0, 42, 44, 6, 20, 9, 45, 47, 50] [2, 10, 21, 25, 27, 19, 15, 35, 16, 12, 0, 37, 42, 6, 20, 9, 44, 45, 47, 50] [2, 10, 21, 25, 19, 15, 27, 16, 12, 0, 35, 37, 6, 20, 9, 42, 44, 45, 47, 50] [2, 10, 21, 19, 15, 25, 16, 12, 0, 27, 35, 6, 20, 9, 37, 42, 44, 45, 47, 50] [2, 10, 19, 15, 21, 16, 12, 0, 25, 27, 6, 20, 9, 35, 37, 42, 44, 45, 47, 50] [2, 10, 15, 19, 16, 12, 0, 21, 25, 6, 20, 9, 27, 35, 37, 42, 44, 45, 47, 50] [2, 10, 15, 16, 12, 0, 19, 21, 6, 20, 9, 25, 27, 35, 37, 42, 44, 45, 47, 50] [2, 10, 15, 12, 0, 16, 19, 6, 20, 9, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50] [2, 10, 12, 0, 15, 16, 6, 19, 9, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50] [2, 10, 0, 12, 15, 6, 16, 9, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50] [2, 0, 10, 12, 6, 15, 9, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50] [0, 2, 10, 6, 12, 9, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50] [0, 2, 6, 10, 9, 12, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50] [0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50] [0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50]
Selection Sort
Another attempt at improving the sort is the selection sort.
import java.util.Arrays;
class SelectionSort {
public static void selectionSort(int arr[]) {
int x, y, t, min;
System.out.println(Arrays.toString(arr) + "\n");
for (x = 0; x < arr.length - 1; x++) {
min = x;
for (y = x + 1; y < arr.length; y++)
if (arr[y] < arr[min])
min = y;
t = arr[min];
arr[min] = arr[x];
arr[x] = t;
System.out.println(Arrays.toString(arr));
}
}
public static void main(String[] args) {
int a[] = {2, 21, 27, 35, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 0, 42, 44, 6, 20, 9};
selectionSort(a);
}
}
Example 4b: A Selection sort with nested for
loops.
Note that there is no optimization of the algorithm in Example 4b. It is simply a variation of how the Bubble sort runs. There is an enhancement to the algorithm, recognizing that the lower end of the array is already sorted as we move toward the higher end.
The output of Example 4b is shown below:
[0, 21, 27, 35, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 2, 42, 44, 6, 20, 9] [0, 2, 27, 35, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 21, 42, 44, 6, 20, 9] [0, 2, 6, 35, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 21, 42, 44, 27, 20, 9] [0, 2, 6, 9, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 21, 42, 44, 27, 20, 35] [0, 2, 6, 9, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 21, 42, 44, 27, 20, 35] [0, 2, 6, 9, 10, 12, 47, 45, 37, 19, 15, 50, 16, 25, 21, 42, 44, 27, 20, 35] [0, 2, 6, 9, 10, 12, 15, 45, 37, 19, 47, 50, 16, 25, 21, 42, 44, 27, 20, 35] [0, 2, 6, 9, 10, 12, 15, 16, 37, 19, 47, 50, 45, 25, 21, 42, 44, 27, 20, 35] [0, 2, 6, 9, 10, 12, 15, 16, 19, 37, 47, 50, 45, 25, 21, 42, 44, 27, 20, 35] [0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 47, 50, 45, 25, 21, 42, 44, 27, 37, 35] [0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 50, 45, 25, 47, 42, 44, 27, 37, 35] [0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 45, 50, 47, 42, 44, 27, 37, 35] [0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 50, 47, 42, 44, 45, 37, 35] [0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 35, 47, 42, 44, 45, 37, 50] [0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50] [0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50] [0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50] [0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50] [0, 2, 6, 9, 10, 12, 15, 16, 19, 20, 21, 25, 27, 35, 37, 42, 44, 45, 47, 50]
Insertion Sort
The Insertion Sort is the first iteration of the Shell sort presented in the next section. This sort plays on array management concepts discussed in Chapter 9. In this sort, we pull a value from the array. We then close that gap by moving all the values to the right (toward higher memory) as long as the pulled value is less than the values being moved. This creates an opening on the left (lower memory). The pulled value is then inserted into the newly created opening. This continues until all the values are processed.
public static void insertionSort(int[] arr) {
int x, y, t;
for (x = 0; x < arr.length; x++) {
t = arr[x];
y = x;
while ( y > 0 && t < arr[y-1]) {
arr[y] = arr[y-1];
y--;
}
arr[y] = t;
}
}
Example 4c: A simple Insertion Sort.
For those who would like to see the sort in action, the following code provides debugging information to show the boundaries of the inner while
loop. In addition, you can see the value of x
and the stopping value of y
.
public class InsertionSortDebug {
public static void insertionSort(int[] arr) {
int x, y, t;
for (x = 0; x < arr.length; x++) {
t = arr[x];
y = x;
while ( y > 0 && t < arr[y-1]) {
arr[y] = arr[y-1];
//print_debug(arr, x, y);
y--;
}
arr[y] = t;
print_debug(arr, x, y);
}
}
public static void print_debug(int[] a, int x, int y){
// Not part of sort routine.
// Only used to print debug info!
System.out.printf("x=%2d, y=%2d arr=", x, y);
for (int i = 0; i < a.length; i++)
System.out.printf("%3d", a[i]);
System.out.println();
// end debug
}
public static void main(String[] args) {
int x, a[] = {2, 21, 27, 35, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 0, 42, 44, 6, 20, 9};
insertionSort(a);
}
}
Example 4d: An insertion sort with debugs showing the visual of the sort.
The output of Example 4d is shown below:
x= 0, y= 0 arr= 2 21 27 35 10 25 47 45 37 19 15 50 16 12 0 42 44 6 20 9 x= 1, y= 1 arr= 2 21 27 35 10 25 47 45 37 19 15 50 16 12 0 42 44 6 20 9 x= 2, y= 2 arr= 2 21 27 35 10 25 47 45 37 19 15 50 16 12 0 42 44 6 20 9 x= 3, y= 3 arr= 2 21 27 35 10 25 47 45 37 19 15 50 16 12 0 42 44 6 20 9 x= 4, y= 1 arr= 2 10 21 27 35 25 47 45 37 19 15 50 16 12 0 42 44 6 20 9 x= 5, y= 3 arr= 2 10 21 25 27 35 47 45 37 19 15 50 16 12 0 42 44 6 20 9 x= 6, y= 6 arr= 2 10 21 25 27 35 47 45 37 19 15 50 16 12 0 42 44 6 20 9 x= 7, y= 6 arr= 2 10 21 25 27 35 45 47 37 19 15 50 16 12 0 42 44 6 20 9 x= 8, y= 6 arr= 2 10 21 25 27 35 37 45 47 19 15 50 16 12 0 42 44 6 20 9 x= 9, y= 2 arr= 2 10 19 21 25 27 35 37 45 47 15 50 16 12 0 42 44 6 20 9 x=10, y= 2 arr= 2 10 15 19 21 25 27 35 37 45 47 50 16 12 0 42 44 6 20 9 x=11, y=11 arr= 2 10 15 19 21 25 27 35 37 45 47 50 16 12 0 42 44 6 20 9 x=12, y= 3 arr= 2 10 15 16 19 21 25 27 35 37 45 47 50 12 0 42 44 6 20 9 x=13, y= 2 arr= 2 10 12 15 16 19 21 25 27 35 37 45 47 50 0 42 44 6 20 9 x=14, y= 0 arr= 0 2 10 12 15 16 19 21 25 27 35 37 45 47 50 42 44 6 20 9 x=15, y=12 arr= 0 2 10 12 15 16 19 21 25 27 35 37 42 45 47 50 44 6 20 9 x=16, y=13 arr= 0 2 10 12 15 16 19 21 25 27 35 37 42 44 45 47 50 6 20 9 x=17, y= 2 arr= 0 2 6 10 12 15 16 19 21 25 27 35 37 42 44 45 47 50 20 9 x=18, y= 8 arr= 0 2 6 10 12 15 16 19 20 21 25 27 35 37 42 44 45 47 50 9 x=19, y= 3 arr= 0 2 6 9 10 12 15 16 19 20 21 25 27 35 37 42 44 45 47 50
Shell Sort
Now we will introduce the Shell sort named after Donald Shell. This algorithm uses a method of decreasing the gap between swapped values. In the original design by Shell, the gap begins at half the length.
Beginning at roughly the midpoint of the array, if the current value is less than the value gap
locations behind it, they are swapped. We then move down the array by one and repeat.
The gap then shrinks by half each time. So the second complete iteration will begin at roughly the 25% point until we enter a gap of 1, which is effectively an insertion sort.
The critical component here is the swaps continue toward the beginning of the array as long as we still have reachable values gap
locations behind us. This moves lower values to the beginning of the array more rapidly.
- The gap starts big and gradually becomes one.
- Only gap-distance values are swapped per iteration. Insertion sort moved entire data blocks to accommodate one value.
public static void shellSort(int[] arr) {
int x, y, t, gap, swaps;
for ( gap = arr.length/2; gap > 0; gap=gap/2 ) {
for ( x = gap; x < arr.length; x++ ) {
t = arr[x];
for ( y = x; y >= gap && arr[y-gap] > t; y=y-gap )
arr[y] = arr[y-gap];
arr[y] = t;
}
}
}
Example 4e: Original Shell sort algorithm.
In the next piece of code, we offer the Shell sort with debugging information to show the gaps and the number of swaps that occur within each gap iteration.
public class ShellSortDebug {
public static void shellSort(int[] arr) {
int x, y, t, gap, swaps=0;
for ( gap = arr.length/2; gap > 0; gap=gap/2 ) {
swaps = 0;
for ( x = gap; x < arr.length; x++ ) {
t = arr[x];
for ( y = x; y >= gap && arr[y-gap] > t; y=y-gap ) {
arr[y] = arr[y-gap];
swaps++;
//print_debug(arr, gap, swaps);
}
arr[y] = t;
}
print_debug(arr, gap, swaps);
}
}
public static void print_debug(int[] a, int g, int s){
if ( g == 0 && s == 0 )
System.out.print(" arr=");
else
System.out.printf("gap=%2d swaps=%2d arr=", g, s);
for (int i = 0; i < a.length; i++)
System.out.printf("%3d", a[i]);
System.out.println();
}
public static void main(String[] args) {
int x, a[] = {2, 21, 27, 35, 10, 25, 47, 45, 37, 19, 15, 50, 16, 12, 0, 42, 44, 6, 20, 9};
print_debug(a, 0, 0);
shellSort(a);
print_debug(a, 0, 0);
}
}
Example 4f: Shell sort with debugs showing the visual of the sort.
The output for Example 4f will look like the following:
arr= 2 21 27 35 10 25 47 45 37 19 15 50 16 12 0 42 44 6 20 9 gap=10 swaps= 7 arr= 2 21 16 12 0 25 44 6 20 9 15 50 27 35 10 42 47 45 37 19 gap= 5 swaps= 3 arr= 2 21 6 12 0 15 44 16 20 9 25 47 27 35 10 42 50 45 37 19 gap= 2 swaps=26 arr= 0 9 2 12 6 15 10 16 20 19 25 21 27 35 37 42 44 45 50 47 gap= 1 swaps= 8 arr= 0 2 6 9 10 12 15 16 19 20 21 25 27 35 37 42 44 45 47 50 arr= 0 2 6 9 10 12 15 16 19 20 21 25 27 35 37 42 44 45 47 50
Quick Sort
Sort Stress Test
Finally, we introduce a comparison of many sorts. This is a conglomeration of multiple variations of hand-written from multiple algorithm sources to those that occur naturally in the language runtime. Each sort is measured solely with a clock-on-the-wall time and the number of swaps. The number of swaps is used to measure time complexity which will vary significantly.
import java.util.concurrent.ThreadLocalRandom;
import java.util.Arrays;
public class StressSort {
public static void bubbleSort(int arr[]) {
boolean swapped;
do {
// no swap yet
swapped = false;
for ( int x = 1; x < arr.length; x++) {
if ( arr[x] < arr[x-1] ) {
// swap values!
int t = arr[x];
arr[x] = arr[x-1];
arr[x-1] = t;
swapped = true;
}
}
} while (swapped);
}
public static void insertionSort(int[] arr) {
int x, y, t;
for (x = 0; x < arr.length; x++) {
t = arr[x];
y = x;
while ( y > 0 && t < arr[y-1]) {
arr[y] = arr[y-1];
y--;
}
arr[y] = t;
}
}
public static void selectionSort(int[] arr) {
int x, y, t, min;
for (x = 0; x < arr.length - 1; x++) {
min = x;
for (y = x + 1; y < arr.length; y++)
if (arr[y] < arr[min])
min = y;
t = arr[min];
arr[min] = arr[x];
arr[x] = t;
}
}
public static void shellSortOrig(int[] arr) {
int x, y, t, gap;
for ( gap = arr.length/2; gap > 0; gap=gap/2 ) {
for ( x = gap; x < arr.length; x++ ) {
t = arr[x];
for ( y = x; y >= gap && arr[y-gap] > t; y=y-gap ) {
arr[y] = arr[y-gap];
}
arr[y] = t;
}
}
}
public static void shellSortInSed(int[] arr) {
int[] gaps = {1391376, 463792, 198768, 86961, 33936, 13776, 4592, 1968, 861, 336, 112, 48, 21, 7, 3, 1};
int g, x, y, t, gap;
for ( g = 0; g < gaps.length; g++ ) {
gap = gaps[g];
for ( x = gap; x < arr.length; x++ ) {
t = arr[x];
for ( y = x; y >= gap && arr[y-gap] > t; y=y-gap ) {
arr[y] = arr[y-gap];
}
arr[y] = t;
}
}
}
public static void shellSortBentlyA(int[] arr) {
int gap, x, y, t, n;
n = arr.length;
// Find largest gap based on 3.
for ( gap = 1; gap < n; gap=3*gap+1 );
while (true) {
gap /= 3;
if ( gap < 1)
break;
for ( x = gap; x < n; x++ ) {
for ( y = x; y >= gap; y=y-gap ) {
if ( arr[y-gap] < arr[y] )
break;
t = arr[y];
arr[y] = arr[y-gap];
arr[y-gap] = t;
}
}
}
}
public static void shellSortBentlyA2(int[] arr) {
int g, gap, x, y, t, n;
n = arr.length;
// Find largest gap based on 3.
for ( g = 1; g < n; g=3*g+1 );
for ( gap = g/3; gap > 0; gap=gap/3 ) {
for ( x = gap; x < n; x++ ) {
for ( y = x; y >= gap; y=y-gap ) {
if ( arr[y-gap] <= arr[y] )
break;
t = arr[y];
arr[y] = arr[y-gap];
arr[y-gap] = t;
}
}
}
}
public static void shellSortBentlyB(int[] arr) {
int[] gaps = {1391376, 463792, 198768, 86961, 33936, 13776, 4592, 1968, 861, 336, 112, 48, 21, 7, 3, 1};
int g, x, y, t, gap, n;
n = arr.length;
for ( g = 0; g < gaps.length; g++ ) {
gap = gaps[g];
for ( x = gap; x < n; x++ ) {
for ( y = x; y >= gap; y=y-gap ) {
if ( arr[y-gap] < arr[y] )
break;
t = arr[y];
arr[y] = arr[y-gap];
arr[y-gap] = t;
}
}
}
}
public static void qsort(int[] arr) {
qsort1(arr, 0, arr.length-1);
}
public static void qsort1(int[] arr, int l, int u) {
int m, x, t;
if ( l >= u )
return;
m = l;
for (x = l+1; x <= u; x++) {
if ( arr[x] < arr[l] ) {
m++;
t = arr[m];
arr[m] = arr[x];
arr[x] = t;
}
}
t = arr[l];
arr[l] = arr[m];
arr[m] = t;
qsort1(arr, l, m-1);
qsort1(arr, m+1, u);
}
public static void main(String[] args)
{
int ASIZE=100000000;
int[] a1 = new int[ASIZE], a2 = new int[ASIZE], a3 = new int[ASIZE];
int[] a4 = new int[ASIZE], a5 = new int[ASIZE], a6 = new int[ASIZE];
int[] a7 = new int[ASIZE], a8 = new int[ASIZE], a9 = new int[ASIZE];
int x, n;
long s, e;
for ( x = 0; x < ASIZE; x++ ) {
n = ThreadLocalRandom.current().nextInt(1, 10000000);
a1[x] = n;
a2[x] = n;
a3[x] = n;
a4[x] = n;
a5[x] = n;
a6[x] = n;
a7[x] = n;
a8[x] = n;
a9[x] = n;
}
System.out.print("Bubble Sort ");
s = System.nanoTime();
//bubbleSort(a1);
e = System.nanoTime();
reportTime(s, e);
System.out.print("Selection Sort ");
s = System.nanoTime();
//selectionSort(a2);
e = System.nanoTime();
reportTime(s, e);
System.out.print("Insertion Sort ");
s = System.nanoTime();
//insertionSort(a3);
e = System.nanoTime();
reportTime(s, e);
System.out.print("Shell Sort Original ");
s = System.nanoTime();
shellSortOrig(a4);
e = System.nanoTime();
reportTime(s, e);
System.out.print("Shell Sort InSed ");
s = System.nanoTime();
shellSortInSed(a5);
e = System.nanoTime();
reportTime(s, e);
System.out.print("Shell Sort Bently A ");
s = System.nanoTime();
shellSortBentlyA2(a6);
e = System.nanoTime();
reportTime(s, e);
System.out.print("Shell Sort Bently B ");
s = System.nanoTime();
shellSortBentlyB(a7);
e = System.nanoTime();
reportTime(s, e);
System.out.print("Arrays.sort() ");
s = System.nanoTime();
Arrays.sort(a8);
e = System.nanoTime();
reportTime(s, e);
System.out.print("Quicksort ");
s = System.nanoTime();
qsort(a9);
e = System.nanoTime();
reportTime(s, e);
}
public static void reportTime(long s, long e) {
long t = e - s;
System.out.println("Execution time in milliseconds : " + t / 1000000);
}
}
Example 4g: Sort stress test.
Searching
Searching algorithms allow us to seek out specific values in an array of data. We can improve the searching methods when we know something specific about the data. It must be detailed enough to make the search method more refined and efficient.
Linear (Sequential) Search
First we begin with the most basic form - the Linear Search or Sequential Search. This search works regardless of the data layout. Completely random and ordered data can have a linear search applied. This is because, as the name implies, we scan all values in the space.
public static int seq_search(int[] data, int key) {
for ( int x = 0; x < data.length; x++ )
if ( data[x] == key )
return x;
return -1;
}
Example 5a: A simple linear search.
The linear search begins at the first value in the array and iterates over the entire expanse. We compare each value in the array to the key
value we seek. If they match, we return to the location it was found. Otherwise, we move on to the next location.
When the array has been exhausted, we are still in the method. We simply return -1
to indicate it was not found.
This form of search has a time and space complexity of O(n).
Binary Search
The following form assumes we know something specific about the data. In this case, the data will be sorted. By sorting the array before beginning the search, we can get to the time complexity of O(log2n). The storage complexity remains O(n) since we still represent the same number of values.
public static int bin_search(int[] data, int key) {
int midpoint = 0, first = 0, last;
last = data.length - 1;
while ( first <= last ) {
midpoint = (first + last) / 2;
if ( data[midpoint] == key )
return midpoint;
else if ( data[midpoint] > key )
last = midpoint - 1;
else
first = midpoint + 1;
}
return -1;
}
Example 5b: An applied binary search.
After the requirement of sorting the data has been met, we begin by setting up pickets representing the search area's beginning and end. We then find the midpoint of that expanse and examine that location. If the key
value has been found, we return the location. Otherwise, we either move the first
picket or move the last
picket. This reduces the search space to half of what it was.
Each iteration of these steps reduces the field by half. So, after the first comparison, you have eliminated 50%. After the second comparison, the reduction is 75%. After the third, the reduction is 87.5%, and so on.
When the first
and last
variables cross over such that first
is greater than last
, we know the array has been exhausted. We simply return -1
to indicate it was not found.
Hashed
Hash (Table) Search
This particular topic eventually warranted its own chapter. You an find out all about hash tables in Chapter 15 - Hashing.
Exercises
- (Beginner) Research the
Arrays.toString()
overloaded method. Pop it intoMedian
and/orMode
, above, and prove that the sort is happening. - (Intermediate) Modify Example 3 to handle a multi-modal solution. (Hint: You only need to maintain one
mcount
, but multiplemode
values.) - (Intermediate) Modify
bubbleSort1()
andbubbleSort2()
to be more efficient by adjusting the inner loop to the array upper bound the size minus the iteration number. - (Advanced) Modify the Insertion Sort to be head-recursive. (See Chapter 13 - Recursion)