Massive Algorithms: Radix Sort

Radix Sort | GeeksforGeeks

Radix Sort | GeeksforGeeks
What if the elements are in range from 1 to n²?

The lower bound for Comparison based sorting algorithm is $\Omega(nLogn)$ , i.e., they cannot do better than nLogn.

Counting sort is a linear tine sorting algorithm that sort in O(n+k) time when elements are in range from 1 to k.

overall time complexity is O((n+b) * log_b(k)). b is the base for representing numbers

Why Radix Sort Works?

In mathematics, radix means base, where decimal would be base 10.

The idea of Radix Sort is to do digit by digit sort starting from least significant digit to most significant digit. Radix sort uses counting sort as a subroutine to sort.

1) Do following for each digit i where i varies from least significant digit to the most significant digit.
………….a) Sort input array using counting sort (or any stable sort) according to the i’th digit.
Example:

Original, unsorted list:

170, 45, 75, 90, 802, 24, 2, 66

Sorting by least significant digit (1s place) gives: [*Notice that we keep 802 before 2, because 802 occurred before 2 in the original list, and similarly for pairs 170 & 90 and 45 & 75.]

170, 90, 802, 2, 24, 45, 75, 66

Sorting by next digit (10s place) gives: [*Notice that 802 again comes before 2 as 802 comes before 2 in the previous list.]

802, 2, 24, 45, 66, 170, 75, 90

Sorting by most significant digit (100s place) gives:

2, 24, 45, 66, 75, 90, 170, 802

// A function to do counting sort of arr[] according to

// the digit represented by exp.

void countSort(int arr[], int n, int exp)

{

    int output[n]; // output array

    int i, count[10] = {0};

    // Store count of occurrences in count[]

    for (i = 0; i < n; i++)

        count[ (arr[i]/exp)%10 ]++;

    // Change count[i] so that count[i] now contains actual position of

    // this digit in output[]

    for (i = 1; i < 10; i++)

        count[i] += count[i - 1];

    // Build the output array

    for (i = n - 1; i >= 0; i--)

{

        output[count[ (arr[i]/exp)%10 ] - 1] = arr[i];

        count[ (arr[i]/exp)%10 ]--;

}

    // Copy the output array to arr[], so that arr[] now

    // contains sorted numbers according to curent digit

    for (i = 0; i < n; i++)

        arr[i] = output[i];

}

// The main function to that sorts arr[] of size n using Radix Sort

void radixsort(int arr[], int n)

{

    // Find the maximum number to know number of digits

    int m = getMax(arr, n);

    // Do counting sort for every digit. Note that instead of passing digit

    // number, exp is passed. exp is 10^i where i is current digit number

    for (int exp = 1; m/exp > 0; exp *= 10)

        countSort(arr, n, exp);

}

基数排序
基数排序的主要思想是把数字按位进行比较，通过比较每个数特定位的大小来移动数组元素。既然是按位进行比较，那么既可以从左往右，也可以从右往左.

基数排序号称可以在O(n)的时间复杂度完成排序，其实所言非虚，只是真正的时间复杂度应该是O(kN)，k指待排序列最大值的位数，同时不能遗漏的是基数排序需要O(N)的空间充当“桶”的角色。根据基数排序的排序的原理，可知它比较适合于正整数的排序，而对于负数或者浮点数，字符串等排序，需要将待排序列做一定得变换.

根据按位比较的方向，可以将基数排序的实现分为两种，MSD（Most significant digital，从高到低）和LSD（Least significant digital，从低到高）。

首先写一个获取指定位的辅助函数

private static final int radix = 10;

private static int getDigit(int num, int d){

    int[] rd = new int[]{1,1,10,100,1000};      

    return (num / rd[d])%10;

}

先实现MSD算法，也就是从高到低位的比较顺序。

public static void msdRadixSort(int[] nums, int start, int end, int d){

    if(nums == null || nums.length < 2) return;

    int len = end - start + 1;

    int[] bucket = new int[len];

    int[] count = new int[radix+1];

    int i = 0;

    int j = 0;

    for(i=start; i<=end; i++){

        count[getDigit(nums[i], d)]++;

}

    for(i=1; i<=radix; i++){

        count[i] += count[i-1];

}

    for(i = end; i>=start; i--){

        j = getDigit(nums[i], d);

        bucket[count[j] - 1] = nums[i];

        count[j]--;

}

    for(i=start, j=0; i<=end; i++,j++)

        nums[i] = bucket[j];

    for(i=0; i<radix; i++){

        int left = start + count[i];

        int right = start + count[i+1]-1;

        if(left < right && d>1){

            msdRadixSort(nums, left, right, d-1);

}

}

}

再实现LSD算法。

public static void lsdRadixSort(int[] nums, int d){

    if(nums == null || nums.length <2 || d<1) return;

    int[] count = new int[radix + 1];

    int[] bucket = new int[nums.length];

    for(int k=1; k<=d; k++){

        Arrays.fill(count, 0);

        for(int i=0; i<nums.length; i++){

            count[getDigit(nums[i], k)]++;

}

        for(int i=1; i<=radix; i++)

            count[i] += count[i-1];

        // fill the bucket

        for(int i=nums.length-1; i>=0; i--){

            int j = getDigit(nums[i], k);

            bucket[count[j] - 1] = nums[i];

            --count[j];

}

        // copy to the original array

        System.arraycopy(bucket, 0, nums, 0, nums.length);

}

}

http://www.sanfoundry.com/java-program-implement-radix-sort/
public static void sort(int[] a) {
int m = Collections.max(Ints.asList(a)), exp = 1;
while (m / exp > 0) {
countSort(a, exp);
exp *= 10;
}
}

public static void countSort(int[] a, int exp) {
int[] bucket = new int[10];
int i, n = a.length;
int[] b = new int[10];
for (i = 0; i < n; i++)
bucket[(a[i] / exp) % 10]++;
for (i = 1; i < 10; i++)
bucket[i] += bucket[i - 1];
for (i = n - 1; i >= 0; i--)
b[--bucket[(a[i] / exp) % 10]] = a[i];
for (i = 0; i < n; i++)
a[i] = b[i];
}

Correctness Proofs for Radix Sort
First proof of correctness. Consider any two elements in the unsorted array, say A[i] and
A[j] for i < j. Let us denote the digits of these numbers as follows:
A[i] = b1b2 · · · bd
A[j] = c1c2 · · · cd
Suppose that A[i] != A[j], and let t be the first digit where they differ. That is, we have
b1 = c1, . . . , bt−1 = ct−1 but bt != ct
. For concreteness, suppose that bt < ct
. Then A[i]
should be sorted before A[j]. (The case bt > ct
is analogous.)
When j = t in Radix-Sort, we will perform a sort by the t-th digit, which will
ensure that A[i] is before A[j]. All sorts after this are for digits j < t, where A[i] and
A[j] have equal digits. Since Counting-Sort is stable, each of these sorts leaves A[i]
before A[j] in the array. Hence, A[i] is before A[j] when Radix-Sort finishes.

The next proof relies on the following lemma. This proof plus the lemma is a bit
longer than the proof above, but it tells us a bit more about what Radix-Sort is doing.
Lemma 1. After t sorts in Radix-Sort, the elements of A are sorted according to
their last t digits (i.e., digits d − t + 1, . . . , d).
The second proof is a trivial consequence of this lemma.
Second proof of correctness. Apply the above lemma to the case t = d. This says that,
after all d sorts, the numbers are sorted according to all their digits.

Proof of Lemma. We will prove this by induction on t.

For t = 1, the claim is that, after the first sort, the numbers are sorted according to

their last digit. Since j starts at d, the first call to Counting-Sort is for the d-th digit,

so after that first sort is done, they are properly sorted according to the d-th digit.

For t > 1, the t-th sort comes after the t − 1 previous sorts. By induction, after the

(t − 1)-st sort, the numbers are sorted according to their last t − 1 digits (i.e., digits

d − t + 2, . . . , d). The t-th sort now sorts them by digit k := d − t + 1.

Consider any two elements A[i] and A[j] and label their digits as above:

A[i] = b1b2 · · · bd

A[j] = c1c2 · · · cd

If we have bk < ck, then the number represented by the last t digits of A[i], which is

bkbk+1 · · · bd, is smaller than A[j], which is ckck+1 · · · cd. Since bk < ck and the last call

to Counting-Sort was for digit k, this will have ensured that A[i] is before A[j].

It only remains to consider the case that bk = ck. In that case, the number

bkbk+1 · · · bd is smaller than ckck+1 · · · cd if and only if bk+1bk+2 · · · bd is smaller than

ck+1ck+2 · · · cd. By the inductive hypothesis, after the (t−1)-st sort (just before the t-th

sort), the numbers were sorted properly according to these digits. As we just argued,

the proper order of A[i] and A[j] according to digits k, . . . , d is the same as according to

digits k + 1, . . . , d. And since bk = ck and Counting-Sort is stable, the t-th sort will

leave them in the same order, which is properly ordered.

Read full article from Radix Sort | GeeksforGeeks

Radix Sort | GeeksforGeeks

Labels

Popular Posts