Inversions in array

Inversions in array

Let A[0…n – 1] be an array of n distinct positive integers. If i < j and A[i] > A[j] then the pair (i, j) is called an inversion of A. Given n and an array A, find the number of inversions in array A. For example: First array has two inversions (2,1) and (5,1) where as second array has 3 inversions, (2,1), (4,1) and (4,3)

inversions in array

How many inversion can be in a sorted array? There is no inversion in sorted array and nC2 inversions in completely inverted array.

Inversions in array : Thought process

What first thing which comes to mind? For each index i, check all j where j > i and see if A[j] < A[i]?
If A[j] is greater than current element A[i], increase inversion count. Implementation is given below.

package com.company;

/**
 * Created by sangar on 6.4.18.
 */
public class Inversions {
    public static int findInversions(int[] a){
        int count = 0;
        for(int i=0; i<a.length; i++){
            for(int j=i+1;  j<a.length; j++){
                if(a[i] > a[j]) count++;
            }
        }
        return count;
    }

    public static void main(String[] args) {
        int a[] = new int[]{90, 20, 30, 40, 50};
        int inversions = findInversions(a);
        System.out.print("Inversions : " + inversions);
    }
}

Worst case complexity of this method to find inversions in array is O(n2).

Can we use the information that a sorted array does not have any inversion? Let’s ask this question again with a tweak, how many inversions will there be if only one element is out of place for a completely sorted array? There will one. What if there are two elements out of place? Of course, there will be 2 inversion. Are we getting somewhere? Effectively, we have to count, how many swaps we need to completely sort array from it’s original state. If you noticed, the brute force algorithm, is nothing but selection sort, instead of swapping elements, count inversions.

What if we use any other sort algorithm like Merge sort algorithm? Complexity of merge sort is much better than selection sort, then it is possible that merge sort identifies inversions much efficiently than selection sort.
Let’s use merge step of merge sort algorithm with a bit modification. Divide part will remains the same. While merging two sorted subarrays, count number of times element on right part is put on result array before element on left side.
Every time A[i] is appended to the output of merge sort, no new inversions are encountered, since A[i] is smaller than everything left in array B.  If B[j] is appended to the output, then it is smaller than all the remaining items in A, we increase the number of count of inversions by the number of elements remaining in A.
Overall inversions will be inversion in left part + inversions of right part and inversion which are in merge step.

Let’s see how inversions in merge steps are counted and then we will see the overall algorithm.

inversion in array

 

inversions in array

Total number of inversions is 6.

Overall, algorithm looks like below.

inversions in array using merge sort

Algorithm to count inversions in array

First, let’s write an algorithm to count inversions in merge step. When we are at merge steps, there are two sorted arrays with us:  A and B

  1. Initialize i as start position of A and j as start position of B. These pointers will reference to currently compared indices in both arrays.
  2. While there are elements in both arrays, i.e. i < length(A) && j < length(B)
    1.  If B[j] < A[i], all elements from i to length(A) are greater than B[j],
      count += number of elements remaining in A. Increment j
    2. Else increment i
  3. Return count

Replace merge part of merge sort with this piece of algorithm and return sum of inversions in left + inversions in right + inversions in merge from function.

MergeSortAndCount(L):

  1. If L has one element return 0
  2. Divide L into A, B
    1. inversionsLeft = MergeSortAndCount(A)
    2. inversionRight = MergeSortAndCount(B)
    3. inversionMerge = MergeAndCount(A,B)
  3. return inversionLeft + inversionRight + inversionMerge

Inversions in array implementation

package com.company;

/**
 * Created by sangar on 6.4.18.
 */
public class Inversions {
    public  static int mergeAndCount(int[] a, int low, int mid, int high){
        int count  =  0;
        int[] temp = new int[high-low+1];

        int i = low;
        int j = mid+1;
        int k = 0;
        /*
            There are elements on both side of array
        */
        while(i<=mid && j<=high){

            if(a[i] > a[j]){
                //Number of elements remaining on left side.
                count+= (mid - i + 1);
                temp[k++] = a[j++];
            }
            else{
                temp[k++] = a[i++];
            }
        }
        while(i<=mid){
            temp[k++] = a[i++];
        }
        while(j<=high) {
            temp[k++] = a[j++];
        }

        for(i=low; i<k+low; i++){
            a[i] = temp[i-low];
        }

        return count;
    }

    public static int countInversions(int[] a, int low, int high){
        if(low >= high) return 0;

        int mid = low + (high - low) / 2;

        int inversionsLeft = countInversions(a, low, mid);
        int inversionsRight = countInversions(a, mid+1, high);
        int inversionsMerge = mergeAndCount(a, low, mid, high);

        return inversionsLeft + inversionsRight + inversionsMerge;
    }


    public static void main(String[] args) {
        int a[] = new int[]{90, 20, 30, 40, 50};
        int inversions = countInversions(a, 0, a.length-1);
        System.out.print("Inversions : " + inversions);
    }
}

Complexity of finding inversions in arrays using merge sort method is  O(n log n).

Please share if there is something wrong or missing. If you are interested in contributing to website and share your knowledge with learners, please contact us at communications@algorithmsandme.com

References: Lecture 8

Dutch National Flag Algorithm

Dutch National Flag Algorithm

Given an array with 0s,1s and 2s, sort array in increasing order. Another way to phrase this problem is sort balls with three different colors : red, blue and white, where balls of each color are placed together. This is typically know as Dutch national flag problem and algorithm to solve it is called Dutch national flag algorithm. Example:
A = [0,1,0,1,1,2,0,2,0,1]
Output = [0,0,0,0,1,1,1,1,2,2]

A = [R,B,B,W,B,R,W,B,B]
Output = [R,R,W,W,B,B,B,B]

This problem can be asked as design question, as let’s say you have to design a robot. All this robot does is : it see three empty buckets and a bucket full of colored balls (red, blue and white). Design an instruction set for this robot that it fill each empty buckets with one color. It’s the same problem as Dutch National Flag problem.

Count to sort an array of 0s,1s and 2s

We have already seen similar problem before: Segregate 0s and 1s in an array. We explored how to count elements and re-write them again on to array.

Let’s apply same method for this problem. Take an array of three integers, index store corresponding count for that number. E.g count[0] stores count of 0 and count[1] stores count of 1. Scan through array and count each element. At last, re-write those numbers back on to array starting from index 0, according to their count. For example, if there are 4 zeros, then starting from 0 index, write 4 zeros, followed by 1s and then by 2.

Complexity of counting method is O(n), notice that we scan array twice, first time for counts and second time for writing on array.

Dutch national flag algorithm

  1. Start with three pointers : reader, low and high.
  2. reader and low are initialized as 0 and high is initialized as last element of array as size-1.
  3. reader will be used to scan the array while low and high will be used to swap elements to their desired positions.
  4. Starting with current position of reader, follow below steps, keep in mind we need 0 at start of array
    1. If element at index reader is 0, swap element at reader with element at low and increment low and reader by 1.
    2. If element at reader is 1, do not swap and increment reader by 1.
    3. If element at reader is 2, swap element at reader with element at high and decrease high by 1

Actually, three pointers divide array into four parts.  Red, white, unknown and Blue. Every element is taken from unknown part and put into its right place. So all three other parts expand while unknown part shrinks.

Let’s take an example and see how dutch national flag algorithm works.

First step, initialize reader, low and high.

Element at reader is 0, hence swap element at reader and low,also increment reader and low.

dutch national flag algorithm

Follow the same step, check element at reader again, it’s 1, hence, just move reader by one.

Element at reader is now 2, swap element at reader with element at high and decrease high by 1.

Element at reader is 1, just increment reader.

Element at reader is now 2, swap element at reader with element at high and decrease high by 1.

Dutch national flag algorithm explanation

Element at reader is 1, just increment reader.

Element at reader is 1, just increment reader.

Dutch national flag algorithm explanation

Element at reader 0, hence swap element at reader and low,also increment reader and low.

Element at reader 0, hence swap element at reader and low,also increment reader and low.

Dutch national flag algorithm explanation

Element at reader is now 2, swap element at reader with element at high and decrease high by 1.

Here, high becomes less than reader, we can stop as array is already sorted.

Dutch national flag algorithm implementation

package com.company;

/**
 * Created by sangar on 5.1.18.
 */
public class DutchNationalFlag {

    public static void swap(int[] input, int i, int j){
        int temp =  input[i];
        input[i] = input[j];
        input[j] = temp;
    }

    public static void dutchNationalFalgAlgorithm(int [] input){

        //initialize all variables
        int reader = 0;
        int low = 0;
        int high = input.length - 1;

        while(reader <= high){
            if(input[reader] == 0){
                /*When element at reader is 0, swap 
                element at reader with element at index 
                low and increment reader and low*/
                swap(input, reader, low);
                reader++;
                low++;
            }
            else if(input[reader] == 1){
                /* if element at reader is just 
                increment reader by 1 */
                reader++;
            }
            else if(input[reader] == 2){
                /* If element at reader is 2, swap
                 element at reader with element at
                 high and decrease high by 1 */
                swap(input, reader, high);
                high--;
            }
        }

    }
    public static void main(String[] args) {
        int[] input = {2,2,1,1,0,0,0,1,1,2};

        dutchNationalFalgAlgorithm(input);

        for(int i=0; i<input.length; i++){
            System.out.print(" " + input[i]);
        }
    }
}

Complexity of Dutch National Flag algorithm is O(n), however, we scan the array only once.

Please share if you have some suggestions or comments. Sharing is caring.

3-way quicksort (Quick sort on non unique elements)

3-way quicksort

In post Dutch National Flag Algorithm, we discussed how can we sort and array in linear time by partitioning it in four parts, red, white, unknown and blue. Can we apply the same technique on quick sort.? As we all know that quick sort relies on partitioning of input and input is partitioned in  two parts. What if we divide input space in three parts? Then it becomes 3-way quicksort.

The 3-way partition variation of quick sort has slightly higher overhead compared to standard 2-way partition version. Both have same best, typical, and worst case time bounds, but this version is highly adaptive in very common case of sorting with few unique keys.

Quicksort basics and limitation

Before going ahead with 3-way partition method, I would strongly recommend that you go through usual quick sort algorithm : Quick sort algorithm in C

A big limitation of quick sort is that it has O(n2) worst case running time. Some improvements have been suggested as given below:

  • Cutoff to insertion sort. Switch to insertion sort for arrays with size less than a pre-decided limit. We follow quick sort and partition array. Once, the size of partitioned array goes lower than limit, apply insertion sort on that array. Limit varies from system to system and typically it is between 5 to 15.
  • Median-of-three partitioning. Use median of a small sample of items taken from the array as the partitioning item. Doing so will give a slightly better partition, but at the cost of computing the median.

There is another optimization which is called as Entropy-optimal sorting or 3-way partition quick sort. This method is useful when inout array contains lot of duplicate values which is very frequent in real world. Idea is to divide array in three parts rather than two. Let’s say P be pivot. First part contains all numbers which are less than p, second part contains number equal to p and last part contains numbers which are greater than p.

3-way quicksort algorithm 

3-way quicksort is optimization which works in specific cases like when input to be sorted contains few unique keys, in this case having traditional approach of one pivot does not perform well compared to 3-way quicksort.
Some of the properties of 3-way quicksort are:
It is not stable, when stability is the requirement, avoid using quicksort in general.
It uses O(lg(n)) extra space, why? Because of the recursion.
Worst case run time is as same as classic quicksort, O(n2), but typically O(nlog(n)) time
Best part is it takes O(n) time when O(1) unique keys.

This algorithm partitions array into three parts:
1. one with is less than pivot
2. equal to pivot
3. one with elements greater than pivot

3-way quicksort

3-way quicksort Implementation

#include <stdio.h>

int min(int a, int b){
 if(a > b) 
    return b;
  return a;
}

void swap(int a[], int i, int j){
	int temp = a[i];
	a[i] = a[j];
	a[j] = temp;
}

void quicksort(int a[], int low, int high){
	int pivot = a[high];
	
	int lt = low;
	int reader = low ;
	int gt = high;
	
	while(reader < gt){
		if(a[reader] < pivot){
			swap(a, reader, lt);
			reader++;
			lt++;
		}
		else if(a[reader] == pivot){
			gt--;
		    swap(a, reader, gt);
		}
		else{
			reader++;
		}
	}
  
	int minimum = min( gt-lt, high-gt+1);
	for(int i=0; i<minimum; i++){
		swap(a, lt+i, high-minimum+1+i); 
	}

	if( low < high){
		quicksort(a, low, lt-1);
   		quicksort(a, high-gt+lt, high);
	}
	return ;
}

int main(void) {
	int a[] = {4,3,3,2,7,9,2,3,5,6,7,4};
	int size = sizeof(a)/sizeof(a[0]);

	quicksort(a, 0, size-1);
	
	printf("\n");
	for(int i=0; i<size; i++){
		printf("%d ", a[i]);	
	}

	return 0;
}

Complexity of 3-way quicksort is O(n2).

Please share if something is wrong or any other suggestion or query. We would love to hear what you have to say.

Merge sort algorithm

Merge Sort Algorithm

We can classify sorting algorithms based on their complexity, selection sort, bubble and insertion sort have complexity of O(n2) while heap sort, merge sort and quick sort (with some assumptions) have complexity of  O(nlogn) and count and Radix sorts are linear O(n) algorithms.In this post, we will concentrate on merge sort algorithm.

Key points about merge sort algorithm

  • class of algorithm : divide and conquer
  • Complexity : Time O(n2)
  • Space : O(n)
  • Stable : Yes
  • In place : NO

Merge sort algorithm falls in divide and conquer class of algorithms where input space is divided into subproblems and these subproblems are then solved individually and combined together to solve original problem. Below figure explains divide and conquer paradigm.

Merge sort algorithm
Divide and conquer strategy

In merge sort algorithm, input space is divided into two subproblems till the time we achieve smallest subproblem and then smallest sub-problem is sorted and then combined up till original input is sorted. Question arises is what is the smallest subproblem? Smallest subproblem is where input cannot be further divided, a subproblem with one item to sorted.

Once, split is done till smallest size, start merging. Going from bottom, start with two one-item subproblems, combine that to create a two item subproblem, then combine two two-items subproblem to create a four item subproblem. Go on till you get problem with original size.

merge sort algorithm

Implementation of Merge sort algorithm

As is the case with all divide and conquer algorithm problems, same processing is applied to subproblem which was applied to original problem, a case for recursive solution. Recursion needs a termination condition. Base condition to stop recursion in merge sort algorithm is when subproblem has only one element to be sorted. How can you sort single element? You don’t do, just return the element.

Now, how do we combine two subproblems solutions? This step is conquer part. Sort smaller parts and combine them together with merge operation. In merge operation, scan both sorted arrays one by one element and based on output of comparison, put one of the two items into output array, till both arrays are scanned. If one array is finished before other, just copy all elements from other array to output array. Copy this output array back to original input array so that it can be combined with bigger sub problems till solution to original problem is derived.

int merge(int a[], int start, int mid, int end){
 
    int i,j,k;
    int temp[end-start+1];
 
    i = start; j = mid+1; k =0;
    /* Compare all elements of two sorted arrays and store
      them in sorted array. */
    while(i <= mid && j <= end){
        if(a[i]< a[j]){
            temp[k++]= a[i++];
        }
        else {
            temp[k++]  = a[j++];
        }
    }
    while(i <= mid){
        temp[k++] = a[i++]; 
    }
    while(j <= end){
        temp[k++] = a[j++]; 
    }
    //Copy the temporary array into original array
    for(i=0; i<k; i++){
        a[i+start] = temp[i];
    }
}
int mergeSort(int a[], int start, int end ){
 
    int mid  = (start + end)/2;
    if(start<end){
        //sort the left part
        mergeSort(a, start, mid);
        //sort right part
        mergeSort(a, mid+1, end);
 
        //merge them together
        merge(a, start, mid, end);
    }
}
int main(){
	int a[] = {2,3,4,1,8,7,6,9};
	int size = sizeof(a)/sizeof(a[0]);
 
	mergeSort(a, 0, size-1);
	for(int i=0; i<size; i++){
		printf("%d ", a[i]);
	}
}

Let’s analyse complexity of merge sort algorithm, it can be divided into 2 parts :
1. Split step, which takes the constant amount of time, all it does is create two array half the size of original array, irrespective of size of array. So, complexity of this step is O(1).

2. Combine step complexity is O(n) where n is number of elements.
Complexity of each divide and conquer step is O(n). So, how many such divide and conquer steps do we have?
For n elements in array, there will be O(logn) such steps. Hence, complexity of merge sort algorithm is O(nlogn)

However, there is one caveat, every time we merge, two sub-arrays an auxiliary array is needed to temporarily store array elements and that incurs O(n) space complexity on merge sort algorithm.

There are some improvements which can be done on this algorithm.
1. When number of elements are less than some threshold, one can use insertion or selection sort to sort those numbers.  Why? Because when n is small, O(n2) is as good as O(nlogn) and it saves extra overhead of split and merging. All in all, using insertion sort in input array with small size, can give better performance.

2. Before calling merging, check if all the elements in right array are greater then left array, if yes, no need of merging. This can be easily checked by comparing a[mid] with a[mid+1]. If a[mid] is less than a[mid+1],  two sub arrays are already sorted and we don’t need to perform merge.

Please let us what do you think of this article and if there is something wrong or missing.

Quick sort algorithm

Quick sort Algorithm

Quick sort like merge sort is a sorting algorithm under divide and conquer paradigm of algorithms like merge sort. Basic idea of algorithm is to divide inputs around a pivot and then sort two smaller parts recursively and finally get original input sorted.

Selection of pivot

Entire idea of quick sort revolves around pivot. Pivot is an element in input around which input is arranged in such a way that all elements on left side are smaller and all elements on right side are greater than pivot. Question is how to find or select pivot and put it into correct position.

To make things simpler to start with, let’s assume first element of input is pivot element.

To put this pivot at correct position in input, start with next element of pivot in input space and find first element which is greater than pivot. Let that be ith position.

At the same time, start from end of array and find first element which is smaller than pivot. Let it be jth position.

If i and j have not crossed each other i.e i < j, then swap element at ith and jth positions, and continue moving right on input to find element greater than pivot and moving left to find element smaller than pivot.
Once i and j cross each other, swap pivot with element at jth position.  After this step, pivot will be at its correct position and array will be divided into two parts. All elements on left side will be less than pivot and all elements on right side will be greater than pivot.

Quick sort partition example

This is too much to process, I know! Let’s take an example and see how it does it work? We have an array as follows

quick sort

Let’s select first element as pivot, pivot = 3.

quick sort pivot selection

Start from next element of pivot, move towards right of array, till we see first element which is greater than pivot i.e. 3.

From end of array, move towards left till you find an element which is less than pivot.

Now, there are two indices, i and j, where A[i] > pivot and A[j] < pivot. See that i and j not yet crossed each other. Hence, we swap A[i] with A[j]. Array at the bottom of pic, shows resultant array after swap.

quick sort partition

Again, start with i+1 and follow the same rule : Stop when you find element greater than pivot. In this case, 10 is greater than 3, hence we stop.

Similarly, move left from end again, till we find an element which is less than pivot. In this case, we end up at index = 2 which is element 1.

Since, i > j, than means paths have been crossed. At this time, instead of swapping element at i and j index, swap element at j index with pivot.

After swapping pivot with jth index, we have array divided into two parts, pivot as boundary. All elements on left side of pivot are smaller (they may not be sorted) and all elements on right side of pivot are greater than pivot (again may not be sorted).

quick sort partitions

We, apply this same partition process to left and right arrays again, till base condition is hit. In this case, base condition would be if there is only one element in array to be partitioned.

Quick sort algorithm

quickSort([], start, end)
 1. If array has more than one elements i.e (start < end):
    1.1 Find correct place for pivot.
    pivot = partition(arr, low, high)
    1.2. Apply same function recursively to left of pivot index
         quickSort(arr, start, pivot -1 )
         and to the right of pivot index
         quickSort(arr, pivot + 1, end)

Quick sort implementation

#include <stdio.h>

void swap(int a[], int i, int j){
	int temp = a[i];
	a[i] = a[j];
	a[j] = temp;
}

int partition(int a[], int start, int end){
	
	int pivot = a[start];
	int i  = start+1;
	int j  = end;
	
	while(i < j){
	    while(a[i] < pivot) i++;
            while(a[j] > pivot) j--;
	
            if(i < j) {
                swap(a, i, j);
	    }
	}
	swap(a, start, j);
	return j;
}

void quickSort(int a[], int start, int end){
    if(start < end){
        int p = partition(a, start, end);
	quickSort(a,start, p-1);
	quickSort(a, p+1, end);
    }
}

int main(void) {
	int a[]= {4,3,2,5,6,8,1};
	int size = sizeof(a)/sizeof(a[0]);
	
	quickSort(a, 0, size-1);
	
	for(int i=0; i < size; i++){
		printf(" %d", a[i]);
	}
	return 0;
}

There is another implementation which is based on Lomuto partition scheme, in this scheme, we make last element as pivot. The implementation is compact but complexity is bit higher than the original partition methods in terms of number of swaps.

#include<stdlib.h>
#include<stdio.h>
 
void swap(int *a, int *b){
    int temp = *a;
    *a = *b;
    *b = temp;
}
 
int partition(int a[], int low, int high)
{
    // set pivot as highest element
    int x  = a[high];
 
    //Current low points to previous of low of this part of array. 
    int i = low - 1;
 
    for (int j = low; j <= high-1; j++)
    {
    	/*Move in the array till current node data is 
        less than the pivot */
        if (a[j] <= x){
            //set the current low appropriately
            i++;
            swap(&a[i], &a[j]);
        }
    }
    //Now swap the next node of current low with pivot
 
    swap(&a[i+1], &a[high]);
 
    printf("\n Pivot : %d\n", a[i+1]);
    for(int j=0; j<=high; j++){
 
    	printf("%d ", a[j]);
    }
    //return current low as partitioning point.
    return i+1;
}
 
/* A recursive implementation of quicksort for linked list */
void quickSortUtil(int a[], int low, int high)
{
    if (low < high)
    {
        int p = partition(a,low, high);
        quickSortUtil(a,low, p-1);
        quickSortUtil(a, p+1, high);
    }
}
 
/* Driver program to run above code */
int main(){
 
    int a[] = {5,4,2,7,9,1,6,10,8};
 
    int size = sizeof(a)/sizeof(a[0]);
    quickSortUtil(a, 0, size-1);
 
    for(int i=0; i<size; i++){
    	printf("%d ", a[i]);
    }
    return 0;
}

Complexity analysis of quick sort algorithm

If pivot splits original array into two equal parts (which is the intention), complexity of quick sort is O(n log n). However, worst case complexity of quick sort happens when input array is already sorted in increasing or decreasing order. In this case, array is partitioned into two subarrays, one with size 1 and other with size n-1. Similarly, subarray with n-1 elements, it again is divided into two subarrays of size 1 and n-2. In order to completely sort array it will split for n-1 times and each time it requires to traverse n element to find correct position of pivot. Hence overall complexity of quick sort comes out as O(n2).

There is a very interesting question, which tests your understanding of system basics. Question is what is space complexity of this algorithm? There is no apparent memory is used. However, recursive implementation internally puts stack frames on stack for partitioned indices and function call return address and so on. In worst case, there can be n stack frames, hence worst case complexity of quick sort will be O(n).

How can we reduce that? If the partition with fewest elements is (recursively) sorted first, it requires at most O(log n) space. Then the other partition is sorted using tail recursion or iteration, which doesn’t add to the call stack. This idea, was described by R. Sedgewick, and keeps the stack depth bounded by O(log n) and hence space complexity will be O(log n).

Quick sort with tail recursion

Quicksort(A, p, r)
{
 while (p < r)
 {
  q = Partition(A, p, r)
  Quicksort(A, p, q)
  p = q+1
 }
}

Selection of Pivot
If array is completely sorted, then worst case behavior of quick sort is O(n2), so there comes another problem. How can we select pivot so that two subarrays are almost equal size. There are many solutions proposed.
1. Taking median of array as pivot. So how to select median of an unsorted array. We look into this problem separately, but yes it guarantees two halves of same size.
2. Selecting pivot randomly. This requires heuristics algorithms to select pivot.

Please leave your comment in case you find something wrong or you have some improved version.