Inversions in array

Inversions in array

Let A[0…n – 1] be an array of n distinct positive integers. If i < j and A[i] > A[j] then the pair (i, j) is called an inversion of A. Given n and an array A, find the number of inversions in array A. For example: First array has two inversions (2,1) and (5,1) where as second array has 3 inversions, (2,1), (4,1) and (4,3)

inversions in array

How many inversion can be in a sorted array? There is no inversion in sorted array and nC2 inversions in completely inverted array.

Inversions in array : Thought process

What first thing which comes to mind? For each index i, check all j where j > i and see if A[j] < A[i]?
If A[j] is greater than current element A[i], increase inversion count. Implementation is given below.

package com.company;

/**
 * Created by sangar on 6.4.18.
 */
public class Inversions {
    public static int findInversions(int[] a){
        int count = 0;
        for(int i=0; i<a.length; i++){
            for(int j=i+1;  j<a.length; j++){
                if(a[i] > a[j]) count++;
            }
        }
        return count;
    }

    public static void main(String[] args) {
        int a[] = new int[]{90, 20, 30, 40, 50};
        int inversions = findInversions(a);
        System.out.print("Inversions : " + inversions);
    }
}

Worst case complexity of this method to find inversions in array is O(n2).

Can we use the information that a sorted array does not have any inversion? Let’s ask this question again with a tweak, how many inversions will there be if only one element is out of place for a completely sorted array? There will one. What if there are two elements out of place? Of course, there will be 2 inversion. Are we getting somewhere? Effectively, we have to count, how many swaps we need to completely sort array from it’s original state. If you noticed, the brute force algorithm, is nothing but selection sort, instead of swapping elements, count inversions.

What if we use any other sort algorithm like Merge sort algorithm? Complexity of merge sort is much better than selection sort, then it is possible that merge sort identifies inversions much efficiently than selection sort.
Let’s use merge step of merge sort algorithm with a bit modification. Divide part will remains the same. While merging two sorted subarrays, count number of times element on right part is put on result array before element on left side.
Every time A[i] is appended to the output of merge sort, no new inversions are encountered, since A[i] is smaller than everything left in array B.  If B[j] is appended to the output, then it is smaller than all the remaining items in A, we increase the number of count of inversions by the number of elements remaining in A.
Overall inversions will be inversion in left part + inversions of right part and inversion which are in merge step.

Let’s see how inversions in merge steps are counted and then we will see the overall algorithm.

inversion in array

 

inversions in array

Total number of inversions is 6.

Overall, algorithm looks like below.

inversions in array using merge sort

Algorithm to count inversions in array

First, let’s write an algorithm to count inversions in merge step. When we are at merge steps, there are two sorted arrays with us:  A and B

  1. Initialize i as start position of A and j as start position of B. These pointers will reference to currently compared indices in both arrays.
  2. While there are elements in both arrays, i.e. i < length(A) && j < length(B)
    1.  If B[j] < A[i], all elements from i to length(A) are greater than B[j],
      count += number of elements remaining in A. Increment j
    2. Else increment i
  3. Return count

Replace merge part of merge sort with this piece of algorithm and return sum of inversions in left + inversions in right + inversions in merge from function.

MergeSortAndCount(L):

  1. If L has one element return 0
  2. Divide L into A, B
    1. inversionsLeft = MergeSortAndCount(A)
    2. inversionRight = MergeSortAndCount(B)
    3. inversionMerge = MergeAndCount(A,B)
  3. return inversionLeft + inversionRight + inversionMerge

Inversions in array implementation

package com.company;

/**
 * Created by sangar on 6.4.18.
 */
public class Inversions {
    public  static int mergeAndCount(int[] a, int low, int mid, int high){
        int count  =  0;
        int[] temp = new int[high-low+1];

        int i = low;
        int j = mid+1;
        int k = 0;
        /*
            There are elements on both side of array
        */
        while(i<=mid && j<=high){

            if(a[i] > a[j]){
                //Number of elements remaining on left side.
                count+= (mid - i + 1);
                temp[k++] = a[j++];
            }
            else{
                temp[k++] = a[i++];
            }
        }
        while(i<=mid){
            temp[k++] = a[i++];
        }
        while(j<=high) {
            temp[k++] = a[j++];
        }

        for(i=low; i<k+low; i++){
            a[i] = temp[i-low];
        }

        return count;
    }

    public static int countInversions(int[] a, int low, int high){
        if(low >= high) return 0;

        int mid = low + (high - low) / 2;

        int inversionsLeft = countInversions(a, low, mid);
        int inversionsRight = countInversions(a, mid+1, high);
        int inversionsMerge = mergeAndCount(a, low, mid, high);

        return inversionsLeft + inversionsRight + inversionsMerge;
    }


    public static void main(String[] args) {
        int a[] = new int[]{90, 20, 30, 40, 50};
        int inversions = countInversions(a, 0, a.length-1);
        System.out.print("Inversions : " + inversions);
    }
}

Complexity of finding inversions in arrays using merge sort method is  O(n log n).

Please share if there is something wrong or missing. If you are interested in contributing to website and share your knowledge with learners, please contact us at communications@algorithmsandme.com

References: Lecture 8

Dutch National Flag Problem

Dutch National Flag Problem

Given an array with 0s,1s and 2s, sort array in increasing order. Another way to phrase this problem is sort balls with three different colors : red, blue and white, where balls of each color are placed together. This is typically know as Dutch national flag problem and algorithm to solve it is called Dutch national flag problem. Example:
A = [0,1,0,1,1,2,0,2,0,1]
Output = [0,0,0,0,1,1,1,1,2,2]

A = [R,B,B,W,B,R,W,B,B]
Output = [R,R,W,W,B,B,B,B]

This problem can be asked as design question, as let’s say you have to design a robot. All this robot does is : it see three empty buckets and a bucket full of colored balls (red, blue and white). Design an instruction set for this robot that it fill each empty buckets with one color. It’s the same problem as Dutch National Flag problem.

Count to sort an array of 0s,1s and 2s

We have already seen similar problem before: Segregate 0s and 1s in an array. We explored how to count elements and re-write them again on to array.

Let’s apply same method for this problem. Take an array of three integers, index store corresponding count for that number. E.g count[0] stores count of 0 and count[1] stores count of 1. Scan through array and count each element. At last, re-write those numbers back on to array starting from index 0, according to their count. For example, if there are 4 zeros, then starting from 0 index, write 4 zeros, followed by 1s and then by 2.

Complexity of counting method is O(n), notice that we scan array twice, first time for counts and second time for writing on array.

Dutch national flag problem : algorithm

  1. Start with three pointers : reader, low and high.
  2. reader and low are initialized as 0 and high is initialized as last element of array as size-1.
  3. reader will be used to scan the array while low and high will be used to swap elements to their desired positions.
  4. Starting with current position of reader, follow below steps, keep in mind we need 0 at start of array
    1. If element at index reader is 0, swap element at reader with element at low and increment low and reader by 1.
    2. If element at reader is 1, do not swap and increment reader by 1.
    3. If element at reader is 2, swap element at reader with element at high and decrease high by 1

Actually, three pointers divide array into four parts.  Red, white, unknown and Blue. Every element is taken from unknown part and put into its right place. So all three other parts expand while unknown part shrinks.

Let’s take an example and see how dutch national flag algorithm works.

First step, initialize reader, low and high.

Element at reader is 0, hence swap element at reader and low,also increment reader and low.

dutch national flag algorithm

Follow the same step, check element at reader again, it’s 1, hence, just move reader by one.

Element at reader is now 2, swap element at reader with element at high and decrease high by 1.

Element at reader is 1, just increment reader.

Element at reader is now 2, swap element at reader with element at high and decrease high by 1.

Dutch national flag algorithm explanation

Element at reader is 1, just increment reader.

Element at reader is 1, just increment reader.

Dutch national flag algorithm explanation

Element at reader 0, hence swap element at reader and low,also increment reader and low.

Element at reader 0, hence swap element at reader and low,also increment reader and low.

Dutch national flag algorithm explanation

Element at reader is now 2, swap element at reader with element at high and decrease high by 1.

Here, high becomes less than reader, we can stop as array is already sorted.

Dutch national flag algorithm implementation

package com.company;

/**
 * Created by sangar on 5.1.18.
 */
public class DutchNationalFlag {

    public static void swap(int[] input, int i, int j){
        int temp =  input[i];
        input[i] = input[j];
        input[j] = temp;
    }

    public static void dutchNationalFalgAlgorithm(int [] input){

        //initialize all variables
        int reader = 0;
        int low = 0;
        int high = input.length - 1;

        while(reader &amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;lt;= high){
            /*
              input always holds a permutation of the original data with input(0..(lo-1)) =0, input(lo..(reader-1))=1, input(reader..hi) is
              untouched, and input((hi+1)..(input.length-1)) = 2
            */
            if(input[reader] == 0){
                /*When element at reader is 0, swap
                element at reader with element at index
                low and increment reader and low*/
                swap(input, reader, low);
                reader++;
                low++;
            }
            else if(input[reader] == 1){
                /* if element at reader is just
                increment reader by 1 */
                reader++;
            }
            else if(input[reader] == 2){
                /* If element at reader is 2, swap
                 element at reader with element at
                 high and decrease high by 1 */
                swap(input, reader, high);
                high--;
            }
            else{
               System.out.println("Bad input");
               break;
            }
        }

    }
    public static void main(String[] args) {
        int[] input = {2,2,1,1,0,0,0,1,1,2};

        dutchNationalFalgAlgorithm(input);

        for(int i=0; i&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;lt;input.length; i++){
            System.out.print(" " + input[i]);
        }
    }
}

Complexity of Dutch National Flag algorithm is O(n), however, we scan the array only once.

Please share if you have some suggestions or comments. Sharing is caring.

3-way quicksort (Quick sort on non unique elements)

3-way quicksort

In post Dutch National Flag Algorithm, we discussed how can we sort and array in linear time by partitioning it in four parts, red, white, unknown and blue. Can we apply the same technique on quick sort.? As we all know that quick sort relies on partitioning of input and input is partitioned in  two parts. What if we divide input space in three parts? Then it becomes 3-way quicksort.

The 3-way partition variation of quick sort has slightly higher overhead compared to standard 2-way partition version. Both have same best, typical, and worst case time bounds, but this version is highly adaptive in very common case of sorting with few unique keys.

Quicksort basics and limitation

Before going ahead with 3-way partition method, I would strongly recommend that you go through usual quick sort algorithm : Quick sort algorithm in C

A big limitation of quick sort is that it has O(n2) worst case running time. Some improvements have been suggested as given below:

  • Cutoff to insertion sort. Switch to insertion sort for arrays with size less than a pre-decided limit. We follow quick sort and partition array. Once, the size of partitioned array goes lower than limit, apply insertion sort on that array. Limit varies from system to system and typically it is between 5 to 15.
  • Median-of-three partitioning. Use median of a small sample of items taken from the array as the partitioning item. Doing so will give a slightly better partition, but at the cost of computing the median.

There is another optimization which is called as Entropy-optimal sorting or 3-way partition quick sort. This method is useful when inout array contains lot of duplicate values which is very frequent in real world. Idea is to divide array in three parts rather than two. Let’s say P be pivot. First part contains all numbers which are less than p, second part contains number equal to p and last part contains numbers which are greater than p.

3-way quicksort algorithm 

3-way quicksort is optimization which works in specific cases like when input to be sorted contains few unique keys, in this case having traditional approach of one pivot does not perform well compared to 3-way quicksort.
Some of the properties of 3-way quicksort are:
It is not stable, when stability is the requirement, avoid using quicksort in general.
It uses O(lg(n)) extra space, why? Because of the recursion.
Worst case run time is as same as classic quicksort, O(n2), but typically O(nlog(n)) time
Best part is it takes O(n) time when O(1) unique keys.

This algorithm partitions array into three parts:
1. one with is less than pivot
2. equal to pivot
3. one with elements greater than pivot

3-way quicksort

3-way quicksort Implementation

#include <stdio.h>

int min(int a, int b){
 if(a > b) 
    return b;
  return a;
}

void swap(int a[], int i, int j){
	int temp = a[i];
	a[i] = a[j];
	a[j] = temp;
}

void quicksort(int a[], int low, int high){
	int pivot = a[high];
	
	int lt = low;
	int reader = low ;
	int gt = high;
	
	while(reader < gt){
		if(a[reader] < pivot){
			swap(a, reader, lt);
			reader++;
			lt++;
		}
		else if(a[reader] == pivot){
			gt--;
		    swap(a, reader, gt);
		}
		else{
			reader++;
		}
	}
  
	int minimum = min( gt-lt, high-gt+1);
	for(int i=0; i<minimum; i++){
		swap(a, lt+i, high-minimum+1+i); 
	}

	if( low < high){
		quicksort(a, low, lt-1);
   		quicksort(a, high-gt+lt, high);
	}
	return ;
}

int main(void) {
	int a[] = {4,3,3,2,7,9,2,3,5,6,7,4};
	int size = sizeof(a)/sizeof(a[0]);

	quicksort(a, 0, size-1);
	
	printf("\n");
	for(int i=0; i<size; i++){
		printf("%d ", a[i]);	
	}

	return 0;
}

Complexity of 3-way quicksort is O(n2).

Please share if something is wrong or any other suggestion or query. We would love to hear what you have to say.

Merge sort algorithm

Merge Sort Algorithm

We can classify sorting algorithms based on their complexity, selection sort, bubble and insertion sort have complexity of O(n2) while heap sort, merge sort and quick sort (with some assumptions) have complexity of  O(nlogn) and count and Radix sorts are linear O(n) algorithms.In this post, we will concentrate on merge sort algorithm.

Key points about merge sort algorithm

  • class of algorithm : divide and conquer
  • Complexity : Time O(n2)
  • Space : O(n)
  • Stable : Yes
  • In place : NO

Merge sort algorithm falls in divide and conquer class of algorithms where input space is divided into subproblems and these subproblems are then solved individually and combined together to solve original problem. Below figure explains divide and conquer paradigm.

Merge sort algorithm
Divide and conquer strategy

In merge sort algorithm, input space is divided into two subproblems till the time we achieve smallest subproblem and then smallest sub-problem is sorted and then combined up till original input is sorted. Question arises is what is the smallest subproblem? Smallest subproblem is where input cannot be further divided, a subproblem with one item to sorted.

Once, split is done till smallest size, start merging. Going from bottom, start with two one-item subproblems, combine that to create a two item subproblem, then combine two two-items subproblem to create a four item subproblem. Go on till you get problem with original size.

merge sort algorithm

Implementation of Merge sort algorithm

As is the case with all divide and conquer algorithm problems, same processing is applied to subproblem which was applied to original problem, a case for recursive solution. Recursion needs a termination condition. Base condition to stop recursion in merge sort algorithm is when subproblem has only one element to be sorted. How can you sort single element? You don’t do, just return the element.

Now, how do we combine two subproblems solutions? This step is conquer part. Sort smaller parts and combine them together with merge operation. In merge operation, scan both sorted arrays one by one element and based on output of comparison, put one of the two items into output array, till both arrays are scanned. If one array is finished before other, just copy all elements from other array to output array. Copy this output array back to original input array so that it can be combined with bigger sub problems till solution to original problem is derived.

int merge(int a[], int start, int mid, int end){
 
    int i,j,k;
    int temp[end-start+1];
 
    i = start; j = mid+1; k =0;
    /* Compare all elements of two sorted arrays and store
      them in sorted array. */
    while(i <= mid && j <= end){
        if(a[i]< a[j]){
            temp[k++]= a[i++];
        }
        else {
            temp[k++]  = a[j++];
        }
    }
    while(i <= mid){
        temp[k++] = a[i++]; 
    }
    while(j <= end){
        temp[k++] = a[j++]; 
    }
    //Copy the temporary array into original array
    for(i=0; i<k; i++){
        a[i+start] = temp[i];
    }
}
int mergeSort(int a[], int start, int end ){
 
    int mid  = (start + end)/2;
    if(start<end){
        //sort the left part
        mergeSort(a, start, mid);
        //sort right part
        mergeSort(a, mid+1, end);
 
        //merge them together
        merge(a, start, mid, end);
    }
}
int main(){
	int a[] = {2,3,4,1,8,7,6,9};
	int size = sizeof(a)/sizeof(a[0]);
 
	mergeSort(a, 0, size-1);
	for(int i=0; i<size; i++){
		printf("%d ", a[i]);
	}
}

Let’s analyse complexity of merge sort algorithm, it can be divided into 2 parts :
1. Split step, which takes the constant amount of time, all it does is create two array half the size of original array, irrespective of size of array. So, complexity of this step is O(1).

2. Combine step complexity is O(n) where n is number of elements.
Complexity of each divide and conquer step is O(n). So, how many such divide and conquer steps do we have?
For n elements in array, there will be O(logn) such steps. Hence, complexity of merge sort algorithm is O(nlogn)

However, there is one caveat, every time we merge, two sub-arrays an auxiliary array is needed to temporarily store array elements and that incurs O(n) space complexity on merge sort algorithm.

There are some improvements which can be done on this algorithm.
1. When number of elements are less than some threshold, one can use insertion or selection sort to sort those numbers.  Why? Because when n is small, O(n2) is as good as O(nlogn) and it saves extra overhead of split and merging. All in all, using insertion sort in input array with small size, can give better performance.

2. Before calling merging, check if all the elements in right array are greater then left array, if yes, no need of merging. This can be easily checked by comparing a[mid] with a[mid+1]. If a[mid] is less than a[mid+1],  two sub arrays are already sorted and we don’t need to perform merge.

Please let us what do you think of this article and if there is something wrong or missing.

Merge two sorted linked lists

Merge two sorted linked lists

Problem statement is simple : Merge two sorted linked lists, without using extra space. To refer to the basics of linked list, please follow the post : Linked list data structure. This problem is commonly asked in telephonic round of Amazon and Microsoft. Let’s take an example and understand what is required as solution. Given two linked lists as following,

merge two sorted linked lists
Two sorted linked lists

Result should be like this:

merge two sorted linked list
Resultant linked list.

Merge two sorted linked lists : Thoughts

Consider following two steps to merge sorted linked lists. First, figure out which node should be head of result list. Compare head nodes of two give lists, which ever is smaller, that should be the head of result list.

Second, compare two nodes, one from each list and decide which should go next in result linked list.  Advance the pointer to next node of the node which is added to result list.

As no new node is allocated during this merge, we have to make sure that all the references are maintained when nodes are added to merged linked list.

merge two sorted linked list
Two sorted list to be merged

We can start with one list as merge list and add nodes from second list at appropriate place in that list. Let’s say L1 is our merged list and we always compare node on L2 to see if it should be placed in L1 at current position. L1 grows as more nodes are sorted in merge list.

We compare first two nodes L1 and L2, and decide that node(2) has to go in merged list as head. If it was head of L2, we would have swapped L1 and L2 heads and still L1 will be head of merged list. Why? Because we want that L1 always points to last node in merged list and L1 to represent sorted merged list till this point and L2 switches between two input lists.

As L1 always points to the last node of merged linked list, next node to compare should be L1.next i.e node(4) and L2 i.e node(3).

As L1 follows the merged linked list, we will move L1.next to point node(3), however doing it directly will lead to lose of entire linked list following it. So we do it in four steps : store L1 next as temp; link L2 to L1.next; L2 points to temp and then move L1 to L1.next

Node temp = L1.next;
L1.next = L2;
L2 = temp;
L1 = L1.next
merge two sorted linked lists

Next nodes to be compared are node(5), which is L1.next and node(5) which is L2.

Comparing node 4 and 5 to add in sorted merge list

Of course node(4) has to be added to merged linked list, what should we do? First save L1.next in temp, temp now points to node(5). Second, point L1.next to L2, point L2 to temp, and at last, L1 moves to L1.next. State of two sorted linked lists looks as follows.

merge two sorted linked lists

By this time you must have noticed that L1 and L2 do not point to the one list all the time, L1 always points to the last node of merged list and L2 points to first node of separated list which needs to be merged.

Now, L1.next which is node(7) and L2 which is node(5) will be compared.

node(5) is to be added in merged sorted list. Again same set of steps. L1.next stored as temp, L1.next points to L2 i.e. node(5) and then L2 points to temp i.e. node(7)

merge two sorted linked lists

Again, node(9) which is L1.next will be compared to L2 i.e node(7). L1.next should point to L2. Final state will be as follows

At this point, L1.next i.e node(8) is less than L2, this is simple case, where we just move L1 to L1.next and L2 remains as is.

merge two sorted linked lists

Next two nodes follow the same pattern and added to merged sorted linked list.

At this point, special condition occurs which is L1.next is null. In this case, point L1.next to L2 and two linked lists are merged.

Two sorted linked lists are merged into a sorted list

Merge two sorted linked lists : Implementation

#include <stdio.h>
#include <stdlib.h>

typedef struct node{
    int data;
    struct node *next;
} Node;
 
Node * createNode(int val){
  Node * temp = (Node *)malloc(sizeof(Node));
  if(temp){
    temp->data = val;
    temp->next = NULL;
  }
  return temp;
}
/* This function inserts node at the head of linked list */
void push(Node **headRef, int data){
	Node * newNode  = createNode(data);
	newNode->next = *headRef;
	*headRef  = newNode;
}

void printList(Node * head){
    while(head){
        printf("%d->" , head->data );
        head = head->next;
    }

    printf("NULL");
    printf("\n");
}
Node * MergeLists(Node *list1, Node *list2) {
  if (!list1) return list2;
  if (!list2) return list1;

  Node *head;
	//Chosing head of merged list
  if (list1->data < list2->data) {
    head = list1;
  } else {
    head = list2;
    list2 = list1;
    list1 = head;
  }
	
  while(list1->next && list2) {
    if (list1->next->data > list2->data) {
	//Step 1. Save the next pointer
      Node *tmp = list1->next;
	//Step 2. Change next pointer to point L2
      list1->next = list2;
	//Step 3. Move L2 to temp
      list2 = tmp;
    }
	//Step 4. Move L1 ahead
    list1 = list1->next;
  } 
  if (!list1->next) list1->next = list2;
  return head;
}
int main(){
        Node * L1 = NULL;
        Node * L2 = NULL;
        Node * result = NULL;
        int carry = 0 ;
        /* creating list 1 */
        push(&L1,7);
        push(&L1,6);
        push(&L1,4);
        push(&L1,3);
	
        /* creating list 2 */
        push(&L2,10);
        push(&L2,8);
        push(&L2,1);
 
        L1 = MergeLists(L1,L2); 
        printList(L1);
 
        return 0;
}

Complexity of this method to merge two sorted lists into one is O(n+m) where n and m are number of nodes in two sorted linked lists.

Recursive implementation to merge two sorted linked lists

#include<stdlib.h>
#include<stdio.h>
 
typedef struct node{
    int data;
    struct node *next;
} Node;
 
Node * mergeSort(Node *a, Node *b){
    Node *result = NULL;
    if(a ==  NULL)
        return b;
    else if(b == NULL)
        return a;

    /* For the first node, we would set the result to either a or b */
      if(a->data <= b->data){
         result = a;
        /* Result's next will point to smaller one in lists 
           starting at a->next  and b */
         result->next = mergeSort(a->next,b);
      }
      else {
        result = b;
       /*Result's next will point to smaller one in lists 
         starting at a and b->next */
        result->next = mergeSort(a,b->next);
      }
      return result;
}

Node * createNode(int val){
  Node * temp = (Node *)malloc(sizeof(Node));
  if(temp){
    temp->data = val;
    temp->next = NULL;
  }
  return temp;
}
/* This function inserts node at the head of linked list */
void push(Node **headRef, int data){
	Node * newNode  = createNode(data);
	newNode->next = *headRef;
	*headRef  = newNode;
}

void printList(Node * head){
    while(head){
        printf("%d->" , head->data );
        head = head->next;
    }

    printf("NULL");
    printf("\n");
}

/* Driver program to run above code */
int main(){
        Node * L1 = NULL;
        Node * L2 = NULL;
        Node * result = NULL;
        int carry = 0 ;
        /* creating list 1 */
        push(&L1,7);
        push(&L1,6);
        push(&L1,4);
        push(&L1,3);
        /* creating list 2 */
        push(&L2,10);
        push(&L2,8);
        push(&L2,1);
      
        L1 = mergeSort(L1,L2); 
        printList(L1);
        
        return 0;
}

Please share if there is something wrong or missing. If you want to take personalized coaching from our expert teachers, please signup for free demo class.

Find Kth smallest element in array

Kth smallest element in array

Given an array of integers which is non sorted, find kth smallest element in that array. For example: if input array is A = [3,5,1,2,6,9,7], 4th smallest element in array A is 5, because if you sort the array A, it looks like A = [1,2,3,5,6,7,9] and now you can easily see that 4th element is 5.

This problem is commonly asked in Microsoft and Amazon interviews as it has multiple layers and there is some many things that can be measured with this one problem.

Kth smallest element : Line of thought

First of all, in any interview, try to come up with brute force solution. Brute force solution to find Kth smallest element in array of integers would be to sort array and return A[k-1] element (K-1 as array is zero base indexed).

What is the complexity of brute force solution? It’s O(n2)? Well, we have sort algorithms like merge sort and heap sort which work in O(n log n) complexity. Problem with both searches is that they use additional space. Quick sort is another sort algorithm. It has problem that it’s worst case complexity will be O(n2), which happens when input is completely sorted.
In our case, input is given as unsorted already, so we can expect that quick sort will function with O(n log n) complexity which is it’s average case complexity. Advantage of using quick sort is that there is no additional space complexity.

Optimising quick sort

Let’s see how quick sort works and see if we can optimize solution further?
Idea behind quick sort is to find correct place for the selected pivot. Once pivot is at correct position, all the elements on left side of pivot are smaller and on right side of pivot are greater than pivot. This step is partitioning.

If after partitioning, pivot is at position j, can we say that pivot is actually jth smallest element of the array? What if j is equal to k? Well problem solved, we found the kth smallest element.

If j is less than k, left subarray is less than k, we need to include more elements from right subarray, therefore kth smallest element is in right subarray somewhere. We have already found j smallest elements, all we need to find is k-j elements from right subarray.

What if j is greater than k? In this case, we have to drop some elements from left subarray, so our search space would be left subarray after partition.

Theoretically, this algorithm still has complexity of O(n log n), but practically, you do not need to sort the entire array before you find k smallest elements.

Algorithm to find K smallest elements in array

  1. Select a pivot and partition the array with pivot at correct position j
  2. If position of pivot, j, is equal to k, return A[j].
  3. If j is less than k, discard array from start to j, and look for (k-j)th smallest element in right sub array, go to step 1.
  4. If j is greater than k, discard array from j to end and look for kth element in left subarray, go to step 1

Let’s take an example and see if this algorithm works? A =  [4, 2, 1, 7, 5, 3, 8, 10, 9, 6 ], and we have to find fifth smallest element in array A.

Kth smallest element in array

Start with pivot as first index of array, so pivot = 0, partition the array into two parts around pivot such that all elements on left side of pivot element, i.e. A[pivot] are smaller and all elements on right side are greater than A[pivot].

Start with pivot as first index of array, so pivot = 0, partition the array into two parts around pivot such that all elements on left side of pivot element, i.e. A[pivot] are smaller and all elements on right side are greater than A[pivot].

In our example, array A will look like below after pivot has found it’s correct position.

k smallest element
After partition, correct position of pivot is index 3

If pivot == k-1 (array is represented as zero base index), then A[pivot] is kth smallest element. Since pivot (3) is less than k-1 (4), look for kth smallest element on right side of the pivot.

k remains as it is as opposed to k-j mentioned in algorithm as pivot is given w.r.t entire array and not w.r.t subarray.

In second iteration, pivot = 4 (index and not element). After second execution of quick sort array A will be like

After partition of right subarray, correct position of pivot is index 4

pivot(4) which is equal to k-1(5-1). 5th smallest element in array A is 5.

Implementation

package com.company;

/**
	* Created by sangar on 30.9.18.
*/
public class KthSmallest {
	private void swap(int[] a, int i, int j){
		int temp = a[i];
		a[i] = a[j];
		a[j] = temp;
	}
	private int partition(int[] a, int start, int end){
		int pivot = a[start];
		int i  = start+1;
		int j  = end;

		while(i < j){
			while(a[i] < pivot) i++;
			while(a[j] > pivot) j--;

			if(i < j) {
				swap(a, i, j);
			}
		}
		swap(a, start, j);
		return j;
	}

	public int findKthSmallestElement(int a[], int start, 
				int end, int k){
		if(start < end){
		int p = partition(a, start, end);
		if(p == k-1){
			return a[p];
		}
		if(p > k-1)
			return findKthSmallestElement(a, start, p, k);
		if(p < k-1)
			return findKthSmallestElement(a, p+1, end, k);
		}
		return -1;
	}
}
package test;

import com.company.KthSmallest;
import org.junit.jupiter.api.Test;

import static org.junit.jupiter.api.Assertions.assertEquals;

/**
 * Created by sangar on 28.8.18.
 */
public class KthSmallestTest {

	KthSmallest tester = new KthSmallest();
	private int[] a = {4, 2, 1, 7, 5, 3, 8, 10, 9};
	@Test
	public void kthSmallest() {
		assertEquals(7, tester.findKthSmallestElement(a,0,8,6));
	}

	@Test
	public void firstSmallest() {
		assertEquals(1, tester.findKthSmallestElement(a,0,8,1));
	}

	@Test
	public void lastSmallest() {
		assertEquals(10, tester.findKthSmallestElement(a,0,8,9));
	}

	@Test
	public void kGreaterThanSize() {
		assertEquals(-1, tester.findKthSmallestElement(a,0,8,15));
	}
	@Test
	public void emptyArray() {
		int[] a = {};
		assertEquals(-1, tester.findKthSmallestElement(a,0,0,1));
	}

	@Test
	public void nullArray() {
		assertEquals(-1, tester.findKthSmallestElement(null,0,0,1));
	}
}

Complexity of using quick sort algorithm to find kth smallest element in array of integers in still O(n log n).

Kth smallest element using heaps

Imagine a case where there are a billion integers in array and you have to find 5 smallest elements from that array. Complexity of O(n log n) is too costly for that use case. Above algorithm using quick sort does not take into consideration disparity between k and n.

We want top k elements, how about we chose those k elements randomly, call it set A and then go through all other n-k elements, call it set B, check if element from set B (n-k elements) can displace element in set A (k elements)?

What will be condition for an element from set B to replace an element in set A? Well, if the new element is less than maximum in set A, than maximum in set A cannot be in set of k smallest elements right?  Maximum element in set A would be replaced by the new element from set B.

Now, problem is how to quickly find maximum out of set A. Heap is the best data structure there. What kind of heap: min heap or max heap? Max heap as it store the maximum of set at the root of it.

Let’s defined concrete steps to find k smallest elements using max heap. 

  1. Create a max heap of size k from first k elements of array.
  2. Scan all elements in array one by one.
    1.  If current element is less than max on heap, add current element to heap and heapify.
    2. If not, then go to next element.
  3. At the end, max heap will contain k smallest elements of array and root will be kth smallest element.

Let’s take an example and see if this algorithm works? Input array is shown below and we have to find 6th smallest element in this array.

kth smallest element using heaps
input array

Step 1 : Create a max heap with first 6 elements of array.

Create a max heap with set A

Step 2 : Take next element from set B and check if it is less than root of max heap. In this case, yes it is. Remove the root and insert the new element into max heap.

Element from set B removes root from max heap and added to max heap

Step 2 : It continues to 10, nothing happens as new element is greater than root of max heap. Same for 9.  At 6, again root of max heap is greater than 6. So remove the root and add 6 to max heap.

Again, new element from set B is less than root of max heap. Root is removed and new element is added.

Array scan is finished, so just return root of max heap, 6 which is sixth smallest element in given array.

	public int findKthSmallestElementUsingHeap(int a[], int k){
	//https://stackoverflow.com/questions/11003155/change-priorityqueue-to-max-priorityqueue
	PriorityQueue<Integer>  maxHeap =
			new PriorityQueue<>(k, Collections.reverseOrder());

		if(a == null || k > a.length) return -1;
		//Create max with first k elements
		for(int i=0; i<k; i++){
			maxHeap.add(a[i]);
		}
		/*Keep updating max heap based on new element
		If new element is less than root, 
		remove root and add new element
		*/
		for(int i=k; i<a.length; i++){
			if(maxHeap.peek() > a[i]){
				maxHeap.remove();
				maxHeap.add(a[i]);
			}
		}
		return maxHeap.peek();
	}

Can you calculate the complexity of above algorithm? heapify() has complexity of log(k) with k elements on heap. In worst case, we have to do heapify() for all elements in array, which is n, so overall complexity of algorithm becomes O(n log k). Also, there is additional space complexity of O(k) to store heap.
When is very small as compared to n, this algorithm again depends on the size of array.

We want k smallest elements, if we pick first k elements from a min heap, will it solve the problem? I think so. Create a min heap of n elements in place from the given array, and then pick first k elements.
Creation of heap has complexity of O(n), do more reading on it. All we need to do is delete k times from this heap, each time there will be heapify(). It will have complexity of O(log n) for n element heap. So, overall complexity would be O(n + k log n).

Depending on what you want to optimize, select correct method to find kth smallest element in array.

Please share if there is something wrong or missing. If you are interested in taking coaching sessions from our experienced teachers, please reach out to us at communications@algorithmsandme.com

Quick sort algorithm

Quick sort Algorithm

Quick sort like merge sort is a sorting algorithm under divide and conquer paradigm of algorithms like merge sort. Basic idea of algorithm is to divide inputs around a pivot and then sort two smaller parts recursively and finally get original input sorted.

Selection of pivot

Entire idea of quick sort revolves around pivot. Pivot is an element in input around which input is arranged in such a way that all elements on left side are smaller and all elements on right side are greater than pivot. Question is how to find or select pivot and put it into correct position.

To make things simpler to start with, let’s assume first element of input is pivot element.

To put this pivot at correct position in input, start with next element of pivot in input space and find first element which is greater than pivot. Let that be ith position.

At the same time, start from end of array and find first element which is smaller than pivot. Let it be jth position.

If i and j have not crossed each other i.e i < j, then swap element at ith and jth positions, and continue moving right on input to find element greater than pivot and moving left to find element smaller than pivot.
Once i and j cross each other, swap pivot with element at jth position.  After this step, pivot will be at its correct position and array will be divided into two parts. All elements on left side will be less than pivot and all elements on right side will be greater than pivot.

Quick sort partition example

This is too much to process, I know! Let’s take an example and see how it does it work? We have an array as follows

quick sort

Let’s select first element as pivot, pivot = 3.

quick sort pivot selection

Start from next element of pivot, move towards right of array, till we see first element which is greater than pivot i.e. 3.

From end of array, move towards left till you find an element which is less than pivot.

Now, there are two indices, i and j, where A[i] > pivot and A[j] < pivot. See that i and j not yet crossed each other. Hence, we swap A[i] with A[j]. Array at the bottom of pic, shows resultant array after swap.

quick sort partition

Again, start with i+1 and follow the same rule : Stop when you find element greater than pivot. In this case, 10 is greater than 3, hence we stop.

Similarly, move left from end again, till we find an element which is less than pivot. In this case, we end up at index = 2 which is element 1.

Since, i > j, than means paths have been crossed. At this time, instead of swapping element at i and j index, swap element at j index with pivot.

After swapping pivot with jth index, we have array divided into two parts, pivot as boundary. All elements on left side of pivot are smaller (they may not be sorted) and all elements on right side of pivot are greater than pivot (again may not be sorted).

quick sort partitions

We, apply this same partition process to left and right arrays again, till base condition is hit. In this case, base condition would be if there is only one element in array to be partitioned.

Quick sort algorithm

quickSort([], start, end)
 1. If array has more than one elements i.e (start < end):
    1.1 Find correct place for pivot.
    pivot = partition(arr, low, high)
    1.2. Apply same function recursively to left of pivot index
         quickSort(arr, start, pivot -1 )
         and to the right of pivot index
         quickSort(arr, pivot + 1, end)

Quick sort implementation

#include <stdio.h>

void swap(int a[], int i, int j){
	int temp = a[i];
	a[i] = a[j];
	a[j] = temp;
}

int partition(int a[], int start, int end){
	
	int pivot = a[start];
	int i  = start+1;
	int j  = end;
	
	while(i < j){
	    while(a[i] < pivot) i++;
            while(a[j] > pivot) j--;
	
            if(i < j) {
                swap(a, i, j);
	    }
	}
	swap(a, start, j);
	return j;
}

void quickSort(int a[], int start, int end){
    if(start < end){
        int p = partition(a, start, end);
	quickSort(a,start, p-1);
	quickSort(a, p+1, end);
    }
}

int main(void) {
	int a[]= {4,3,2,5,6,8,1};
	int size = sizeof(a)/sizeof(a[0]);
	
	quickSort(a, 0, size-1);
	
	for(int i=0; i < size; i++){
		printf(" %d", a[i]);
	}
	return 0;
}

There is another implementation which is based on Lomuto partition scheme, in this scheme, we make last element as pivot. The implementation is compact but complexity is bit higher than the original partition methods in terms of number of swaps.

#include<stdlib.h>
#include<stdio.h>
 
void swap(int *a, int *b){
    int temp = *a;
    *a = *b;
    *b = temp;
}
 
int partition(int a[], int low, int high)
{
    // set pivot as highest element
    int x  = a[high];
 
    //Current low points to previous of low of this part of array. 
    int i = low - 1;
 
    for (int j = low; j <= high-1; j++)
    {
    	/*Move in the array till current node data is 
        less than the pivot */
        if (a[j] <= x){
            //set the current low appropriately
            i++;
            swap(&a[i], &a[j]);
        }
    }
    //Now swap the next node of current low with pivot
 
    swap(&a[i+1], &a[high]);
 
    printf("\n Pivot : %d\n", a[i+1]);
    for(int j=0; j<=high; j++){
 
    	printf("%d ", a[j]);
    }
    //return current low as partitioning point.
    return i+1;
}
 
/* A recursive implementation of quicksort for linked list */
void quickSortUtil(int a[], int low, int high)
{
    if (low < high)
    {
        int p = partition(a,low, high);
        quickSortUtil(a,low, p-1);
        quickSortUtil(a, p+1, high);
    }
}
 
/* Driver program to run above code */
int main(){
 
    int a[] = {5,4,2,7,9,1,6,10,8};
 
    int size = sizeof(a)/sizeof(a[0]);
    quickSortUtil(a, 0, size-1);
 
    for(int i=0; i<size; i++){
    	printf("%d ", a[i]);
    }
    return 0;
}

Complexity analysis of quick sort algorithm

If pivot splits original array into two equal parts (which is the intention), complexity of quick sort is O(n log n). However, worst case complexity of quick sort happens when input array is already sorted in increasing or decreasing order. In this case, array is partitioned into two subarrays, one with size 1 and other with size n-1. Similarly, subarray with n-1 elements, it again is divided into two subarrays of size 1 and n-2. In order to completely sort array it will split for n-1 times and each time it requires to traverse n element to find correct position of pivot. Hence overall complexity of quick sort comes out as O(n2).

There is a very interesting question, which tests your understanding of system basics. Question is what is space complexity of this algorithm? There is no apparent memory is used. However, recursive implementation internally puts stack frames on stack for partitioned indices and function call return address and so on. In worst case, there can be n stack frames, hence worst case complexity of quick sort will be O(n).

How can we reduce that? If the partition with fewest elements is (recursively) sorted first, it requires at most O(log n) space. Then the other partition is sorted using tail recursion or iteration, which doesn’t add to the call stack. This idea, was described by R. Sedgewick, and keeps the stack depth bounded by O(log n) and hence space complexity will be O(log n).

Quick sort with tail recursion

Quicksort(A, p, r)
{
 while (p < r)
 {
  q = Partition(A, p, r)
  Quicksort(A, p, q)
  p = q+1
 }
}

Selection of Pivot
If array is completely sorted, then worst case behavior of quick sort is O(n2), so there comes another problem. How can we select pivot so that two subarrays are almost equal size. There are many solutions proposed.
1. Taking median of array as pivot. So how to select median of an unsorted array. We look into this problem separately, but yes it guarantees two halves of same size.
2. Selecting pivot randomly. This requires heuristics algorithms to select pivot.

Please leave your comment in case you find something wrong or you have some improved version.