Difference between array and linked list

Difference between array and linked list

In last post : Linked list data structure, we discussed basics of linked list, where I promised to go in details what is difference between array and linked list. Before going into post, I want to make sure that you understand that there is no such thing called one data structure is better than other. Based on your requirements and use cases, you chose one or the other. It depends on what is most frequent operation your algorithm would perform in it’s lifetime. That’s why they have data structure round in interview process to understand if you can chose the correct one for the problem.

What is an array?
Array is linear, sequential and contiguous collection of elements which can be addressed using index.

What is a linked list?
Linked list is linear, sequential and non-contiguous collection of nodes, each node store the reference to next node. To understand more, please refer to Linked list data structure.

Difference between arrays and linked list

Static Vs dynamic size

Size of an array is defined statically at the compile time where as linked list grows dynamically at run time based on need. Consider a case where you know the maximum number of elements algorithm would ever have, then you can confidently declare it as array. However, if you do not know, the linked list is better. There is a catch : What if there is a rare chance that number of elements will reach maximum, most of the time it will be way less than maximum? In this case, we would unnecessary allocating extra memory for array which may or may not be used. 

Memory allocation

An array is given contiguous memory in system. So, if you know the address of any of the element in array, you can access other elements based position of the element.

linked list vs arrays
Statically allocated contiguous memory

Linked list are not store contiguous on memory, nodes are scattered around on memory. So you may traverse forward in linked list, given node (using next node reference), but you can not access nodes prior to it.

arrays vs linked list
Dynamically allocated non-contiguous memory

Contiguous allocation of memory required sufficient memory before hand for an array to be stored, for example if want to store 20 integers in an array, we would required 80 bytes contiguous memory chunk. However, with linked list we can start with 8 bytes and request more memory as when required, which may be wherever. Contiguous allocation of memory makes it difficult to resize an array too. We have to look for different chunk of memory, which fits the new size, move all existing elements to that location. Linked list on other hand are dynamically size and can grow much faster without relocating existing elements.

Memory requirement

It’s good to have non-contiguous memory then? It comes with a cost. Each node of linked list has to store reference to next node in memory. This leads to extra payload of 4 bytes in each node. On the other hand, array do not require this extra payload. You  have to trade off extra space with advantages you are getting. Also, sometime, spending extra space is better that have cumbersome operations like shifting, adding and deleting operation on array. Or value stored in node is big enough to make these 4 bytes negligible in analysis.

Operation efficiency

We do operations of data structure to get some output. There are four basic operations we should be consider : read, search, insert/update and delete.

Read on array is O(1) where you can directly access any element in array given it’s index. By O(1), read on array does not depend on size of array.
Whereas, time complexity of read on linked list is O(n) where n is number of nodes. So, if you have a problem, which requires more random reads, array will over-weigh linked list.

Given the contiguous memory allocation of array, there are optimized algorithms like binary search to search elements on array which has complexity of O(log n). Search on linked list on other hand requires O(n).

Insert on array is O(1) again, if we are writing within the size of array. In linked list, complexity of insert depends where do you want to write new element at. If insert happens at head, then it O(1), on the other hand if insert happens at end, it’s O(n).

Insert node at start of linked list
Insert node at the tail of linked list

Update means here, changing size of array or linked list by adding one more element. In array it is costly operation, as it will require reallocation of memory and copying all elements on to it. Does not matter if you add element at end or start, complexity remains O(1).
For linked list, it varies, to update at end it’s O(n), to update at head, it’s O(1). 
In same vain, delete on array requires movement of all elements, if first element is deleted, hence complexity of O(n). However, delete on linked list O(1), if it’s head, O(n) if it’s tail.

To see the difference between O(1) and O(n), below graph should be useful.

difference between array and linked list
Complexity analysis graph

Key difference between array and linked list are as follows

  • Arrays are really bad at insert and delete operation due to internal reallocation of memory.
  • Statically sized at the compile time
  • Memory allocation is contiguous,  which make access elements easy without any additional pointers. Can jump around the array without accessing all the elements in between.
  • Linked list almost have same complexity when insert and delete happens at the end, however no memory shuffling happens
  • Search on linked list is bad.=, usually require scan with O(n) complexity
  • Dynamically sized on run time.
  • Memory allocation is non-contiguous, additional pointer is required to store neighbor node reference. Cannot jump around in linked list.

Please share if there is something wrong or missing. If you wan to contribute to website, please reach out to us at communications@algorithmsandme.com

Linked list data structure

Linked list data structure

Linked list is a very important data structure to understand as lot of problems are asked based on linked list in Amazon, Microsoft and Google interview. Today, we will understand the basics of linked list data structure and it’s implementation. 

Linked list represent linear sequence of elements. Each element connected to next element using chain of references. Another data structure which store linear sequence of items is array. There are some advantages and uses cases where linked list way of storing sequence is more efficient than array, I will cover that into next post : Arrays Vs Linked lists.

In last paragraph, I emphasized on linkedlist being linear data structure. In linear data structure, there is a sequence and order how elements are inserted, arranged and traversed. In order to go to tail of linked list, we have to go through all of the nodes.

linked list data structure
linear data structure when elements can be traversed only in one order


Non linear data structures are the ones where elements are not arranged or traversed in a specific order. One element may be connected to many others, hence we cannot traverse them in the same order every time. Example of non-linear data structure would be maps, dictionaries, trees, graphs etc.

linked list as data structure
Non  linear data structure when nodes cannot be traversed in one order always

Linked list implementation

Linked list consists of node, any number of nodes. Each node contains two things : first, value of the node, this value can be of any type, integer, string, or other user defined type. Second, a reference which points to next node in linked list. A node can be declared as follows:

typedef struct Node {
	int data;
	struct Node * next;
} Node;
Node structure
Linked list

What happens if the node is last node in linked list? At last node, next pointer of the node points to the null. It’s very important to understand this bit, as this condition will be used on almost every problem you have to solve on linked list.

Linked list is dynamic data structure. By dynamic data structure, we mean, it’s size and nature is not defined at the time of compilation, but defined at run time. Every time, a new node is added to linked list, new memory location is allocated and previous node’s next pointer will point to new node.

Operations of linked list

  • Adding node at the end of list
    There are three basic steps to add a node to linked list at end:
  1. Check if there is already a node
    1. If no, then create a new node and return it as head of linked list.
  2. If there is a node,
    1. Scan through linked list using next pointer, reach to the last node.
    2. Create a new node, and point next pointer of last node to this new node.
Node * createNode(int val){
	Node * newNode = (Node *)malloc(sizeof(Node));
	if(newNode){
		newNode->data = val;
		newNode->next = NULL;
	}
	return newNode;
}

void addNode(Node **headRef, int value){
	//create new node
	Node *newNode = createNode(value);

	//find the last node
	Node *currentNode = *headRef;
	while(currentNode && currentNode->next != NULL){
		currentNode = currentNode->next;
	}
	if(currentNode)
		currentNode->next = newNode;
	}
	else{
		//Change headRef to point to new head.
		*headRef = newNode;
	}
}

Complexity of adding a node to linked list is O(n). 

  • Insert node at head of list
    In this case too, we allocate a new node, however, this time we do not have to scan the entire list. Every time we add node to list, it’s head changes though.
  1. Check if there is already a node
    1. If no, then create a new node and return it as head of linked list.
  2. If there is a node,
    1. Create a new node, and point next pointer new node to head.
    2. Return new node as head pointer.
Node * createNode(int val){
	Node * newNode = (Node *)malloc(sizeof(Node));
	if(newNode){
		newNode->data = val;
		newNode->next = NULL;
	}
	return newNode;
}

void addNode(Node **headRef, int value){
	//create new node
	Node *newNode = createNode(value);
	newNode->next = *headRef;
	*headRef = newNode;
}

Linked list data structure problems

It’s very important to understand that linked list is a recursive data structure. Base case is a linked list with no node, represented by NULL node. Every problem on linked list can be solved using template : process one node, and then recursively process the remaining linked list.

In programming terms, linked list is divided into two parts, head and tail. The node being processed is called head and rest of the linked list is tail. Tail has the exactly same structure as the original list. 

Problems like merging linked lists, reverse a linked list, find length of linked list all can be solved using the same template of processing one node and the recursively call function on remaining node. 

Types of linked list

There are three types of linked lists :
1. Singly linked list 
Singly linked lists contain nodes with data and reference, i.e., next, which points to the next node in the sequence of nodes. The next pointer of the last node will point to null. In singly linked list you can traverse only in one direction.

singly linked list
singly linked list

2. Doubly linked list
In a doubly linked list, each node contains two links – previous, which points to the node before current node and next,  which points to next node. The previous pointer of the first node and next pointer of the last node will point to null. In doubly linked list, you can traverse it both directions. Two references adds to weight as extra memory is required.

doubly linked list
doubly linked list

3. Circular linked list
In circular linked list, next pointer of  the last node will point to the first node. A circular linked list can be both singly as well as doubly linked list.

circular linked list
Circular doubly linked list

This was all for basics of linked list, I know problems on them are hard to solve but if you look at all the problems, they boil down to one thing : understanding of node and how recursion can be used. In next posts, we will be solving many of these problems and see how we can use these basics.

Please share if there is something wrong or missing. If you are interested in contributing to website and share your knowledge with thousands of users across world, please reach out to us at communications@algorithmsandme.com

Fill 4xN wall with 4×1 and 1×4 bricks

Fill 4xN wall with 4×1 and 1×4 bricks

There is a wall with 4 x N dimensions and we have a brick with 4 x 1 dimension. We have to fill the wall with given brick and find out how may ways possible to fill that wall.

For example, if there is wall with N = 3, we have only one way to fill the wall, with three brick laid horizontally.

Where as with N = 4, there are two ways, one with putting four bricks horizontally, or 4 bricks vertically.

fill wall with bricks Actually, examples themselves give away the answer to the our problem. Let’s start small and build on top of it. What if N = 1 , then wall dimensions are 4 x 1, and there is only one way to fill that wall with brick of 4 x 1, which is to lay the brick horizontally.

What if N = 2, i.e. wall  is 4 x 2, , again, there is only one way  possible, put two bricks horizontally,we cannot put bricks vertical. Why?

Take N  = 3, i.e. wall with 4 x 3, only way we can fill the wall is to put three bricks horizontally, can’t use vertical brick.

What if N = 4, wall with 4 x 4 dimensions, in this scenario, we have two options, put four bricks horizontally or four bricks vertically, so there are two ways to fill a wall of 4 x 4 with brick of 4 x 1.

Now,  if number of ways to fill a wall of dimension 4 x N is f(N) then f(N) for values 1, 2 and 3 is as follows.

f(1)=1, f(2)=1, f(3)=1

We have two choices for each brick for wall of size greater than 4 X 3.  Either to keep brick vertically or  to keep brick horizontally.

If we keep brick vertically, we cover four units out of N units height of wall with each brick, require four vertical bricks to cover horizontally, so problem reduces to N-4 units.

If we keep brick horizontally, then it covers only 1 unit height of wall, hence we need to cover N-1 units of height further.
So, for N we have relationship as

f(N) = f(N-1)  + f(N-4)

We have the recurrence relation and the base conditions, let’s implement it.

Fill wall with brick : recursive implementation

int findWays(int n){
        if(n == 0 || n == 1 || n == 2 || n == 3) return 1;
        return findWays(n-1) + findWays(n-4);
}

int main(void) {
	int N = 5;
	int ways = findWays(N);
	printf("%d", ways);
	return 0;
}

Do you think this solution is optimized? Why do you think, it can be optimized and how? If you closely look at the recursion tree of implementation, you will see the problem. Some of the subproblems are solved repeatedly. Can we avoid solving them again and again?  Yes, that’s called memoization.

Well, this problem can be solved using dynamic programming, because two properties hold : First, optimal solution to subproblem gives solution to original problem. Second, overlapping subproblems.

Dynamic programming approach would be to fill a table bottom up where table [N] will be the solution.  table[0] = table[1] = table[2] = table[3] = 1 as discussed above.

Now from N = 4, we can fill the table bottom up as

table[N] = table[N-1] + table[N-4]

Fill wall with brick : dynamic programming implementation

int find_ways(int n, int table[]){
	int i;
	for(i = 4; i<= n; i++){
		table[i] = table[i-1] + table[i-4];
	}
}

int main(void) {
	int N =5;
	int table[N+1];
	table[0] = 1;
	table[1] = 1;
	table[2] = 1;
	table[3] = 1;
	find_ways(N, table);
	printf("%d", table[N]);
	return 0;
}

Complexity of dynamic programming approach is O (N) with space complexity of O(N).

Please share if there is something wrong or missing. If you are willing to share your knowledge and help thousands of learners across the world, please reach out to us on communications@algorithmsandme.com

Scheduling weighted jobs

Scheduling weighted jobs

Suppose we have been give n jobs j1, j2,j3…jn with their start time s1,s2,… sn and finish time f1,f2, f3…fn. There is a value vi associated with each job. Problem is scheduling weighted jobs such all jobs are compatible and we get maximum value. Two jobs are said to be compatible, if there execution time do not overlap.

For example, we have four jobs as shown below:

scheduling weighted jobs

In above figure maximum value can be achieved by scheduling job 1 and job 4 which is value of 250. Notice that there one more schedule with compatible jobs (Job1, Job2 and Job 3), however, value we get by that schedule is only 170 which is less than what we got in earlier schedule.

Scheduling weighted jobs : Line of thoughts

There is strong urge to use greedy algorithm here, and problems is very similar to Interval Scheduling Algorithm. However, greedy algorithm works for this problem when value of all jobs is equal. Since value of jobs is different here, greedy algorithm fails.

Let’s consider brute force solution. First of all, sort all jobs based on finish time in increasing order. Now, for each job, decide if including it in schedule gives us maximum value or excluding it will give us maximum value. When we include a job, check if it is compatible with other jobs which are included in schedule. To determine compatibility quickly, we pre-calculate an array, called P such that

p(j) = largest index i < j such that job i is compatible with j.

For jth job or interval to be compatible with ith interval, start time of jth interval or job should be greater than end time of ith interval or job.

For example: p(8) = 5, p(7) = 3, p(2) = 0.

scheduling-weighted-jobs

Now, let’s say OPT(j) represents the maximum value which we gain by adding jobs from 1 to j. As mentioned above, there are two cases:

Case 1: OPT selects job j. In this case we can not use incompatible jobs {p(j) + 1, p(j) + 2, …, j – 1} and must include optimal solution to problem consisting of remaining compatible jobs 1, 2, …, p(j).

Case 2: OPT does not select job j. – must include optimal solution to problem consisting of remaining compatible jobs 1, 2, …, j-1

For case 1, we already have P[j] calculated. With P[j] already prepared, we know that we don’t have to check any job later than P[j] as all of them will be conflicting with current job. Recursive formula for calculating maximum value for n jobs will be:

OPT( j) = 0 if j = 0 
          max { vj + OPT( p(j) ), OPT(j-1)} otherwise

Scheduling weighted jobs : Recursive solution

package com.company;

import java.util.Arrays;

/**
 * Created by sangar on 4.5.18.
 */
public class ScheduleWeightedJobs {

    public static int optimalScheduling(Job[] jobs, int[] nonConflictJobs, int j){
        if(j == -1){
            return 0;
        }

        return Integer.max(optimalScheduling(jobs, nonConflictJobs, nonConflictJobs[j]) + jobs[j].getValue(),
                            optimalScheduling(jobs, nonConflictJobs, j-1));
    }

    public static void main(String[] args) {

        Job[] jobs = new Job[4];
        jobs[0] = new Job(1, 3, 50);
        jobs[1] = new Job(3, 5, 20);
        jobs[2] = new Job(6, 9, 100);
        jobs[3] = new Job(3, 12, 200);

        Arrays.sort(jobs, (o1, o2) -> o1.getEndTime() - o2.getEndTime());

        int[] nonConflictingJobs = new int[jobs.length];

        for (int j = 0; j < jobs.length; j++) {
            nonConflictingJobs[j] = -1;
            for(int i = j-1; i >= 0; i--) {
                if(jobs[i].getEndTime() <= jobs[j].getStartTime()) {
                    nonConflictingJobs[j] = i;
                    break;
                }
            } 
        }

        int maxValue = optimalScheduling(jobs,nonConflictingJobs, jobs.length-1);

        System.out.println(maxValue);
    }
}

This recursive algorithm has exponential complexity as there are lot of subproblems which are calculated repeatedly. For example,
Schedule weighted jobs

Recursive execution tree for above problem would like
weighted jobs scheduling

If we revisit the problems there are two properties of this problem : First it is optimal substructure, which means, optimal solution to subproblem leads to optimal solution to bigger problem. Second, there are overlapping subproblems. From figure, we can see that there are subproblems which are being re-calculated. Typical way to avoid this repetition is to store solutions to subproblem, this method is called memoization. This is kind of a cache where results of subproblems are stored and looked into whenever required.

This is typical case of dynamic programming application.

scheduling weighted job : Dynamic programming implementation

package com.company;

import java.util.Arrays;

/**
 * Created by sangar on 4.5.18.
 */
public class ScheduleWeightedJobs {

    public static int optimalSchedulingDP(Job[] jobs, int[] nonConflictJobs){
        int[] optimalValue = new int[jobs.length];

        optimalValue[0] = jobs[0].getValue();

        for(int i = 1; i < jobs.length; i++){
            optimalValue[i] = Integer.max(optimalValue[nonConflictJobs[i]] + jobs[i].getValue(),
                                optimalValue[i-1]);
        }
        return optimalValue[jobs.length-1];
    }

    public static void main(String[] args) {

        Job[] jobs = new Job[4];
        jobs[0] = new Job(1, 3, 50);
        jobs[1] = new Job(3, 5, 20);
        jobs[2] = new Job(6, 9, 100);
        jobs[3] = new Job(3, 12, 200);

        Arrays.sort(jobs, (o1, o2) -> o1.getEndTime() - o2.getEndTime());

        int[] nonConflictingJobs = new int[jobs.length];

        for (int j = 0; j < jobs.length; j++) {
            nonConflictingJobs[j] = -1;
            for(int i = j-1; i >= 0; i--) {
                if(jobs[i].getEndTime() <= jobs[j].getStartTime()) {
                    nonConflictingJobs[j] = i;
                    break;
                }
            }
        }

        int maxValue = optimalSchedulingDP(jobs,nonConflictingJobs);

        System.out.println(maxValue);
    }
}

Run time complexity of dynamic programming approach is O(n2). Sorting takes O(n log n) and calculation of maximum value takes O(n2).
If we have pre-sorted input based on finish time, then this approach takes only O(n). Note that we need additional O(n) space for storing results of subproblems.

How about finding the solution itself, means to find which jobs are actually give us optimal value? This requires some post processing. Algorithm is as follows

Find-solution(j) : 
 if (j = 0) output nothing 
 else if (vj + Table[P(j)] > Table[j-1]) print j 
     Find-Solution(p(j)) 
 else Find-Solution(j-1)

Please share if there is something wrong or missing. If you are interested in contributing to algorithms and me, please drop a mail

Interval partitioning problem

Interval partitioning problem

In continuation of greedy algorithm problem, (earlier we discussed : even scheduling and coin change problems) we will discuss another problem today. Problem is known as interval partitioning problem and it goes like : There are n lectures to be schedules and there are certain number of classrooms. Each lecture has a start time si and finish time fi. Task is to schedule all lectures in minimum number of classes and there cannot be more than one lecture in a classroom at a given point of time. For example, minimum number of classrooms required to schedule these nine lectures is 4 as shown below.

interval partition

However,  we can do some tweaks and manage to schedule same nine lectures in three classrooms as shown below.

So, second solution optimizes the output.

Another variant of this problem is :  You want to schedule jobs on a computer. Requests take the form (si , fi) meaning a job that runs from time si to time fi. You get many such requests, and you want to process as many as possible, but the computer can only work on one job at a time.

Interval partitioning : Line of thought

First thing to note about interval partitioning problem is that we have to minimize something, in this case, number of classrooms. What template this problem fits into? Greedy may be? Yes it fits into greedy algorithm template. In greedy algorithm we take decision on local optimum.

Before discussing the solution, be clear that what is resource and what needs to be minimized? In this problem, resource is classroom and total number of classroom needs to be minimized by arranging lectures in certain order.

There are few natural orders in which we can arrange all lectures or for sake of generality, tasks. First is to arrange them in order of finish time,  second is to arrange in order of start time, third is to order them by smallest duration of task, fourth is by minimum number of conflicting jobs. Which one to chose?
You can come up with counter example when if lectures are arranged in classrooms by order of their end time, or smallest duration or minimum number of conflicting jobs, it does not end to optimal solution  So, let’s pick lectures based on earliest start time. At any given pint of time, pick lecture with least start time and yet not scheduled and then assign it to first available class. Will it work? Sure it does.  When you have assigned all lectures, total number of classrooms will be minimum number of classrooms required.

Interval partitioning algorithm

1. Sort all lectures based on start time in ascending order.
2. Number of initial classrooms = 0
3. While lecture to be scheduled:
   3.1 Take first lecture yet not scheduled,
   3.2 If there a already a class available for lecture's start time
       Assign lecture to the class.
   3.3 If not, then allocate a new classroom
       number of classroom = number of classroom + 1
4. Return number of classrooms.

Before jumping into the code, let’s discuss some data structures which we can use to implement this algorithm.

Understand that we have to find a compatible classroom for a lecture. There are many classrooms, we need to check if the finish time of lecture in that classroom is less than start time of new lecture. If yes , then classroom is compatible, if there is no such class, allocate a new class. If we store our allocated classrooms in such a way that it always gives classroom with least finish time of last lecture scheduled there, we can safely say that if this classroom is not compatible, none of the others will be.(Why?) Every time we assign a lecture to a classroom, sort the list of classroom, so that first classroom is with least finish time.  Sort has complexity of O(n log n) and if we do it for all n intervals, overall complexity of algorithm will be O(n2 log n).

We are sorting just to find minimum end time across all classrooms. This can easily be achieved by min heap or priority queue keyed on finish time of last lecture of class. Every time finish time of last lecture changes for a classroom, heap is readjusted and root gives us classroom with min finish time.

  • To determine whether lecture j is compatible with some classroom, compare sj to key of min classroom k in priority queue.
  • When a lecture is added to a classroom,  increase key of classroom k to fj.

Well know we have algorithm and data structure to implement in, so let’s code it.

PrioritityQueue implementation is given below:

import heapq
# This is our priority queue implementation
class PriorityQueue:
    def __init__(self):
        self._queue = []
        self._index = 0
 
    def push(self, item, priority):
        heapq.heappush(self._queue, (priority, self._index, item))
        self._index += 1
 
    def pop(self):
        if(self._index == 0):
                return None
        return heapq.heappop(self._queue)[-1];

Classroom class implementation

class Classroom:
	def __init__(self, number, finish_time):
		self.class_num = number
		self.finish_time = finish_time
	def __repr__(self):
		return 'Classroom({!r})'.format(self.class_num)

Interval partitioning problem : Implementation

from PriorityQueue import PriorityQueue
from Classroom import Classroom

jobs = [(1, 930, 1100),
        (2, 930, 1300),
        (3, 930, 1100),
        (5, 1100, 1400),
        (4, 1130, 1300),
        (6, 1330, 1500),
        (7, 1330, 1500),
        (8,1430,1700),
        (9, 1530, 1700),
        (10, 1530, 1700)
]

def find_num_classrooms():
	num_classrooms = 0;
	priority_queue = PriorityQueue()

	for job in jobs:
		# we have job here, now pop the classroom with least finishing time
		classroom = priority_queue.pop();
		if(classroom == None) :
			#allocate a new class
			num_classrooms+= 1;
			priority_queue.push(Classroom(num_classrooms,job[2]),job[2]);
		else:
			#check if finish time of current classroom is
			#less than start time of this lecture
			if(classroom.finish_time  <= job[1]):
				classroom.finish_time = job[2]
				priority_queue.push(classroom,job[2])
			else:
				num_classrooms+= 1;
				#Since last classroom needs to be compared again, push it back
				priority_queue.push(classroom,job[2])
				#Push the new classroom in list
				priority_queue.push(Classroom(num_classrooms,job[2]),job[2])

    return  num_classrooms
	
print "Number of classrooms required: " +  find_num_classrooms();

Java Implementation

package com.company;

import java.util.*;

/**
 * Created by sangar on 24.4.18.
 */
public class IntervalPartition {

    public static int findIntervalPartitions(ArrayList<Interval> intervals){
        PriorityQueue<Interval> queue =
                new PriorityQueue<Interval>(intervals.size(), Comparator.comparing(p -> p.getEndTime()));

        for(Interval currentInterval : intervals) {
            if (queue.isEmpty()) queue.add(currentInterval);
            else {
                if (queue.peek().getEndTime() > currentInterval.getStartTime()) {
                    queue.add(currentInterval);
                } else {
                    queue.remove();
                    queue.add(currentInterval);
                }
            }
        }
        return queue.size();
    }

    public static void main(String args[] ) throws Exception {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(930,1300));
        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(1130,1300));
        intervals.add(new Interval(1100,1400));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1430,1700));
        intervals.add(new Interval(1530,1700));

        Collections.sort(intervals, Comparator.comparing(p -> p.getStartTime()));

        int minimumClassRooms = findIntervalPartitions(intervals);
        System.out.println(minimumClassRooms);
    }
}

This algorithm takes overall time of O(n log n) dominated by the sorting of jobs on start time. Total number of priority queue operations is O(n) as we have only n lectures to schedule and for each lecture we have push and pop operation.

Reference :

There is another method using binary search algorithm which can be used to solve this problem. As per problem statement, we have to find minimum number of classrooms to schedule n lectures. What are the maximum number of classrooms required? It will be number of lectures when all lectures conflict with each other.
Minimum number of classrooms will be 0 when there is no lecture to be scheduled. Now, we know the range of values of classrooms. How can we find minimum?

Basic idea is that if we can schedule all n lectures in m rooms, then we can definitely schedule them in m+1 and more rooms. So minimum number of rooms required will be either m or less than it. In this case, we can safely discard all candidate solution from m to n (remember n is the maximum number of classrooms).
Again what if we can not schedule lectures in m rooms, then there is no way we can schedule them in less than m rooms. Hence we can discard all candidate solutions less than m.

How can we select m? We can select is as mid of range which is (0,n). And try to fit all lectures on those m rooms based on condition that none of lecture conflicts. Keep track of end time of last lecture of each classroom. If none of the classroom has end time less than start time of new lecture, allocate new class. If total number of classrooms is less than or equal to m, discard m+1 to n. If it is more than m, then discard 0 to m and search for m+1 to n.

package com.company;

import java.util.*;

/**
 * Created by sangar on 24.4.18.
 */
public class IntervalPartition {

    public static boolean predicate(ArrayList<Interval> intervals, long candidateClassRooms){

        int i = 0;

        PriorityQueue<Interval> queue =
                new PriorityQueue<Interval>(intervals.size(), Comparator.comparing(p -> p.getEndTime()));

        for(Interval currentInterval : intervals){
            if(queue.isEmpty()) queue.add(currentInterval);
            else{
                if(queue.peek().getEndTime() > currentInterval.getStartTime()){
                    queue.add(currentInterval);
                }
                else{
                    queue.remove();
                    queue.add(currentInterval);
                }
            }
        }

        return queue.size() <= candidateClassRooms;
    }

    public static void main(String args[] ) throws Exception {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(930,1300));
        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(1130,1300));
        intervals.add(new Interval(1100,1400));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1430,1700));
        intervals.add(new Interval(1530,1700));

        long low = 0;
        long high = intervals.size();

        Collections.sort(intervals, Comparator.comparing(p -> p.getStartTime()));

        while(low < high){
            long mid  = low + ( (high - low) >> 1);

            if(predicate(intervals, mid)){
                high = mid;
            }else{
                low = mid+1;
            }
        }
        System.out.println(low);
    }
}

Complexity of algorithm is dependent on number of lectures to be scheduled which is O(n log n ) with additional space complexity of O(c) where c is number of classrooms required.

Please share your views and suggestions in comments and feel free to share and spread the word. If you are interested to share your knowledge to learners across the world, please write to us on communications@algorithmsandme.com

Median of two sorted arrays

Median of two sorted array

Before going any further, let’s understand what is a median? “Median” is “middle” value in list of numbers. To find median, input should be sorted from smallest to largest. If input is not sorted, then we have to first sort and them return middle of that list. Question arises is what if number of elements in list are even? In that case, median is average of two middle elements. Ask of this problem is to find median of two sorted arrays.
For example :

median of two sorted array

Before going into the post, find a pen and paper and try to work out example. And as I tell in our posts, come up with a method to solve this considering, you have all the time and resources to solve this problem. I mean think of most brute force solution.
Let’s simplify the question first and then work it upwards. If question was to find median of one sorted array, how would you solved it?
If array has odd number of elements in it, return A[mid], where mid = (start + end)/2; else if array has even number of elements, return average of A[mid] + A[mid+1]. For example for array A = [1,5,9,12,15], median is 9. Complexity of this operation is O(1).

Focus back on two sorted arrays. To find median of two sorted arrays in no more simple and O(1) operation. For example, A = [ 1,5,9,12,15] and B = [ 3,5,7,10,17], median is 8. How about merging these two sorted array into one, problem is reduced to find median of one array. In above example, it will be C = [1,3,5,5,7,9,10,12,15,17]. Although to find median in a sorted array is O(1), merge step takes O(N) operations. Hence, overall complexity would be O(N). Reuse the merge part of Merge sort algorithm to merge two sorted arrays.
Start from beginning of two arrays and advance the pointer of array whose current element is smaller than current element of other. This smaller element is put on to output array which is sorted merge array. Merge will use an additional space to store N elements (Note that N is here sum of size of both sorted arrays). Best part of this method is that it does not consider if size of two arrays is same or different. It works for all size of arrays.

This can be optimized, by counting number of elements, N, in two arrays in advance. Then we need to merge only N/2+1 elements if N is even and N/2 if N is odd. This saves us O(N/2) space.

There is another optimization:do not store all N/2 or N/2+1 elements while merging, keep track of last two elements in sorted array, and count how many elements are sorted. When N/2+1 elements are sorted return average of last two elements if N is even, else return N/2 element as median. With this optimizations, time complexity remains O(N), however, space complexity reduces to O(1).

Median of two sorted arrays implementation

package com.company;

/**
 * Created by sangar on 18.4.18.
 */
public class Median {

    public static double findMedian(int[] A, int[] B){
        int[] temp = new int[A.length + B.length];

        int i = 0;
        int j = 0;
        int k = 0;
        int lenA = A.length;
        int lenB = B.length;

        while(i<lenA && j<lenB){
            if(A[i] <= B[j]){
                temp[k++] = A[i++];
            }else{
                temp[k++] = B[j++];
            }
        }
        while(i<lenA){
            temp[k++] = A[i++];
        }
        while(j<lenB){
            temp[k++] = B[j++];
        }

        int lenTemp = temp.length;

        if((lenTemp)%2 == 0){
            return ( temp[lenTemp/2-1] + temp[lenTemp/2] )/2.0;
        }
        return temp[lenTemp/2];
    }

    public static void main(String[] args){
        int[] a = {1,3,5,6,7,8,9,11};
        int[] b = {1,4,6,8,12,14,15,17};

        double median = findMedian(a,b);
        System.out.println("Median is " + median);
    }
}

Complexity to find median of two sorted arrays using merge operation is O(N).
Optimized version to find median of two sorted arrays

package com.company;

/**
 * Created by sangar on 18.4.18.
 */
public class Median {

    public  static int findMedianOptimized(int[] A, int[] B){
        int i = 0;
        int j = 0;
        int k = 0;
        int lenA = A.length;
        int lenB = B.length;

        int mid = (lenA + lenB)/2;
        int midElement = -1;
        int midMinusOneElement = -1;

        while(i<lenA && j<lenB){
            if(A[i] <= B[j]){
                if(k == mid-1){
                    midMinusOneElement = A[i];
                }
                if(k == mid){
                    midElement = A[i];
                    break;
                }
                k++;
                i++;
            }else{
                if(k == mid-1){
                    midMinusOneElement = B[j];
                }
                if(k == mid){
                    midElement = B[j];
                    break;
                }
                k++;
                j++;
            }
        }
        while(i<lenA){
            if(k == mid-1){
                midMinusOneElement = A[i];
            }
            if(k == mid){
                midElement = A[i];
                break;
            }
            k++;
            i++;
        }
        while(j<lenB){
            if(k == mid-1){
                midMinusOneElement = B[j];
            }
            if(k == mid){
                midElement = B[j];
                break;
            }
            k++;
            j++;
        }

        if((lenA+lenB)%2 == 0){
            return (midElement + midMinusOneElement)/2;
        }
        return midElement;
    }

    public static void main(String[] args){
        int[] a = {1,3,5,6,7,8,9,11};
        int[] b = {1,4,6,8,12,14,15,17};

        double median = findMedianOptimized(a,b);
        System.out.println("Median is " + median);
    }
}

Median of two sorted array using binary search

One of the property which leads us to think about binary search is that two arrays are sorted. Before going deep into how Binary search algorithm can solve this problem, first find out mathematical condition which should hold true for a median of two sorted arrays.
As explained above, median divides input into two equal parts, so first condition median index m satisfy is a[start..m] and a[m+1..end] are equal size. We have two arrays A and B, let’s split them into two. First array is of size m, and it can be split into m+1 ways at 0 to at m. If we split at i, length(A_left) – i and length(A_right) = m-i.

When i=0, len(A_left) =0 and when i=m, len(A_right) = 0.

Similarly for array B, we can split it into n+1 way, j being from 0 to n.

After split at specific indices i and j, how can we derive condition for median, which is left part of array should be equal to right part of array?

If len(A_left) + len(B_left) == len(A_right) + len(B_right) , it satisfies our condition. As we already know these values for split at i and j, equation becomes

i+j = m-i + n-j

median of two sorted array

But is this the only condition to satisfy for median? As we know, median is middle of sorted list, we have to guarantee that all elements on left array should be less than elements in right array.
It is must that max of left part is less than min of right part. What is max of left part? It can be either A[i-1] or B[j-1]. What can be min of right part, it can be either A[i] or B[j]. We already know that, A[i-1] < A[i] and B[j-1]<B[j] as arrays A and B are sorted. All we need to check if A[i-1] <= B[j] and B[j-1]<=A[i], if index i and j satisfy this conditions, then median will be average of max of left part and min of right part if n+m is even and max(A[i-1], B[j-1]) if n+m is odd.

Let’s make an assumption that n>=m, then j = (n+m+1)/2 -i, it will always lead to j as positive integer for possible values of i (o ~m) and avoid array out of bound errors and automatically makes the first condition true.

Now, problem reduces to find index i such that A[i-1] <= B[j] and B[j-1]<=A[i] is true.

This is where binary search comes into picture. We can start i as mid of array A, j = (n+m+1)/2-i and see if this i satisfies the condition. There can be three possible outcomes for condition.
1. A[i-1] <= B[j] and B[j-1]<=A[i] is true, we return the index i.
2. If B[j-1] > A[i], in this case, A[i] is too small. How can we increase it? by moving towards right. If i is increased, value A[i] is bound to increase, and also it will decrease j. In this case, B[j-1] will decrease and A[i] will increase which will make B[j-1]<=A[i] is true. So, limit search space for i to mid+1 to m and go to step 1.
3. A[i-1] > B[j], means A[i-1] is too big. And we must decrease i to get A[i-1]<=B[j]. Limit search space for i to 0 mid-1 and go to step 1

Let’s take an example and see how this works. Out initial two array as follows.

Index i is mid of array A and corresponding j will as shown

Since condition B[j-1] <= A[i] is not met, we discard left of A and right of B and find new i and j based on remaining array elements.

Finally our condition that A[i-1]<= B[j] and B[j-1] <=A[i] is satisfied, find max of left and min of right and based on even or odd length of two arrays, return average of max of left and min of right or return max of left.

This algorithm has very dangerous implementation caveat, which what if i or j is 0, in that case i-1 and j-1 will  be invalid indices. When can j be zero, when i == m. Till i<m, no need to worry about j being zero. So be sure to check i<m and i>0, when we are checking j-1 and i-1 respectively.

Implementation

package com.company;

/**
 * Created by sangar on 18.4.18.
 */
public class Median {

    public static double findMedianWithBinarySearch(int[] A, int[] B){

        int[] temp;

        int lenA = A.length;
        int lenB = B.length;

        /*We want array A to be always smaller than B
          so that j is always greater than zero
         */
        if(lenA > lenB){
            temp = A;
            A = B;
            B = temp;
        }

        int iMin = 0;
        int iMax = A.length;
        int midLength =  ( A.length + B.length + 1 )/2;

        int i = 0;
        int j = 0;

        while (iMin <= iMax) {
            i = (iMin + iMax) / 2;
            j = midLength - i;
            if (i < A.length && B[j - 1] > A[i]) {
                // i is too small, must increase it
                iMin = i + 1;
            } else if (i > 0 && A[i - 1] > B[j]) {
                // i is too big, must decrease it
                iMax = i - 1;
            } else {
                // i is perfect
                int maxLeft = 0;
                //If there we are at the first element on array A
                if (i == 0) maxLeft = B[j - 1];
                //If we are at te first element of array B
                else if (j == 0) maxLeft = A[i - 1];
                //We are in middle somewhere, we have to find max
                else maxLeft = Integer.max(A[i - 1], B[j - 1]);

                //If length of two arrays is odd, return max of left
                if ((A.length + B.length) % 2 == 1)
                    return maxLeft;

                int minRight = 0;
                if (i == A.length) minRight = B[j];
                else if (j == B.length) minRight = A[i];
                else minRight = Integer.min(A[i], B[j]);

                return (maxLeft + minRight) / 2.0;
            }
        }
        return -1;
    }

    public static void main(String[] args){
        int[] a = {1,3,5,6,7,8,9,11};
        int[] b = {1,4,6,8,12,14,15,17};

        double median = findMedian(a,b);
        System.out.println("Median is " + median);
    }
}

Complexity of this algorithm to find median of two sorted arrays is log(max(m,n)) where m and n are size of two arrays.
Please share your views and suggestions. If you liked content, please share it. If you are interested in contributing to site, please contact us.

Leaders in array

Leaders in array

In last post, we discussed inversions in array. One more problem on similar lines, given an array of integers, find all leaders in array. First of all, let’s understand what is a leader. Leader is an element in array which is greater than all element on right side of it. For example:
In array below element 8, 5 and 4 are leaders. Note that element at index 6 is leader by not at index 1.

leaders in array

Another example, in this there are only two leaders which is 10 and 9.

inversions in array

Clarifying question which becomes evident in example is that if last element is considered as leader? Based on answer from interviewer, function should print or not last element.

Leaders in array : thought process

What is brute force approach? Scan through all elements in array one by one and check if there is any greater element on right side. If there is no such element, number is leader in array.

package com.company;

import java.util.ArrayList;
import java.util.Stack;

/**
 * Created by sangar on 7.4.18.
 */
public class Leaders {

    public static ArrayList<Integer> findLeaders(int[] a){
        ArrayList<Integer> leaders = new ArrayList<>();

        for(int i=0; i<a.length; i++){
            int j = 0;
            for(j=i+1; j<a.length; j++){
                if(a[i] < a[j]){
                    break;
                }
            }
            if(j==a.length) leaders.add(a[i]);
        }

        return  leaders;

    }

    public static void main(String[] args) {
        int a[] = new int[]{90, 20, 30, 40, 50};
        ArrayList<Integer> inversions = findLeadersWithoutExtraSpace(a);
        System.out.print("Leaders : " + inversions);
    }
}

Complexity of brute force solution to find leaders in array is O(n2).

Let’s go to basics of question: All elements on right side of an element should be less than it for that element to be leader. Starting from index 0, we can assume that A[0] is leader and move forward. Remove A[0] if A[1] > A[0] as A[0] is not leader anymore. Now, if A[2] > A[1], then A[1] cannot be leader.
What if A[3] < A[2], then A[2] may still be leader and A[3] may also be.
What if A[4] > A[3], then A[3] cannot be leader. Can A[2] be leader? Depends if A[4] is less or more than A[2]. For each element, we are going back to all previous candidate leaders in reverse way and drop all candidates which are less than current element. Does it ring bell?Well, data structure which supports this kind of operation Last In First Out, is stack.
Stack supports two operations : push and pop. Question is when to push and pop and elements from stack for our problem.

Push element if it less than top of stack. If top of stack is less than current element, pop elements from stack till an element which is greater than current element. When entire array is scanned, stack will contain all leaders.

    • Start with empty stack. Push first element of array on to it.
    • For each element in array
    • Till current element is greater than top, pop element.
    • Push current element on to stack.
    •  At the end of processing, stack will contain all leaders.

Leaders in array : Implementation using stack

package com.company;

import java.util.ArrayList;
import java.util.Stack;

/**
 * Created by sangar on 7.4.18.
 */
public class Leaders {

    public static ArrayList<Integer> findLeadersUsingStack(int[] a){
        ArrayList<Integer> leaders =new ArrayList<>();

        Stack<Integer> s = new Stack();
        s.push(a[0]);

        for(int i=1; i<a.length; i++){
            while(s.peek() < a[i]){
                s.pop();
            }
            s.push(a[i]);
        }

        while (!s.empty()){
            leaders.add(s.pop());
        }
        return leaders;
    }
    public static void main(String[] args) {
        int a[] = new int[]{90, 20, 30, 40, 50};
        ArrayList<Integer> inversions = findLeadersWithoutExtraSpace(a);
        System.out.print("Leaders : " + inversions);
    }
}

Complexity of algorithm using stack to find leaders in array is O(n) with extra O(n) space complexity.

Scanning array in reverse
How can we avoid the additional space used by stack? When we are scanning forward, there are chances that some element going forward will be current candidate leader. That is why we keep track of all candidate leaders. How about scanning array from end, in reverse order. Start with last index and keep track of maximum we saw till current index. Check if element at current index is greater than current max, save it as leader and change current max to current element.

Algorithm to find leaders without extra space
  • Set current max as last element of array.
  • For i = n-1 to 0 index of array
    • if a[i] greater than current max
    • add a[i] to leaders.
    • Change current max to a[i]

Leaders in array implementation without extra space

package com.company;

import java.util.ArrayList;
import java.util.Stack;

/**
 * Created by sangar on 7.4.18.
 */
public class Leaders {

    public  static ArrayList<Integer> findLeadersWithoutExtraSpace(int[] a){
        ArrayList<Integer> leaders =new ArrayList<>();

        int currentMax = Integer.MIN_VALUE;
        for(int i=a.length-1; i>=0; i--){
            if(a[i] > currentMax ){
                currentMax = a[i];
                leaders.add(a[i]);
            }
        }

        return leaders;
    }
    public static void main(String[] args) {
        int a[] = new int[]{90, 20, 30, 40, 50};
        ArrayList<Integer> inversions = findLeadersWithoutExtraSpace(a);
        System.out.print("Leaders : " + inversions);
    }
}

Complexity of reverse array algorithm to find leaders in array is O(n) with no added space complexity.

Please share you views,suggestion, queries or if you find something wrong. If you want t contribute to algorithms and me, please reach out to us on communications@algorithmsandme.com

Inversions in array

Inversions in array

Let A[0…n – 1] be an array of n distinct positive integers. If i < j and A[i] > A[j] then the pair (i, j) is called an inversion of A. Given n and an array A, find the number of inversions in array A. For example: First array has two inversions (2,1) and (5,1) where as second array has 3 inversions, (2,1), (4,1) and (4,3)

inversions in array

How many inversion can be in a sorted array? There is no inversion in sorted array and nC2 inversions in completely inverted array.

Inversions in array : Thought process

What first thing which comes to mind? For each index i, check all j where j > i and see if A[j] < A[i]?
If A[j] is greater than current element A[i], increase inversion count. Implementation is given below.

package com.company;

/**
 * Created by sangar on 6.4.18.
 */
public class Inversions {
    public static int findInversions(int[] a){
        int count = 0;
        for(int i=0; i<a.length; i++){
            for(int j=i+1;  j<a.length; j++){
                if(a[i] > a[j]) count++;
            }
        }
        return count;
    }

    public static void main(String[] args) {
        int a[] = new int[]{90, 20, 30, 40, 50};
        int inversions = findInversions(a);
        System.out.print("Inversions : " + inversions);
    }
}

Worst case complexity of this method to find inversions in array is O(n2).

Can we use the information that a sorted array does not have any inversion? Let’s ask this question again with a tweak, how many inversions will there be if only one element is out of place for a completely sorted array? There will one. What if there are two elements out of place? Of course, there will be 2 inversion. Are we getting somewhere? Effectively, we have to count, how many swaps we need to completely sort array from it’s original state. If you noticed, the brute force algorithm, is nothing but selection sort, instead of swapping elements, count inversions.

What if we use any other sort algorithm like Merge sort algorithm? Complexity of merge sort is much better than selection sort, then it is possible that merge sort identifies inversions much efficiently than selection sort.
Let’s use merge step of merge sort algorithm with a bit modification. Divide part will remains the same. While merging two sorted subarrays, count number of times element on right part is put on result array before element on left side.
Every time A[i] is appended to the output of merge sort, no new inversions are encountered, since A[i] is smaller than everything left in array B.  If B[j] is appended to the output, then it is smaller than all the remaining items in A, we increase the number of count of inversions by the number of elements remaining in A.
Overall inversions will be inversion in left part + inversions of right part and inversion which are in merge step.

Let’s see how inversions in merge steps are counted and then we will see the overall algorithm.

inversion in array

 

inversions in array

Total number of inversions is 6.

Overall, algorithm looks like below.

inversions in array using merge sort

Algorithm to count inversions in array

First, let’s write an algorithm to count inversions in merge step. When we are at merge steps, there are two sorted arrays with us:  A and B

  1. Initialize i as start position of A and j as start position of B. These pointers will reference to currently compared indices in both arrays.
  2. While there are elements in both arrays, i.e. i < length(A) && j < length(B)
    1.  If B[j] < A[i], all elements from i to length(A) are greater than B[j],
      count += number of elements remaining in A. Increment j
    2. Else increment i
  3. Return count

Replace merge part of merge sort with this piece of algorithm and return sum of inversions in left + inversions in right + inversions in merge from function.

MergeSortAndCount(L):

  1. If L has one element return 0
  2. Divide L into A, B
    1. inversionsLeft = MergeSortAndCount(A)
    2. inversionRight = MergeSortAndCount(B)
    3. inversionMerge = MergeAndCount(A,B)
  3. return inversionLeft + inversionRight + inversionMerge

Inversions in array implementation

package com.company;

/**
 * Created by sangar on 6.4.18.
 */
public class Inversions {
    public  static int mergeAndCount(int[] a, int low, int mid, int high){
        int count  =  0;
        int[] temp = new int[high-low+1];

        int i = low;
        int j = mid+1;
        int k = 0;
        /*
            There are elements on both side of array
        */
        while(i<=mid && j<=high){

            if(a[i] > a[j]){
                //Number of elements remaining on left side.
                count+= (mid - i + 1);
                temp[k++] = a[j++];
            }
            else{
                temp[k++] = a[i++];
            }
        }
        while(i<=mid){
            temp[k++] = a[i++];
        }
        while(j<=high) {
            temp[k++] = a[j++];
        }

        for(i=low; i<k+low; i++){
            a[i] = temp[i-low];
        }

        return count;
    }

    public static int countInversions(int[] a, int low, int high){
        if(low >= high) return 0;

        int mid = low + (high - low) / 2;

        int inversionsLeft = countInversions(a, low, mid);
        int inversionsRight = countInversions(a, mid+1, high);
        int inversionsMerge = mergeAndCount(a, low, mid, high);

        return inversionsLeft + inversionsRight + inversionsMerge;
    }


    public static void main(String[] args) {
        int a[] = new int[]{90, 20, 30, 40, 50};
        int inversions = countInversions(a, 0, a.length-1);
        System.out.print("Inversions : " + inversions);
    }
}

Complexity of finding inversions in arrays using merge sort method is  O(n log n).

Please share if there is something wrong or missing. If you are interested in contributing to website and share your knowledge with learners, please contact us at communications@algorithmsandme.com

References: Lecture 8

Pair with given sum in array

Pair with given sum in array

Given an array a[] and a number X, find two elements or pair with given sum X in array. For example:

Given array : [3,4,5,1,2,6,8] X = 10
Answer could be (4,6) or (2,8).

Before looking at the post below, we strongly recommend to have pen and paper and git it a try to solve it.

Pair in array with given sum : thought process

Ask some basic questions about the problem, it’s a good way to dig more into problem and gain more confidence. Remember interviewers are not trained interrogators, they slip hint or two around solution when you ask relevant questions.

  • Is it a sorted array ? If not, think additional complexity you would be adding to sort it
  • If duplicates present in array?
  • Whether returning first pair is enough or should we return all such pairs with sum equal to X?
  • If there can be negative numbers in array?

This problem is used regularly in interviews because it tests so many things about your programming knowledge.
It validates that if you can traverse array properly, with both lower and higher bounds. It also checks your optimizing ability once you got a working solution. Can you work with additional constraints? Are you able to work with more than one data structure like array and hash together to solve a problem?

Find pairs with given sum : Using sorting

Let’s go with an assumption that input is sorted array and if not, we will sort it? If you want to know how to sort an array efficiently,refer Quick sort or Merge sort
With sorted array, we can apply below algorithm to find a pair with given sum.

  1. Initialize two variable left = 0 and right = array.length-1, These variable are used to traverse array from two ends of array.
  2. While two variables left and right do not cross each other,
  3. Get sum of elements at index left and right, i.e A[left] + A[right]
  4. If sum is greater than X, move towards left from end i.e decrease right by 1
  5. Else if sum is less than X,then move towards right from start, i.e increment left
  6. At last, if sum is equal to X, then return (left, right) as pair.

Example

Let’s see how this works with an example and then we will implement it. Given an array as shown and sum = 17, find all pair which sum as 17.

Initialization step, left = 0 and right = array.length – 1

A[left] + A[right] = 20 which is greater than sum (17), move right towards left by 1.

Again, A[left] + A[right] = 18 which is greater than sum (17), move right towards left by 1.

At this point, A[left] + A[right] is less than sum(17), hence move left by 1

Now, A[left] + A[right]  is equal to sum and so add this pair in result array. Also, decrease right by 1, why?

At this point, A[left] + A[right] is less than sum(17), hence move left by 1

Again, A[left] + A[right] is less than sum(17), hence move left by 1

A[left] + A[right]  is equal to sum and so add this pair in result array. Also, decrease right by 1.

Since, left and right point to same element now, there cannot be a pair anymore, hence return.

package com.company;

import javafx.util.Pair;

import java.util.ArrayList;

/**
 * Created by sangar on 5.4.18.
 */
public class PairWithGivenSum {
    public static ArrayList<Pair<Integer, Integer>> pairWithGivenSum(int[] a, int sum){
        int left = 0;
        int right = a.length - 1;

        ArrayList<Pair<Integer, Integer>> resultList = new ArrayList<>();

        while(left < right){
            /*If sum of two elements is greater than
              sum required, move towards left */
            if(a[left] + a[right] > sum) right--;
            /*
              If sum of two elements is less than
              sum required, move towards right
            */
            if(a[left] + a[right] < sum) left++;
            if(a[left] + a[right] == sum){
                resultList.add(new Pair(left, right));
                right--;
            }
        }
        return resultList;
    }
    public static void main(String[] args) {
        int a[] = new int[] {10, 20, 30, 40, 50};

        ArrayList<Pair<Integer, Integer>> result = pairWithGivenSum(a,50);
        for (Pair<Integer, Integer> pair : result ) {
            System.out.println("("+ pair.getKey() + "," + pair.getValue()  + ")");
        }
    }
}

Complexity of this algorithm to find a pair of numbers in array with sum X is dependent on sorting algorithm used. If it is merge sort, complexity is O(n log n) with added space complexity of O(n). If quick sort is used, worst case complexity is O(n2) and no added space complexity.

Find a pair with given sum in array : Without sorting

In first method,  array is modified, when it is not already sorted. Also, Preprocessing step (sorting) dominates the complexity of algorithm. Can we do better than O(nlogn) or in other words, can we avoid sorting?

Additional constraint put on problem is that  you cannot modify original input.  Use basic mathematics, if A + B = C, then A = C-B.  Consider B is each element for which we are looking for A. Idea is to scan entire array and find all A’s required for each element. Scan array again and check there was B which required current element as A.
To keep track of required A values, we will create an hash, this will make second step O(1).
We can optimize further by scanning array only once for both steps.

1. Create an hash
2. Check element at each index of array
    2.a If element at current index  is already in hash. return pair of current index and value in hash
    2.b If not, then subtract element from sum and store (sum-A[index], index) key value pair in hash.

This algorithm scans array only once and does not change input. Worst case time complexity is O(n), hash brings additional space complexity. How big should be the hash? Since, all values between sum-max value of array and sum-min value of array will be candidate A’s hence hash will be of difference between these two values.

This solution does not work in C if there are negative numbers in array. It will work in languages which have HashMaps in-built. For C, we have to do some preprocessing like adding absolute of smallest negative number to all elements. That’s where our fourth question above helps us to decide.

Pairs with given sum : implementation

package com.company;

import javafx.util.Pair;

import java.util.ArrayList;
import java.util.HashMap;

/**
 * Created by sangar on 5.4.18.
 */
public class PairWithGivenSum {
    public static ArrayList<Pair<Integer, Integer>> pairsWithGivenSum2(int[] a, int sum){
        int index = 0;
        ArrayList<Pair<Integer, Integer>> resultList = new ArrayList<>();

        HashMap<Integer, Integer> pairMap = new HashMap<>();
        for(int i=0; i< a.length; i++){
            if(pairMap.containsKey(a[i])){
                resultList.add(new Pair(pairMap.get(a[i]), i));
            }
            pairMap.put(sum-a[i], i);
        }
        return resultList;
    }
    public static void main(String[] args) {
        int a[] = new int[] {10, 20, 30, 40, 50};

        ArrayList<Pair<Integer, Integer>> result = pairsWithGivenSum2(a,50);
        for (Pair<Integer, Integer> pair : result ) {
            System.out.println("("+ pair.getKey() + "," + pair.getValue()  + ")");
        }
    }
}

Please share if there is some error or suggestion to improve. We would love to hear what you have to say. If you want to contribute to learning process of other by sharing your knowledge, please write to us at communications@algorithmsandme.com

Number of occurrences of element

Number of occurrences of element

Given a sorted array and a key, find number of occurrences of key in that array. For example, in below array, number of occurrences of 3 is 3.

number of occurrences of element

Brute force method will be to scan through array, find first instance of element and then find last instance, then do the math. Complexity of that method is O(N). Can we do better than that?

Did you get some hint when brute force method was described? Yes,we have already cracked the problem to find first occurrence and last occurrence in O(log n) complexity earlier. We will be using those two methods, all we need to do know is math.

occurrences = lastInstance - firstInstance + 1

Number of occurrences of element : Implementation.

package com.company;

/**
 * Created by sangar on 25.3.18.
 */
public class BinarySearcchAlgorithm {

    private static boolean isGreaterThanEqualTo(int[] a, int index, int key){
        if(a[index] >= key) return true;

        return false;
    }

    public static int findFirstInstance(int[] a, int start, int end, int key){

        while(start < end){
            int mid = start + (end - start) / 2;

            if(isGreaterThanEqualTo(a, mid, key)){
                end = mid;
            }
            else{
                start = mid + 1;
            }
        }

        return (a[start] == key) ? start : -1;
    }

    private static boolean isLessThanEqualTo(int[] a, int index, int key){
        if(a[index] <= key) return true;

        return false;
    }

    public static int findLastInstance(int[] a, int start, int end, int key){

        while(start < end){
            int mid = start +( (end - start) + 1) / 2;

            if(isLessThanEqualTo(a, mid, key)){
                start = mid;
            }
            else{
                end = mid - 1;
            }
        }
        return (a[start] == key) ? start : -1;
    }

    public  static  int numberOfOccurrences(int[] a, int key){
        int firstInstance = findFirstInstance(a, 0, a.length-1, key);
        int lastInstance = findLastInstance(a, 0, a.length-1, key);

        return (firstInstance != -1) ? lastInstance-firstInstance + 1 : 0;
    }

    public static void main(String[] args) {
        int[] input = {3,10,11,15,17,17,17,20};

        int index = numberOfOccurrences(input,3);
        System.out.print(index == -1 ? "Element not found" : "Element found at : " + index);

    }
}

Worst case time complexity of algorithm to find number of occurrences of element in sorted array is O(log n). We are using iterative method to find first and last instances, therefore, there is no hidden space complexity of algorithm.

Please share if there is something wrong or missing. Also if you want to contribute to algorithms and me, please drop an email at communications@algorithmsandme.com