Interval partitioning problem

Interval partitioning problem

In continuation of greedy algorithm problem, (earlier we discussed : even scheduling and coin change problems) we will discuss another problem today. Problem is known as interval partitioning problem and it goes like : There are n lectures to be schedules and there are certain number of classrooms. Each lecture has a start time si and finish time fi. Task is to schedule all lectures in minimum number of classes and there cannot be more than one lecture in a classroom at a given point of time. For example, minimum number of classrooms required to schedule these nine lectures is 4 as shown below.

interval partition

However,  we can do some tweaks and manage to schedule same nine lectures in three classrooms as shown below.

So, second solution optimizes the output.

Another variant of this problem is :  You want to schedule jobs on a computer. Requests take the form (si , fi) meaning a job that runs from time si to time fi. You get many such requests, and you want to process as many as possible, but the computer can only work on one job at a time.

Interval partitioning : Line of thought

First thing to note about interval partitioning problem is that we have to minimize something, in this case, number of classrooms. What template this problem fits into? Greedy may be? Yes it fits into greedy algorithm template. In greedy algorithm we take decision on local optimum.

Before discussing the solution, be clear that what is resource and what needs to be minimized? In this problem, resource is classroom and total number of classroom needs to be minimized by arranging lectures in certain order.

There are few natural orders in which we can arrange all lectures or for sake of generality, tasks. First is to arrange them in order of finish time,  second is to arrange in order of start time, third is to order them by smallest duration of task, fourth is by minimum number of conflicting jobs. Which one to chose?
You can come up with counter example when if lectures are arranged in classrooms by order of their end time, or smallest duration or minimum number of conflicting jobs, it does not end to optimal solution  So, let’s pick lectures based on earliest start time. At any given pint of time, pick lecture with least start time and yet not scheduled and then assign it to first available class. Will it work? Sure it does.  When you have assigned all lectures, total number of classrooms will be minimum number of classrooms required.

Interval partitioning algorithm

1. Sort all lectures based on start time in ascending order.
2. Number of initial classrooms = 0
3. While lecture to be scheduled:
   3.1 Take first lecture yet not scheduled,
   3.2 If there a already a class available for lecture's start time
       Assign lecture to the class.
   3.3 If not, then allocate a new classroom
       number of classroom = number of classroom + 1
4. Return number of classrooms.

Before jumping into the code, let’s discuss some data structures which we can use to implement this algorithm.

Understand that we have to find a compatible classroom for a lecture. There are many classrooms, we need to check if the finish time of lecture in that classroom is less than start time of new lecture. If yes , then classroom is compatible, if there is no such class, allocate a new class. If we store our allocated classrooms in such a way that it always gives classroom with least finish time of last lecture scheduled there, we can safely say that if this classroom is not compatible, none of the others will be.(Why?) Every time we assign a lecture to a classroom, sort the list of classroom, so that first classroom is with least finish time.  Sort has complexity of O(n log n) and if we do it for all n intervals, overall complexity of algorithm will be O(n2 log n).

We are sorting just to find minimum end time across all classrooms. This can easily be achieved by min heap or priority queue keyed on finish time of last lecture of class. Every time finish time of last lecture changes for a classroom, heap is readjusted and root gives us classroom with min finish time.

  • To determine whether lecture j is compatible with some classroom, compare sj to key of min classroom k in priority queue.
  • When a lecture is added to a classroom,  increase key of classroom k to fj.

Well know we have algorithm and data structure to implement in, so let’s code it.

PrioritityQueue implementation is given below:

import heapq
# This is our priority queue implementation
class PriorityQueue:
    def __init__(self):
        self._queue = []
        self._index = 0
 
    def push(self, item, priority):
        heapq.heappush(self._queue, (priority, self._index, item))
        self._index += 1
 
    def pop(self):
        if(self._index == 0):
                return None
        return heapq.heappop(self._queue)[-1];

Classroom class implementation

class Classroom:
	def __init__(self, number, finish_time):
		self.class_num = number
		self.finish_time = finish_time
	def __repr__(self):
		return 'Classroom({!r})'.format(self.class_num)

Interval partitioning problem : Implementation

from PriorityQueue import PriorityQueue
from Classroom import Classroom

jobs = [(1, 930, 1100),
        (2, 930, 1300),
        (3, 930, 1100),
        (5, 1100, 1400),
        (4, 1130, 1300),
        (6, 1330, 1500),
        (7, 1330, 1500),
        (8,1430,1700),
        (9, 1530, 1700),
        (10, 1530, 1700)
]

def find_num_classrooms():
	num_classrooms = 0;
	priority_queue = PriorityQueue()

	for job in jobs:
		# we have job here, now pop the classroom with least finishing time
		classroom = priority_queue.pop();
		if(classroom == None) :
			#allocate a new class
			num_classrooms+= 1;
			priority_queue.push(Classroom(num_classrooms,job[2]),job[2]);
		else:
			#check if finish time of current classroom is
			#less than start time of this lecture
			if(classroom.finish_time  <= job[1]):
				classroom.finish_time = job[2]
				priority_queue.push(classroom,job[2])
			else:
				num_classrooms+= 1;
				#Since last classroom needs to be compared again, push it back
				priority_queue.push(classroom,job[2])
				#Push the new classroom in list
				priority_queue.push(Classroom(num_classrooms,job[2]),job[2])

    return  num_classrooms
	
print "Number of classrooms required: " +  find_num_classrooms();

Java Implementation

package com.company;

import java.util.*;

/**
 * Created by sangar on 24.4.18.
 */
public class IntervalPartition {

    public static int findIntervalPartitions(ArrayList<Interval> intervals){
        PriorityQueue<Interval> queue =
                new PriorityQueue<Interval>(intervals.size(), Comparator.comparing(p -> p.getEndTime()));

        for(Interval currentInterval : intervals) {
            if (queue.isEmpty()) queue.add(currentInterval);
            else {
                if (queue.peek().getEndTime() > currentInterval.getStartTime()) {
                    queue.add(currentInterval);
                } else {
                    queue.remove();
                    queue.add(currentInterval);
                }
            }
        }
        return queue.size();
    }

    public static void main(String args[] ) throws Exception {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(930,1300));
        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(1130,1300));
        intervals.add(new Interval(1100,1400));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1430,1700));
        intervals.add(new Interval(1530,1700));

        Collections.sort(intervals, Comparator.comparing(p -> p.getStartTime()));

        int minimumClassRooms = findIntervalPartitions(intervals);
        System.out.println(minimumClassRooms);
    }
}

This algorithm takes overall time of O(n log n) dominated by the sorting of jobs on start time. Total number of priority queue operations is O(n) as we have only n lectures to schedule and for each lecture we have push and pop operation.

Reference :

There is another method using binary search algorithm which can be used to solve this problem. As per problem statement, we have to find minimum number of classrooms to schedule n lectures. What are the maximum number of classrooms required? It will be number of lectures when all lectures conflict with each other.
Minimum number of classrooms will be 0 when there is no lecture to be scheduled. Now, we know the range of values of classrooms. How can we find minimum?

Basic idea is that if we can schedule all n lectures in m rooms, then we can definitely schedule them in m+1 and more rooms. So minimum number of rooms required will be either m or less than it. In this case, we can safely discard all candidate solution from m to n (remember n is the maximum number of classrooms).
Again what if we can not schedule lectures in m rooms, then there is no way we can schedule them in less than m rooms. Hence we can discard all candidate solutions less than m.

How can we select m? We can select is as mid of range which is (0,n). And try to fit all lectures on those m rooms based on condition that none of lecture conflicts. Keep track of end time of last lecture of each classroom. If none of the classroom has end time less than start time of new lecture, allocate new class. If total number of classrooms is less than or equal to m, discard m+1 to n. If it is more than m, then discard 0 to m and search for m+1 to n.

package com.company;

import java.util.*;

/**
 * Created by sangar on 24.4.18.
 */
public class IntervalPartition {

    public static boolean predicate(ArrayList<Interval> intervals, long candidateClassRooms){

        int i = 0;

        PriorityQueue<Interval> queue =
                new PriorityQueue<Interval>(intervals.size(), Comparator.comparing(p -> p.getEndTime()));

        for(Interval currentInterval : intervals){
            if(queue.isEmpty()) queue.add(currentInterval);
            else{
                if(queue.peek().getEndTime() > currentInterval.getStartTime()){
                    queue.add(currentInterval);
                }
                else{
                    queue.remove();
                    queue.add(currentInterval);
                }
            }
        }

        return queue.size() <= candidateClassRooms;
    }

    public static void main(String args[] ) throws Exception {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(930,1300));
        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(1130,1300));
        intervals.add(new Interval(1100,1400));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1430,1700));
        intervals.add(new Interval(1530,1700));

        long low = 0;
        long high = intervals.size();

        Collections.sort(intervals, Comparator.comparing(p -> p.getStartTime()));

        while(low < high){
            long mid  = low + ( (high - low) >> 1);

            if(predicate(intervals, mid)){
                high = mid;
            }else{
                low = mid+1;
            }
        }
        System.out.println(low);
    }
}

Complexity of algorithm is dependent on number of lectures to be scheduled which is O(n log n ) with additional space complexity of O(c) where c is number of classrooms required.

Please share your views and suggestions in comments and feel free to share and spread the word. If you are interested to share your knowledge to learners across the world, please write to us on communications@algorithmsandme.com

Merge overlapping intervals

Merge overlapping intervals

Given N intervals S = {E1,E2,…..En} with each Ei has start time si and end time ei. Some of these intervals can be overlapping, Just to clarify, Ei and Ej overlap when start time of Ej i.e sj is less than end time of Ei i.e ei. For example, [(1,3),(2,4),(5,8), (6,9)] should transform into [(1, 4),(5,9)] has interval (1,3) and (2,4) overlap and interval (5,8) and (6,9) also overlap.

merge overlapping intervals

Merge overlapping intervals  : Thought process

As we always do, first try to come up with brute force solution, given enough time and space and money, how would you solve this?
Natural course is to take ith interval, and compare start time of all jth intervals with end time of ith, if start time of jth interval is less than end time of ith event, then you can merge two intervals. What should be end time for merged interval then?  It should be maximum of end times of two merged intervals.

What will be time complexity of this approach? We are not using any additional space, however, worst case time complexity is O(n2). Can we do better?

What are two times we are comparing in brute force solution? It’s start time of one interval with end time of another. If we arrange input in specific order, can we reduce processing some entries?

If we sort all intervals based on their start time, si < si+1< si+2. Also, interval is always forward looking, ei > si, ei+1 > si+1 and so on.

If si is greater ei-1, then si+1 will be greater than ei-1, so no need to compare si+1 with ei-1, that is no need to go beyond immediate previous interval for any interval Ei. If si is less than ei-1, update ei-1 with maximum of ei-1 and ei and move to Ei+1.
Notice that we need last interval Ei-1 to decide if to merge new interval into previous one or keep it as standalone. Stack is best data structure to use. Algorithm will look like:

  1. Consider interval Ei.
  2. If stack is empty, push Ei to stack.
  3. If stack is not empty, then pop interval at top of stack call it Ei-1.
  4. Compare si, start time of Ei with ei-1, end time of Ei-1.
  5. If si less than ei-1, update ei-1 as max(ei-1, ei), as in maximum of end times of two intervals and push back Ei-1on to stack.
  6. Else push Ei on to stack.
  7. Continue till all events are considered.
  8. At the end of processing, stack will contain all merged interval.

Let’s take an example and see how this algorithm works. We have following intervals and we have to merge overlapping intervals.

First of all, sort all interval based on their start time.

Create a stack, start with first interval, since stack is empty, we will push first event on to stack.

After pushing first event, problem state looks like this

Take second event, start time (2) of second event is less than end time of previous event on stack (3), hence, find maximum of end times of these two intervals and update last interval with that end time and push back on to stack.

 

Look at third interval, start time of it is greater than end time of interval on top of stack, just push interval on to stack.

Last interval, this time, start time of new interval is less than end time of interval on top of stack.

Find maximum of end times of two intervals and update previous interval with that end time and push it back on to stack.

merge overlapping intervals

At this point, when there is no more interval remaining, stack contains all merged overlapping intervals.

Merge overlapping intervals : Implementation

package com.company;


import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.Stack;

/**
 * Created by sangar on 8.4.18.
 */
public class OverlappingIntervals {
    public  static ArrayList<Interval>
        mergeOverlappingIntervals(ArrayList<Interval> intervals){

        ArrayList<Interval> mergedIntervals = new ArrayList<>();
        Stack<Interval> s = new Stack();

        //Sort the arraylist of interval based on start time.
        Collections.sort(intervals, Comparator.comparing(p -> p.getStartTime()));
        for(Interval currentInterval : intervals){
            if(s.empty())s.push(currentInterval);
            else {
                Interval previousInterval = s.pop();
                if(previousInterval.getEndTime() > currentInterval.getStartTime()){
                    /*
                    If current interval's start time is less than end time of
                    previous interval, find max of end times of two intervals
                    and push new interval on to stack.
                     */
                    int endTime = Integer.max(previousInterval.getEndTime(), currentInterval.getEndTime());
                    /* Notice that we have created new interval and did not update the old one
                       This concept is called as immutability of class
                     */
                    s.push(new Interval(previousInterval.getStartTime(), endTime));
                }
                else{
                    s.push(previousInterval);
                    s.push(currentInterval);
                }
            }
        }
        while(!s.empty()){
            mergedIntervals.add(s.pop());
        }

        return mergedIntervals;
    }

    public static void main(String[] args) {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(1,3));
        intervals.add(new Interval(2,4));
        intervals.add(new Interval(5,8));
        intervals.add(new Interval(6,9));
        ArrayList<Interval> mergedIntervals = mergeOverlappingIntervals(intervals);
        for (Interval interval : mergedIntervals){
            System.out.print("(" + interval.getStartTime() +"," + interval.getEndTime() + ")");
        }
    }
}

Complexity of algorithm to merge overlapping intervals will be O(N log N) due to sorting with O(N) extra space.

Please share if there is something missing or wrong. Also, please reach out to us at communications@algorithmsandme.com if you want to contribute to website and help others to learn by sharing your knowledge.

Interval Scheduling Algorithm

Interval Scheduling Algorithm

Any interval has two time stamps, it’s start time and end time. To schedule number of intervals on to particular resource, take care that no two intervals are no overlapping, that is to say second interval cannot be scheduled while first is running. Given a set of intervals S with their start time si and end time ei, find out largest set R such that all events in R are mutually compatible. Two intervals are called compatible if they do not overlap (2nd job starts after or at the same time as the 1st one finishes). This problem is called as interval scheduling problem and algorithm which helps solve this class of problems is called as interval scheduling algorithm.
Example: 8 intervals{A,B,C,D,E,F,G,H}, optimal set would be {B,E,H}

interval scheduling

Interval Scheduling : Line of thought

Notice from problem statement that ask is to maximize output with given constraints. What template this kind of problems fit in? It’s greedy algorithm. We need to select each job which maximizes output, i.e gives us maximum number of compatible intervals. What should be the order of evaluation of intervals? There are some natural orders we can think of :
1. Order intervals by earliest start time first.
2. Order intervals by earliest end time first.
3. Order intervals by minimum number of overlapping jobs.
4. Order intervals by shortest job first.

Let’s take some examples and see how things work out with each criteria.
1. Earliest start time first

interval scheduling algorithm

In above arrangement, if we schedule interval with earliest start time first, only one interval will scheduled, however, optimally, 3 intervals, { B, C, D } should have been scheduled.

2. Minimum number of conflicting jobs

interval scheduling problem

If we select job with least conflicts first, we will select F ( 2 conflicts) then C ( with 3 conflicts ) and then E ( again with 3 conflicts ). However, ideal set should be { B, C, D, E }

3. Shortest job first.

In this case, if we select shortest job first, set will contain only interval A, where as optimal set is {B, C}.
These are counter examples for three of the four natural ordering, these three criteria cannot give us optimum output, which is maximum number of compatible intervals. If we take interval with earliest end time, it will give us optimal set. Can you check if above three examples give you correct answer if you take interval based on earliest end time first? If we take first example, when order by end time, intervals will look like this. From this we can easily find out that compatible intervals are B, E and H.

Interval scheduling algorithm

Sort all jobs which based on end time in increasing order.

  1. Take the interval which has earliest finish time.
  2. Repeat net two steps till all you process all jobs
  3. Eliminate all intervals which have start time less than selected interval’s end time.
  4. If interval has start time greater than current interval’s end time, at it to set. Set current interval to new interval
package com.company;

import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.PriorityQueue;

/**
 * Created by sangar on 25.4.18.
 */
public class IntervalScheduling {
    public static ArrayList<Interval> intervalScheduling(ArrayList<Interval> intervals){
        Collections.sort(intervals, Comparator.comparing(p -> p.getEndTime()));

        ArrayList<Interval> resultList = new ArrayList<>();

        for(Interval currentInterval : intervals) {
            if(resultList.isEmpty()) resultList.add(currentInterval);
            else{
                if(currentInterval.getStartTime() > resultList.get(resultList.size()-1).getEndTime()){
                    resultList.add(currentInterval);
                }
            }
        }
        return resultList;
    }

    public static void main(String args[] ) throws Exception {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(930,1300));
        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(1130,1300));
        intervals.add(new Interval(1100,1400));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1430,1700));
        intervals.add(new Interval(1530,1700));

        ArrayList<Interval> compatibleIntervals = intervalScheduling(intervals);

        for(Interval interval : compatibleIntervals) {
            System.out.println("(" + interval.getStartTime() + "," + interval.getEndTime() + ")");
        }
    }
}

Complexity of algorithm is dominated by the sorting which is O(N log N) which is actually the complexity of sort algorithm.

Reference 
http://courses.cs.vt.edu/cs5114/spring2009/lectures/lecture04-greedy-scheduling.pdf

Please share if you find something missing or wrong. If you want to contribute to algorithm and me and share your knowledge with thousands of students across world, please reach out to us at communications@algorithmsandme.com