## Wednesday, May 25, 2016

### Data stream as disjoint intervals

https://discuss.leetcode.com/topic/96/data-stream-as-disjoint-intervals
Given a data stream input of positive integers a1, a2, ..., an, ..., summary the numbers seen so far as a list of disjoint intervals.
For example, suppose the input are 1, 3, 7, 2, 6, ..., then the summary will be:
[1, 1]
[1, 1], [3, 3]
[1, 1], [3, 3], [7, 7]
[1, 3], [7, 7]
[1, 3], [6, 7]

This problem is similar to Summary Ranges, except that it is expressed in a stream format. Also, the integers in the stream is not sorted.

Since the intervals are disjoint (non-overlapping), this problem can be solved efficiently with using Binary Search Tree. For overlapping intervals, you have to use Interval Tree.

Well, BST could not do the merging of intervals efficiently. I would say just use Binary Search to find the intervals to merge, which is possible in O(log n). By inserting a point to a list of sorted disjoint intervals, it would cause at most two intervals to be merged. So you only need to look at the interval before and after it.

Merging adjacent intervals or inserting an interval takes linear time, for an array / list. For doubly connected list, insertion or merge takes constant time, however finding the interval takes linear time. So BST is the choice that makes all operations O(log n).

I was thinking the same. But the problem is that to use BS effectively you need array, also there is dynamic in the problem (not by accident this is a stream), we insert numbers and we have to rearrange them in the proper place. With BST, I think that we have to search where to insert the number and check its left and right neighbour always. In the worst case we have to merge both intervals (left and right), this need to delete one node of the tree which is again log(n) operation
As you mentioned, the problem is easier because intervals dont overlapp

using a BST enables all operations to complete in O(log n). Then you will need to return the root of the tree, not `List<Interval>`. I don't think it's possible to do `insert` in O(log n) if your function signature has to return `List<Interval>`.

I agree with @elmirap that there may be two functions: "insert" and "get interval".
Complexity of "insert" is O(log n) while "get interval" is O(n).