[Solved] Interval range insert with split into unique ranges

Question

in the actual code there is a differentiation between old interval ranges and new ranges after applying certain operation those ranges would be merged together. I wanted to split the question into smaller chunk so the merging part is left out.

Solely: It is easier to merge in the new interval directly, instead of artificially splitting it up. So this is what I propose below (C++):

using DataType = /* whatever appropriate; double, int, unsigned int, ...*/;
using Interval = std::pair<DataType, DataType>;
std::vector<Interval> intervals;

void insert(Interval x)
{
    if(intervals.empty() || x.second < intervals.front().first)
    {
        intervals.insert(intervals.begin(), x); // (1)
    }
    else if(x.first > intervals.back().second)
    {
        intervals.push_back(x); // (2)
    }
    else
    {
        auto b = intervals.begin();
        while(b->second < x.first)
        {
            std::advance(b, 1);
        }
        // cannot be END iterator, otherwise would have been caught by (2)
        if(x.second < b->first)
        {
            // this is a new interval in between std::prev(b) and (b)
            intervals.insert(b, x);
        }
        else
        {
            // b is overlapping with x!
            if(b->first > x.first)
            {
                b->first = x.first;
            }

            auto e = std::next(b);
            while(e != intervals.end() && e->first <= x.second)
            {
                std::advance(e, 1);
            }
            // e is the first interval NOT overlapping with x!
            auto l = std::prev(e);
            b->second = std::max(x.second, l->second);
            // drop all subsequent intervals now contained in b (possibly none)
            intervals.erase(std::next(b), e);
        }
    }
}

Algorithm only, spared the design efforts of packing into class, having convenience function accepting begin/end markers (instead of interval), …

If the data type you intend to use does not provide a back accessor (C++: e. g. std::forward_list): No problem, just drop the second if (containing (2)); then, however, b can be the end iterator, so you’d have to test for and if the test succeeds, you can insert at end. You’d most likely not have an ‘insert before’ then, so you’d need to track b’s and later e’s predecessors separately, too…

Accepted Answer

in the actual code there is a differentiation between old interval ranges and new ranges after applying certain operation those ranges would be merged together. I wanted to split the question into smaller chunk so the merging part is left out.

Solely: It is easier to merge in the new interval directly, instead of artificially splitting it up. So this is what I propose below (C++):

using DataType = /* whatever appropriate; double, int, unsigned int, ...*/;
using Interval = std::pair<DataType, DataType>;
std::vector<Interval> intervals;

void insert(Interval x)
{
    if(intervals.empty() || x.second < intervals.front().first)
    {
        intervals.insert(intervals.begin(), x); // (1)
    }
    else if(x.first > intervals.back().second)
    {
        intervals.push_back(x); // (2)
    }
    else
    {
        auto b = intervals.begin();
        while(b->second < x.first)
        {
            std::advance(b, 1);
        }
        // cannot be END iterator, otherwise would have been caught by (2)
        if(x.second < b->first)
        {
            // this is a new interval in between std::prev(b) and (b)
            intervals.insert(b, x);
        }
        else
        {
            // b is overlapping with x!
            if(b->first > x.first)
            {
                b->first = x.first;
            }

            auto e = std::next(b);
            while(e != intervals.end() && e->first <= x.second)
            {
                std::advance(e, 1);
            }
            // e is the first interval NOT overlapping with x!
            auto l = std::prev(e);
            b->second = std::max(x.second, l->second);
            // drop all subsequent intervals now contained in b (possibly none)
            intervals.erase(std::next(b), e);
        }
    }
}

Algorithm only, spared the design efforts of packing into class, having convenience function accepting begin/end markers (instead of interval), …

If the data type you intend to use does not provide a back accessor (C++: e. g. std::forward_list): No problem, just drop the second if (containing (2)); then, however, b can be the end iterator, so you’d have to test for and if the test succeeds, you can insert at end. You’d most likely not have an ‘insert before’ then, so you’d need to track b’s and later e’s predecessors separately, too…