Yesterday, I came across an interesting question on StackOverflow. The question is as follows: assuming you have a sorted array, is it possible to increment locations by 1 (one at a time) and still ensure that the array is sorted in O(1)
? Assuming you are not allowed to use any other secondary data-structures, the answer is no. This becomes evident when you have repeated values. Assume a degenerate case where you have an array of identical values; let's say [0, 0, 0, 0, 0]
. If you increment the value in location 0
, you now have [1, 0, 0, 0, 0]
. For this array to be sorted, the value 1
must be moved to the end of the array (index 4
). This will require four comparisons, which means that this will inevitably turn into an O(n)
algorithm.
How can we do better?
The problematic case is with repeated values, since this is where we have the worst performance (O(n)
). So we need to figure out a way to make performance better in this case (all other cases can be handled in constant time). What is our problem exactly when we encounter a run of repeated values? The problem is that we need to go through all of the repeated values to figure out the last one, since we will be swapping our incremented value with that last repeated-value. What if we could easily find out where a run of repeated values ends? If that was the case, we could simply look up that information and swap our incremented value with the value in that location! It turns out that we can easily do this with a map (hash map or hash table). Lookups from a hash map are O(1)
and in our case we are unlikely to have collisions either, so now our O(n)
operation now runs in constant time! What would this map contain? It would only contain repeated values that are mapped to their ending index. So in our first example, where we had an array of zeroes, the map would be (0 => 4)
, because the run of zeroes ends at location 4
. After incrementing the value at location 0
, the map would be (0 => 3)
because our run of zeroes ends at location 3
now. If we incremented the first location again, the map would be (0 => 2, 1 => 4)
because we have a run of zeroes ending at location 2
and a new run of ones that end at location 4
. As you can see, it's a pretty simple structure.
By keeping track of this information, we can now accomplish our goal of maintaining a sorted array in O(1)
time! The algorithm has two parts: the first part does the incrementing and swapping (if necessary), while the second part performs book-keeping to update our map. First let's concentrate on the increment-and-swap part. We have to increment a location, but do we always have to swap? Let's look at the different cases:
- Incremented value is the last element in the array - No swap necessary.
- Incremented value is lesser than the next value - No swap necessary.
- Incremented value is the same as the next value - No swap necessary.
- Incremented value is greater than the next value - Swap necessary.
So it turns out that we actually only need to swap in one case: when the incremented value is greater than the next one. But we cannot simply swap the values of course, we have to check to see if we have a run of values as well, which we can easily do by checking our map. So the cases we need to consider are:
- If there is no entry for the next value in the map, simply swap the two values.
- If there is an entry for the next value in the map, swap the incremented value with the value at the ending index.
At this point our array is properly sorted. But there is still one more thing we need to do; we need to make sure that we perform book-keeping on our map to ensure that it is in a state that accurately reflects all the runs we have in our array (if any). There are a few things we need to do:
- Decrement the ending-index for a run of values, if we swapped the incremented value with this value.
- Remove the entry for a value from the map, if there is no longer a run of values for this value.
- Add an entry into the map if we created a new run by incrementing a value.
That's basically it! So let's see what this actually looks like in code form. I used JavaScript because it is pretty easy to prototype algorithms in it. The code is pretty well-commented and so you can see where I have handled the different cases:
function incrementAndMaintainSortedArray(array, index) {
var oldValue = array[index];
var newValue = ++array[index];
if(index === (array.length - 1)) {
//Incremented element is the last element.
//We don't need to swap, but we need to see if we modified a run (if one exists)
if(endingIndices[oldValue]) {
endingIndices[oldValue]--;
}
} else if(index >= 0) {
//Incremented element is not the last element; it is in the middle of
//the array, possibly even the first element
var nextIndexValue = array[index + 1];
if(newValue === nextIndexValue) {
//If the new value is the same as the next value, we don't need to swap anything. But
//we are doing some book-keeping later with the endingIndices map. That code requires
//the ending index (i.e., where we moved the incremented value to). Since we didn't
//move it anywhere, the endingIndex is simply the index of the incremented element.
endingIndex = index;
} else if(newValue > nextIndexValue) {
//If the new value is greater than the next value, we will have to swap it
var swapIndex = -1;
if(!endingIndices[nextIndexValue]) {
//If the next value doesn't have a run, then location we have to swap with
//is just the next index
swapIndex = index + 1;
} else {
//If the next value has a run, we get the swap index from the map
swapIndex = endingIndices[nextIndexValue];
}
array[index] = nextIndexValue;
array[swapIndex] = newValue;
endingIndex = swapIndex;
} else {
//If the next value is already greater, there is nothing we need to swap but we do
//need to do some book-keeping with the endingIndices map later, because it is
//possible that we modified a run (the value might be the same as the value that
//came before it). Since we don't have anything to swap, the endingIndex is
//effectively the index that we are incrementing.
endingIndex = index;
}
//Moving the new value to its new position may have created a new run, so we need to
//check for that. This will only happen if the new position is not at the end of
//the array, and the new value does not have an entry in the map, and the value
//at the position after the new position is the same as the new value
if(endingIndex < (array.length - 1) &&
!endingIndices[newValue] &&
array[endingIndex + 1] === newValue) {
endingIndices[newValue] = endingIndex + 1;
}
//We also need to check to see if the old value had an entry in the
//map because now that run has been shortened by one.
if(endingIndices[oldValue]) {
var newEndingIndex = --endingIndices[oldValue];
if(newEndingIndex === 0 ||
(newEndingIndex > 0 && array[newEndingIndex - 1] !== oldValue)) {
//In this case we check to see if the old value only has one entry, in
//which case there is no run of values and so we will need to remove
//its entry from the map. This happens when the new ending-index for this
//value is the first location (0) or if the location before the new
//ending-index doesn't contain the old value.
delete endingIndices[oldValue];
}
}
}
}
You can check out a runnable version of this code at jsfiddle.net. The code includes a test-harness that increments random locations in an array and ensures that the array is sorted after each iteration.