k-Sorted Matrix

Sorting a nearly ordered matrix

Feb 08, 2025

Problem Statement

You have an m×n integer matrix M. Each row is k-sorted, meaning every element in that row is at most k positions away from where it would be if you sorted the row in non-decreasing order. Your goal is to rearrange all elements so that the matrix is sorted in row-major order—the smallest elements fill the first row left to right, then the second row, and so on.

What Does “`k`-Sorted (per row)” Mean?

If a row is [4, 1, 2] and k = 2:

The sorted version of the row is [1, 2, 4].
Each element moves at most 2 positions from its current index to its sorted index (e.g., 4 goes from index 0 to 2).

Thus, the row is 2-sorted.

Examples

Input: [[2,1,3,5,4],[7,6,9,8,10]], k=1
Output: [[1,2,3,4,5],[6,7,8,9,10]]

Input: [[4,1,2],[9,7,5],[6,8,10]], k=2
Output: [[1,2,4],[5,6,7],[8,9,10]]

Theophilus Pedapolu

Feb 9, 2025

The obvious solution to writing a matrix in row-major order for this problem is to sort each row individually and then use a heap of size m containing the current smallest elements in each row to refill the matrix in row-major order. This is a O(mn * log(mn)) algorithm. However, we can make use of the fact that each row is k-sorted to make this algorithm slightly faster in the sorting phase.

Consider an individual row R. By the property of k-sorted rows, it follows that the ith smallest element in R must lie in the range of indices [max(0,i-k),min(len(R),i+k] < [0,min(len(R),i+k)]. Thus, the smallest element lies in [0,k], the 2nd smallest in [0,k+1], etc. This motivates us to use a heap of size k to sort the array. Namely, for each row R, we do the following:

1. Store the first k+1 elements of R in a heap. Keep a counter idx, for the next index not in the heap

2. For i = 0 to len(R)-1: pop the smallest element from the heap and place it at index I. If idx < len(R), add R[idx] to the heap and increment idx

We can see that this algorithm makes use of the k-sorted row property to ensure that when we are at index i, the heap contains all the elements at most k positions from i. This sorts each row in O(n* log k) time as opposed to O(n * log n) time. From here, we just proceed as normal, using a heap of size m to track the smallest elements in each row and refill the matrix:

1. Store a tuple of the first element of each row and its row index in a heap. Create a list L of size m that stores the next index that is not in the heap for each row (entries are initialized to 1)

2. Going left to right in row-major order: pop the smallest element from the heap and place it at the current position. Also use the popped element's row index to add the next element in its row to the heap (if possible) and increment the next index of the row in L.

Overall, this algorithm takes O(mn * log(km)) time and O(k+m) space.

Note: I would be interested in seeing what input sizes are needed for this algorithm to be faster than the naive algorithm (stated at the beginning) in practice. We are essentially doing a modified heapsort for each row, meaning it might still suffer from heapsort's poor cache locality. For certain inputs, quicksort on each row might still be faster. My guess is that k has to be small enough for the cache line to contain sufficient memory and yield good performance.

Puzzle Spree

Discussion about this post

Ready for more?