Massive Algorithms: LeetCode 955 - Delete Columns to Make Sorted II

LeetCode 955 - Delete Columns to Make Sorted II

We are given an array A of N lowercase letter strings, all of the same length.

Now, we may choose any set of deletion indices, and for each string, we delete all the characters in those indices.

For example, if we have an array A = ["abcdef","uvwxyz"] and deletion indices {0, 2, 3}, then the final array after deletions is ["bef","vyz"].

Suppose we chose a set of deletion indices D such that after deletions, the final array has its elements in lexicographic order (A[0] <= A[1] <= A[2] ... <= A[A.length - 1]).

Return the minimum possible value of D.length.

Example 1:

Input: ["ca","bb","ac"]
Output: 1
Explanation: 
After deleting the first column, A = ["a", "b", "c"].
Now A is in lexicographic order (ie. A[0] <= A[1] <= A[2]).
We require at least 1 deletion since initially A was not in lexicographic order, so the answer is 1.

Example 2:

Input: ["xc","yb","za"]
Output: 0
Explanation: 
A is already in lexicographic order, so we don't need to delete anything.
Note that the rows of A are not necessarily in lexicographic order:
ie. it is NOT necessarily true that (A[0][0] <= A[0][1] <= ...)

Example 3:

Input: ["zyx","wvu","tsr"]
Output: 3
Explanation: 
We have to delete every column.

Note:

1 <= A.length <= 100
1 <= A[i].length <= 100

https://zhanghuimeng.github.io/post/leetcode-955-delete-columns-to-make-sorted-ii/

这道题还挺有趣的。题解里给了一种很明显是stay ahead思路的正确性证明[1]：

如果当前列会导致之前留下的列加上这一列不满足字典序，则这一列必须删除
否则可以说明，留下当前列必然不比删除它更差
由上述说明可以得知，已经留下的列必然是满足字典序的，其中有一些满足严格字典序（大于），有一些则可能不是严格字典序（相等）
对于已经满足严格字典序的字符串，后面加什么都没问题
对于相等的字符串，后面加的第一列必须满足字典序
因此，加上当前这一列之后，相等的字符串可能会变成严格字典序，或者还是相等
也就是说，相等的字符串不会增加，添加新列的难度也不会增加
所以应该尽可能留下当前列

https://www.acwing.com/solution/leetcode/content/684/
相同长度序列字典序的比较方式：两个序列从头开始比较，如果中途出现了不一样的字符，则以该字符为大小为字典序的大小。
此时定义一个数组 end，表示当前序列是否还应该与后继序列作比较。初始时都是应该的。
然后从第一列开始枚举，如果按照 end 数组，当前列不符合单调不降，则删除该列且答案加一；否则，直接更新 end 数组，如果某个序列不再和后继序列作比较（即当前列的字符不同），则 end[i] 更新为 false 即可。
时间复杂度
共枚举 mm 列，每次都需要 O(n)O(n) 的时间判定，Θ(n)Θ(n) 的时间更新 end 数组，故总时间复杂度为 Θ(nm)Θ(nm)。

https://blog.csdn.net/gqxxxxxl/article/details/84928263

贪心算法。不考虑删除那些列，思考保留哪些列。第一列如果不符合字典顺序，则必须删除。其中若有部分字母相同，则比较他们下一位,若满足条件，则按列拼在一起，若不满足，则跳至下一列。

X. https://leetcode.com/problems/delete-columns-to-make-sorted-ii/discuss/203182/JavaC%2B%2BPython-Greedy-Solution-O(MN)

Solve it with a greed algorithme.

Initial N empty string.
For each column,
add the character to each row,
and check if all rows are still sorted.

If not, we have to delete this column.

Explanation

Initial a boolean array sorted,
sorted[i] = true if and only if A[i] < A[i + 1],
that is to say A[i] and A[i + 1] are sorted.

For each col, we check all unsorted rows.
If A[i][j] > A[i + 1][j], we need to deleted this col, res++.

Otherwise, we can keep this col, and update all sorted rows.

    public int minDeletionSize(String[] A) {
        int res = 0, n = A.length, m = A[0].length(), i, j;
        boolean[] sorted = new boolean[n - 1];
        for (j = 0; j < m; ++j) {
            for (i = 0; i < n - 1; ++i) {
                if (!sorted[i] && A[i].charAt(j) > A[i + 1].charAt(j)) {
                    res++;
                    break;
                }
            }
            if (i < n - 1) continue;
            for (i = 0; i < n - 1; ++i)
                if (A[i].charAt(j) < A[i + 1].charAt(j))
                    sorted[i] = true;
        }
        return res;
    }

https://leetcode.com/problems/delete-columns-to-make-sorted-ii/discuss/203341/Java-Solution-6ms-beats-100

// O(m * n), m = string length, n = array size

// (1) only need to compare ajacent strings
// (2) have a rank[] to record the rank of previous characters in a string. if there is rank already, there is no need to compare again.
rank[i] = true means rank of A[i] is higher than the rank of A[i-1]
// (3) if the column is not deleted, we will update the rank[].
// (4). if not in order is found, need to delete the column
/* Example
Input: ["aba","gbf","dcd"]

col: 012 when col = 1
0 aba rank[0] = false
1 gbf rank[1] = false
2 dcd rank[2] = true

when col = 0, we find we need to delete this column
when col = 1, all characters in this column are in order, so this column don't need to be deleted. But we need to update the rank[].

    boolean[] rank;
    public int minDeletionSize(String[] A) {
        int n = A[0].length();
        rank = new boolean[A.length];
        int res = 0;
        // from first character to last character
        for (int col = 0; col < n; col++) {
            boolean delCol = false;
            // compare ajacent string only
            for (int i = 1; i < A.length; i++) {
                if (rank[i]) continue; // meaning A[i] > A[i - 1] in rank
                if (A[i - 1].charAt(col) > A[i].charAt(col)) { // not in order, need to delete the column
                    res++;
                    // System.out.println("col: " + col + " A[i]: " + A[i]);
                    delCol = true;
                    break;
                }
            }
            if (!delCol) { // if the column is not deleted, update rank[]
                for (int i = 1; i < A.length; i++) {
                    if (!rank[i] && A[i].charAt(col) > A[i - 1].charAt(col)) {
                        rank[i] = true;
                    }
                }
            }

        }
        
        return res;
    }

https://leetcode.com/articles/delete-columns-to-make-sorted-ii/

It is also possible to implement the solution in Approach 1 without using as much time and space.

The key idea is that we will record the "cuts" that each column makes. In our first example from Approach 1 with A = ["axx","ayy","baa","bbb","bcc"] (and R defined as in Approach 1), the first column cuts our condition from R[0] <= R[1] <= R[2] <= R[3] <= R[4] to R[0] <= R[1] and R[2] <= R[3] <= R[4]. That is, the boundary "a" == column[1] != column[2] == "b" has 'cut' one of the conditions for R out.

At a high level, our algorithm depends on evaluating whether adding a new column will keep all the rows sorted. By maintaining information about these cuts, we only need to compare characters in the newest column.

Time Complexity:

O(NW)

, where

N

is the length of A, and

W

is the length of A[i].

public int minDeletionSize(String[] A) {

int N = A.length;

int W = A[0].length();

// cuts[j] is true : we don't need to check any new A[i][j] <= A[i][j+1]

boolean[] cuts = new boolean[N - 1];

int ans = 0;

search: for (int j = 0; j < W; ++j) {

// Evaluate whether we can keep this column

for (int i = 0; i < N - 1; ++i)

if (!cuts[i] && A[i].charAt(j) > A[i + 1].charAt(j)) {

// Can't keep the column - delete and continue

ans++;

continue search;

}

// Update 'cuts' information

for (int i = 0; i < N - 1; ++i)

if (A[i].charAt(j) < A[i + 1].charAt(j))

cuts[i] = true;

}

return ans;

}

Instead of thinking about column deletions, let's think about which columns we will keep in the final answer.

If the first column isn't lexicographically sorted, we have to delete it.

Otherwise, we will argue that we can keep this first column without consequence. There are two cases:

If we don't keep the first column, then the final rows of the answer all have to be sorted.
If we do keep the first column, then the final rows of the answer (minus the first column) only have to be sorted if they share the same first letter (coming from the first column).

The above statement is hard to digest, so let's use an example:

Say we have A = ["axx","ayy","baa","bbb","bcc"]. When we keep the first column, the final rows are R = ["xx","yy","aa","bb","cc"], and instead of the requirement that these all have to be sorted (ie. R[0] <= R[1] <= R[2] <= R[3] <= R[4]), we have a weaker requirement that they only have to be sorted if they share the same first letter of the first column, (ie. R[0] <= R[1] and R[2] <= R[3] <= R[4]).

Now, we applied this argument only for the first column, but it actually works for every column we could consider taking. If we can't take a column, we have to delete it. Otherwise, we take it because it can only make adding subsequent columns easier.

Algorithm

All our effort has led us to a simple algorithmic idea.

Start with no columns kept. For each column, if we could keep it and have a valid answer, keep it - otherwise delete it.

Time Complexity: $O(NW^2)$ , where $N$ is the length of A, and $W$ is the length of A[i].
Space Complexity: $O(NW)$ .

public int minDeletionSize(String[] A) {

int N = A.length;

int W = A[0].length();

int ans = 0;

// cur : all rows we have written

// For example, with A = ["abc","def","ghi"] we might have

// cur = ["ab", "de", "gh"].

String[] cur = new String[N];

for (int j = 0; j < W; ++j) {

// cur2 : What we potentially can write, including the

// newest column col = [A[i][j] for i]

// Eg. if cur = ["ab","de","gh"] and col = ("c","f","i"),

// then cur2 = ["abc","def","ghi"].

String[] cur2 = Arrays.copyOf(cur, N);

for (int i = 0; i < N; ++i)

cur2[i] += A[i].charAt(j);

if (isSorted(cur2))

cur = cur2;

else

ans++;

}

return ans;

}

public boolean isSorted(String[] A) {

for (int i = 0; i < A.length - 1; ++i)

if (A[i].compareTo(A[i + 1]) > 0)

return false;

return true;

}

LeetCode 955 - Delete Columns to Make Sorted II

Explanation

Labels

Popular Posts