Given a set of strings which just has lower case letters and a target string, output all the strings for each the edit distance with the target no greater thank.

You have the following 3 operations permitted on a word:

  • Insert a character
  • Delete a character
  • Replace a character

Have you met this question in a real interview?

Yes

Example

Given words =["abc", "abd", "abcd", "adc"]and target ="ac", k =1
Return["abc", "adc"]

在edit distance中,我们运用动态规划记录 f[i][j] 表示前i个字符在第一个词和前j个字符在第二个词中的编辑距离

转移方程是 f[i][j] = min (f[i - 1][j - 1], f[i - 1][j] + 1, f[i][j - 1] + 1) | word[i] == word[j]

                            min \(f\[i - 1\]\[j - 1\] + 1, f\[i - 1\]\[j\] + 1, f\[i\]\[j - 1\] + 1\) \| word\[i\] != word\[j\]

这题的不同之处在于需要查询字典中所有小于k edit distance的词。

  1. naive算法是算每一个字典中词的edit distance, 那么算法复杂度是 O(n* k^2) k is the avg length of the words in dict
  2. 如果字典中的词看起来是这样[abc, abcd, acd, ace, aed, bcd], 如果我们用第一个方法会发现ab, ac等前缀被重复计算了,如果我们把字典中的词建树可以减少这种重复计算,但是最坏情况的复杂度啊仍然是O(n * k ^ 2)在完全没有公用前缀的情况下。但如果有就会减少复杂度。

     a.这里的字典树只用到了插入的性质,后面需要遍历。
    
     b. 还有一个优化是如果词长一样的情况下当前的编辑长度已经多余k,则不必再向下递归
    
class TrieNode {
    HashMap<Character, TrieNode> children = null;
    boolean hasWord;
    String word;
    public TrieNode () {
        children = new HashMap<Character, TrieNode>();
        hasWord = false;
        word = "";
    }
}

class Trie {
    TrieNode root;
    public Trie() {
        root = new TrieNode();
    }

    public void insert(String word) {
        char[] wordA = word.toCharArray();
        TrieNode cur = root;
        for (int i = 0; i < wordA.length; i++) {
            if (!cur.children.containsKey(wordA[i])) {
                cur.children.put(wordA[i], new TrieNode());
            }
            cur = cur.children.get(wordA[i]);
        }
        cur.hasWord = true;
        cur.word = word;
    }
}
public class Solution {
    /*
     * @param words: a set of stirngs
     * @param target: a target string
     * @param k: An integer
     * @return: output all the strings that meet the requirements
     */
    public List<String> kDistance(String[] words, String target, int k) {
        // write your code here
        List<String> ans = new ArrayList<>();
        if (words == null || words.length == 0) {
            return ans;
        }
        Trie trie = new Trie();
        for (String word : words) {
            trie.insert(word);
        }

        int[] dp = new int[target.length() + 1];
        for (int i = 0; i <= target.length(); i++) {
            dp[i] = i;
        }

        find(trie.root, target, dp, ans, k);
        return ans;
    }

    private void find(TrieNode root, String target, int[] dp, List<String> ans, int k) {
        //System.out.println(root.word);

        if (root.hasWord && dp[target.length()] <= k) {
            ans.add(root.word);
        } else if (root.hasWord && root.word.length() >= target.length() && dp[target.length()] > k) {
            return;
        }


        int[] next = new int[target.length() + 1];
        for (int i = 0; i <= target.length(); i++) {
            next[i] = 0;
        }
        next[0] = dp[0] + 1;
        for (Character c : root.children.keySet()) {
            //System.out.println(c);

            for (int j = 1; j <= target.length(); j++) {
                if (target.charAt(j - 1) == c) {
                    int tmp = Math.min(dp[j - 1], dp[j] + 1);
                    next[j] = Math.min(tmp, next[j - 1] + 1);
                } else {
                    int tmp = Math.min(dp[j - 1], dp[j]);
                    next[j] = Math.min(tmp, next[j - 1]) + 1;
                }
            }


            find(root.children.get(c), target, next, ans, k);
        }
    }
}

results matching ""

    No results matching ""