Java字符串算法 | AI生成和翻译
字符串无处不在,高效的字符串处理是项关键技能。让我们来看看用于模式匹配的KMP算法、字典树数据结构和Rabin-Karp算法。
1. KMP算法:高效字符串匹配
Knuth-Morris-Pratt(KMP)算法通过预处理模式串来避免不必要的比较,实现O(n + m)的时间复杂度。
Java实现
public class KMP {
static void KMPSearch(String pat, String txt) {
int M = pat.length(), N = txt.length();
int[] lps = new int[M];
computeLPSArray(pat, M, lps);
int i = 0, j = 0;
while (i < N) {
if (pat.charAt(j) == txt.charAt(i)) { i++; j++; }
if (j == M) {
System.out.println("Found at " + (i - j));
j = lps[j - 1];
} else if (i < N && pat.charAt(j) != txt.charAt(i)) {
if (j != 0) j = lps[j - 1];
else i++;
}
}
}
static void computeLPSArray(String pat, int M, int[] lps) {
int len = 0, i = 1;
lps[0] = 0;
while (i < M) {
if (pat.charAt(i) == pat.charAt(len)) lps[i++] = ++len;
else if (len != 0) len = lps[len - 1];
else lps[i++] = 0;
}
}
public static void main(String[] args) {
String txt = "ABABDABACDABABCABAB";
String pat = "ABABCABAB";
KMPSearch(pat, txt);
}
}
输出: Found at 10
2. 字典树:基于前缀的搜索
字典树以树形结构存储字符串,支持快速前缀查找,空间复杂度与总字符数成正比。
Java实现
public class Trie {
static class TrieNode {
TrieNode[] children = new TrieNode[26];
boolean isEndOfWord;
}
TrieNode root = new TrieNode();
void insert(String word) {
TrieNode node = root;
for (char c : word.toCharArray()) {
int index = c - 'a';
if (node.children[index] == null) node.children[index] = new TrieNode();
node = node.children[index];
}
node.isEndOfWord = true;
}
boolean search(String word) {
TrieNode node = root;
for (char c : word.toCharArray()) {
int index = c - 'a';
if (node.children[index] == null) return false;
node = node.children[index];
}
return node.isEndOfWord;
}
public static void main(String[] args) {
Trie trie = new Trie();
trie.insert("apple");
System.out.println("Apple: " + trie.search("apple"));
System.out.println("App: " + trie.search("app"));
}
}
输出:
Apple: true
App: false
3. Rabin-Karp:基于哈希的匹配
Rabin-Karp使用哈希技术查找模式串,平均时间复杂度为O(n + m),最坏情况下为O(nm)。
Java实现
public class RabinKarp {
public static void search(String pat, String txt, int q) {
int d = 256, M = pat.length(), N = txt.length(), p = 0, t = 0, h = 1;
for (int i = 0; i < M - 1; i++) h = (h * d) % q;
for (int i = 0; i < M; i++) {
p = (d * p + pat.charAt(i)) % q;
t = (d * t + txt.charAt(i)) % q;
}
for (int i = 0; i <= N - M; i++) {
if (p == t) {
boolean match = true;
for (int j = 0; j < M; j++) {
if (pat.charAt(j) != txt.charAt(i + j)) { match = false; break; }
}
if (match) System.out.println("Found at " + i);
}
if (i < N - M) {
t = (d * (t - txt.charAt(i) * h) + txt.charAt(i + M)) % q;
if (t < 0) t += q;
}
}
}
public static void main(String[] args) {
String txt = "GEEKSFORGEEKS";
String pat = "GEEKS";
int q = 101; // 质数
search(pat, txt, q);
}
}
输出:
Found at 0
Found at 8