Given a sequence of words, print all anagrams together | Set 2 | GeeksforGeeks
Given an array of words, print all anagrams together. For example, if the given array is {“cat”, “dog”, “tac”, “god”, “act”}, then output may be “cat tac act dog god”.
1) Create an empty Trie
2) One by one take all words of input sequence. Do following for each word
…a) Copy the word to a buffer.
…b) Sort the buffer
…c) Insert the sorted buffer and index of this word to Trie. Each leaf node of Trie is head of a Index list. The Index list stores index of words in original sequence. If sorted buffe is already present, we insert index of this word to the index list.
3) Traverse Trie. While traversing, if you reach a leaf node, traverse the index list. And print all words using the index obtained from Index list.
http://ajeetsingh.org/2013/10/17/write-a-dictionary-using-thousands-of-words-to-find-all-anagrams-for-a-given-string-with-o1-complexity/
Suppose we a thousands of words and we need to maintain these words in a data structure in such a way that we should be able to find all anagrams for a given string.
I tried to achieve this with O(1) complexity.
Don't need to sort word - get its hash code
here is my solution using hashmap....
http://ideone.com/4GVgla
http://www.cnblogs.com/grandyang/p/4882024.html
让我们给一个字符串数组排序,让所有的变位词Anagrams排在一起,关于变位词,LeetCode里有两道相关的题目Anagrams 错位词和Valid Anagram 验证变位词。那么对于这道题,我们有两种方法可以实现,先来看第一种方法,来重写sort中的比较函数compare
public static String[] sortStrings(String[] arr){
if(arr == null || arr.length == 0 || arr.length == 1)
return arr;
class StringComparator implements Comparator<String>{
private String sort(String s){
char[] letters = s.toCharArray();
Arrays.sort(letters);
return new String(letters);
}
public int compare(String s1, String s2){
return sort(s1).compareTo(sort(s2));
}
}
Arrays.sort(arr,new StringComparator());
return arr;
}
另一种解法较为复杂一些,用到了哈希表来建立排序后的字符串和其所有的异位词集合的映射,最后在按集合填充原数组
https://www.geeksforgeeks.org/given-a-sequence-of-words-print-all-anagrams-together/
Read full article from
Given a sequence of words, print all anagrams together | Set 2 | GeeksforGeeks
Given an array of words, print all anagrams together. For example, if the given array is {“cat”, “dog”, “tac”, “god”, “act”}, then output may be “cat tac act dog god”.
1) Create an empty Trie
2) One by one take all words of input sequence. Do following for each word
…a) Copy the word to a buffer.
…b) Sort the buffer
…c) Insert the sorted buffer and index of this word to Trie. Each leaf node of Trie is head of a Index list. The Index list stores index of words in original sequence. If sorted buffe is already present, we insert index of this word to the index list.
3) Traverse Trie. While traversing, if you reach a leaf node, traverse the index list. And print all words using the index obtained from Index list.
// This function traverses the built trie. When a leaf node is reached,
// all words connected at that leaf node are anagrams. So it traverses
// the list at leaf node and uses stored index to print original words
void
printAnagramsUtil(
struct
TrieNode* root,
char
*wordArr[])
{
if
(root == NULL)
return
;
// If a lead node is reached, print all anagrams using the indexes
// stored in index linked list
if
(root->isEnd)
{
// traverse the list
IndexNode* pCrawl = root->head;
while
(pCrawl != NULL)
{
printf
(
"%s \n"
, wordArr[ pCrawl->index ] );
pCrawl = pCrawl->next;
}
}
for
(
int
i = 0; i < NO_OF_CHARS; ++i)
printAnagramsUtil(root->child[i], wordArr);
}
// The main function that prints all anagrams together. wordArr[] is input
// sequence of words.
void
printAnagramsTogether(
char
* wordArr[],
int
size)
{
// Create an empty Trie
struct
TrieNode* root = NULL;
// Iterate through all input words
for
(
int
i = 0; i < size; ++i)
{
// Create a buffer for this word and copy the word to buffer
int
len =
strlen
(wordArr[i]);
char
*buffer =
new
char
[len+1];
strcpy
(buffer, wordArr[i]);
// Sort the buffer
qsort
( (
void
*)buffer,
strlen
(buffer),
sizeof
(
char
), compare );
// Insert the sorted buffer and its original index to Trie
insert(&root, buffer, i);
}
// Traverse the built Trie and print all anagrms together
printAnagramsUtil(root, wordArr);
}
Suppose we a thousands of words and we need to maintain these words in a data structure in such a way that we should be able to find all anagrams for a given string.
I tried to achieve this with O(1) complexity.
Don't need to sort word - get its hash code
Step
1
:Create an array of prime numbers.
int
primes[] = {
2
,
3
,
5
,
7
, ...};
We are using prime number to avoid
false
collisions.
Step
2
:Create a method to calculate hash code of a word\string.
int
getHashCode(String str){
int
hash =
31
;
for
(i =
0
to length of str){
hash = hash*primes[
'a'
- str.charAt[i]];
}
return
hash;
}
Step
3
: Now store all words in a HashMap<Integer, List<String>.
void
loadDictionary(String[] words){
for
( word from words
for
i =
0
to length of words) {
int
hash = getHashCode(word);
List<String> anagrams = dictionary.get(hash);
if
(anagrams ! =
null
){
anagrams.add(word);
}
else
List<String> newAnagrams =
new
ArrayList<String>();
newAnagrams.add(word);
dictionary.put(hash, newAnagrams);
}
}
}
Step
4
: Now here is the approach to find anagrams:
int
findNumberOfAnagrams(String str){
List<String> anagrams = dictionary.get(getHashCode(str));
return
anagrams.size();
}
http://ideone.com/4GVgla
- void printAnagram(string s[],int n)
- {
- unordered_map<string,vector<string> > m;
- string t;
- for(int i=0;i<n;i++)
- {
- t=s[i];
- sort(t.begin(),t.end());
- m[t].push_back(s[i]);
- }
- for(unordered_map<string,vector<string> >::iterator it=m.begin();it!=m.end();it++)
- {
- for(int i=0;i<it->second.size();i++)
- cout<<it->second[i]<<endl;
- }
- }
A simple method is to create a Hash Table. Calculate the hash value of each word in such a way that all anagrams have the same hash value. Populate the Hash Table with these hash values. Finally, print those words together with same hash values. A simple hashing mechanism can be modulo sum of all characters. With modulo sum, two non-anagram words may have same hash value. This can be handled by matching individual characters.
struct
Word
{
char
* str;
// to store word itself
int
index;
// index of the word in the original array
};
// structure to represent duplicate array.
struct
DupArray
{
struct
Word* array;
// Array of words
int
size;
// Size of array
};
void
printAnagramsTogether(
char
* wordArr[],
int
size)
{
// Step 1: Create a copy of all words present in given wordArr.
// The copy will also have orignal indexes of words
struct
DupArray* dupArray = createDupArray(wordArr, size);
// Step 2: Iterate through all words in dupArray and sort individual words.
int
i;
for
(i = 0; i < size; ++i)
qsort
(dupArray->array[i].str,
strlen
(dupArray->array[i].str),
sizeof
(
char
), compChar);
// Step 3: Now sort the array of words in dupArray
qsort
(dupArray->array, size,
sizeof
(dupArray->array[0]), compStr);
// Step 4: Now all words in dupArray are together, but these words are
// changed. Use the index member of word struct to get the corresponding
// original word
for
(i = 0; i < size; ++i)
printf
(
"%s "
, wordArr[dupArray->array[i].index]);
}
让我们给一个字符串数组排序,让所有的变位词Anagrams排在一起,关于变位词,LeetCode里有两道相关的题目Anagrams 错位词和Valid Anagram 验证变位词。那么对于这道题,我们有两种方法可以实现,先来看第一种方法,来重写sort中的比较函数compare
bool cmp(const string &a, const string &b) { string m = a, n = b; sort(m.begin(), m.end()); sort(n.begin(), n.end()); return !m.compare(n); } sort(array.begin(), array.end(), cmp);https://gist.github.com/anupsavvy/1113100
public static String[] sortStrings(String[] arr){
if(arr == null || arr.length == 0 || arr.length == 1)
return arr;
class StringComparator implements Comparator<String>{
private String sort(String s){
char[] letters = s.toCharArray();
Arrays.sort(letters);
return new String(letters);
}
public int compare(String s1, String s2){
return sort(s1).compareTo(sort(s2));
}
}
Arrays.sort(arr,new StringComparator());
return arr;
}
另一种解法较为复杂一些,用到了哈希表来建立排序后的字符串和其所有的异位词集合的映射,最后在按集合填充原数组
void sortArray(vector<string> &array) { unordered_map<string, vector<string> > m; for (auto &a : array) { string key = a; sort(key.begin(), key.end()); if (m.find(key) == m.end()) { m[key] = vector<string>(); } vector<string> &v = m[key]; v.push_back(a); } int idx = 0; for (unordered_map<string, vector<string> >::iterator it = m.begin(); it != m.end(); ++it) { for (auto &a : it->second) { array[idx++] = a; } } }
https://www.geeksforgeeks.org/given-a-sequence-of-words-print-all-anagrams-together/
A simple method is to create a Hash Table. Calculate the hash value of each word in such a way that all anagrams have the same hash value. Populate the Hash Table with these hash values. Finally, print those words together with same hash values. A simple hashing mechanism can be modulo sum of all characters. With modulo sum, two non-anagram words may have same hash value. This can be handled by matching individual characters.
Following is another method to print all anagrams together. Take two auxiliary arrays, index array and word array. Populate the word array with the given sequence of words. Sort each individual word of the word array. Finally, sort the word array and keep track of the corresponding indices. After sorting, all the anagrams cluster together. Use the index array to print the strings from the original array of strings.
Let us understand the steps with following input Sequence of Words:
"cat", "dog", "tac", "god", "act"
1) Create two auxiliary arrays index[] and words[]. Copy all given words to words[] and store the original indexes in index[]
index[]: 0 1 2 3 4 words[]: cat dog tac god act
2) Sort individual words in words[]. Index array doesn’t change.
index[]: 0 1 2 3 4 words[]: act dgo act dgo act
3) Sort the words array. Compare individual words using strcmp() to sort
index: 0 2 4 1 3 words[]: act act act dgo dgo
4) All anagrams come together. But words are changed in words array. To print the original words, take index from the index array and use it in the original array. We get
"cat tac act dog god"