Given a sequence of words, print all anagrams together


Given a sequence of words, print all anagrams together | Set 2 | GeeksforGeeks
Given an array of words, print all anagrams together. For example, if the given array is {“cat”, “dog”, “tac”, “god”, “act”}, then output may be “cat tac act dog god”.

1) Create an empty Trie
2) One by one take all words of input sequence. Do following for each word
a) Copy the word to a buffer.
b) Sort the buffer
c) Insert the sorted buffer and index of this word to Trie. Each leaf node of Trie is head of a Index list. The Index list stores index of words in original sequence. If sorted buffe is already present, we insert index of this word to the index list.
3) Traverse Trie. While traversing, if you reach a leaf node, traverse the index list. And print all words using the index obtained from Index list.

// This function traverses the built trie. When a leaf node is reached,
// all words connected at that leaf node are anagrams. So it traverses
// the list at leaf node and uses stored index to print original words
void printAnagramsUtil(struct TrieNode* root, char *wordArr[])
{
    if (root == NULL)
        return;
    // If a lead node is reached, print all anagrams using the indexes
    // stored in index linked list
    if (root->isEnd)
    {
        // traverse the list
        IndexNode* pCrawl = root->head;
        while (pCrawl != NULL)
        {
            printf( "%s \n", wordArr[ pCrawl->index ] );
            pCrawl = pCrawl->next;
        }
    }
    for (int i = 0; i < NO_OF_CHARS; ++i)
        printAnagramsUtil(root->child[i], wordArr);
}
// The main function that prints all anagrams together. wordArr[] is input
// sequence of words.
void printAnagramsTogether(char* wordArr[], int size)
{
    // Create an empty Trie
    struct TrieNode* root = NULL;
    // Iterate through all input words
    for (int i = 0; i < size; ++i)
    {
        // Create a buffer for this word and copy the word to buffer
        int len = strlen(wordArr[i]);
        char *buffer = new char[len+1];
        strcpy(buffer, wordArr[i]);
        // Sort the buffer
        qsort( (void*)buffer, strlen(buffer), sizeof(char), compare );
        // Insert the sorted buffer and its original index to Trie
        insert(&root,  buffer, i);
    }
    // Traverse the built Trie and print all anagrms together
    printAnagramsUtil(root, wordArr);
}
http://ajeetsingh.org/2013/10/17/write-a-dictionary-using-thousands-of-words-to-find-all-anagrams-for-a-given-string-with-o1-complexity/
Suppose we a thousands of words and we need to maintain these words in a data structure in such a way that we should be able to find all anagrams for a given string.
I tried to achieve this with O(1) complexity.
Don't need to sort word - get its hash code
Step 1:Create an array of prime numbers.
       int primes[] = {2, 3, 5, 7, ...};
       We are using prime number to avoid false collisions.
Step 2:Create a method to calculate hash code of a word\string.
       int getHashCode(String str){
         int hash = 31;
         for(i =0 to length of str){
            hash = hash*primes['a' - str.charAt[i]];
         }
         return hash;
       }
Step 3: Now store all words in a HashMap<Integer, List<String>.
   void loadDictionary(String[] words){
      for( word from words for i = 0 to length of words)   {
         int hash  = getHashCode(word);
         List<String> anagrams = dictionary.get(hash);
         if(anagrams ! = null){
             anagrams.add(word);
         } else
            List<String> newAnagrams = new ArrayList<String>();
            newAnagrams.add(word);
            dictionary.put(hash, newAnagrams);
         }
    }
   }
Step 4: Now here is the approach to find anagrams:
   int findNumberOfAnagrams(String str){
   List<String> anagrams = dictionary.get(getHashCode(str));
      return anagrams.size();
   }
here is my solution using hashmap.... 
http://ideone.com/4GVgla
  1. void printAnagram(string s[],int n)
  2. {
  3. unordered_map<string,vector<string> > m;
  4. string t;
  5. for(int i=0;i<n;i++)
  6. {
  7. t=s[i];
  8. sort(t.begin(),t.end());
  9. m[t].push_back(s[i]);
  10. }
  11. for(unordered_map<string,vector<string> >::iterator it=m.begin();it!=m.end();it++)
  12. {
  13. for(int i=0;i<it->second.size();i++)
  14. cout<<it->second[i]<<endl;
  15. }
  16. }
http://www.geeksforgeeks.org/given-a-sequence-of-words-print-all-anagrams-together/
simple method is to create a Hash Table. Calculate the hash value of each word in such a way that all anagrams have the same hash value. Populate the Hash Table with these hash values. Finally, print those words together with same hash values. A simple hashing mechanism can be modulo sum of all characters. With modulo sum, two non-anagram words may have same hash value. This can be handled by matching individual characters.
struct Word
{
    char* str;  // to store word itself
    int index; // index of the word in the original array
};
// structure to represent duplicate array.
struct DupArray
{
    struct Word* array;  // Array of words
    int size;   // Size of array
};
void printAnagramsTogether(char* wordArr[], int size)
{
    // Step 1: Create a copy of all words present in given wordArr.
    // The copy will also have orignal indexes of words
    struct DupArray* dupArray = createDupArray(wordArr, size);
    // Step 2: Iterate through all words in dupArray and sort individual words.
    int i;
    for (i = 0; i < size; ++i)
       qsort(dupArray->array[i].str,
             strlen(dupArray->array[i].str), sizeof(char), compChar);
    // Step 3: Now sort the array of words in dupArray
    qsort(dupArray->array, size, sizeof(dupArray->array[0]), compStr);
    // Step 4: Now all words in dupArray are together, but these words are
    // changed. Use the index member of word struct to get the corresponding
    // original word
    for (i = 0; i < size; ++i)
        printf("%s ", wordArr[dupArray->array[i].index]);
}
http://www.cnblogs.com/grandyang/p/4882024.html
让我们给一个字符串数组排序,让所有的变位词Anagrams排在一起,关于变位词,LeetCode里有两道相关的题目Anagrams 错位词Valid Anagram 验证变位词。那么对于这道题,我们有两种方法可以实现,先来看第一种方法,来重写sort中的比较函数compare
bool cmp(const string &a, const string &b) {
    string m = a, n = b;
    sort(m.begin(), m.end());
    sort(n.begin(), n.end());
    return !m.compare(n);
}
sort(array.begin(), array.end(), cmp);
https://gist.github.com/anupsavvy/1113100
public static String[] sortStrings(String[] arr){

if(arr == null || arr.length == 0 || arr.length == 1)
return arr;

class StringComparator implements Comparator<String>{

private String sort(String s){
char[] letters = s.toCharArray();
Arrays.sort(letters);
return new String(letters);
}

public int compare(String s1, String s2){
return sort(s1).compareTo(sort(s2));
}
}

Arrays.sort(arr,new StringComparator());
return arr;

}
另一种解法较为复杂一些,用到了哈希表来建立排序后的字符串和其所有的异位词集合的映射,最后在按集合填充原数组
    void sortArray(vector<string> &array) {
        unordered_map<string, vector<string> > m;
        for (auto &a : array) {
            string key = a;
            sort(key.begin(), key.end());
            if (m.find(key) == m.end()) {
                m[key] = vector<string>();
            }
            vector<string> &v = m[key];
            v.push_back(a);
        }
        int idx = 0;
        for (unordered_map<string, vector<string> >::iterator it = m.begin(); it != m.end(); ++it) {
            for (auto &a : it->second) {
                array[idx++] = a;
            }
        }
    }

https://www.geeksforgeeks.org/given-a-sequence-of-words-print-all-anagrams-together/
simple method is to create a Hash Table. Calculate the hash value of each word in such a way that all anagrams have the same hash value. Populate the Hash Table with these hash values. Finally, print those words together with same hash values. A simple hashing mechanism can be modulo sum of all characters. With modulo sum, two non-anagram words may have same hash value. This can be handled by matching individual characters.
Following is another method to print all anagrams together. Take two auxiliary arrays, index array and word array. Populate the word array with the given sequence of words. Sort each individual word of the word array. Finally, sort the word array and keep track of the corresponding indices. After sorting, all the anagrams cluster together. Use the index array to print the strings from the original array of strings.
Let us understand the steps with following input Sequence of Words:
"cat", "dog", "tac", "god", "act"
1) Create two auxiliary arrays index[] and words[]. Copy all given words to words[] and store the original indexes in index[]
index[]:  0   1   2   3   4
words[]: cat dog tac god act
2) Sort individual words in words[]. Index array doesn’t change.
index[]:   0    1    2    3    4
words[]:  act  dgo  act  dgo  act
3) Sort the words array. Compare individual words using strcmp() to sort
index:     0    2    4    1    3
words[]:  act  act  act  dgo  dgo
4) All anagrams come together. But words are changed in words array. To print the original words, take index from the index array and use it in the original array. We get
"cat tac act dog god"
    private static void printAnagrams(String arr[])
    {
        HashMap<Integer, List<String> > map = new HashMap<>();
  
        // loop over all words
        for (int i = 0; i < arr.length; i++) {
  
            // convert to char array, sort and
            // then re-convert to string
            String word = arr[i];
            char[] letters = word.toCharArray();
            Arrays.sort(letters);
            String newWord = new String(letters);
  
            // calculate hashcode of string
            // after sorting
            int n = newWord.hashCode();
            if (map.containsKey(n)) {
  
                // Here, we already have
                // a word for the hashcode
                List<String> words = map.get(n);
                words.add(word);
                map.put(n, words);
            }
            else {
  
                // This is the first time we are
                // adding a word for a specific
                // hashcode
                List<String> words = new ArrayList<>();
                words.add(word);
                map.put(n, words);
            }
        }
  
        // print all the values where size is > 1
        // If you want to print non-anagrams,
        // just print the values having size = 1
        for (Integer i : map.keySet()) {
            List<String> values = map.get(i);
            if (values.size() > 1) {
                System.out.print(values);
            }
        }
    }
  
Read full article from Given a sequence of words, print all anagrams together | Set 2 | GeeksforGeeks

Labels

LeetCode (1432) GeeksforGeeks (1122) LeetCode - Review (1067) Review (882) Algorithm (668) to-do (609) Classic Algorithm (270) Google Interview (237) Classic Interview (222) Dynamic Programming (220) DP (186) Bit Algorithms (145) POJ (141) Math (137) Tree (132) LeetCode - Phone (129) EPI (122) Cracking Coding Interview (119) DFS (115) Difficult Algorithm (115) Lintcode (115) Different Solutions (110) Smart Algorithm (104) Binary Search (96) BFS (91) HackerRank (90) Binary Tree (86) Hard (79) Two Pointers (78) Stack (76) Company-Facebook (75) BST (72) Graph Algorithm (72) Time Complexity (69) Greedy Algorithm (68) Interval (63) Company - Google (62) Geometry Algorithm (61) Interview Corner (61) LeetCode - Extended (61) Union-Find (60) Trie (58) Advanced Data Structure (56) List (56) Priority Queue (53) Codility (52) ComProGuide (50) LeetCode Hard (50) Matrix (50) Bisection (48) Segment Tree (48) Sliding Window (48) USACO (46) Space Optimization (45) Company-Airbnb (41) Greedy (41) Mathematical Algorithm (41) Tree - Post-Order (41) ACM-ICPC (40) Algorithm Interview (40) Data Structure Design (40) Graph (40) Backtracking (39) Data Structure (39) Jobdu (39) Random (39) Codeforces (38) Knapsack (38) LeetCode - DP (38) Recursive Algorithm (38) String Algorithm (38) TopCoder (38) Sort (37) Introduction to Algorithms (36) Pre-Sort (36) Beauty of Programming (35) Must Known (34) Binary Search Tree (33) Follow Up (33) prismoskills (33) Palindrome (32) Permutation (31) Array (30) Google Code Jam (30) HDU (30) Array O(N) (29) Logic Thinking (29) Monotonic Stack (29) Puzzles (29) Code - Detail (27) Company-Zenefits (27) Microsoft 100 - July (27) Queue (27) Binary Indexed Trees (26) TreeMap (26) to-do-must (26) 1point3acres (25) GeeksQuiz (25) Merge Sort (25) Reverse Thinking (25) hihocoder (25) Company - LinkedIn (24) Hash (24) High Frequency (24) Summary (24) Divide and Conquer (23) Proof (23) Game Theory (22) Topological Sort (22) Lintcode - Review (21) Tree - Modification (21) Algorithm Game (20) CareerCup (20) Company - Twitter (20) DFS + Review (20) DP - Relation (20) Brain Teaser (19) DP - Tree (19) Left and Right Array (19) O(N) (19) Sweep Line (19) UVA (19) DP - Bit Masking (18) LeetCode - Thinking (18) KMP (17) LeetCode - TODO (17) Probabilities (17) Simulation (17) String Search (17) Codercareer (16) Company-Uber (16) Iterator (16) Number (16) O(1) Space (16) Shortest Path (16) itint5 (16) DFS+Cache (15) Dijkstra (15) Euclidean GCD (15) Heap (15) LeetCode - Hard (15) Majority (15) Number Theory (15) Rolling Hash (15) Tree Traversal (15) Brute Force (14) Bucket Sort (14) DP - Knapsack (14) DP - Probability (14) Difficult (14) Fast Power Algorithm (14) Pattern (14) Prefix Sum (14) TreeSet (14) Algorithm Videos (13) Amazon Interview (13) Basic Algorithm (13) Codechef (13) Combination (13) Computational Geometry (13) DP - Digit (13) LCA (13) LeetCode - DFS (13) Linked List (13) Long Increasing Sequence(LIS) (13) Math-Divisible (13) Reservoir Sampling (13) mitbbs (13) Algorithm - How To (12) Company - Microsoft (12) DP - Interval (12) DP - Multiple Relation (12) DP - Relation Optimization (12) LeetCode - Classic (12) Level Order Traversal (12) Prime (12) Pruning (12) Reconstruct Tree (12) Thinking (12) X Sum (12) AOJ (11) Bit Mask (11) Company-Snapchat (11) DP - Space Optimization (11) Dequeue (11) Graph DFS (11) MinMax (11) Miscs (11) Princeton (11) Quick Sort (11) Stack - Tree (11) 尺取法 (11) 挑战程序设计竞赛 (11) Coin Change (10) DFS+Backtracking (10) Facebook Hacker Cup (10) Fast Slow Pointers (10) HackerRank Easy (10) Interval Tree (10) Limited Range (10) Matrix - Traverse (10) Monotone Queue (10) SPOJ (10) Starting Point (10) States (10) Stock (10) Theory (10) Tutorialhorizon (10) Kadane - Extended (9) Mathblog (9) Max-Min Flow (9) Maze (9) Median (9) O(32N) (9) Quick Select (9) Stack Overflow (9) System Design (9) Tree - Conversion (9) Use XOR (9) Book Notes (8) Company-Amazon (8) DFS+BFS (8) DP - States (8) Expression (8) Longest Common Subsequence(LCS) (8) One Pass (8) Quadtrees (8) Traversal Once (8) Trie - Suffix (8) 穷竭搜索 (8) Algorithm Problem List (7) All Sub (7) Catalan Number (7) Cycle (7) DP - Cases (7) Facebook Interview (7) Fibonacci Numbers (7) Flood fill (7) Game Nim (7) Graph BFS (7) HackerRank Difficult (7) Hackerearth (7) Inversion (7) Kadane’s Algorithm (7) Manacher (7) Morris Traversal (7) Multiple Data Structures (7) Normalized Key (7) O(XN) (7) Radix Sort (7) Recursion (7) Sampling (7) Suffix Array (7) Tech-Queries (7) Tree - Serialization (7) Tree DP (7) Trie - Bit (7) 蓝桥杯 (7) Algorithm - Brain Teaser (6) BFS - Priority Queue (6) BFS - Unusual (6) Classic Data Structure Impl (6) DP - 2D (6) DP - Monotone Queue (6) DP - Unusual (6) DP-Space Optimization (6) Dutch Flag (6) How To (6) Interviewstreet (6) Knapsack - MultiplePack (6) Local MinMax (6) MST (6) Minimum Spanning Tree (6) Number - Reach (6) Parentheses (6) Pre-Sum (6) Probability (6) Programming Pearls (6) Rabin-Karp (6) Reverse (6) Scan from right (6) Schedule (6) Stream (6) Subset Sum (6) TSP (6) Xpost (6) n00tc0d3r (6) reddit (6) AI (5) Abbreviation (5) Anagram (5) Art Of Programming-July (5) Assumption (5) Bellman Ford (5) Big Data (5) Code - Solid (5) Code Kata (5) Codility-lessons (5) Coding (5) Company - WMware (5) Convex Hull (5) Crazyforcode (5) DFS - Multiple (5) DFS+DP (5) DP - Multi-Dimension (5) DP-Multiple Relation (5) Eulerian Cycle (5) Graph - Unusual (5) Graph Cycle (5) Hash Strategy (5) Immutability (5) Java (5) LogN (5) Manhattan Distance (5) Matrix Chain Multiplication (5) N Queens (5) Pre-Sort: Index (5) Quick Partition (5) Quora (5) Randomized Algorithms (5) Resources (5) Robot (5) SPFA(Shortest Path Faster Algorithm) (5) Shuffle (5) Sieve of Eratosthenes (5) Strongly Connected Components (5) Subarray Sum (5) Sudoku (5) Suffix Tree (5) Swap (5) Threaded (5) Tree - Creation (5) Warshall Floyd (5) Word Search (5) jiuzhang (5)

Popular Posts