LeetCode [271] Encode and Decode Strings


LeetCode [271] Encode and Decode Strings
http://segmentfault.com/a/1190000003791865
Design an algorithm to encode a list of strings to a string. The encoded string is then sent over the network and is decoded back to the original list of strings.
Machine 1 (sender) has the function:
string encode(vector<string> strs) { // ... your code return encoded_string; }
Machine 2 (receiver) has the function:
vector<string> decode(string s) { //... your code return strs; }
So Machine 1 does:
string encoded_string = encode(strs);
and Machine 2 does:
vector<string> strs2 = decode(encoded_string);
strs2 in Machine 2 should be the same as strs in Machine 1.
Implement the encode and decode methods.
Note: The string may contain any possible characters out of 256 valid ascii characters. Your algorithm should be generalized enough to work on any possible characters. Do not use class member/global/static variables to store states. Your encode and decode algorithms should be stateless. Do not rely on any library method such as eval or serialize methods. You should implement your own encode/decode algorithm.
Also http://www.meetqun.com/thread-11463-1-1.html
    // Encodes a list of strings to a single string.
    public String encode(List<String> strs) {
        StringBuilder output = new StringBuilder();
        for(String str : strs){
            // 对于每个子串,先把其长度放在前面,用#隔开
            output.append(String.valueOf(str.length())+"#");
            // 再把子串本身放在后面
            output.append(str);
        }
        return output.toString();
    }

    // Decodes a single string to a list of strings.
    public List<String> decode(String s) {
        List<String> res = new LinkedList<String>();
        int start = 0;
        while(start < s.length()){
            // 找到从start开始的第一个#,这个#前面是长度
            int idx = s.indexOf('#', start);
            int size = Integer.parseInt(s.substring(start, idx));
            // 根据这个长度截取子串
            res.add(s.substring(idx + 1, idx + size + 1));
            // 更新start为子串后面一个位置
            start = idx + size + 1;
        }
        return res;
    }
Better: https://leetcode.com/discuss/57716/java-solution-string-length-delimiter-as-header
public String encode(List<String> strs) { StringBuffer result = new StringBuffer(); if(strs == null || strs.size() == 0) return result.toString(); for(String str: strs){ result.append(str.length()); result.append("#"); result.append(str); } return result.toString(); } // Decodes a single string to a list of strings. public List<String> decode(String s) { List<String> result = new ArrayList(); if(s == null || s.length() == 0) return result; int current = 0; while(true){ if(current == s.length()) break; StringBuffer sb = new StringBuffer(); while(s.charAt(current) != '#'){ sb.append(s.charAt(current)); current++; } int len = Integer.parseInt(sb.toString()); int end = current + 1 + len; result.add(s.substring(current+1, end)); current = end; } return result; }

X. http://leetcode.tgic.me/encode-and-decode-strings/index.html TODO

X. http://nb4799.neu.edu/wordpress/?p=900
Also http://www.fgdsb.com/2015/01/06/encode-decode-strings/
class Codec:
    def encode(self, strs):
        return ''.join(s.replace('|', '||') + ' | ' for s in strs)
    def decode(self, s):
        return [t.replace('||', '|') for t in s.split(' | ')[:-1]]
Use ” | ” as a delimiter of strings. If any string has “|” in it, you need to replace it with “||” in advance. When decoding, you first get ” | ” splitted words. Then, if there is any “||” in splitted words, you need to replace it back with “|”. Don’t forget to exclude the last element ofs.split(' | ') since it is an empty string. 



http://shibaili.blogspot.com/2015/09/day-124-271-encode-and-decode-strings.html
  • The string may contain any possible characters out of 256 valid ascii characters. Your algorithm should be generalized enough to work on any possible characters.
  • Do not use class member/global/static variables to store states. Your encode and decode algorithms should be stateless.
  • Do not rely on any library method such as eval or serialize methods. You should implement your own encode/decode algorithm.
    // Encodes a list of strings to a single string.
    string encode(vector<string>& strs) {
        string rt = "";
        for (string s : strs) {
            rt += to_string(s.length()) + "/" + s;
        }
         
        return rt;
    }
    // Decodes a single string to a list of strings.
    vector<string> decode(string s) {
        vector<string> rt;
        int i = 0;
        while (i < s.length()) {
            int start = i;
            string part = "",length = "";
            while (isdigit(s[i])) {
                length += s[i];
                i++;
            }
            part = s.substr(i + 1,stoi(length));
            rt.push_back(part);
            i += stoi(length) + 1;
        }
        return rt;
    }

http://www.cnblogs.com/tonix/p/4768262.html
This is about https://en.wikipedia.org/wiki/Run-length_encoding. The trick is, for a valid char, we only compress up to 254 occurences - count 255 means end of a string.

    // Encodes a list of strings to a single string.
    string encode(vector<string>& strs) {
        string ret;
        for(auto s:strs){
            int i = 0, len = s.size(), cnt;
            while(i<len){
                cnt = 1;
                uchar c = s[i++];
                while(i<len && (uchar)s[i]==c && cnt<254){
                    i++; cnt++;
                }
                ret += (uchar)cnt;
                ret += c;
            }
            ret += (uchar)255;
        }
        return ret;
    }
    // Decodes a single string to a list of strings.
    vector<string> decode(string s) {
        vector<string> ret;
        string cur;
        int i=0, len = s.size();
        while(i<len){
            uchar c = s[i++];
            if(c==(uchar)255){
                ret.push_back(cur);
                cur.clear();
            }else{
                int cnt = c;
                c = s[i++];
                for(int i=0; i<cnt; ++i) cur += c;
            }
        }
        return ret;
    }
http://buttercola.blogspot.com/2015/09/leetcode-encode-and-decode-strings.html
    // Encodes a list of strings to a single string.
    public String encode(List<String> strs) {
        if (strs == null || strs.size() == 0) {
            return "";
        }
         
        StringBuffer sb = new StringBuffer();
         
        for (String str : strs) {
            int len = str == null ? 0 : str.length();
            sb.append(len);
            sb.append('#');
            sb.append(str);
        }
         
        return sb.toString();
    }
    // Decodes a single string to a list of strings.
    public List<String> decode(String s) {
        List<String> result = new ArrayList<String>();
        if (s == null || s.length() == 0) {
            return result;
        }
         
        int i = 0;
        while (i < s.length()) {
            int len = 0;
            // Get length
            while (i < s.length() && s.charAt(i) != '#') {
                len = len * 10 + Character.getNumericValue(s.charAt(i));
                i++;
            }
             
            String str = s.substring(i + 1, i + len + 1);
            result.add(str);
            i = i + len + 1;
        }
         
        return result;
    }
http://algobox.org/encode-and-decode-strings/


Encode: add the lengths of the strings seperated by ',' as metadata in front of the real data, then use ':' to separate the metadata and the real data. In the end append all the strings.
Decode: extract metadata by finding the first ':', then split the metadata and convert it to lengths. Finally, extract the strings from real data according to the lengths one by one.
    // Encodes a list of strings to a single string.
    public String encode(List<String> strs) {
        if (strs.isEmpty()) return "";
        StringBuilder builder = new StringBuilder();
        for (String str : strs) builder.append(str.length()).append(',');
        builder.setCharAt(builder.length() - 1, ':');
        for (String str : strs) builder.append(str);
        return builder.toString();
    }
    // Decodes a single string to a list of strings.
    public List<String> decode(String s) {
        List<String> ans = new ArrayList<>();
        int m = s.indexOf(':');
        if (m < 0) return ans;
        int i = m + 1;
        String[] strLens = s.substring(0, m).split(",");
        for (String strLen : strLens) {
            int len = Integer.parseInt(strLen);
            ans.add(s.substring(i, i + len));
            i += len;
        }
        return ans;
    }
Encode: pack every string into a package of (str.length+','+str).
Decode: find the comma and then get the length in front of it then extract the string behind it.
    public String encode(List<String> strs) {
        StringBuilder builder = new StringBuilder();
        for (String str : strs)
            builder.append(str.length()).append(',').append(str);
        return builder.toString();
    }
    // Decodes a single string to a list of strings.
    public List<String> decode(String s) {
        List<String> ans = new ArrayList<>();
        for (int i = 0, len; i < s.length(); i += len) {
            int comma = s.indexOf(',', i);
            len = Integer.parseInt(s.substring(i, comma));
            i = comma + 1;
            ans.add(s.substring(i, i + len));
        }
        return ans;
    }
X. https://leetcode.com/discuss/57472/java-with-escaping
Double any hashes inside the strings, then use standalone hashes (surrounded by spaces) to mark string endings. For example:
{"abc", "def"}    =>  "abc # def # "
{'abc', '#def'}   =>  "abc # ##def # "
{'abc##', 'def'}  =>  "abc#### # def # "
For decoding, just do the reverse: First split at standalone hashes, then undo the doubling in each string.
public String encode(List<String> strs) {
    StringBuffer out = new StringBuffer();
    for (String s : strs)
        out.append(s.replace("#", "##")).append(" # ");
    return out.toString();
}

public List<String> decode(String s) {
    List strs = new ArrayList();
    String[] array = s.split(" # ", -1);
    for (int i=0; i<array.length-1; ++i)
        strs.add(array[i].replace("##", "#"));
    return strs;
}
Or with streaming:
public String encode(List<String> strs) {
    return strs.stream()
               .map(s -> s.replace("#", "##") + " # ")
               .collect(Collectors.joining());
}

public List<String> decode(String s) {
    List strs = Stream.of(s.split(" # ", -1))
                      .map(t -> t.replace("##", "#"))
                      .collect(Collectors.toList());
    strs.remove(strs.size() - 1);
    return strs;
}
https://gist.github.com/gcrfelix/f342257a2a124f525c4c
Read full article from LeetCode [271] Encode and Decode Strings

Labels

LeetCode (1432) GeeksforGeeks (1122) LeetCode - Review (1067) Review (882) Algorithm (668) to-do (609) Classic Algorithm (270) Google Interview (237) Classic Interview (222) Dynamic Programming (220) DP (186) Bit Algorithms (145) POJ (141) Math (137) Tree (132) LeetCode - Phone (129) EPI (122) Cracking Coding Interview (119) DFS (115) Difficult Algorithm (115) Lintcode (115) Different Solutions (110) Smart Algorithm (104) Binary Search (96) BFS (91) HackerRank (90) Binary Tree (86) Hard (79) Two Pointers (78) Stack (76) Company-Facebook (75) BST (72) Graph Algorithm (72) Time Complexity (69) Greedy Algorithm (68) Interval (63) Company - Google (62) Geometry Algorithm (61) Interview Corner (61) LeetCode - Extended (61) Union-Find (60) Trie (58) Advanced Data Structure (56) List (56) Priority Queue (53) Codility (52) ComProGuide (50) LeetCode Hard (50) Matrix (50) Bisection (48) Segment Tree (48) Sliding Window (48) USACO (46) Space Optimization (45) Company-Airbnb (41) Greedy (41) Mathematical Algorithm (41) Tree - Post-Order (41) ACM-ICPC (40) Algorithm Interview (40) Data Structure Design (40) Graph (40) Backtracking (39) Data Structure (39) Jobdu (39) Random (39) Codeforces (38) Knapsack (38) LeetCode - DP (38) Recursive Algorithm (38) String Algorithm (38) TopCoder (38) Sort (37) Introduction to Algorithms (36) Pre-Sort (36) Beauty of Programming (35) Must Known (34) Binary Search Tree (33) Follow Up (33) prismoskills (33) Palindrome (32) Permutation (31) Array (30) Google Code Jam (30) HDU (30) Array O(N) (29) Logic Thinking (29) Monotonic Stack (29) Puzzles (29) Code - Detail (27) Company-Zenefits (27) Microsoft 100 - July (27) Queue (27) Binary Indexed Trees (26) TreeMap (26) to-do-must (26) 1point3acres (25) GeeksQuiz (25) Merge Sort (25) Reverse Thinking (25) hihocoder (25) Company - LinkedIn (24) Hash (24) High Frequency (24) Summary (24) Divide and Conquer (23) Proof (23) Game Theory (22) Topological Sort (22) Lintcode - Review (21) Tree - Modification (21) Algorithm Game (20) CareerCup (20) Company - Twitter (20) DFS + Review (20) DP - Relation (20) Brain Teaser (19) DP - Tree (19) Left and Right Array (19) O(N) (19) Sweep Line (19) UVA (19) DP - Bit Masking (18) LeetCode - Thinking (18) KMP (17) LeetCode - TODO (17) Probabilities (17) Simulation (17) String Search (17) Codercareer (16) Company-Uber (16) Iterator (16) Number (16) O(1) Space (16) Shortest Path (16) itint5 (16) DFS+Cache (15) Dijkstra (15) Euclidean GCD (15) Heap (15) LeetCode - Hard (15) Majority (15) Number Theory (15) Rolling Hash (15) Tree Traversal (15) Brute Force (14) Bucket Sort (14) DP - Knapsack (14) DP - Probability (14) Difficult (14) Fast Power Algorithm (14) Pattern (14) Prefix Sum (14) TreeSet (14) Algorithm Videos (13) Amazon Interview (13) Basic Algorithm (13) Codechef (13) Combination (13) Computational Geometry (13) DP - Digit (13) LCA (13) LeetCode - DFS (13) Linked List (13) Long Increasing Sequence(LIS) (13) Math-Divisible (13) Reservoir Sampling (13) mitbbs (13) Algorithm - How To (12) Company - Microsoft (12) DP - Interval (12) DP - Multiple Relation (12) DP - Relation Optimization (12) LeetCode - Classic (12) Level Order Traversal (12) Prime (12) Pruning (12) Reconstruct Tree (12) Thinking (12) X Sum (12) AOJ (11) Bit Mask (11) Company-Snapchat (11) DP - Space Optimization (11) Dequeue (11) Graph DFS (11) MinMax (11) Miscs (11) Princeton (11) Quick Sort (11) Stack - Tree (11) 尺取法 (11) 挑战程序设计竞赛 (11) Coin Change (10) DFS+Backtracking (10) Facebook Hacker Cup (10) Fast Slow Pointers (10) HackerRank Easy (10) Interval Tree (10) Limited Range (10) Matrix - Traverse (10) Monotone Queue (10) SPOJ (10) Starting Point (10) States (10) Stock (10) Theory (10) Tutorialhorizon (10) Kadane - Extended (9) Mathblog (9) Max-Min Flow (9) Maze (9) Median (9) O(32N) (9) Quick Select (9) Stack Overflow (9) System Design (9) Tree - Conversion (9) Use XOR (9) Book Notes (8) Company-Amazon (8) DFS+BFS (8) DP - States (8) Expression (8) Longest Common Subsequence(LCS) (8) One Pass (8) Quadtrees (8) Traversal Once (8) Trie - Suffix (8) 穷竭搜索 (8) Algorithm Problem List (7) All Sub (7) Catalan Number (7) Cycle (7) DP - Cases (7) Facebook Interview (7) Fibonacci Numbers (7) Flood fill (7) Game Nim (7) Graph BFS (7) HackerRank Difficult (7) Hackerearth (7) Inversion (7) Kadane’s Algorithm (7) Manacher (7) Morris Traversal (7) Multiple Data Structures (7) Normalized Key (7) O(XN) (7) Radix Sort (7) Recursion (7) Sampling (7) Suffix Array (7) Tech-Queries (7) Tree - Serialization (7) Tree DP (7) Trie - Bit (7) 蓝桥杯 (7) Algorithm - Brain Teaser (6) BFS - Priority Queue (6) BFS - Unusual (6) Classic Data Structure Impl (6) DP - 2D (6) DP - Monotone Queue (6) DP - Unusual (6) DP-Space Optimization (6) Dutch Flag (6) How To (6) Interviewstreet (6) Knapsack - MultiplePack (6) Local MinMax (6) MST (6) Minimum Spanning Tree (6) Number - Reach (6) Parentheses (6) Pre-Sum (6) Probability (6) Programming Pearls (6) Rabin-Karp (6) Reverse (6) Scan from right (6) Schedule (6) Stream (6) Subset Sum (6) TSP (6) Xpost (6) n00tc0d3r (6) reddit (6) AI (5) Abbreviation (5) Anagram (5) Art Of Programming-July (5) Assumption (5) Bellman Ford (5) Big Data (5) Code - Solid (5) Code Kata (5) Codility-lessons (5) Coding (5) Company - WMware (5) Convex Hull (5) Crazyforcode (5) DFS - Multiple (5) DFS+DP (5) DP - Multi-Dimension (5) DP-Multiple Relation (5) Eulerian Cycle (5) Graph - Unusual (5) Graph Cycle (5) Hash Strategy (5) Immutability (5) Java (5) LogN (5) Manhattan Distance (5) Matrix Chain Multiplication (5) N Queens (5) Pre-Sort: Index (5) Quick Partition (5) Quora (5) Randomized Algorithms (5) Resources (5) Robot (5) SPFA(Shortest Path Faster Algorithm) (5) Shuffle (5) Sieve of Eratosthenes (5) Strongly Connected Components (5) Subarray Sum (5) Sudoku (5) Suffix Tree (5) Swap (5) Threaded (5) Tree - Creation (5) Warshall Floyd (5) Word Search (5) jiuzhang (5)

Popular Posts