Massive Algorithms: Design a data structure for sparse matrix + sparse vector

Design a data structure for sparse matrix + sparse vector

http://www.geeksforgeeks.org/sparse-matrix-representations-using-list-lists-dictionary-keys/

List of Lists (LIL)

One of the possible representation of sparse matrix is List of Lists (LIL). Where one list is used to represent the rows and each row contains the list of triples: Column index, Value(non – zero element) and address field, for non – zero elements. For the best performance both lists should be stored in order of ascending keys.

struct row_list

{

    int row_number;

    struct row_list *link_down;

    struct value_list *link_right;

};

// Node to represent triples

struct value_list

{

    int column_index;

    int value;

    struct value_list *next;

};

Dictionary of Keys

    map< pair<int,int>, int > new_matrix;

    for (int i = 0; i < R; i++)

        for (int j = 0; j < C; j++)

            if (Sparse_Matrix[i][j] != 0)

                new_matrix[make_pair(i+1,j+1)] =

                                Sparse_Matrix[i][j] ;

http://www.geeksforgeeks.org/sparse-matrix-representation/

A matrix is a two-dimensional data object made of m rows and n columns, therefore having total m x n values. If most of the elements of the matrix have 0 value, then it is called a sparse matrix.

Why to use Sparse Matrix instead of simple matrix ?

Storage: There are lesser non-zero elements than zeros and thus lesser memory can be used to store only those elements.
Computing time: Computing time can be saved by logically designing a data structure traversing only non-zero elements..

Representing a sparse matrix by a 2D array leads to wastage of lots of memory as zeroes in the matrix are of no use in most of the cases. So, instead of storing zeroes with non-zero elements, we only store non-zero elements. This means storing non-zero elements with triples- (Row, Column, value).

Sparse Matrix Representations can be done in many ways following are two common representations:

Array representation
Linked list representation

In linked list, each node has four fields. These four fields are defined as:

Row: Index of row, where non-zero element is located
Column: Index of column, where non-zero element is located
Value: Value of the non zero element located at index – (row,column)
Next node: Address of the next node

Other representations:
As a Dictionary where row and column numbers are used as keys and values are matrix entries. This method saves space but sequential access of items is costly.

As a list of list. The idea is to make a list of rows and every item of list contains values. We can keep list items sorted by column numbers.

struct row_list

{

    int row_number;

    struct row_list *link_down;

    struct value_list *link_right;

};

// Node to represent triples

struct value_list

{

    int column_index;

    int value;

    struct value_list *next;

};

An alternative representation of sparse matrix is Dictionary. For the key field of the dictionary, pair of row and column index is used that maps with the non – zero element of the matrix. This method saves space but sequential access of items is costly.

map< pair<int,int>, int > new_matrix;
Related: Design SparseVector
http://www.cs.cmu.edu/~scandal/cacm/node9.html

A =

2.0	-1.0	0	0
-1.0	2.0	-1.0	0
0	-1.0	2.0	-1.0
0	0	-1.0	2.0

is represented in this way as

A = [[(0, 2.0), (1, -1.0)],
     [(0, -1.0), (1, 2.0), (2, -1.0)],
     [(1, -1.0), (2, 2.0), (3, -1.0)],
     [(2, -1.0), (3, 2.0)]]

where A is a nested sequence.
http://introcs.cs.princeton.edu/java/44st/SparseMatrix.java.html

public class SparseMatrix {
    private final int N;           // N-by-N matrix
    private SparseVector[] rows;   // the rows, each row is a sparse vector

    // initialize an N-by-N matrix of all 0s
    public SparseMatrix(int N) {
        this.N  = N;
        rows = new SparseVector[N];
        for (int i = 0; i < N; i++)
            rows[i] = new SparseVector(N);
    }

    // put A[i][j] = value
    public void put(int i, int j, double value) {
        if (i < 0 || i >= N) throw new RuntimeException("Illegal index");
        if (j < 0 || j >= N) throw new RuntimeException("Illegal index");
        rows[i].put(j, value);
    }

    // return A[i][j]
    public double get(int i, int j) {
        if (i < 0 || i >= N) throw new RuntimeException("Illegal index");
        if (j < 0 || j >= N) throw new RuntimeException("Illegal index");
        return rows[i].get(j);
    }

    // return the number of nonzero entries (not the most efficient implementation)
    public int nnz() { 
        int sum = 0;
        for (int i = 0; i < N; i++)
            sum += rows[i].nnz();
        return sum;
    }

    // return the matrix-vector product b = Ax
    public SparseVector times(SparseVector x) {
        SparseMatrix A = this;
        if (N != x.size()) throw new RuntimeException("Dimensions disagree");
        SparseVector b = new SparseVector(N);
        for (int i = 0; i < N; i++)
            b.put(i, A.rows[i].dot(x));
        return b;
    }

    // return C = A + B
    public SparseMatrix plus(SparseMatrix B) {
        SparseMatrix A = this;
        if (A.N != B.N) throw new RuntimeException("Dimensions disagree");
        SparseMatrix C = new SparseMatrix(N);
        for (int i = 0; i < N; i++)
            C.rows[i] = A.rows[i].plus(B.rows[i]);
        return C;
    }
}

http://introcs.cs.princeton.edu/java/44st/SparseVector.java.html

public class SparseVector {
    private final int N;             // length
    private ST<Integer, Double> st;  // the vector, represented by index-value pairs

    // initialize the all 0s vector of length N
    public SparseVector(int N) {
        this.N  = N;
        this.st = new ST<Integer, Double>();
    }

    // put st[i] = value
    public void put(int i, double value) {
        if (i < 0 || i >= N) throw new RuntimeException("Illegal index");
        if (value == 0.0) st.delete(i);
        else              st.put(i, value);
    }

    // return st[i]
    public double get(int i) {
        if (i < 0 || i >= N) throw new RuntimeException("Illegal index");
        if (st.contains(i)) return st.get(i);
        else                return 0.0;
    }

    // return the number of nonzero entries
    public int nnz() {
        return st.size();
    }

    // return the size of the vector
    public int size() {
        return N;
    }

    // return the dot product of this vector a with b
    public double dot(SparseVector b) {
        SparseVector a = this;
        if (a.N != b.N) throw new RuntimeException("Vector lengths disagree");
        double sum = 0.0;

        // iterate over the vector with the fewest nonzeros
        if (a.st.size() <= b.st.size()) {
            for (int i : a.st)
                if (b.st.contains(i)) sum += a.get(i) * b.get(i);
        }
        else  {
            for (int i : b.st)
                if (a.st.contains(i)) sum += a.get(i) * b.get(i);
        }
        return sum;
    }

    // return the 2-norm
    public double norm() {
        SparseVector a = this;
        return Math.sqrt(a.dot(a));
    }

    // return alpha * a
    public SparseVector scale(double alpha) {
        SparseVector a = this;
        SparseVector c = new SparseVector(N);
        for (int i : a.st) c.put(i, alpha * a.get(i));
        return c;
    }

    // return a + b
    public SparseVector plus(SparseVector b) {
        SparseVector a = this;
        if (a.N != b.N) throw new RuntimeException("Vector lengths disagree");
        SparseVector c = new SparseVector(N);
        for (int i : a.st) c.put(i, a.get(i));                // c = a
        for (int i : b.st) c.put(i, b.get(i) + c.get(i));     // c = c + b
        return c;
    }
}

http://introcs.cs.princeton.edu/java/44st/ST.java.html
// we can use hashmap -> no need to sort by key(j).

public class ST<Key extends Comparable<Key>, Value> implements Iterable<Key> {

    private TreeMap<Key, Value> st;

    /**
     * Initializes an empty symbol table.
     */
    public ST() {
        st = new TreeMap<Key, Value>();
    }


    /**
     * Returns the value associated with the given key.
     * @param key the key
     * @return the value associated with the given key if the key is in the symbol table
     *     and <tt>null</tt> if the key is not in the symbol table
     * @throws NullPointerException if <tt>key</tt> is <tt>null</tt>
     */
    public Value get(Key key) {
        if (key == null) throw new NullPointerException("called get() with null key");
        return st.get(key);
    }

    /**
     * Inserts the key-value pair into the symbol table, overwriting the old value
     * with the new value if the key is already in the symbol table.
     * If the value is <tt>null</tt>, this effectively deletes the key from the symbol table.
     * @param key the key
     * @param val the value
     * @throws NullPointerException if <tt>key</tt> is <tt>null</tt>
     */
    public void put(Key key, Value val) {
        if (key == null) throw new NullPointerException("called put() with null key");
        if (val == null) st.remove(key);
        else             st.put(key, val);
    }

    /**
     * Removes the key and associated value from the symbol table
     * (if the key is in the symbol table).
     * @param key the key
     * @throws NullPointerException if <tt>key</tt> is <tt>null</tt>
     */
    public void delete(Key key) {
        if (key == null) throw new NullPointerException("called delete() with null key");
        st.remove(key);
    }

    /**
     * Does this symbol table contain the given key?
     * @param key the key
     * @return <tt>true</tt> if this symbol table contains <tt>key</tt> and
     *     <tt>false</tt> otherwise
     * @throws NullPointerException if <tt>key</tt> is <tt>null</tt>
     */
    public boolean contains(Key key) {
        if (key == null) throw new NullPointerException("called contains() with null key");
        return st.containsKey(key);
    }

    /**
     * Returns the number of key-value pairs in this symbol table.
     * @return the number of key-value pairs in this symbol table
     */
    public int size() {
        return st.size();
    }

    /**
     * Is this symbol table empty?
     * @return <tt>true</tt> if this symbol table is empty and <tt>false</tt> otherwise
     */
    public boolean isEmpty() {
        return size() == 0;
    }

    /**
     * Returns all keys in the symbol table as an <tt>Iterable</tt>.
     * To iterate over all of the keys in the symbol table named <tt>st</tt>,
     * use the foreach notation: <tt>for (Key key : st.keys())</tt>.
     * @return all keys in the sybol table as an <tt>Iterable</tt>
     */
    public Iterable<Key> keys() {
        return st.keySet();
    }

    /**
     * Returns all of the keys in the symbol table as an iterator.
     * To iterate over all of the keys in a symbol table named <tt>st</tt>, use the
     * foreach notation: <tt>for (Key key : st)</tt>.
     * @return an iterator to all of the keys in the symbol table
     */
    public Iterator<Key> iterator() {
        return st.keySet().iterator();
    }

    /**
     * Returns the smallest key in the symbol table.
     * @return the smallest key in the symbol table
     * @throws NoSuchElementException if the symbol table is empty
     */
    public Key min() {
        if (isEmpty()) throw new NoSuchElementException("called min() with empty symbol table");
        return st.firstKey();
    }

    /**
     * Returns the largest key in the symbol table.
     * @return the largest key in the symbol table
     * @throws NoSuchElementException if the symbol table is empty
     */
    public Key max() {
        if (isEmpty()) throw new NoSuchElementException("called max() with empty symbol table");
        return st.lastKey();
    }

    /**
     * Returns the smallest key in the symbol table greater than or equal to <tt>key</tt>.
     * @return the smallest key in the symbol table greater than or equal to <tt>key</tt>
     * @param key the key
     * @throws NoSuchElementException if the symbol table is empty
     * @throws NullPointerException if <tt>key</tt> is <tt>null</tt>
     */
    public Key ceil(Key key) {
        if (key == null) throw new NullPointerException("called ceil() with null key");
        SortedMap<Key, Value> tail = st.tailMap(key);
        if (tail.isEmpty()) throw new NoSuchElementException();
        return tail.firstKey();
    }

    /**
     * Returns the largest key in the symbol table less than or equal to <tt>key</tt>.
     * @return the largest key in the symbol table less than or equal to <tt>key</tt>
     * @param key the key
     * @throws NoSuchElementException if the symbol table is empty
     * @throws NullPointerException if <tt>key</tt> is <tt>null</tt>
     */
    public Key floor(Key key) {
        if (key == null) throw new NullPointerException("called floor() with null key");
        // headMap does not include key if present (!)
        if (st.containsKey(key)) return key;
        SortedMap<Key, Value> head = st.headMap(key);
        if (head.isEmpty()) throw new NoSuchElementException();
        return head.lastKey();
    }
}


    public SortedMap<K,V> headMap(K toKey) {

        return headMap(toKey, false);

    }

    public SortedMap<K,V> tailMap(K fromKey) {

        return tailMap(fromKey, true);

    }

java.util.TreeMap.headMap(K)

Returns a view of the portion of this map whose keys are less than (or equal to, if inclusive is true) toKey. The returned map is backed by this map, so changes in the returned map are reflected in this map, and vice-versa. The returned map supports all optional map operations that this map supports.

java.util.SortedMap<K, V>

Returns a view of the portion of this map whose keys are greater than (or equal to, if inclusive is true) fromKey. The returned map is backed by this map, so changes in the returned map are reflected in this map, and vice-versa. The returned map supports all optional map operations that this map supports.

http://www.sanfoundry.com/java-program-implement-sparse-matrix/
public class SparseMatrix
{
private int N;
private SparseArray sparsearray[];
}
public class SparseArray
{
private List start;
private int index; // length
}

class List
{
private int index;
private Object value;
private List nextindex;
}

http://www.sanfoundry.com/java-program-check-if-sparse-matrix/

http://yuanhsh.iteye.com/blog/2186422

What is a memory-efficient way to store a vector of integers? Follow-up question: using your proposed data structure, find an algorithm with constant memory usage to calculate the dot product of two vectors.

有两个很大的稀疏向量，问怎么存储和算他们的dot product. 只存储非零元素和他的index
，如果压缩后的向量大小为m,n, O(m+n)和O(mlogn)方法都不难想到。他问有没有更好
，提示divide and conquer，我就说先取一个向量的中间元素，然后搜索他在另一个向
量中对应元素的位置，这样就把两个矩阵都分别分为两半。他问复杂度，我说我要算一
下才知道，然后他说他也不知道，不过平均情况应该比前面的好。

https://gist.github.com/pedrofurtado/0a29fadeb098adc582df
class SparseMatrixNode {

public int key;

public int x;

public int y;

public SparseMatrixNode next = null;

SparseMatrixNode(int key, int x, int y) {

this.key = key;
this.x = x;
this.y = y;
}
}

Design a data structure for sparse matrix + sparse vector

Labels

Popular Posts