Massive Algorithms: B-Tree | Set 1 (Introduction)

B-Tree | Set 1 (Introduction) - GeeksforGeeks

B-Tree | Set 1 (Introduction) - GeeksforGeeks
The main idea of using B-Trees is to reduce the number of disk accesses. Most of the tree operations (search, insert, delete, max, min, ..etc ) require O(h) disk accesses where h is height of the tree. B-tree is a fat tree. Height of B-Trees is kept low by putting maximum possible keys in a B-Tree node. Generally, a B-Tree node size is kept equal to the disk block size.

Properties of B-Tree
1) All leaves are at same level.
2) A B-Tree is defined by the term minimum degree ‘t’. The value of t depends upon disk block size.
3) Every node except root must contain at least t-1 keys. Root may contain minimum 1 key.
4) All nodes (including root) may contain at most 2t – 1 keys.
5) Number of children of a node is equal to the number of keys in it plus 1.
6) All keys of a node are sorted in increasing order. The child between two keys k1 and k2 contains all keys in range from k1 and k2.
7) B-Tree grows and shrinks from root which is unlike Binary Search Tree. Binary Search Trees grow downward and also shrink from downward.
8) Like other balanced Binary Search Trees, time complexity to search, insert and delete is O(Logn).

class BTreeNode

{

    int *keys;  // An array of keys

    int t;      // Minimum degree (defines the range for number of keys)

    BTreeNode **C; // An array of child pointers

    int n;     // Current number of keys

    bool leaf; // Is true when node is leaf. Otherwise false

};

BTreeNode::BTreeNode(int _t, bool _leaf)

{

    // Copy the given minimum degree and leaf property

    t = _t;

    leaf = _leaf;

    // Allocate memory for maximum number of possible keys

    // and child pointers

    keys = new int[2*t-1];

    C = new BTreeNode *[2*t];

    // Initialize the number of keys as 0

    n = 0;

}

void BTreeNode::traverse()

{

    // There are n keys and n+1 children, travers through n keys

    // and first n children

    int i;

    for (i = 0; i < n; i++)

{

        // If this is not leaf, then before printing key[i],

        // traverse the subtree rooted with child C[i].

        if (leaf == false)

            C[i]->traverse();

        cout << " " << keys[i];

}

    // Print the subtree rooted with last child

    if (leaf == false)

        C[i]->traverse();

}

// Function to search key k in subtree rooted with this node

BTreeNode *BTreeNode::search(int k)

{

    // Find the first key greater than or equal to k

    int i = 0;

    while (i < n && k > keys[i])

        i++;

    // If the found key is equal to k, return this node

    if (keys[i] == k)

        return this;

    // If key is not found here and this is a leaf node

    if (leaf == true)

        return NULL;

    // Go to the appropriate child

    return C[i]->search(k);

}

http://www.geeksforgeeks.org/b-tree-set-1-insert-2/
A new key is always inserted at leaf node. Let the key to be inserted be k. Like BST, we start from root and traverse down till we reach a leaf node. Once we reach a leaf node, we insert the key in that leaf node.

child y of x is being split into two nodes y and z. Note that the splitChild operation moves a key up and this is the reason B-Trees grow up unlike BSTs which grow down.

Insertion
1) Initialize x as root.
2) While x is not leaf, do following
..a) Find the child of x that is going to to be traversed next. Let the child be y.
..b) If y is not full, change x to point to y.
..c) If y is full, split it and change x to point to one of the two parts of y. If k is smaller than mid key in y, then set x as first part of y. Else second part of y. When we split y, we move a key from y to its parent x.
3) The loop in step 2 stops when x is leaf. x must have space for 1 extra key as we have been splitting all nodes in advance. So simply insert k to x.

Note that the algorithm follows the Cormen book. It is actually a proactive insertion algorithm where before going down to a node, we split it if it is full. The advantage of splitting before is, we never traverse a node twice. If we don’t split a node before going down to it and split it only if new key is inserted (reactive), we may end up traversing all nodes again from leaf to root. This happens in cases when all nodes on the path from root to leaf are full. So when we come to the leaf node, we split it and move a key up. Moving a key up will cause a split in parent node (because parent was already full). This cascading effect never happens in this proactive insertion algorithm. There is a disadvantage of this proactive insertion though, we may do unnecessary splits.

// A utility function to insert a new key in this node

// The assumption is, the node must be non-full when this

// function is called

void BTreeNode::insertNonFull(int k)

{

    // Initialize index as index of rightmost element

    int i = n-1;

    // If this is a leaf node

    if (leaf == true)

{

        // The following loop does two things

        // a) Finds the location of new key to be inserted

        // b) Moves all greater keys to one place ahead

        while (i >= 0 && keys[i] > k)

{

            keys[i+1] = keys[i];

            i--;

}

        // Insert the new key at found location

        keys[i+1] = k;

        n = n+1;

}

    else // If this node is not leaf

{

        // Find the child which is going to have the new key

        while (i >= 0 && keys[i] > k)

            i--;

        // See if the found child is full

        if (C[i+1]->n == 2*t-1)

{

            // If the child is full, then split it

            splitChild(i+1, C[i+1]);

            // After split, the middle key of C[i] goes up and

            // C[i] is splitted into two.  See which of the two

            // is going to have the new key

            if (keys[i+1] < k)

                i++;

}

        C[i+1]->insertNonFull(k);

}

}

// A utility function to split the child y of this node

// Note that y must be full when this function is called

void BTreeNode::splitChild(int i, BTreeNode *y)

{

    // Create a new node which is going to store (t-1) keys

    // of y

    BTreeNode *z = new BTreeNode(y->t, y->leaf);

    z->n = t - 1;

    // Copy the last (t-1) keys of y to z

    for (int j = 0; j < t-1; j++)

        z->keys[j] = y->keys[j+t];

    // Copy the last t children of y to z

    if (y->leaf == false)

{

        for (int j = 0; j < t; j++)

            z->C[j] = y->C[j+t];

}

    // Reduce the number of keys in y

    y->n = t - 1;

    // Since this node is going to have a new child,

    // create space of new child

    for (int j = n; j >= i+1; j--)

        C[j+1] = C[j];

    // Link the new child to this node

    C[i+1] = z;

    // A key of y will move to this node. Find location of

    // new key and move all greater keys one space ahead

    for (int j = n-1; j >= i; j--)

        keys[j+1] = keys[j];

    // Copy the middle key of y to this node

    keys[i] = y->keys[t-1];

    // Increment count of keys in this node

    n = n + 1;

}

http://www.geeksforgeeks.org/b-tree-set-3delete/

http://algs4.cs.princeton.edu/62btrees/BTree.java
Read full article from B-Tree | Set 1 (Introduction) - GeeksforGeeks

B-Tree | Set 1 (Introduction) - GeeksforGeeks

Labels

Popular Posts