B-Tree | Set 1 (Introduction) - GeeksforGeeks
The main idea of using B-Trees is to reduce the number of disk accesses. Most of the tree operations (search, insert, delete, max, min, ..etc ) require O(h) disk accesses where h is height of the tree. B-tree is a fat tree. Height of B-Trees is kept low by putting maximum possible keys in a B-Tree node. Generally, a B-Tree node size is kept equal to the disk block size.
Properties of B-Tree
1) All leaves are at same level.
2) A B-Tree is defined by the term minimum degree ‘t’. The value of t depends upon disk block size.
3) Every node except root must contain at least t-1 keys. Root may contain minimum 1 key.
4) All nodes (including root) may contain at most 2t – 1 keys.
5) Number of children of a node is equal to the number of keys in it plus 1.
6) All keys of a node are sorted in increasing order. The child between two keys k1 and k2 contains all keys in range from k1 and k2.
7) B-Tree grows and shrinks from root which is unlike Binary Search Tree. Binary Search Trees grow downward and also shrink from downward.
8) Like other balanced Binary Search Trees, time complexity to search, insert and delete is O(Logn).
http://www.geeksforgeeks.org/b-tree-set-1-insert-2/
A new key is always inserted at leaf node. Let the key to be inserted be k. Like BST, we start from root and traverse down till we reach a leaf node. Once we reach a leaf node, we insert the key in that leaf node.
http://www.geeksforgeeks.org/b-tree-set-3delete/
http://algs4.cs.princeton.edu/62btrees/BTree.java
Read full article from B-Tree | Set 1 (Introduction) - GeeksforGeeks
The main idea of using B-Trees is to reduce the number of disk accesses. Most of the tree operations (search, insert, delete, max, min, ..etc ) require O(h) disk accesses where h is height of the tree. B-tree is a fat tree. Height of B-Trees is kept low by putting maximum possible keys in a B-Tree node. Generally, a B-Tree node size is kept equal to the disk block size.
Properties of B-Tree
1) All leaves are at same level.
2) A B-Tree is defined by the term minimum degree ‘t’. The value of t depends upon disk block size.
3) Every node except root must contain at least t-1 keys. Root may contain minimum 1 key.
4) All nodes (including root) may contain at most 2t – 1 keys.
5) Number of children of a node is equal to the number of keys in it plus 1.
6) All keys of a node are sorted in increasing order. The child between two keys k1 and k2 contains all keys in range from k1 and k2.
7) B-Tree grows and shrinks from root which is unlike Binary Search Tree. Binary Search Trees grow downward and also shrink from downward.
8) Like other balanced Binary Search Trees, time complexity to search, insert and delete is O(Logn).
class
BTreeNode
{
int
*keys;
// An array of keys
int
t;
// Minimum degree (defines the range for number of keys)
BTreeNode **C;
// An array of child pointers
int
n;
// Current number of keys
bool
leaf;
// Is true when node is leaf. Otherwise false
};
BTreeNode::BTreeNode(
int
_t,
bool
_leaf)
{
// Copy the given minimum degree and leaf property
t = _t;
leaf = _leaf;
// Allocate memory for maximum number of possible keys
// and child pointers
keys =
new
int
[2*t-1];
C =
new
BTreeNode *[2*t];
// Initialize the number of keys as 0
n = 0;
}
void
BTreeNode::traverse()
{
// There are n keys and n+1 children, travers through n keys
// and first n children
int
i;
for
(i = 0; i < n; i++)
{
// If this is not leaf, then before printing key[i],
// traverse the subtree rooted with child C[i].
if
(leaf ==
false
)
C[i]->traverse();
cout <<
" "
<< keys[i];
}
// Print the subtree rooted with last child
if
(leaf ==
false
)
C[i]->traverse();
}
// Function to search key k in subtree rooted with this node
BTreeNode *BTreeNode::search(
int
k)
{
// Find the first key greater than or equal to k
int
i = 0;
while
(i < n && k > keys[i])
i++;
// If the found key is equal to k, return this node
if
(keys[i] == k)
return
this
;
// If key is not found here and this is a leaf node
if
(leaf ==
true
)
return
NULL;
// Go to the appropriate child
return
C[i]->search(k);
}
A new key is always inserted at leaf node. Let the key to be inserted be k. Like BST, we start from root and traverse down till we reach a leaf node. Once we reach a leaf node, we insert the key in that leaf node.
child y of x is being split into two nodes y and z. Note that the splitChild operation moves a key up and this is the reason B-Trees grow up unlike BSTs which grow down.
Insertion
1) Initialize x as root.
2) While x is not leaf, do following
..a) Find the child of x that is going to to be traversed next. Let the child be y.
..b) If y is not full, change x to point to y.
..c) If y is full, split it and change x to point to one of the two parts of y. If k is smaller than mid key in y, then set x as first part of y. Else second part of y. When we split y, we move a key from y to its parent x.
3) The loop in step 2 stops when x is leaf. x must have space for 1 extra key as we have been splitting all nodes in advance. So simply insert k to x.
1) Initialize x as root.
2) While x is not leaf, do following
..a) Find the child of x that is going to to be traversed next. Let the child be y.
..b) If y is not full, change x to point to y.
..c) If y is full, split it and change x to point to one of the two parts of y. If k is smaller than mid key in y, then set x as first part of y. Else second part of y. When we split y, we move a key from y to its parent x.
3) The loop in step 2 stops when x is leaf. x must have space for 1 extra key as we have been splitting all nodes in advance. So simply insert k to x.
Note that the algorithm follows the Cormen book. It is actually a proactive insertion algorithm where before going down to a node, we split it if it is full. The advantage of splitting before is, we never traverse a node twice. If we don’t split a node before going down to it and split it only if new key is inserted (reactive), we may end up traversing all nodes again from leaf to root. This happens in cases when all nodes on the path from root to leaf are full. So when we come to the leaf node, we split it and move a key up. Moving a key up will cause a split in parent node (because parent was already full). This cascading effect never happens in this proactive insertion algorithm. There is a disadvantage of this proactive insertion though, we may do unnecessary splits.
// A utility function to insert a new key in this node
// The assumption is, the node must be non-full when this
// function is called
void
BTreeNode::insertNonFull(
int
k)
{
// Initialize index as index of rightmost element
int
i = n-1;
// If this is a leaf node
if
(leaf ==
true
)
{
// The following loop does two things
// a) Finds the location of new key to be inserted
// b) Moves all greater keys to one place ahead
while
(i >= 0 && keys[i] > k)
{
keys[i+1] = keys[i];
i--;
}
// Insert the new key at found location
keys[i+1] = k;
n = n+1;
}
else
// If this node is not leaf
{
// Find the child which is going to have the new key
while
(i >= 0 && keys[i] > k)
i--;
// See if the found child is full
if
(C[i+1]->n == 2*t-1)
{
// If the child is full, then split it
splitChild(i+1, C[i+1]);
// After split, the middle key of C[i] goes up and
// C[i] is splitted into two. See which of the two
// is going to have the new key
if
(keys[i+1] < k)
i++;
}
C[i+1]->insertNonFull(k);
}
}
// A utility function to split the child y of this node
// Note that y must be full when this function is called
void
BTreeNode::splitChild(
int
i, BTreeNode *y)
{
// Create a new node which is going to store (t-1) keys
// of y
BTreeNode *z =
new
BTreeNode(y->t, y->leaf);
z->n = t - 1;
// Copy the last (t-1) keys of y to z
for
(
int
j = 0; j < t-1; j++)
z->keys[j] = y->keys[j+t];
// Copy the last t children of y to z
if
(y->leaf ==
false
)
{
for
(
int
j = 0; j < t; j++)
z->C[j] = y->C[j+t];
}
// Reduce the number of keys in y
y->n = t - 1;
// Since this node is going to have a new child,
// create space of new child
for
(
int
j = n; j >= i+1; j--)
C[j+1] = C[j];
// Link the new child to this node
C[i+1] = z;
// A key of y will move to this node. Find location of
// new key and move all greater keys one space ahead
for
(
int
j = n-1; j >= i; j--)
keys[j+1] = keys[j];
// Copy the middle key of y to this node
keys[i] = y->keys[t-1];
// Increment count of keys in this node
n = n + 1;
}
http://algs4.cs.princeton.edu/62btrees/BTree.java
Read full article from B-Tree | Set 1 (Introduction) - GeeksforGeeks