Massive Algorithms: Count set bits in an integer

Count set bits in an integer | GeeksforGeeks

Brian Kernighan’s Algorithm:
Subtraction of 1 from a number toggles all the bits (from right to left) till the rightmost set bit(including the righmost set bit). So if we subtract a number by 1 and do bitwise & with itself (n & (n-1)), we unset the righmost set bit. If we do n & (n-1) in a loop and count the no of times loop executes we get the set bit count.
Beauty of the this solution is number of times it loops is equal to the number of set bits in a given integer.

int countSetBits(int n)

{

unsigned int count = 0;

while (n)

{

n &= (n-1) ;

count++;

}

return count;

}

Simple Method Loop through all bits in an integer, check if a bit is set and if it is then increment the set bit count.

int countSetBits(unsigned int n)

{

  unsigned int count = 0;

  while(n)

  {

    count += n & 1;

    n >>= 1;

  }

  return count;

}

http://brokendreams.iteye.com/blog/2089714
在Hackers Delight里面介绍了很好的技巧。
这种技巧是一种分治的思想，假设我们要求32位数值的bit-1个数，我们只要计算出前16位和后16位的bit-1个数，然后把他们加起来就行了，那么对于16位的bit-1个数的求法，我们采取同样的方式...直到我们计算2位bit-1个数的时候，我们只要把两个1位bit的值加起来就好了，这相当于是子问题的基本情况，递归的出口。
下面的图展示了这个过程：

public static int bitCount1(int n){
n = (n & 0x55555555) + ((n >>> 1) & 0x55555555);
n = (n & 0x33333333) + ((n >>> 2) & 0x33333333);
n = (n & 0x0f0f0f0f) + ((n >>> 4) & 0x0f0f0f0f);
n = (n & 0x00ff00ff) + ((n >>> 8) & 0x00ff00ff);
n = (n & 0x0000ffff) + ((n >>> 16) & 0x0000ffff);
return n;
}

https://yq.aliyun.com/articles/645729

三、查表法

空间换时间，是算法优化中最常见的手段，如果有相对充裕的内存，可以有更快的算法。

思路：一个uint32的正整数n，一旦n的值确定，n的二进制表示中包含多少个1也就确定了，理论上无需重新计算：

1的二进制表示中包含1个1

2的二进制表示中包含1个1

3的二进制表示中包含2个1

…

58585858的二进制表示中包含15个1

...

提前计算好结果数组：

result[1]=1;

result[2]=1;

result[3]=2;

…

result[58585858]=15;

…

伪代码：

return result[n];

查表法的好处是，时间复杂度为O(1)，潜在的问题是，需要很大的内存。

内存分析：

假如被分析的整数是uint32，打表数组需要记录2^32个正整数的结果。

n的二进制表示最多包含32个1，存储结果的计数，使用5个bit即可。

故，共需要内存2^32 * 5bit = 2.5GB。

画外音：5个bit，能表示00000-11111这32个数。帮忙看下，算错了没有，上一篇文章bit和Byte算错了8倍。

四、二次查表法

查表法，非常快，只查询一次，但消耗内存太大，在工程中几乎不被使用。

算法设计，本身是一个时间复杂度与空间复杂度的折衷，增加计算次数，往往能够减少存储空间。

思路：

（1）把uint32的正整数n，分解为低16位正整数n1，和高16正整数n2；

（2）n1查一次表，其二进制表示包含a个1；

（3）n2查一次表，其二进制表示包含b个1；

（4）则，n的二进制表示包含a+b个1；

伪代码：

uint16 n1 = n & 0xFFFF;

uint16 n2 = (n>>16) & 0xFFFF;

return result[n1]+result[n2];

问题来了：增加了一倍的计算量（1次查表变2次查表），内存空间是不是对应减少一半呢？

内存分析：

被分析的整数变成uint16，打表数组需要记录2^16个正整数的结果。

n1和n2的二进制表示最多包含16个1，存储结果的计数，使用4个bit即可。

故，共需要内存2^16 * 4bit = 32KB。

画外音：帮忙看下，算错了没有。

计算量多了1次（1倍），内存占用量却由2.5G降到了32K（1万多倍），是不是很有意思？

http://cs-fundamentals.com/tech-interview/c/c-program-to-count-number-of-ones-in-unsigned-integer.php
Counting Set Bits by Lookup Table

int countSetBits(unsigned int n)
{
  unsigned char lookup_t[256] = {0}; 
  int i, count = 0;
 
  // generate the lookup table
  for (i = 0; i < 256; i++)
  {
    lookup_t[i] = (i & 1) + lookup_t[i / 2];
  }
 
  /* Get sie of int program by size of operator. 
     It makes the solution consistent for different word sizes */
  for (i = 1; i <= sizeof(int) && n; i++)
  {
    count += lookup_t[n & 255];
    n >>= 8;
  }
 
  return count;
}

static const unsigned char lookup_t[256] =
{
#   define B2(n) n,     n+1,     n+1,     n+2
#   define B4(n) B2(n), B2(n+1), B2(n+1), B2(n+2)
#   define B6(n) B4(n), B4(n+1), B4(n+1), B4(n+2)
    B6(0), B6(1), B6(1), B6(2)
};

Above table lookup_t uses macro definitions to form the table by patterns. They are recursive macros, B6 uses B4 and B4 uses B2. B6(0) will get expanded into B4(0), B4(1), B4(1), B4(2). Then B4(0) will get expanded into 0, 1, 1, 2.

Count Set Bits by Divide and Conquer Strategy
By divide and conquer technique of counting set bits we have to first set each 2-bit field equal to the sum of the two single bits that were originally in the field, and then sum adjacent 2-bit fields, putting the results in each 4-bit field, and so on. The method is illustrated in Figure 1, in which the first row shows a computer word whose 1-bits are to be summed, and the last row shows the result (decimal 6).
http://codingforspeed.com/a-faster-approach-to-count-set-bits-in-a-32-bit-integer/
The first line counts the number of ones every two bits and store them using two bits.

For example, 01101100 -> 01011000. The original can be grouped to 01 10 11 00, the number of ones is 1, 1, 2, 0 then written in binary is 01011000.

The second counts the number of ones every four bits and store them using four bits.
The number we obtained after the first operation is 01011000, then we group them into 0101, 1000 and add neighbour two bits, 01+01=10, 10+00=10, and represent them using four bits, which is 0010 0010.
And this goes on every 8, 16, 32 bits.

Counting set bits by divide and conquer strategy

Counting set bits by divide and conquer strategy

int countSetBits(unsigned int n)
{
  n = n - ((n >> 1) & 0x55555555);
  n = (n & 0x33333333) + ((n >> 2) & 0x33333333);
  n = (n + (n >> 4)) & 0x0F0F0F0F;
  n = n + (n >> 8);
  n = n + (n >> 16);
  return n & 0x0000003F;
}

In the above mentioned solution of counting set bits in a binary string, the ultimate small problems (summing adjacent bits) can all be done in parallel, and combining adjacent sums can also be done in parallel in a fixed number of steps at each stage. The result is an algorithm that can be executed in log₂⁽³²⁾ = 5 steps.

https://compprog.wordpress.com/2007/11/06/binary-numbers-counting-bits/
Read full article from Count set bits in an integer | GeeksforGeeks

Count set bits in an integer | GeeksforGeeks

Labels

Popular Posts