You Won’t Believe This One Weird CPU Instruction!

Vaibhav Sagar (@vbhvsgr)

popcount?

Population Count

Counts the number of bits set to 1 in a machine word.

popcount

popcount(00100110) == 3 popcount(01100000) == 2

That’s it, that’s all it does.

That doesn’t seem very useful??

CPUs with popcount

  • 1960: IBM STRETCH
  • 1964: CDC 6000
  • 1975: Cray
  • 2005: SPARC
  • 2005: ARM NEON
  • 2007: AMD K10
  • 2008: Intel Core

What’s going on?

Cryptography

The NSA Instruction

source

The NSA Instruction

It was almost a tradition that one of the first of any new faster CDC machine was delivered to a “good customer” - picked up at the factory by an anonymous truck, and never heard from again.

source

Hamming weight

The number of symbols that are different from the zero-symbol of the alphabet.

Cryptanalysis

  1. You have 60-bit words, enough to handle most alphabets
  2. Set a bit for each character in each line of the input
  3. Count distinct characters with popcount
  4. Use the count as a hash, or to do further analysis
  5. ???
  6. Profit!

source

popcount disappears from CPUs: a conspiracy???

Error correction

Hamming distance

The number of differing positions between two strings of identical length.

Hamming distance

For binary strings: popcount(x ^ y)

00100110
01100000
--------
01000110

popcount(01000110) = 3

Signal distance

Send some data over the wire and compare it at both ends to get an estimate of the error rate.

Error Correcting Codes

Based on your error correction requirements, choose sufficiently distant symbols to assemble your message out of!

Binary convolutional neural networks

Binary

Uses +1 and -1 (coded as 0) instead of 32-bit floating point numbers.

Convolutional

Matrix multiplication?

Neural Network

¯\(ツ)

Why?

Simpler and smaller, suitable for less powerful devices like mobile phones

Matrix multiplication


a = xnor(x, y)

b = popcount(a)

c = len(a)

dot(x, y) = 2 × b − c

source

Other fun uses of popcount

Chess Engines

Molecular Fingerprinting

Hash Array Mapped Tries

Succinct Data Structures

Compiler optimisations

GCC, Clang

Thanks!

Slides

https://vaibhavsagar.com/presentations/popcount