WHAT IS CRYPTO? PART 2: SECRET KEYS
WHAT IS CRYPTO? PART 2: SECRET KEYS
MATT WEEKS | @SCRIPTJUNKIE1
MAR. 20, 2018 THE FRONT LINE
The last post introduced what crypto is at a high level. Now, let’s get into specifics. We will cover various types of encryption, including block ciphers and one-time pads.
A block cipher is the basic unit of what’s called symmetric, or secret key, cryptography. It takes a key and a plaintext of a set size (for example, 8 or 16 bytes) and creates a ciphertext, or, given the ciphertext and the same secret key, decrypts it to a plaintext. Normally, the key is at least as long, or longer than the plaintext, and the plaintext and ciphertext are the same length.
Individual ciphertext blocks must correspond to exactly one plaintext block (otherwise, whoever is decrypting the data would not know what message is being sent). This also means the ciphertext block cannot be smaller than the plaintext block. For example, if the plaintext were 2 bytes, there would be 65,536 different possible messages that could be encrypted, but if the ciphertext were only 1 byte, there would only be 256 possible ciphertexts; not enough to decrypt all the possible messages!
It is important to note, nobody wants to use a block cipher that makes ciphertext blocks larger than the plaintext blocks. If a block cipher turned every 8-byte plaintext into a 10-byte ciphertext, it would slow download rates by 25%! Also, if you tried to use a cipher like that to encrypt your hard drive, in addition to decreasing the usable space with its overhead, it would mess up the alignment of each of the hard drive sectors, as reading a single plaintext sector would require reading two ciphertext sectors to get all the overhead. And if the size of the ciphertext could vary, you would not know where to start reading to find a given sector unless you read through every preceding sector to find out how large they were. So that is not an option either. Although we will see that there will always be at least a little bit of overhead when using cryptographic protocols, doing so results in a one-time cost of a fixed size, not linearly increasing overhead.
Clearly, it is important that blocks of ciphertext and plaintext be the same size. With the same number of bytes in the plaintext and ciphertext, this does mean that with any given key, every block of plaintext must decode to a different, valid block of ciphertext and vice versa.
Every modern block cipher like AES includes extensive amounts of rotating, mixing, and flipping of the bits, and some include substitution boxes (s-boxes) as well. This process is run repeatedly, in many rounds. By the time they are done, it is common to have hundreds of binary mathematical operations applied to each block. Block ciphers have been analyzed to a high degree using advanced math and adding additional scrambling on top is unlikely to add any security.
If used naively, block ciphers are not sufficient to handle encrypting longer messages. Since the same sequences of bytes frequently appear in plaintext, an attacker would be able to see repeated ciphertext blocks and deduce they came from the same plaintext blocks, allowing frequency analysis attacks. These types of attacks were very effective against earlier codes and ciphers. In a code book, every plaintext word would correspond to a different word in code, while in a simple substitution cipher, every letter would correspond to a different letter. Even though an attacker would not know exactly what one word or letter stood for, when viewing a longer message, patterns would stand out. The same is true with modern block ciphers if they are used like an “electronic code book” i.e. in “ECB” mode. This problem is called “seeing penguins” thanks to a very common illustration of this process (image courtesy of Wikipedia)
Block cipher modes of operation solve the repeatability problem and prevent the associated statistical attacks against a dumb (ECB) use of a block cipher. They use different strategies to mix the output of a different encryption with each block to avoid repeated ciphertexts. The common Cipher Block Chaining (CBC) mode will first XOR (see below) each plaintext block with the previous ciphertext block before encrypting (or after decrypting). To avoid repeatability of the first block, most modes use an initialization vector, or IV, to mix with it. This effectively turns the block cipher into a stream cipher. Stream ciphers are ciphers that encrypt any length of plaintext, encrypting or decrypting sequentially from the first byte to the last.
What’s XOR? That means “exclusive-or” and it is a binary operation that goes bit-by-bit between two binary inputs, outputting a 1 if one of the input bits is set to 1, but not both. In other words, if the two bits are different, it outputs a 1 and if they’re the same it outputs a 0. It has the interesting property that if you take a plaintext sequence of bits, XOR them with a different random (key) sequence of bits that is the same length, you will get an output sequence of bits that will also look random. Then if you XOR the output sequence with the key sequence again, you will get back your plaintext sequence of bits.
In fact, one well-known cipher, called the “one-time pad” (OTP) is basically just an XOR; it takes a string of random bits of the same length as the plaintext, then XOR’s the plaintext and key together to create the ciphertext, and XOR’s the ciphertext and key to decrypt. The name comes from the first of two ironclad rules: first the key must never be re-used – in the old days they were written on pads of paper and each pad was ripped off and only used for one message, and second the key must be truly random. However, if those two things are true, OTP is mathematically “perfect” – an attacker without the key learns nothing about the plaintext from the contents of the ciphertext (only metadata like length and time are visible). This comes from the fact that an OTP ciphertext 100 bytes long could correspond to literally any plaintext 100 bytes long.
OTP is harder to use correctly than it first appears since most random number generators, even cryptographically strong ones, usually use only a small truly random seed then an algorithm to generate the random bits instead of generating all truly random bits. Most would-be OTP implementers do not realize that this does not meet the requirements of the security proof and they are simply implementing a traditional stream cipher mode, not real OTP. Even if they do generate truly random keys, the method chosen to distribute them nearly always compromises the entire system. In practice, the problems with generating truly random keys that long and distributing them make OTP impractical for nearly all uses. It is so bad that, ironically, when someone uses the term “one-time pad” to describe their cryptographic software, it is a strong signal that the system is not well thought out and there will be fatal flaws in it.
Put together, stream ciphers or block ciphers with modes of operation provide protection, if you somehow securely pre-shared your keys, against a passive observer (trying to mount what’s called a Ciphertext-Only attack). But a simple stream cipher or block cipher with a basic mode does not protect against Mallory, the man-in-the-middle (MITM). So we say Alice and Bob can communicate securely against Eve the eavesdropper. But in the real world, nearly anyone who can observe your network traffic can also manipulate it with attacks like setting up malicious WiFi hotspots at the coffee shop or pretending to be the router and/or your system on a legitimate network (ARP poisoning or simply spoofing) or pretending to be a proxy (WPAD attacks) or hacking a router or using a routing protocol attack. This means that most attackers can also change the bytes in the ciphertext you try to decrypt. Since each ciphertext block must decrypt to a different plaintext block, decrypting the corrupted ciphertext block will “work,” but you will get a correspondingly corrupt data block (and possibly others, depending on the mode) once you do so.
Mallory most likely has some knowledge of what you are downloading. For example, if a new game comes out that everyone wants to download and Mallory sees a large download from the game site to you, Mallory knows you are downloading the executable. Then Mallory can corrupt certain blocks or individual bytes that will result in creating a security vulnerability or backdoor in that executable with her knowledge of the executable. This would be an MITM attack using Known Plaintext.
So, Alice and Bob need to be able to quickly authenticate that they are getting the bytes they intended. This will require adding some variety of message authentication code (MAC) to our ciphertext. One way to solve this problem is with an algorithm called a “Hash-based Message Authentication Code” or HMAC. In these systems, a hash function runs on a special mixing of your ciphertext with another shared secret to create the HMAC. The HMAC’s size depends on the underlying hash function, such as SHA1 or SHA3 or any of the SHA2 family (SHA256, SHA384, SHA512). Before decrypting the message, the receiver must calculate the HMAC of the ciphertext and shared secret and compare to the received HMAC to verify it.
But wait, what are hash functions? Hash functions are one-way functions that output a random-looking sequence of bits of a fixed size (128, 160, 256 are the most common) in a deterministic way, so that the same input bits correspond to the same output bits, and yet even a slightly different input sequence generates what looks like a completely different output sequence. By design, if you were given an output sequence of bits you should not be able to figure out the input except by simply trying many inputs via brute force. HMAC adds a secret key into the mix so that you cannot create or validate the code unless you also know the secret key. They are nearly as fast as the raw hash and do not add additional bytes to the output.
An alternative to HMAC’s that is becoming more popular now are “authenticated encryption with associated data” or AEAD modes of block ciphers. An AEAD combines a mode of operation with an authentication block. It uses the block cipher itself to create a MAC (or “authentication tag” as the AEAD people like to call it) by encrypting a mixing of the shared secret, various states of the block cipher mode (ciphertext, IV, etc.), and metadata about the message, like its length. “Galois counter mode” or GCM is the most popular of these.
A critical error in many amateur cryptographic systems is to simply not include a MAC at all. Another critical error is to only include a hash instead of an HMAC. If you only use a hash, Mallory, who knows the plaintext of one message, will not only be able to corrupt the message, but will also likely be able to calculate a new valid hash for the evil data.
A third common error, though not as fatal as the last two, is to have the MAC verified *after* decryption instead of *before*. Once a key has been exchanged, a message receiver should never do any processing of received data before the MAC is verified. This is for a variety of reasons, but basically it limits the attack surface of the decryption process. Otherwise the attacker can make you perform many different steps in the decryption on maliciously constructed data, which makes side channels much easier to exploit. Though usually not a fully exploitable vulnerability by itself, it increases the risk of exploitation if a vulnerability is found.
Alright, that is a lot about secret key cryptography and block ciphers. Believe it or not, those still only solve the easiest problems of cryptography that had solutions of varying degrees long ago. The harder ones – requiring real mathematical advances to solve – concerned how to get keys from one party to another, especially if they didn’t previously share a secret key in person. The next post will talk about how more advanced crypto can solve those kinds of problems.
ABOUT THE AUTHOR: Matthew Weeks has extensive experience in cyber operations, as well as security research and software development He currently leads root9B’s research and development arm. Previously, he was the Officer In Charge of the US Air Force’s Intrusion Forensics and Reverse Engineering lab, a lead network defense tactician, and led the creation of the Air Force’s Defensive Counter Cyber forces, tactics, and mission. As a researcher, he has uncovered vulnerabilities found to have affected millions of networks. As a developer, he has placed in the top- ‐tier internationally in programming competitions and was the developer behind a significant portion of the Metasploit framework, the world’s most widely used vulnerability assessment suite. His work has been featured on CNN and in numerous national publications.