Really Applied Cryptography
Last Revision: 08-SEPT-2007
If you are using Opera, go into
fullscreen mode (F11 or Alt-F11), and use [fn] PageUp/PageDown.
But first, a word from our sponsors
Oh, you laugh? I made 59 CENTS last month. Actually I'm
just curious to see what the big brains at Google come up with for an
advertisement.
Currently Available Information
Lots of information on algorithms, algorithm
implementation, and communication protocols
(i.e. SSL).
Lots of bable on how secure a particular algorithm
is
Not so much information on how to use it, and when
to use it to get work done
My first attempt "Cryptography for Internet and Database
Applications" -- still too long.
Scope: Web-Dev Crypto
This focuses on application-level cryptography aka
"web-dev crypto."
- May not be appropriate for financial cryptography
- May not be appropriate for protocol or hardware use (i.e. SSL, VPNs, etc)
- May not be appropriate for OS use (storing local passwords)
Crypto Mini-Review
- Hashes
- HMACS
- Secret key block ciphers
- Public key ciphers (brief)
Not included today: random number generation
Applications
- Storing user passwords
- Tamperproof urls/cookies
Hashes (non-cryptographic)
- The "hash" itself is a function that maps a variable
sized input (bytes, strings, etc) to a fixed sized output
(32-bit int)
- Non-unique, lots of inputs will produce the same
hash
- AND, once hashed, the original message is gone
- Output should be appear more or less random
- Used hash tables
Hash Examples
- Dumb: if input has even number of bytes, return 1, else
0
- Dumb: Add up every byte in input, and use result as
hash
-
Good: SuperFastHash
-
Good: Thomas
Wang
These are not cryptographic
Cryptographic Hashes
- A cryptographic hash are similar but the hash is much larger
(e.g. ≥ 192 bits) and has some special properties:
- Change 1 bit of input, 50% of output changes
- Given just a hash, it's "hard" to find a message that makes
that hash (one way transformation)
- Given a message and a hash, it's "hard" to find another message
that produces the same hash
- The short story is given a hash, it's near impossible to figure
out what the original hash was, by inspection
- Once hashed, the original input is gone. (one-way)
Fingerprints
Another way of thinking about cryptographic hashes is to
compare them to fingerprints.
- Given a person, it's hard to "guess" what their
fingerprint is (you actually have to look at the fingerprint
to know what it is).
- Given a fingerprint, it's hard to reconstruct the actual
person/body.
- It's hard to find two people with the same
fingerprints.
Compare these principals to the "dumb" non-cryptographic
hashes.
Crypto Hash: MD5
- MD5 was one of the first public general purpose
cryptographic hash
- 128-bit output
- Getting "old", minor issues found
- Overall solid, but main problem is 128-bits is too small
- MD5 is great for file checksums -- it's really fast.
Do not use for new applications
Crypto Hash: SHA-1
- Current standard (government, IEEE, ANSI)
- Even 192-bit is getting "small" for modern computers
- Other variants produce larger hashes
Use this
Everything else
From home page
of mhash:
At the time of writing this, the library supports the algorithms:
SHA1, SHA160, SHA192, SHA224, SHA384, SHA512, HAVAL128, HAVAL160, HAVAL192, HAVAL224, HAVAL256, RIPEMD128, RIPEMD256, RIPEMD320, MD4, MD5, TIGER, TIGER128, TIGER160, ALDER32, CRC32, CRC32b, WHIRLPOOL, GOST, SNEFRU128, SNEFRU256
BULLSHIT
you only need SHA-1
Command Line Fun
$ echo "this is a message" | md5
1fb0076c4f2eaa1c788679154c51aa89
$ echo "this is a message" | openssl dgst -md5
1fb0076c4f2eaa1c788679154c51aa89
$ echo "this is a message" | openssl dgst -sha1
54b7b1ec23aae1997dfb8d6fb0d94cac389aa2a0
Secret Key Cryptography
- Requires a secret key to produce correct results
- Secret key can be anything... random bits are best
- Same key used for reading and writing
- Without the key, no deal
Secret Key Ciphers
- ciphers plaintext into ciphertext (encryption)
- decryption is the reverse
- They use one secret key for encryption and decryption
- Only operate on 64 to 128 bits at a time (8-16 chars)
- Also there are stream ciphers that operate a byte at a time, but not discussed here
Ciphers
Three you need to know about
DES, Triple DES
Blowfish
AES
Cipher: DES, Triple-DES
- The Grand daddy of all modern ciphers
- Still in use, so you might bump into it.
- 56-bit key -- too small, no longer secure
- Slow
DO NOT USE
- Drop-in replacement for DES
- Fast and lightweight
- Variable key size up to 400 bits (or so)
- 64-bit block (really too small nowdays)
- Still secure
OK for New Projects, but...
- ANSI/FIPS replacement for DES
- 128-bit block
- Variable key size
YES Use this for new applications
Everything else
As usual there are dozens of other ciphers to use in
any given library.
Ignore them all
Modes of Operation
Block ciphers work on a single block. Modes of operation
deal with working with longer messages.
You have to specify the mode when using a crypto API
Only Two You Need to Know
- ECB mode. each block is treated independently -- RARE.
- CBC mode, each block is chained together -- COMMON
ECB Mode
- Standard for "Electronic Code Book"
- Each block is encrypted and decrypted separately
- If a original message has two blocks are the same, they
will have the same encrypted output. This is bad.
- Use only for very small messages (smaller than 1 block),
if at all
CBC Mode
9 of 10 times you want this.
- "Cipher Block Chaining" mode
- It requires salt -- a random "first block" that is stored
or sent first. This is also called an initialization vector
(IV). This public and sent in the clear.
- Each block depends on previous block. The first block
depends on the IV.
- Meaning if the message contains duplicate blocks, the
output will be different
Other Modes
- One problem with CBC is that if the encrypted message had
some corruption, the rest of the message will be gibberish
when decrypted
- The other modes are mostly involved with limiting damage
from error propagation.
- Only useful for people doing real-time communication
- Not useful if stored on disk or database
IGNORE
Other Issues
Lot of special considerations. Most of these are handled by
the crypto package
- What if the message isn't exactly the length of the
block? (padding)
- How do you encrypt a key?
- How do you generate a series of keys from a password
Public Key Ciphers
- Different key for readers (decryption, private), and
writers (encryption, public)
- The writer key cannot read, and the reader key cannot write
- 10x slower than secret key ciphers
- Keys must be specially generated. They cannot be random like secret
keys
- Block size is large ≥ 512 bits
- Great for semi-anonymous communication protocols, not
great for almost everything else
- Implementation and roll out quite complicated --
when was the last time you used PGP, GPG?
Two you need to know about
Not of a focus of this talk, but to be complete:
RSA
- The first
- Most popular most widely used
- conceptually simple -- factoring integers
- RSA IS JUST FINE
Elliptic Curve
- Newer, not as widely used and may not be available in all libraries.
- Much more complicated
- May be more efficient
PKC Implementations
- For Java,
use Bouncy Castle
and/or it's JCE implementation.
- For C/C++ I would use
OpenSSL. The documentation is
a bit incomplete, but it provides
everything you are going to
need
- For scripting languages, there are a lot of half-assed
student projects floating around. I would find something
that wraps OpenSSL. Even here, be sure to do cross-version
testing with another implementation.
Application: Communication
- Mixes slow public key crypto to agree upon a key for fast
secret key crypto
- Use SSL. End of story.
- Don't reinvent the wheel
- Configuring SSL to work correctly is a topic in itself.
Application: Logging
Sometimes data/messages with sensitive information needs to
be logged (e.g. suspect credit card fraud)
- Work hard at having to write no crypto code
- write to SSL socket
- have server write to encrypted disk volume
- or so. PKC is hard. Try to avoid writing code -- get
involved with the sysadmin/operations group for solutions
Application: Storing Passwords
- Every web site does it
- Few do it correctly
I'll assume you know better than to store it as plaintext
Password Facts
- People are not very smart and reuse very common words and
phrases for password
- A large percentage of the password at any given site are
the same.
Take 1: Encrypt it
- You could encrypt the password, but then either the DB or
the AppDev guys will know the password
- (I prefer the application does it since then it's
encrypted ASAP and not going over the wire.)
- The application has the ability now to decrypt the
password, ripe for abuse.
Take 2: Hash it
- Storing the hash (SHA-1) of password eliminates the password, but
- If the hash of presented password, matches the database entry, we are sure they typed in the correct password
- Hackers could (and do) compile a "dictionary" of common passwords and their hashed value
- Then they just compare their dictionary to your database
- This is called a "dictionary attack"
SHA1("password") →
362842c5b...
SHA1("querty") →
54b8617ec...
SHA1("123456") →
a86850deb...
Take 3: HMAC it
- By using a secret key and a HMAC, you eliminate the dictionary attack since your HMAC results won't match their dictionary.
- But, a new attack based on frequency
- Since so many people use the same password, you could look
at the frequency of entries in the database and infer the
original password
- The top entry in the database is probably the HMAC of "password" or "qwerty"
Take 4: HMAC + Salt
- One solution is add some random bits (salt) to key
messages, e.g. HMAC(message + salt), and store the salt with
the final hmac.
- Now the same message with the same key produces a different
hash
- The salt is public! This is ok!
- You need both the secret key and the salt to be secure
HMAC + Key Management
- Another option is to have many keys and randomly assign a
user a key.
- Store the key-id, along with the hmac.
- In practice, this is very tricky and key management always
falls to the bottom of the priority list
- However, make a field for the keyid anyways. It will come in
handy if you need to do a migration....
Sample Schema
CREATE TABLE password {
uid INT NOT NULL PRIMARY KEY,
salt INT NOT NULL, /* could be char(X) too */
hmac CHAR(48) NOT NULL, /* or so */
kid INT NOT NULL DEFAULT 0, /* TBD */
};
New problem
- Perhaps you top coder or sysadmin just got fired.
- ...or you got bought or are merging
- How do you change keys???
- How do you migrate to a the new user database ?
You are stuck since you don't know the original password
Take 5: HMAC+Salt+PKC
Passwords are checked often, never read, and only rarely
written to.
- Store the HMAC+salt as above, and encrypt it using
a public key (details: tbd)
- The registration application can't read this table since it
doesn't know the private key
- Burn the private key on a CD and give to your boss, the CFO and
the CEO
- Use when you have an unusual situation, you can re-HMAC your
database.
- While the migration is happening, you might be using two keys
for HMAC
Take 6: The HMAC of the Hash
By storing the HMAC of the hash of the password
(e.g. HMAC(SHA1(password) + salt you can allow
for client login where the password is never sent in the
clear. This is good if you don't allow SSL logins
- On login form, on submit, rewrite the password field to
the MD5/SHA1 version of it, and flip an invisible form field
tell the server the password is hashed.
- If javascript is turned off, then the server will need
to MD5 the password.
- Server looks at form to see if the password is MD5 or
not.
Really not that complicated
I know "Take 6" sounds insane, but here's the pseudocode:
salt = random();
hex_encode(hmac(md5(password + salt), key)
in practice it's probably 5 lines of code
Application: Tamper Proof URLs/Cookies
Prevent:
- Prevent damage from proxies, bad bookmark files, crappy
browsers, spyware
- Prevent parameter scanning
- Prevent user modification
tamper-proof urls
Allow:
- URLs to "expire"
- Store expensive computations in the URL/Cookie instead of
a session
- Store authorization in the cookie or url
- Allow login authentication from SSL to go to non-SSL
- Store a session-id safely
Encode Recipe - Part 2
IMPORTANT: notice how you are hmac-ing the meta data too!
Decode Recipe
- Take last 48 characters from URL.
- Perhaps find "kid=XX" to find the key id, and get your key
- HMAC all but the last 48 characters
- compare your hash with hash from url
- Decode querystring into normal key-vaue pairs
- Apply application logic -- did it expire?
Hints
- Use created-on instead of expires on -- allows better
control and debugging. You also aren't giving hints to hacker
on when expiration occurs.
- Consider making the metadata start with an underscore so it
doesn't collide with real data.
- Consider adding the host name of the machine that did the
HMAC-ing -- if the timestamps are "in the future" you can find
the machine quickly.
- You can save a few chars by using
a
URL-safe base64 encoding instead of hex.
UNIT TEST MANIA
- You can't have bugs. Yes bug free code can be done.
- Check for truncation.
- Check that each character changed will cause the validity check to fail.
- Check by appending extra characters causes failure too.
- Check for "double hmac".
Performance Impact
Summary: Zero
I've added tamper-proof cookies/url on every page to two
high volume websites (i.e. Alexa top 50) doing thousands of
requests per second
Unnoticeable on CPU load graphs when code went out.
Cost is sub-millisecond
Use It
More fun
- Integration without violating privacy policy
- Encrypting small numbers or limited-range data
- Note-quite crypto: GUIDs
- Safely putting binary data in URLs
- Secret key management
- Secret key generation from a master key
- Random number generation
- Database data integrity (prevent easy UPDATE to critical fields)
Change Log
- 08-Sept-2007: Major cleanup/revisions
- 05-Sept-2007: Initial NYCBUG presentation