This is the fourth entry in a blog series on using Java cryptography securely. The first entry provided an overview covering architectural details, using stronger algorithms and debugging tips. The second one covered Cryptographically Secure Pseudo-Random Number Generators. The third entry taught you how to securely configure basic encryption/decryption primitives. This post will teach you about Message Digest and walk you through some common use cases and show you how to use them securely along with code examples. This blog series should serve as a one-stop resource for anyone who needs to implement a crypto-system in Java. My goal is for it to be a complimentary, security-focused addition to the JCA Reference Guide.
A message digest algorithm or a hash function, is a procedure that maps input data of an arbitrary length to an output of fixed length. Output is often known as hash values, hash codes, hash sums, checksums, message digest, digital fingerprint or simply hashes. The length of output hashes is generally less than its corresponding input message length. Unlike other cryptographic algorithms, hash functions do not have keys.
Hash functions are an essential part of message authentication codes and digital signature schemes, which deserve special attention and will be covered in future posts. Hash functions are also used in varied cryptographic applications like integrity checks, password storage and key derivations, discussed in this post. They are also utilized in Secure Sockets Layer (SSL), Pretty Good Privacy (PGP), and various other cryptographic protocols.
JDK 8's Security API offers seven algorithms to choose from, out of which only three are suitable for all applications. This means there is only a 42 percent chance of making the right choice .
The SHA2 family of algorithms (SHA2-224, SHA2-256, SHA2-384 and SHA2-512) with security strength[5] above 128 bits are safe for all security applications. All SHA2 algorithms except SHA2-224 fall under this category. Security strength can be roughly defined as the number of repetitions required to find a collision (two messages with same hashes) by brute force. SHA2-224 algorithm's output is 224 bits, so it would need a maximum number of 2 224/2 repetitions to find a collision, which is very possible with today's computer strength. Thus,
In Java 8, MessageDigest
class provides hashing functionality. You need to add all the data you need to compute digest for with repeated use of update
method. Once done, call digest
method, which will generate the digest and reset it for next use.
Below would be the most secure way to use Message Digests:
/*
Most secure way to use Message Digest. Ideal for copy-pasting ;)
*/
String algorithm = "SHA-512" ; // Algorithm chosen for digesting
String data = args[0] ; // Any piece of data to be hashed, in this example used command line input
MessageDigest md = null ;
try {
md = MessageDigest.getInstance(algorithm) ; // MessageDigest instance instantiated with SHA-512 algorithm implementation
} catch( NoSuchAlgorithmException nsae) {System.out.println("No Such Algorithm Exception");}
byte[] hash = null ;
md.update(data.getBytes()) ; // Repeatedly use update method, to add all inputs to be hashed.
hash = md.digest() ; // Perform actual hashing
System.out.println("Base64 hash is = " + Base64.getEncoder().encodeToString(hash)) ;
Note: Code examples/snippets referenced in JavaDocs of MessageDigest class and Java Cryptography Architecture use "SHA" or "SHA-1" algorithms, which are not secured for many applications, as will be discussed below. Just be wary of not accidentally using them.
MD* algorithms: JDK provides support for MD2 and MD5 algorithms. These were developed in 1989 and 1991, respectively. Over the years, the security of these algorithms has been severely compromised, with attacks ranging from collision[7][8], brute-forcing, etc. NIST no longer approves the use of these algorithms[6].
SHA-0 and SHA-1: These algorithms have been compromised with collision resistance attacks[9]. Due to this, they should not be used for any applications that requires collision-resistance properties, such as password storage, generating digital signatures or time stamps.[2][6]SHA1 can be used for non-digital signature generation applications such as HMAC, Key Derivation, hashing passwords etc.
If you are using any of the above algorithms, please plan on upgrading soon.
SHA-3 algorithms are newer algorithms and not yet supported by any default providers in Java 8. Their support is first introduced only in Java 9, by SUN provider. Supported algorithms are SHA3-224, SHA3-256, SHA3-384 and SHA3-512. As for SHA2 algorithms, all algorithms except SHA3-224 are safe for security usage. However, they're not yet commonly used/deployed. There have been discussions on the complicated specifications[10], interpreted differently by implementers. Thus, if you need to use it, keep an eye on any future developments. These are introduced to be used in parallel to SHA2 rather than as a predecessor.
You surely have heard the advice to hash and salt your passwords before storing. This is good practice, but dated. You should also be "stretching" your passwords. There have been way too many password breaches in major companies[11] that have affected millions of users. This compelled me to talk a bit about how to store passwords to mitigate against offline attacks. You should use the PBKDF2 algorithm offered by SecretKeyFactory
as discussed in my previous post on encryption/decryption.. Under the hood all PBKDFs algorithms use hashing algorithms as pseudo-random generators, and run them tens of thousands of times over a user-supplied password (stretching) and apply a salt (cryptographically random nonce value used in hash calculation) to the output. You would be storing salt and output of PBKDF for user authentication. You can refer to a complete working example under crypto_usecase/password_management
Contrary to popular belief, file integrity checks need to use collision-resistance algorithms[14]. CalculateChecksum.java is a complete working example of how to compute the hash of a file.
Note: However compelling it may be, please don't truncate hash values. Hash output of minimum 256 bits should be used. Remember why we steered away from SHA2-224 hashes above?
At this point, we have spoken about three main cryptographic primitives, namely: RNG, encryption and Message Digests. These are usually fundamental building blocks of any cryptographic applications or protocols. Most cryptographic systems rarely use these three in isolation; it's usually a combination, which we will start talking about in our following posts. Stay tuned!
Comments (0)
Please Post Your Comments & Reviews
Your email address will not be published. Required fields are marked *