Message Authentication

In information security, message authentication or data origin authentication is a property of data integrity and authenticity.

How this can be assured vie cryptography?
I've already wrote about hash functions, which generates a hash value that could serve as unique id for every message if not the fact that also attacker could easy generate new hash for a exchanged message. Sufficient for integrity, but other methods are needed for authentication. Two similar ways of doing this exists:

  • Message Authentication Code (MAC)
  • Authenticated Encryption (AE)

Let's go into details of each.

MAC

In contrast to hash functions MAC Algorithms are cryptographic primitives designed to assure Integrity and Authentication.
MAC values are both generated and verified using the same secret key. This implies that the sender and receiver of a message must have same secret key, as it was explained in the case of symmetric encryption, this fact of the common secret key makes authentication possible.

Generally speaking message authentication works with three algorithms:

  1. A key generation algorithm selects a key from the key space uniformly at random.
  2. A signing algorithm efficiently returns a tag for given the key (K) and the message(m)
  3. A verifying algorithm efficiently verifies the authenticity of the message given the key and the tag. That is, return accepted when the message and tag are not tampered with or forged, and otherwise return rejected.

For a secure unforgeable message authentication code, it should be computationally infeasible to compute a valid tag of the given message without knowledge of the key.
MAC principle

In a communication scenario MAC is attached to the message that travels over the network. Receiver applies same MAC algorithm to the payload part of the message and checks if it matches with the MAC part.

Keep in mind that MAC alone is not secure against replay attack. Widely used MAC is HMAC.

Keyed-hash message authentication code (HMAC)

HMAC (RFC 2104 is from 1997) is just a specific type of MAC that is based on hash functions. Any cryptographic hash function, such as MD5 or SHA-1, may be used in the calculation of an HMAC. Therefore cryptographic strength of the HMAC depends upon the cryptographic strength of the underlying hash function, the size of its hash output, and on the size and quality of the key.

HMAC_MD5("key", "The quick brown fox jumps over the lazy dog")    = 0x80070713463e7749b90c2dc24911e275  
HMAC_SHA1("key", "The quick brown fox jumps over the lazy dog")   = 0xde7c9b85b8b78aa6bc8a7a36f70a90701c9db4d9  
HMAC_SHA256("key", "The quick brown fox jumps over the lazy dog") = 0xf7bc83f430538424b13298e6aa6fb143ef4d59a14946175997479dbc2d1a3cd8  

HMAC was one of the first and is de facto standard with a lot of implementations.

Java example

Below you'll find how HMAC tag is created as result utilizing SHA1.

SecureRandom random = new SecureRandom();  
byte[] keyBytes = new byte[64];  
random.nextBytes(keyBytes);  
SecretKey key = new SecretKeySpec(keyBytes, "HMACSHA1");  
Mac mac = Mac.getInstance("HmacSHA1");  
mac.init(key);  
byte[] result = mac.doFinal("Authenticated Message".getBytes("UTF8"));  

Cipher-based Message Authentication Code (CMAC)

CMAC or CMAC-AES (RFC 4493 from 2006) is MAC algorithm for block ciphers.
CMAC can be calculated faster if target platform utilizes hardware optimization for block ciphers (e.g. dedicated AES opcodes).

UMAC

UMAC (RFC 4418 from 2006) is MAC based on universal hashing. The resulting digest or is then encrypted to hide the identity of the hash function was used giving additional security. UMAC has provable cryptographic strength and is usually a lot less computationally intensive than other MACs. UMAC's design is optimized for 32-bit architectures.

A closely related variant of UMAC that is optimized for 64-bit architectures is given by VMAC.

Poly1305

Another speed optimized algorithm is so called Poly 1305. Not going into mathematical details, Poly1305 is a fundamentally different class of MAC than HMAC. The addition of an nonce1 into the function makes it unsuitable for some environments but however Google has selected Poly1305 along with symmetric cipher ChaCha20 as a replacement for RC4 in TLS/SSL.

Authenticated encryption

As discussed MAC assures Integrity and Authentication. In a combination with basic modes of operation it enables confidentiality. E.G. an apllication could use CBC Mode with HMAC and with that two programming interfaces, two keys and two times relative much computation of cryptographic functions, which is not optimal.
Beside that handling with two functions instead of one increase the complexity and many attacks were described due implementation deitails problems of "Mac-that encrypt" over the years.

To improve this situation Authenticated Encryption (AE) or Authenticated Encryption with Associated Data (AEAD) was invented. In fact this is a block cipher mode of operation that in contrast to encryption only modes simultaneously provides confidentiality, integrity, and authenticity. This given by a single, easy to use programming interface.

In AE mode we typically would have to input

  • plaintext (will be encrypted)
  • key
  • Plaintext header (will not be encrypted)

The header part is not encrypted and is intended to provide authenticity and integrity protection for networking or storage metadata for which confidentiality is unnecessary, but authenticity is desired.

It is important to know that AE/AEAD use nonces to in increase semantic security.

Many AE modes of operation where developed (OCB,CCM,EAX..) but GCM is most popular and included NSA Suite B Cryptography.

Galois/Counter Mode (GCM)

Galois/Counter Mode (GCM) has been widely adopted because of its efficiency and performance. GCM throughput rates for state of the art, high speed communication channels can be achieved with reasonable hardware resources. GCM is defined for block ciphers with a block size of 128 bits. Galois Message Authentication Code (GMAC) is an authentication-only variant of the GCM. Both GCM and GMAC can accept initialization vectors of arbitrary length.GCM can take full advantage of parallel processing and implementing GCM can make efficient use of an instruction pipeline or a hardware pipeline.

The encrypted text then contains the IV, cipher text, and authentication code. It therefore has similar security properties to a HMAC.

In the case of very popular AES-GCM cipher the MAC uses a universal hash called GHASH, encrypted with AES-CTR (AES in Counter Mode).
Find java example of AES-GCM encryption below.

byte[] input = "AES-GCM Test ".getBytes();  
SecureRandom random = SecureRandom.getInstanceStrong();

//AES secrete key generation
KeyGenerator keyGen = KeyGenerator.getInstance("AES");  
keyGen.init(128, random);  
SecretKey key = keyGen.generateKey();

// Init cipher with nonce(IV) and tag length and AAD
Cipher cipher = Cipher.getInstance("AES/GCM/NoPadding", "SunJCE");  
GCMParameterSpec spec = new GCMParameterSpec(GCM_TAG_LENGTH * 8, getIV(random));  
cipher.init(Cipher.ENCRYPT_MODE, key, spec);  
byte[] aadHeader = "Joe Your Friend".getBytes();  
cipher.updateAAD(aadHeader); //Additional Authentication Data (AAD) is plaintext

byte[] cipherText = cipher.doFinal(input);  

Find complete java examples at Github

  1. Cryptographic nonce is an arbitrary number that may only be used once. It is often a random or pseudo-random number issued in an authentication protocol to ensure that old communications cannot be reused in replay attacks.