In information security, message authentication or data origin authentication is a property of data integrity and authenticity.

How this can be assured via cryptography? I have already written about hash functions, which generates a hash value that could serve as a unique id for every message if not the fact that also attacker could easily generate a new hash for a exchanged message. Sufficient for integrity, but other methods are needed for authentication. Two similar ways of doing so are used:

  • Message Authentication Code (MAC)
  • Authenticated Encryption (AE)

Let’s go into the details of each.

MAC

In contrast to hash functions, MAC Algorithms are cryptographic primitives designed to assure Integrity and Authentication of the message. MAC values are both generated and verified using the same secret key (See Symmetric Cryptography).

And the fact of the existance of the shared secret key makes authentication possible.

Generally speaking, message authentication works with three algorithms:

  1. A key generation algorithm selects a key from the keyspace uniformly at random.
  2. A signing algorithm efficiently returns a tag for a given the key (K) and the message(m)
  3. A verifying algorithm efficiently verifies the authenticity of the message given the key and the tag.

That is, return accepted when the message and tag are not tampered with or forged, and otherwise return rejected.

For a secure unforgeable message authentication code, it should be computationally infeasible to compute a valid tag of the given message without knowledge of the key. MAC principle

In a communication scenario, MAC is attached to the message that travels over the network. The receiver applies the same MAC algorithm to the payload part of the message and checks if it matches with the MAC part.

Keep in mind that MAC alone is not secure against replay attacks. The widely used MAC algorithm realization is HMAC.

Keyed-hash message authentication code (HMAC)

HMAC (RFC 2104 is from 1997) is just a specific type of MAC that is based on hash functions. Any cryptographic hash function, such as MD5 or SHA-1, may be used in the calculation of an HMAC. Therefore cryptographic strength of the HMAC depends upon the cryptographic strength of the underlying hash function, the size of its hash output, and on the size and quality of the key.

HMAC_MD5("key", "The quick brown fox jumps over the lazy dog")    = 0x80070713463e7749b90c2dc24911e275
HMAC_SHA1("key", "The quick brown fox jumps over the lazy dog")   = 0xde7c9b85b8b78aa6bc8a7a36f70a90701c9db4d9
HMAC_SHA256("key", "The quick brown fox jumps over the lazy dog") = 0xf7bc83f430538424b13298e6aa6fb143ef4d59a14946175997479dbc2d1a3cd8

HMAC was one of the first and is a de facto standard with a lot of implementations.

Java example

Below you’ll find how HMAC tag is created as result utilizing SHA1.

SecureRandom random = new SecureRandom();
byte[] keyBytes = new byte[64];
random.nextBytes(keyBytes);
SecretKey key = new SecretKeySpec(keyBytes, "HMACSHA1");
Mac mac = Mac.getInstance("HmacSHA1");
mac.init(key);
byte[] result = mac.doFinal("Authenticated Message".getBytes("UTF8"));

Cipher-based Message Authentication Code (CMAC)

CMAC or CMAC-AES (RFC 4493 from 2006) is MAC algorithm for block ciphers. CMAC can be calculated faster if the target platform utilizes hardware optimization for block ciphers (e.g. dedicated AES opcodes).

UMAC

UMAC (RFC 4418 from 2006) is MAC based on universal hashing. The resulting digest or is then encrypted to hide the identity of the hash function was used giving additional security. UMAC has provable cryptographic strength and is usually a lot less computationally intensive than other MACs. UMAC’s design is optimized for 32-bit architectures.

A closely related variant of UMAC that is optimized for 64-bit architectures is given by VMAC.

Poly1305

Another speed-optimized algorithm is so-called Poly 1305. Not going into mathematical details, Poly1305 is a fundamentally different class of MAC than HMAC. The addition of an nonce into the function makes it unsuitable for some environments but however, Google has selected Poly1305 along with symmetric cipher ChaCha20 as a replacement for RC4 in TLS/SSL.

Authenticated encryption

So we learned MAC assures Integrity and Authentication. In a combination with basic modes of operation it we can add also a Confidentiality property. For E.G. an application could use CBC Mode with HMAC to ensure all thee properties. Practically it would mean dealing with two shared keys, two programming interfaces, and two times relative much computation of cryptographic functions. Besides that, handling with two functions instead of one increases the complexity and some attacks were described due to implementation details problems of “Mac-that encrypt” over the years.

To improve this situation Authenticated Encryption (AE) or Authenticated Encryption with Associated Data (AEAD) was invented. In fact, this is a block cipher mode of operation that in contrast to encryption only modes simultaneously provide confidentiality, integrity, and authenticity. This given by a single, easy-to-use programming interface.

In AE mode we typically would have the Interface with only 3 inputs

  • plaintext (will be encrypted)
  • key
  • Plaintext header (will not be encrypted)

The header part is not encrypted and is intended to provide authenticity and integrity protection for networking or storage metadata for which confidentiality is unnecessary, but authenticity is desired.

It is important to know that AE/AEAD uses nonces to increase semantic security.

Many AE modes of operation were developed (OCB, CCM, EAX..) but GCM is most popular and included NSA Suite B Cryptography.

Galois Message Authentication Code (GMAC)

Galois/Counter Mode (GCM) has been widely adopted because of its efficiency and performance. GCM throughput rates for state-of-the-art, high-speed communication channels can be achieved with reasonable hardware resources. GCM is defined as block ciphers with a block size of 128 bits. Galois Message Authentication Code (GMAC) is an authentication-only variant of the GCM. Both GCM and GMAC can accept initialization vectors (IV) of arbitrary length. GCM can take full advantage of parallel processing and implementing GCM can make efficient use of an instruction pipeline or a hardware pipeline.

The encrypted text then contains the IV, ciphertext, and authentication code. It, therefore, has similar security properties to a HMAC.

In the case of a very popular AES-GCM cipher, the MAC uses a universal hash called GHASH, encrypted with AES-CTR (AES in Counter Mode). Find a java example of AES-GCM encryption below.

byte[] input = "AES-GCM Test ".getBytes();
SecureRandom random = SecureRandom.getInstanceStrong();
		
//AES secrete key generation
KeyGenerator keyGen = KeyGenerator.getInstance("AES");
keyGen.init(128, random);
SecretKey key = keyGen.generateKey();

// Init cipher with nonce(IV) and tag length and AAD
Cipher cipher = Cipher.getInstance("AES/GCM/NoPadding", "SunJCE");
GCMParameterSpec spec = new GCMParameterSpec(GCM_TAG_LENGTH * 8, getIV(random));
cipher.init(Cipher.ENCRYPT_MODE, key, spec);
byte[] aadHeader = "Joe Your Friend".getBytes();
cipher.updateAAD(aadHeader); //Additional Authentication Data (AAD) is plaintext

byte[] cipherText = cipher.doFinal(input);

Find complete java examples at Github