Tuesday, May 1, 2012

Secure Encryption in Java

Last time I wrote about cryptography, I outlined Apache Shiro crypto API and shown how to use its two symmetric ciphers. I also wrote that "You do not need more to encrypt and decrypt sensitive data in your applications."

I learned more about cryptography and found out that you need to know more. What I wrote is true to some extend, but unless you are careful your sensitive data may not be secure against all attackers.

Out of the box Shiro provides Blowfish-CBC and AES-CBC encryption methods and I recommended to use them. Both have been designed to protect against passive eavesdropping attacker and are good at it. Unfortunately, real attackers are more sophisticated and may break the system based on them.

Notice the "may". The attacker can succeed only if attacked system cooperates with him at least little bit. If you want to use these ciphers, you have to know how to write the system securely. Of course, the other option is to use stronger cipher and avoid the problem completely.

Few needed theoretical terms and concepts are explained in the first chapter. Second chapter shows how to encrypt data in a more secure way. Then we describe how Blowfish-CBC and AES-CBC work and show two possible attacks on them. Finally, the end of the post contains references to other resources.

Table of Contents

Theory

We start with a few theoretical concepts. If you do not want to read it, go directly to the chapter with the solution.

First important thing to describe is the difference between a passive and an active attacker. Then we explain what are block ciphers and what is an authenticated encryption. Last two subchapters list selected vulnerable and selected secure ciphers.

Active vs Passive Attacker
Eavesdropping only attacker is mostly passive. He is able to read encrypted communication, but is unable to modify it or send new ciphertexts to communicating applications. He is not able to influence the communication, he only listens to it. His passivity has only one exception: the attacker is able to give unencrypted information to one of communicating parties and obtain ciphertext with that exact information.

Real-world attackers are often more active. They compose their own messages and send them to communicating application. The application then assumes that those messages are encrypted, so it tries to decrypt them. The attacker observes its reaction (returned error code, time needed to answer and so on), and learn more about the cipher.

If he is lucky, he may use obtained knowledge to break the cipher or to plant false information.

Block Ciphers
Block ciphers are able to encrypt only short messages. For example, AES can encrypt only 16 bytes long messages and Blowfish is able to encrypt only 8 bytes long messages.

Longer messages are split into blocks. Each block is combined with previously encrypted blocks and passed to the block cipher. Block combining is called an operation mode and there are multiple secure ways how to do it.

Two active attacks discussed in this post are attacks on CBC operation mode. Block ciphers themselves are secure.

Authenticated Encryption
Authenticated encryption rejects any modified ciphertext as invalid. It is not possible to take encrypted data, modify them and end up with valid ciphertext. This property is also called a ciphertext integrity.

The cipher checks integrity first and rejects all modified messages the same way. As the attacker can not pass through the integrity check, he gains nothing from sending new messages to the application.

Authentication and ciphertext integrity are usually, but not always, provided by the operation mode.

Vulnerable Ciphers
Any cipher that does not provide ciphertext integrity or authenticated encryption is probably vulnerable to some active attack. It does not matter which encryption library is used. Encryption algorithms are defined in standards and amount to the same thing regardless of the used library.

Otherwise said, if it is possible to create valid ciphertext without knowing the secret key, then it is likely that some active attack exists, even if it is not know yet.

This post describes two attacks CBC operation mode. Once you understand both these attacks and differences between passive and active attacker, you should be able to come up with similar attacks on CFB, CTR, OFB or other non-authenticated cipher modes.

Secure Ciphers
Authenticated encryptions are secure against active attackers. Most common operation modes that provide also authentication are:
  • GCM
  • EAX
  • CCM

Replace CBC with one of these modes (e.g. AES-EAX) and you have a cipher secure against an active attacker.

Of course, secure against an active attacker does not mean that the cipher has no other real-world limitation. For example, most ciphers are safe only if the user changes the key after too much data have been encrypted. If you are serious about data encryption, you should study and know those limitations.

Authenticated Encryption with Shiro

This chapter shows how to use an authenticated encryption in Java. For those who skipped the theory, authenticated encryption protects against data tampering. Nobody without the secret key will be able to modify an encrypted message and an active attack on such cipher is impossible.

Apache Shiro does not implement its own encryption algorithms. It delegates all work to Java Cryptography Extension (JCE) which is available in each Java runtime. Shiro 'only' provides easy to use API and secure defaults.

Therefore, we have to do two things:
  • install authenticated operation modes into Java Cryptography Extension,
  • integrate Shiro with new authenticated operation modes.

Install Authenticated Operation Modes
Java Cryptography Extension is an extensible API. All classes and algorithms are created by providers and new provider can be installed into the system at any time. The provider can be either:
  • installed into java runtime and be available to all java applications,
  • distributed and initialized together with the application.

We will show how to add Bouncy Castle into the project. Bouncy Castle is a provider distributed under the MIT license and contains both EAX and GCM operation modes.

Bouncy Castle distributes multiple jar files, each optimized for different java version. As our demo project uses Java 6, we have to use bcprov-jdk16 library. Add maven dependency into pom.xml:
<dependency>
  <groupId>org.bouncycastle</groupId>
  <artifactId>bcprov-jdk16</artifactId>
  <version>1.46</version>
</dependency>

Once the library is present, you have to install its provider into java security system. Run following method at the application start-up:
private BouncyCastleProvider installBouncyCastle() {
  BouncyCastleProvider provider = new BouncyCastleProvider();
  Security.addProvider(provider);
  return provider;
}

Bouncy Castle is now installed and both authenticated operation modes are available.

Integrate with Shiro
Shiro cryptography package is basically an easy to use facade over JCE. It configures JCE objects to use secure defaults and adds thread safety to it.

Extend DefaultBlockCipherService class to take advantage of these features. Most of configuration is done by that class, but we still have to specify three things:
  • block cipher name and parameters,
  • operation mode,
  • padding.

Block cipher name is specified in constructor parameter. We will use AES, because it requires no additional configuration. Neither GCM nor CCM require padding, so we have to specify the padding NONE.

New cipher service:
class GCMCipherService extends DefaultBlockCipherService {

  private static final String ALGORITHM_NAME = "AES";

  public GCMCipherService() {
    super(ALGORITHM_NAME);
    setMode(OperationMode.GCM);
    setPaddingScheme(PaddingScheme.NONE);
  }

}

That is it. You may use the new authenticating cipher as any other Shiro cipher service.

Test Case
We created a simple test case to demonstrate that the integrity check works. It encrypts a message and changes third byte of its ciphertext. If the cipher provides an authentication, then an attempt to decrypt modified ciphertext results in runtime exception:
@Test
public void testGCMAuthentication() {
  String message = "secret message";

  GCMCipherService gcmCipher = new GCMCipherService();
  assertIngetrityCheck(message, gcmCipher);
}

private void assertIngetrityCheck(String message, DefaultBlockCipherService cipher) {
  byte[] key = cipher.generateNewKey().getEncoded();
  byte[] messageBytes = CodecSupport.toBytes(message);
  ByteSource encrypt = cipher.encrypt(messageBytes, key);

  // change the ciphertext
  encrypt.getBytes()[3] = 0;

  try {
    // it should be impossible to decrypt changed ciphertext
    cipher.decrypt(encrypt.getBytes(), key).getBytes();
  } catch (Exception ex) {
    return;
  }
  fail("It should not be possible to decrypt changed ciphertext.");
}

Note on Java 7
According to documentation, Java 7 supports two authenticated operation modes: CCM and GCM. Theoretically, you should not need a third party cryptography provider.

Unfortunately, Oracle could not provide a full implementation of these modes in first JDK 7 release. They would like to add it in an update release, so the situation may change in the future. Oracle bug database contains two related bugs.

Java 1.7.0_01 still does not have them.

Cipher Block Chaining (CBC)

The last thing we needed before we can describe promised attacks is cipher block chaining (CBC) operation mode. This operation mode is sufficiently secure against passive attacker, reasonably fast and easy to implement.

Unfortunately, it also is vulnerable to active attacks.

Basics
CBC is used to encrypt a long message with block cipher. Block ciphers are able to encrypt only short blocks of data, so it starts by splitting the message into short blocks.

First and last blocks are special cases. We will explain what to do with them in following subchapters. For now, assume that the message beginning is already encrypted and its i-th block mi corresponds to ciphertext ci.

Encrypt the next message block in two steps:
  • xor the block with ciphertext of the previous block (e.g. mi⊕ci-1),
  • encrypt the result with a block cipher (e.g. Blowfish(key, mi⊕ci-1)).

Example: suppose that the secret message has three blocks and we are trying to encrypt it with Blowfish-CBC. The first block is already encrypted and its ciphertext is 1, 2, 3, 4, 5, 6, 7, 8. The second block is a byte array 1, 0, 1, 0, 1, 0, 1, 0.

Step 1: xor the first block ciphertext with the second block:
1, 2, 3, 4, 5, 6, 7, 8 ⊕ 1, 0, 1, 0, 1, 0, 1, 0 = 0, 2, 2, 4, 4, 6, 6, 8

Step 2: encrypt the result with blowfish algorithm:
Blowfish(secret_key, {0, 2, 2, 4, 4, 6, 6, 8})

First Block - Initialization Vector
First block has no previous block to be combined with. Therefore, we will generate a block of random data called initialization vector. The initialization vector is used as a very first block of data. It is xor-ed with the first message block and the result is encrypted with a block cipher.

Initialization vector is send unencrypted as a first block of the ciphertext. The recipient would be unable to decrypt the ciphertext without it and there is no reason to keep it secret.

Example: suppose that the secret message has three blocks and we are trying to encrypt it with Blowfish-CBC. The first block is a byte array 1, 1, 1, 1, 1, 1, 1, 1.

Step 1: generate random initialization vector:
1, 8, 2, 7, 3, 6, 4, 5

Step 2: xor the first block with the initialization vector:
1, 8, 2, 7, 3, 6, 4, 5 ⊕ 1, 1, 1, 1, 1, 1, 1, 1 = 0, 9, 3, 6, 2, 7, 5, 4

Step 3: encrypt the result with blowfish algorithm:
Blowfish(secret_key, {0, 9, 3, 6, 2, 7, 5, 4})

Step 4: combine initialization vector and ciphertext. If the result of Blowfish function in previous step is 1, 2, 3, 4, 5, 6, 7, 8, then the ciphertext is:
1, 8, 2, 7, 3, 6, 4, 5, 1, 2, 3, 4, 5, 6, 7, 8

Last Block - Padding
Block ciphers are able to encrypt messages of fixed length and last block is often shorter than that. As the cipher is unable to encrypt it, we need a way to add additional bytes to the end of the message.

Shiro uses PKCS#5 padding by default. Each its byte is equal to the length of the padding:
  • If the last block is too short, count how many bytes are missing and fill missing bytes with that number.
  • If the last block has the right size, treat the message as if it would be missing whole block. Add a new padding block to it. Each its byte will be equal to the block size.

Example 1: suppose that the secret message has three blocks and we are trying to encrypt it with Blowfish-CBC. The last block is byte array 1, 1, 1, 1. Padded block:
1, 1, 1, 1, 4, 4, 4, 4

Example 2: suppose that the secret message has three blocks and we are trying to encrypt it with Blowfish-CBC. The last block is byte array 8, 7, 6, 5, 4, 3, 2, 1. Last block and padding:
8, 7, 6, 5, 4, 3, 2, 1, 8, 8, 8, 8, 8, 8, 8, 8

Attacks

Finally, we are ready to show two different attacks on CBC based ciphers and prove that the problem is real. Our sample project contains tests cases for both attacks on both AES-CBC and Blowfish-CBC ciphers.

First subchapter shows how to change the beginning of encrypted text to any message of our choice. Second subchapter explains how to decrypt the ciphertext.

Data Tampering
Data tampering attack is possible only if the attacker already knows the content of encrypted message. Such attacker can change first message block to whatever he wishes to.

In particular, it is possible to:
  • change first 16 bytes of AES-CBC encrypted message,
  • change first 8 bytes of Blowfish-CBC encrypted message.

Potential Danger
Whether this type of attack is dangerous depends a lot on circumstances.

If you use the cipher to send password through network, then data tampering is not so dangerous. At worst, a legitimate user will get login denied. Similarly, if your encrypted data are stored on some read-only storage, then you do not have to worry about data tampering.

However, if you are sending bank order through the network, data tampering is a real threat. If someone changes the message Pay Mark 100$ into Pay Tom 9999$, Tom will get 9999$ he should not get.

The Attack
Encrypted message has three parts:
  • random initial vector,
  • first block of ciphertext,
  • the rest of the message.

The recipient decrypts the first block of ciphertext with the block cipher. This gives him message block⊕initial vector. To get the message, he has to xor this value with the initial vector:
message block ⊕ initial vector ⊕ initial vector = message block

The initial vector is transferred together with the message and an active attacker can change it. If the attacker replaces the original initial vector with another iv, then the recipient will decrypt another message:
message block ⊕ initial vector ⊕ another iv = another message

Rearrange the previous equation and you get:
another iv = message block ⊕ initial vector ⊕ another message

If the attacker knows both initial vector and content of the encrypted message, then he can modify the message to anything he wants. All he has to do is to xor the original initial vector with both known message content and desired message.

Whoever decrypts the modified ciphertext will obtain a modified message instead of the original one.

Test Case
We created a simple test case that demonstrate this attack on a message encrypted with AES-CBC.

Imagine that an attacker captured an encrypted email and somehow knows what is in it:
//original message
private static final String EMAIL = "Hi,\n" +
  "send Martin all requested money please.\n\n" +
  "With Regards, \n" +
  "Accounting\n";

The attacker can change only first 16 bytes of the message, so he decides to redirect the money to Andrea:
//changed message
private static final String MODIFICATION = "Hi,\n" +
  "give Andrea all requested money please.\n\n" +
  "With Regards, \n" +
  "Accounting\n";

Following test case encrypts the message and modifies the ciphertext. Modified ciphertext is decrypted and compared to the expected message:
@Test
public void testModifiedMessage_AES() {
  //create cipher and the secret key 
  StringCipherService cipher = 
      new StringCipherService(new AesCipherService());
  byte[] key = cipher.generateNewKey();

  //encrypt the message
  byte[] ciphertext = cipher.encrypt(EMAIL, key);

  //attack: modify the encrypted message
  for (int i = 0; i < 16; i++) {
    ciphertext[i] = (byte)(ciphertext[i] ^ 
        MODIFICATION.getBytes()[i] ^ 
        EMAIL.getBytes()[i]);
  }

  //decrypt and verify
  String result = cipher.decrypt(ciphertext, key);
  assertEquals(MODIFICATION, result);
}

Of course, similar attack can be done on Blowfish-CBC. We can change only first 8 bytes this time:
@Test
public void testModifiedMessage_Blowfish() {
  String email = "Pay 100 dollars to them, but nothing more. Accounting\n";
  
  StringCipherService cipher = 
    new StringCipherService(new BlowfishCipherService());
  byte[] key = cipher.generateNewKey();
  byte[] ciphertext = cipher.encrypt(email, key);

  String modified = "Pay 900 dollars to them, but nothing more. Accounting\n";
  
  for (int i = 0; i < 8; i++) {
    ciphertext[i] = (byte)(ciphertext[i] ^ 
        modified.getBytes()[i] ^ 
        email.getBytes()[i]);
  }
  String result = cipher.decrypt(ciphertext, key);
  assertEquals(modified, result);
}

Decrypt the Cipher
The second attack allows an attacker to decrypt the secret message. The attack is possible only if the application that decrypts secret messages cooperates with the attacker.

Padding Oracle
The attacker creates a lot of fake ciphertexts and sends them to the recipient. As he tries to decrypt those messages, one of these things will happen:
  • the ciphertext decrypts to meaningless garbage,
  • modified message will not be valid ciphertext at all.

If the application behaves the same way in both cases, then everything is ok. If it behaves differently, then the attacker is able to decrypt the ciphertext. There is only one way how the CBC based ciphertext can be incorrect - if the padding is wrong.

For example, if ciphertext contains an encrypted password, the vulnerable server may respond with "login denied" in case of wrong decrypted password and with runtime exception in case of invalid ciphertext. If this is the case, then the attacker can recover the password.

General Idea
Each fake message has two parts: a fake initial vector and one message block. Both are sent to the server. If it answers "padding right", then we know that:
message ⊕ original iv ⊕ fake iv = valid padding

The only unknown variable in the above equation is the message. The original iv is previous ciphertext block, fake iv was created by us and the valid padding is one of 1 or 2, 2 or 3, 3, 3 or ... or 8, 8, ..., 8 and so on.

Therefore, we can calculate the block content as:
message = valid padding ⊕ original iv ⊕ fake iv 

Algorithm
Start by recovering the last block byte. Each fake initial vectors starts with a lot of 0 and ends with a different last byte. This way, we can be almost sure that the server answers "padding right" only on a message that ends with 1. Use the previous chapter equation to calculate the last block byte.

Getting the second last byte of the message is very similar. The only difference is that we have to craft a ciphertext that decrypts into the second shortest padding 2, 2. The last byte of the message is already known, so enforcing 2 as the last value is easy. The beginning of the initial vector is unimportant and set that to 0.

Then we try all possible values for the second last byte of the initial vector. Once the server answers "padding right", we can get the second last message byte from the same formula as before: original iv ⊕ fake iv ⊕ 2.

We calculate third last message byte out of fake message with padding 3, 3, 3; fourth out of message with padding 4, 4, 4, 4; and so on until the whole block is decrypted.

Test Case
The vulnerable server is simulated with a PaddingOraculum class. Each instance of this class generates a random secret key and keeps it private. It exposes only two public methods:
  • byte[] encrypt(String message) - encrypts a string with secret key,
  • boolean verifyPadding(byte[] ciphertext) - returns whether the padding is right.

The padding oraculum attack is implemented in decryptLastBlock method. The method decrypts last block of encrypted message:
private String decryptLastBlock(PaddingOraculum oraculum, byte[] ciphertext) {
  // extract relevant part of the ciphertext
  byte[] ivAndBlock = getLastTwoBlocks(ciphertext, 
      oraculum.getBlockSize());
  // modified initial vector
  byte[] ivMod = new byte[oraculum.getBlockSize()];
  Arrays.fill(ivMod, (byte) 0);

  // Start with last byte of the last block and 
  // continue to the first byte.
  for (int i = oraculum.getBlockSize()-1; i >= 0; i--) {
    // add padding to the initial vector    
    int expectedPadding = oraculum.getBlockSize() - i;
    xorPad(ivMod, expectedPadding);

    // loop through possible values of ivModification[i]
    for (ivMod[i] =  -128; ivMod[i] <  127; ivMod[i]++) {
      // create fake message and verify its padding
      byte[] modifiedCiphertext = replaceBeginning(ivAndBlock, ivMod);
      if (oraculum.verifyPadding(modifiedCiphertext)) {
        // we can stop looping
        // the ivModification[i] =
        //    = solution ^ expectedPadding ^ ivAndBlock[i]
        break;
      }
    }

    // remove the padding from the initial vector
    xorPad(ivMod, expectedPadding);
  }

  // modified initial vector now contains the solution xor 
  // original initial vector
  String result = "";
  for (int i = 0; i < ivMod.length; i++) {
    ivMod[i] = (byte) (ivMod[i] ^ ivAndBlock[i]);
    result += (char) ivMod[i];
  }
  return result;
}

Our sample project contains two test cases. One encrypts message with AES-CBC and then uses the padding oraculum to the last block of the ciphertext. The other do the same thing with Blowfish-CBC.

Decrypt Blowfish-CBC test case:
@Test
public void testPaddingOracle_Blowfish() {
  String message = "secret message!";

  PaddingOraculum oraculum = new PaddingOraculum(
      new BlowfishCipherService());
  //Oraculum encrypts the message with a secret key.
  byte[] ciphertext = oraculum.encrypt(message);
  //use oraculum to decrypt the message
  String result = decryptLastBlock(oraculum, ciphertext);
  
  //the original message had padding 1
  assertEquals("essage!"+(char)1, result);
}

Resources

Additional related resources:

End

No cipher provides absolute safety against all possible attacks. Instead, they provide protection only against well defined classes of attacks. Ciphers are secure only as long as the potential threat to the system matches the cipher strength.

Protection against an active attack can be done in two ways:
  • Make an active attack impossible by design.
  • Use an authenticated encryption.

Using an authenticated encryption is arguably easier and should be the preferred option. Ruling out the active attacker is error prone and more risky.

All code sample used in this post are available on Github.

12 comments:

Anonymous said...

When using the first test case under JUnit I get:

testGCMAuthentication(Security.GCMAuthenticationTest): Unable to acquire a Java JCA Cipher instance using javax.crypto.Cipher.getInstance( "AES/GCM/NoPadding" ). AES under this configuration is required for the Security.GCMCipherService instance to function.

Meri said...

Hi,

I updated the project on Github. Please, try again whether it works.

The change: I added a dependency on junit 4 into pom.xml. As it was not listed as maven dependency, the IDE/system used whatever junit version was default.

The BouncyCastle initialization is in @BeforeClass method which runs only under junit 4.

Anonymous said...

Thanks!

Good luck on the crypto class exam ;)

Meri said...

Hi Anonynous,

thank you :), good luck to you too and see you in fall :))

thanhvinh said...

Hi Meri,
Thanks for informative tutorials/articles.

If you write a security book, I'll buy it.

I'm writing JSF 2 application with login and registration form(name + address + SSN).

Can you or someone please share or point me to the real world examples - steps-by-steps on how to use Shiro to secure sensitive data in jsf app?

Any help is much appreciated! Thank you.

Vinh

Meri said...

Hi Unknown,

thank you for the compliments :).

As far as basics go, there is nothing special about Shiro in JSF project. You can use it pretty much the same way as in any other java project. The first part of the Shiro tutorial explains how to do that.

The above has one exception: Shiro JSP tags are not compatible with JSF. Fortunately, Deluan rewrote Shiro tags into JSF tags and released the result as open source. The library is explained on his blog.

Good luck with your project,
Meri

Unknown said...

Hi Meri,
thank you for this great post. Very useful and helpful to understand this new operation mode : GCM.

I've run perfectly your first example, but when I use stream instead byte[] I loose integrity checking Encrypt and decrypt operation are still running without exception.
Have you an idea.
Thank you for your help,
Pierrick

Meri said...

@Pierrick MOLERA That is interesting. It may be either some trap I do not know about or a bug.

Did you raised the issue with Apache Shiro or Bouncy Castle? If it is a bug they should be told and if it is a trap they will be able to explain it.

Amirah said...

Hey! Great information here on this post. Continue writing

Nathan said...

Thankyou for publishing a wonderful stories. It just great! Thanks

Terry said...

Extremely helpful info. I love this article of yours. best of luck.

Russell said...

Definitely a good blog, Thank you for sharing. Keep on blogging

Post a Comment