7.17.1 Problem
You want to represent cryptographic data such as public keys or certificates in a plaintext format, so that you can use it in protocols that don't accept arbitrary binary data. This may include storing an encrypted version of a private key.
7.17.2 Solution
The PEM format represents DER-encoded data in a printable format. Traditionally, PEM encoding simply base64-encodes DER-encoded data and adds a simple header and footer. OpenSSL provides an API for such functionality that handles the DER encoding and header writing for you.
OpenSSL has introduced extensions for using encrypted DER representations, allowing you to use PEM to store encrypted private keys and other cryptographic data in ASCII format.
7.17.3 Discussion
Privacy Enhanced Mail (PEM) is the original encrypted email standard. Although the standard is long dead, a small subset of its encoding mechanism has managed to survive.
In today's day and age, PEM-encoded data is usually just DER-encoded data with a header and footer. The header is a single line consisting of five dashes followed by the word "BEGIN", followed by anything. The data following the word "BEGIN" is not really standardized. In some cases, there might not be anything following this word. However, if you are using the OpenSSL PEM outputting routines, there is a textual description of the type of data object encoded. For example, OpenSSL produces the following header line for an RSA private key:
-----BEGIN RSA PRIVATE KEY-----
This is a good convention, and one that is widely used.
The footer has the same format, except that "BEGIN" is replaced with "END". You should expect that anything could follow. Again, OpenSSL uses a textual description of the content.
In between the two lines is a base64-encoded DER representation, which may contain line breaks (\r\n, often called CRLFs for "carriage return and line feed"), which get ignored. We cover base64 in Recipe 4.5 and Recipe 4.6, and DER encoding in Recipe 7.16.
If you want to encrypt a DER object, the original PEM format supported that as well, but no one uses these extensions today. OpenSSL does implement something similar. First, we'll describe what OpenSSL does, because this will offer compatibility with applications built with OpenSSL that use this format?most notably Apache with mod_ssl. Next, we'll demonstrate how to use OpenSSL's PEM API directly.
We'll explain this format by walking through an example. Here's a PEM-encoded, encrypted RSA private key:
-----BEGIN RSA PRIVATE KEY-----
Proc-Type: 4,ENCRYPTED
DEK-Info: DES-EDE3-CBC,F2D4E6438DBD4EA8
LjKQ2r1Yt9foxbHdLKZeClqZuzN7PoEmy+b+dKq9qibaH4pRcwATuWt4/Jzl6y85
NHM6CM4bOV1MHkyD01tFsT4kJ0GwRPg4tKAiTNjE4Yrz9V3rESiQKridtXMOToEp
Mj2nSvVKRSNEeG33GNIYUeMfSSc3oTmZVOlHNp9f8LEYWNmIjfzlHExvgJaPrixX
QiPGJ6K05kV5FJWRPET9vI+kyouAm6DBcyAhmR80NYRvaBbXGM/MxBgQ7koFVaI5
zoJ/NBdEIMdHNUh0h11GQCXAQXOSL6Fx2hRdcicm6j1CPd3AFrTt9EATmd4Hj+D4
91jDYXElALfdSbiO0A9Mz6USUepTXwlfVV/cbBpLRz5Rqnyg2EwI2tZRU+E+Cusb
/b6hcuWyzva895YMUCSyDaLgSsIqRWmXxQV1W2bAgRbs8jD8VF+G9w= =
-----END RSA PRIVATE KEY-----
The first line is as discussed at the beginning of this section. Table 7-4 lists the most useful values for the data type specified in the first and last line. Other values can be found in openssl/pem.h.
Table 7-4. PEM header types
RSA PUBLIC KEY
|
?
|
RSA PRIVATE KEY
|
?
|
DSA PUBLIC KEY
|
?
|
DSA PRIVATE KEY
|
?
|
DH PARAMETERS
|
Parameters for Diffie-Hellman key exchange
|
CERTIFICATE
|
An X.509 digital certificate
|
TRUSTED CERTIFICATE
|
A fully trusted X.509 digital certificate
|
CERTIFICATE REQUEST
|
A PKCS #10 certificate signing request
|
X509 CRL
|
An X.509 certificate revocation list
|
SSL SESSION PARAMETERS
|
?
|
The header line is followed by three lines that look like MIME headers. Do not treat them as MIME headers, though. Yes, the base64-encrypted text is separated from the header information by a line with nothing on it (two CRLFs). However, you should assume that there is no real flexibility in the headers. You should have either the two headers that are there, or nothing (and if you're not including headers, be sure to remove the blank line). In addition, the headers should be in the order shown above, and they should have the same comma-separated fields.
As far as we can determine, the second line must appear exactly as shown above for OpenSSL compatibility. There's some logic in OpenSSL to handle two other options that would add an integrity-checking value to the data being encoded, but it appears that the OpenSSL team never actually finished a full implementation, so these other options aren't used (it's left over from a time when the OpenSSL implementers were concerned about compliance with the original PEM RFCs). The first parameter on the "DEK-Info" line (where DEK stands for "data encrypting key") contains an ASCII representation of the algorithm used for encryption, which should always be a CBC-based mode. Table 7-5 lists the identifiers OpenSSL currently supports.
Table 7-5. PEM encryption algorithms supported by OpenSSL
AES with 128-bit keys
|
AES-128-CBC
|
AES with 192-bit keys
|
AES-192-CBC
|
AES with 256-bit keys
|
AES-256-CBC
|
Blowfish
|
BF-CBC
|
CAST5
|
CAST-CBC
|
DES
|
DES-CBC
|
DESX
|
DESX
|
2-key Triple-DES
|
DES-EDE-CBC
|
3-key Triple-DES
|
DES-EDE3-CBC
|
IDEA
|
IDEA-CBC
|
RC2
|
RC2-CBC
|
RC5 with 128-bit keys and 12 rounds
|
RC5-CBC
|
The part of the DEK-Info field after the comma is a CBC initialization vector (which should be randomly generated), represented in uppercase hexadecimal.
The way encrypted PEM representations work in OpenSSL is as follows:
The data is DER-encoded.
The data is encrypted using a key that isn't specified anywhere (i.e., it's not placed in the headers, for obvious reasons). Usually, the user must type in a password to derive an encryption key. (See Recipe 4.10.) The key-from-password functionality has the initialization vector double as a salt value, which is probably okay.
The encrypted data is base64-encoded.
The OpenSSL API for PEM encoding and decoding (include openssl/pem.h) only allows you to operate on FILE or OpenSSL BIO objects, which are the generic OpenSSL IO abstraction. If you need to output to memory, you can either use a memory BIO or get the DER representation and encode it by hand.
The BIO API and the FILE API are similar. The BIO API changes the name of each function in a predictable way, and the first argument to each function is a pointer to a BIO object instead of a FILEobject. The object type on which you're operating is always the second argument to a PEM function when outputting PEM. When reading in data, pass in a pointer to a pointer to the encoded object. As with the DER functions described in Recipe 7.16, OpenSSL increments this pointer.
All of the PEM functions are highly regular. All the input functions and all the output functions take the same arguments and have the same signature, except that the second argument changes type based on the type of data object with which you're working. For example, the second argument toPEM_write_RSAPrivateKey( ) will be an RSA object pointer, whereas the second argument toPEM_writeDSAPrivateKey( ) will be a DSA object pointer.
We'll show you the API by demonstrating how to operate on RSA private keys. Then we'll provide a table that gives you the relevant functions for other data types.
Here's the signature for PEM_write_RSAPrivateKey( ):
int PEM_write_RSAPrivateKey(FILE *fp, RSA *obj, EVP_CIPHER *enc,
unsigned char *kstr, int klen,
pem_password_cb callback, void *cb_arg);
This function has the following arguments:
- fp
Pointer to the open file for output.
- obj
RSA object that is to be PEM-encoded.
- enc
Optional argument that, if not specified as NULL, is the EVP_CIPHER object for the symmetric encryption algorithm (see Recipe 5.17 for a list of possibilities) that will be used to encrypt the data before it is base64-encoded. It is a bad idea to use anything other than a CBC-based cipher.
- kstr
Buffer containing the key to be used to encrypt the data. If the data is not encrypted, this argument should be specified as NULL. Even if the data is to be encrypted, this buffer may be specified as NULL, in which case the key to use will be derived from a password or passphrase.
- klen
If the key buffer is not specified as NULL, this specifies the length of the buffer in bytes. If the key buffer is specified as NULL, this should be specified as 0.
- callback
If the data is to be encrypted and the key buffer is specified as NULL, this specifies a pointer to a function that will be called to obtain the password or passphrase used to derive the encryption key. It may be specified as NULL, in which case OpenSSL will query the user for the password or passphrase to use.
- cb_arg
If a callback function is specified to obtain the password or passphrase for key derivation, this application-specific value is passed directly to the callback function.
If encryption is desired, OpenSSL will use PKCS #5 Version 1.5 to derive an encryption key from a password. This is an earlier version of the algorithm described in Recipe 4.10.
This function will return 1 if the encoding is successful, 0 otherwise (for example, if the underlying file is not open for writing).
The type pem_password_cb is defined as follows:
typedef int (*pem_password_cb)(char *buf, int len, int rwflag, void *cb_arg);
It has the following arguments:
- buf
Buffer into which the password or passphrase is to be written.
- len
Length in bytes of the password or passphrase buffer.
- rwflag
Indicates whether the password is to be used for encryption or decryption. For encryption (when writing out data in PEM format), the argument will be 1; otherwise, it will be 0.
- cb_arg
This application-specific value is passed in from the final argument to the PEM encoding or decoding function that caused this callback to be made.
|
Make sure that you do not overflow buf when writing data into it!
|
|
Your callback function is expected to return 1 if it successfully reads a password; otherwise, it should return 0.
The function for writing an RSA private key to a BIO object has the following signature, which is essentially the same as the function for writing an RSA private key to a FILE object. The only difference is that the first argument is the BIO object to write to instead of a FILE object.
int PEM_write_bio_RSAPrivateKey(BIO *bio, RSA *obj, EVP_CIPHER *enc,
unsigned char *kstr, int klen,
pem_password_cb callback, void *cbarg);
Table 7-6 lists the FILE object-based functions for the most useful PEM-encoding variants. The BIOobject-based functions can be derived by adding _bio_ after read or write.
Table 7-6. FILE object-based functions for PEM encoding
RSA public key
|
RSA
|
PEM_write_RSAPublicKey()
|
PEM_read_RSAPublicKey()
|
RSA private key
|
RSA
|
PEM_write_RSAPrivateKey()
|
PEM_read_RSAPrivateKey()
|
Diffie-Hellman parameters
|
DH
|
PEM_write_DHparams()
|
PEM_read_DHparams()
|
DSA parameters
|
DSA
|
PEM_write_DSAparams()
|
PEM_read_DSAparams()
|
DSA public key
|
DSA
|
PEM_write_DSA_PUBKEY()
|
PEM_read_DSA_PUBKEY()
|
DSA private key
|
DSA
|
PEM_write_DSAPrivateKey()
|
PEM_read_DSAPrivateKey()
|
X.509 certificate
|
X509
|
PEM_write_X509()
|
PEM_read_X509()
|
X.509 CRL
|
X509_CRL
|
PEM_write_X509_CRL()
|
PEM_read_X509_CRL()
|
PKCS #10 certificate signing request
|
X509_REQ
|
PEM_write_X509_REQ()
|
PEM_read_X509_REQ()
|
PKCS #7 container
|
PKCS7
|
PEM_write_PKCS7()
|
PEM_read_PKCS7()
|
The last two rows enumerate calls that are intended for people implementing actual infrastructure for a PKI, and they will not generally be of interest to the average developer applying cryptography.