Skip to main content

SSL Distilled

A few days ago, my wife asked me: "why is it, sometimes when I access my web-mail account at work, I get a scary browser popup threatening and yelling at me that the site is dangerous and I should click 'OK'...?".
I am pretty sure this post will answer her question.

SSL without Sweating

Although SSL is very popular and well documented on many web articles, blogs, books & technical videos, when trying to really understand and assemble all the bits and bytes, you will probably find yourself breaking a sweat, because of the many misconceptions about this subjects.

In this article I am going to give a brief top-to-bottom review of the must-know parts of SSL and try to re-explain the less familiar parts of this protocol.

In order to make it easy for those of you that are familiar with SSL and only want to read the 'Hot' parts, I added an '*' as prefix to those points, so you guys can jump between them.

What is SSL?

SSL is a protocol widely used on the Internet and on Intranet networks in order to provide:

  • A secure way of communicating messages between two ends of a network connection by providing the following:
    • Encryption provides confidentiality by allowing only the intended recipient to read the message
    • Integrity ensures that a message sent by a client is received intact & untampered by the server
  • Authentication identity of one or both communication edges

NOTE:
Although completely unnecessary, at least as far as Web architectures go, the Certificate owner is the server who needs to authenticate itself against the Client - typically a Browser. For the sake of simplicity, in this article I will call the Certificate Owner 'Server' and the Certificate receiver 'Client'.

Certificates

To accomplish the mentioned goals, SSL protocol makes use of Certificates. Certificates are digital files provided and signed by a Certificate Authority (from now called 'CA') or generated and signed by you or one of your colleagues.

Certificates includes both the Certificate sender's information and encryption information, in order to Authenticate and Secure all types of messages. The most important Certificate's information is listed below:

  • Email
  • Name
  • Expiration Date
  • Common Name (CN) - The URL of the Certificate Owner
  • Public and Private keys
  • Generated Hash string of the Certificate's info
  • Issued By - The identifier string of the Certificate that signed this Certificate
  • Issued To - The identifier string of the entity to which the signed Certificate was published

The entirety of this information is validated by the client. For example, if the Expiration Date is expired, or if the CN value is not identical to the URL of the site, the client will discard the server's Certificates.

Public & Private keys

The topic of Public & Private Key is widely documented, so I will only skim through it:

Public/Private keys are a popular cryptography way to secure content between two ends of the communication.
The private key is always secret and is never transmitted through the wire, instead, the public key is public and can be distributed to clients. Both keys can perform one of the two actions: Encrypt or Decrypt data, but neither of them can perform both on the same message. In other words, one encrypts and the second decrypts.

An asymmetric algorithm is used to make it impossible for anyone to derive the private key, based only on their knowledge of the public key. For more info, read this wiki article.

Certificates Handshake Steps

An SSL handshake is the first step in an SSL session. This step enables negotiation of session information between the client and the server:

NOTE:
The list below is a popular description of the SSL Handshake steps. I retrieved the above list from this site and added my comments

  1. The client sends a Hello message to the server.
    The message includes a list of algorithms supported by the client and a random number that will be used to generate the keys.
  2. The server responds by sending a Hello message to the client. This message includes:
    • The algorithm to use. The server selected this from the list sent by the client.
    • A random number, which will be used for generating the keys.
  3. The server sends its Certificate to the client, including only the Public key and encrypted by it's private key.
  4. * The client authenticates the server by trying to decrypt,sign and compare the hash of the Certificate, and its Certificate's chain up to the root self-signed Certificate with the corresponding Certificates public keys retrieved from the pre-installed Certificate's store (take a breath, we will look into it later on...)
  5. The client generates a random value called: 'pre-master secret', encrypts it using the server's public key, and sends it back to the server.
  6. The server uses its private key for decrypting the message to retrieve the pre-master secret.
  7. The client and server separately calculate the keys which will be used in the SSL session.
    These keys do not transport, because the keys are calculated based on the pre-master secret and the random numbers, which are known to each side.
    The keys include:

    • Encryption key that the client uses to encrypt data before sending it to the server
    • Encryption key that the server uses to encrypt data before sending it to the client
    • Key that the client uses to create a message digest of the data
    • Key that the server uses to create a message digest of the data

      * Important: This point is important because it highlights the fact that the public/private key are relevant only for some data negotiation in the earlier SSL stages of communication. Afterwards the messages are always encrypted/decrypted with these calculated keys

  8. The encryption keys are symmetric (no more asymmetric keys), that is, the same key is used to encrypt and decrypt the data.
  9. The client and server send a Finished message to each other. These are the first messages that are sent using the keys generated in the previous step (the first "secure" messages).
  10. The Finished message includes all the previous handshake messages each side sent. Each side verifies that the previous messages it received match the messages included in the Finished message. This checks that the handshake messages were not tampered with.
  11. The client and server now transfer data using the encryption and hashing keys and algorithms.

SSL Handshake Diagram

Integrity via Digital Signature

As illustrated above, same way data may be encrypted with a public key and decrypted with it's corresponding private key, the reversed process applies: data encrypted with a private key may be decrypted with the corresponding public key.
This property of keys is used to ensure the integrity of a digital Certificate in a process called "Digital Signature".

Hashing algorithms such has SHA1 or MD5 (although not as secure any more, MD5 is still used in many industry systems) are used in order to process all the bytes of a message and produce a hash string. The server will encrypt the hash string and transmit a message followed by the encrypted hash string it calculated for the message.
The client re-calculates the message it receives. If the client's hash result is different of the one that was sent, the client concludes the message was corrupted during transit.

Now a man-in-the-middle can't just change the message and create a matching hash string, because the client will use the public keys it received from the server to decrypt the message encrypted using the private key that the man-in-the-middle doesn't have.

* Double Signing Flow

Exactly as SSL handshake messages sent from/to clients are signed, Certificates must be Authenticated as well be the client Client. For this reason Digital Signature process is performed on both levels in a typical SSL flow. As the Certificate has Public/Private key for negotiating the hash string and the "pre-master secret", Certificates also are signed by other Certificate's private key.

Certificates are signed in the same way messages are with another Certificate's private key.

  1. The Client follows the process above to verify the validity of the Certificate against it's list of CA Certificate & public key installed on the PC's list, by trying to decrypt the the entire Certificates in the chain with the corresponding public keys and to reproduce the exact same hash of the certificate's info produced by the Certificate's producer.
  2. After the Certificate has been validated, the same process is evaluated with the encrypted message via the public key sent from the server.

* Self Signed vs Intermediary Signed vs Root CA

Another popular misconception is related to CA, Intermediary and Self Signed.

CA Certificates are Certificates which have been signed by a CA authority. Those Certificates can be Self-Signed or Intermediary Signed.

Intermediary Certificates are Certificates who their hashed string were signed by another Certificate's private key and their private key was used to encrypt a third Certificate's private key. Meaning thy are NOT Self-Signed. For more information see "Chain of Trust" section.

The word "Self" in "Self-Signed" is not related to the entity who performed the signing action, it's not called "Self" because a developer signed the Certificate by himself.
Instead, "Self" means that the Certificate's hash was signed/encrypted with it's own private key (the one that usually is used to encrypt decrypt other messages/Certificates, and not via another Intermediary Certificate). Certificate signed manually by developers are simply Non-CA Certificates.
The result of this fact can be seen when reading the Certificate Issued To and Issued By fields, when self signed both of them are equal.

Self-Signed Certificates are also called Root Certificates as they don't have a 'parent' Certificate signer and they can be found in the Root Certificates folder in the store.

Non-CA Certificates are considered (an arguable statement) less secure not because CA makes use of some bigger secret or a more secure algorithm, but because also a man-in-the-middle can sign the Certificate on behalf of the Server you are now consuming.

In case a client encounters a self-signed Certificates, it must manually 'believe' the Certificate sender is actually who it claims to be.
The built-in/automatic authentication purpose of Certificates vanishes in this case and the client must explicitly approve the the server identity. In case of web browser the browser pops up a scary message (here is your response Leah'le. They probably just use a Non-CA Certificate...) , in custom proprietary clients scenarios the programmer intervention is required for the SSL Authentication Handshake and programmatically approve the Certificate's owner identity.

When using a CA Certificate, the Certificate is authenticated via a third-party company who explicitly verifies the validity of the Certificate's applicant. Well known CA Certificates are already pre-installed on our PCs and the client doesn't need to manually believe sender's identity.
When the client receives a Certificates signed by a well known Authority (GoDaddy, VeriSign etc.) she can be confident in:

  • Certification information were validated against the Certificate's applicant - Via phone, emails, documentations etc'.
  • Certificates sent from the Server was really provided and signed by the underlined CA - Otherwise the client will have never been capable of decrypting the hashed string with the public key stored in the Certificate Store in it's PC.

* Chain of Trust

One additional concern still need to be resolved:
What happens if the pre-installed CA Certificate encounters an issue that compromises the key? For example: The CA may suspects that one of their keys were decrypted by a malicious party, or that one of their keys were lost. In theory all the Certificates must be resigned and redelivered. To prevent this issue a process called Chain of Trust is used.

The root CA's private key, it is stored in a highly secure location and is only used for signing a few Intermediate Certificates, which will, in turn, sign other Certificates called Intermediary Certificates.

Every Certificates may contains other certificates (aka certificate's chain) of it's Certificate issuer (the Certificates which signed it), at the end, a list of nodes (Certificates) is created. This list is called a Trust Chain. Each of the nodes must be validated until the top most Certificate - the CA Root.

In case of compromise, the Intermediates can be revoked quickly (or even Instantly), without having to reconfigure every single machine to trust a new CA.

A musing and visual explanation can be found in this nice post.

Let's glue everything together

This is the explanation of the Certificates validations process in a Nutshell:

  1. Your web browser receives the web server's Certificate, which contains (among all other details) the public key of the web server sender and a chian of other certificates which signed the certificate. Usually this Certificate is signed with the private key of a trusted Certificate authority already installed in the client's PC.
  2. As web browsers arrive pre-installed with the public keys of all of the major Certificate authorities out of the box, they can use thess public key list to verify if the web server's Certificate was indeed signed by the trusted Certificate authority.
    As explained above, this means that the server sends a hash of the Certificate's info encrypted with the CA private key, the client checks the correct public key in its list and if it finds it, it will try to decrypt hash with the public key and generate an identical hash
  3. The Certificate contains the domain name or the IP address of the web server. The web browser confirms that the address listed in the Certificate CN value is the one to which it has an open connection with.
  4. Your web browser generates a shared symmetric key which will be used for encrypting HTTP traffic during this connection.
  5. Your browser encrypts the symmetric key with the web server's public key, then sends it back, thus ensuring that only the web server can decrypt it, since only the web server has its private key

Comments

The Best

Closures in C# vs JavaScript -
Same But Different

Closure in a Nutshell Closures are a Software phenomenon which exist in several languages, in which methods declared inside other methods (nested methods), capture variables declared inside the outer methods. This behavior makes captured variables available even after the outer method's scope has vanished.

The following pseudo-code demonstrates the simplest sample:
Main() //* Program starts from here { Closures(); } AgeCalculator() { int myAge = 30; return() => { //* Returns the correct answer although AgeCalculator method Scope should have ordinarily disappear return myAge++; }; } Closures() { Func ageCalculator = AgeCalculator(); //* At this point AgeCalculator scopeid cleared, but the captured values keeps to live Log(ageCalculator()); //* Result: 30 Log(ageCalculator()); //* Result: 31 } JavaScript and C# are two languages that support…

Formatting Data with IFormatProvider & ICustomFormatter

This post provides an internal overview of IFormatProvider & ICustomFormatter interfaces, and they way they are handled by .NET.

IFormatProvider is a .NET Framework Interface that should be used, by implementing its single public object GetFormat(Type) method, when there is a need to implement custom formatting of data like String and DateTime.

The public object GetFormat(Type) method simply returns an object that in turns is available to supply all available information to continue the formatting process. The Type passed in by the Framework is meant to give the implementor a way to decide which type to return back. Its like a Factory Method Design Pattern where the "formatType" is the type expected to be returned.
class MyProvider : IFormatProvider { public object GetFormat(Type formatType) { object result = null; //* Factory Method if (formatType == typeof( ICustomFormatter)) //* Some object, will be disc…

Design API for Multiple Different Clients

Today I want to talk about common design challenges related to architecture of robust APIs, designed to be consumed by multiple clients with different needs.

Our use case is the following: We need to build a N-Tier Web REST/SOAP API that is supposed to read/write data from a DB, perform some processing on that data and expose those methods to our API consumers.

In addition we have multiple different API clients each with different needs, meaning we can't just expose a rigid set of functions with a defined group of DTOs (Data Transfer Objects).
DTO vs POCO Before start diving I want to explain shortly the difference between these two controversial concepts.
DTO Objects that are designed to transfer data between edges (i.e. between processes, functions, server & clients etc'). Typically DTOs will contain only simple properties with no behavior.
POCO Objects that are designed to reflect the internal business data model. For example if you have an eCommerce platform you will…