## Keith M. Martin

Print publication date: 2012

Print ISBN-13: 9780199695591

Published to Oxford Scholarship Online: December 2013

DOI: 10.1093/acprof:oso/9780199695591.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2017. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy).date: 23 May 2017

# Basic Principles

Chapter:
(p.2) 1 Basic Principles
Source:
Everyday Cryptography
Publisher:
Oxford University Press
DOI:10.1093/acprof:oso/9780199695591.003.0001

# Abstract and Keywords

This chapter discusses the basic principles of cryptography. It covers the role of cryptography in securing information; the types of risk to which information is typically exposed; the main security services that are the focus of this book; the fundamentals of cryptosystems; the resources it is reasonable to assume that an attacker of a cryptosystem has access to; and the much misunderstood concept of ‘breaking’ a cryptosystem.

This chapter serves as an introduction to the environment in which cryptography finds common use today. We discuss the need for cryptography, as well as the basic language and concepts that are used to describe a cryptographic system.

# 1.1 Why information security?

It is very likely that anyone reading this book already understands the need for information security, and hence cryptography. However, we need to consider this question, at least briefly, because it is extremely important that we understand the role of cryptography in securing information. We will use the term information security in a generic sense to describe the protection of information and information systems. This involves the use of many different types of security technologies, as well as management processes and controls. Cryptography provides the techniques that underpin most information security technologies. This chapter will explore this concept in more detail. More precise explanations of the core definitions relating to cryptography are provided in Section 1.4.1 .

## (p.3) 1.1.1 The rising profile of information security

Even as recently as the end of the last century, cryptography was a topic of which only specialists and interested users were aware. In fact, this probably also applies to the much broader discipline of information security. So, what has changed?

Information is not a new concept and has always been of value. Society has always dealt with information that has needed some level of protection and has always used processes to safeguard that information. There is nothing new about the need for information security.

Likewise, cryptography is not a new science, although some would say that it has only recently been formally treated as such. It has been used for centuries to protect sensitive information, especially during periods of conflict.

However, information security is now a subject with a relatively high profile. Most people use information security mechanisms on a daily basis. Information security incidents are widely reported in the media. Information security protection features on government agendas. The reason for this increased profile has been the development of computer networks, particularly the Internet. This development has not necessarily resulted in an increase in the amount of information in the world, but data is now easier to generate, access, exchange and store. The benefits of lower communication and storage costs, as well as increased connectivity and higher processing speeds, have encouraged automation of business processes. As a result, more and more applications and services are conducted electronically. Since all this electronic data has the potential to be transmitted and stored in environments that are relatively insecure, the need for information security has become paramount.

The rise in significance of information security has brought with it an increase in the importance and widespread use of cryptography. As we shall see, cryptography lies at the heart of most technical information security mechanisms. As a result, cryptography has become something that most people use in everyday applications. Once largely the domain of government and the military, cryptography is now deployed on devices that can be found in the pockets of almost every consumer of technology.

## 1.1.2 Two very different office environments

It is worth briefly considering precisely what types of physical security mechanisms we used to rely on prior to computer communication. Indeed, we still rely on many of these in physical situations. The fact that these security mechanisms cannot easily be applied to electronic environments provides the central motivation for defining cryptographic mechanisms.

### (p.4) An Old Office

Imagine an office where there are no computers, no fax machines, no telephones and no Internet. The business conducted in this office relies on information coming from both external and internal sources. The employees in this office need to be able to make decisions about the accuracy and authenticity of information. In addition, they need mechanisms for controlling who has access to information. So, what basic security mechanisms allow people working in such an office to make decisions about the security of information that they receive and process?

We can fairly safely assume that most information dealt with in this office is either spoken or written down. Some basic security mechanisms for spoken information might be:

• facial or vocal recognition of people known to staff in the office;

• personal referrals or letters of introduction for people not known to staff in the office;

• the ability to hold a private conversation in a quiet corner of the room.

Some basic security mechanisms for written information might be:

• recognition of handwriting of people known to staff in the office;

• handwritten signatures on documents;

• sealing documents in an envelope;

• locking a document in a filing cabinet;

• posting a letter in an official post box.

Note that these security mechanisms are not particularly strong. For example, people who do not know each other well could misidentify a voice or face. An envelope could be steamed open and the contents altered. A handwritten signature could be forged. Nonetheless, these mechanisms tend to provide ‘some’ security, which is often ‘good enough’ security for many applications.

### A Modern Office

Now consider a modern office, full of computers that are networked to the outside world via the Internet. Although some information will undoubtedly be processed using some of the previous mechanisms, for reasons of convenience and efficiency there will be a vast amount of information handled by electronic communication and storage systems. Imagine that in this office nobody has considered the new information security issues.

Here is a list of just some of the security issues that staff in this office should be considering:

• How can we tell whether an email from a potential client is a genuine inquiry from the person that it claims to have come from?

• How can we be sure that the contents of an electronic file have not been altered?

• (p.5) How can we be sure that nobody else can read an email that we have just sent to a colleague?

• How can we accept an electronic contract received by email from a client on the other side of the world?

Without the adoption of some information security mechanisms, the answer to all of these questions is probably ‘with great difficulty’. While even a non-expert may notice that a physical envelope has a damaged seal (and hence get suspicious), it is almost impossible to recognise whether an unprotected email has been accessed by an unauthorised party. It is certainly possible to communicate much more easily in this modern office, but there is a real case for claiming that we have much less inherent security in this environment than in the strictly physical world of the old office.

## 1.1.3 Differing perspectives

It should already be clear that there is a need for translation of the basic security mechanisms used in the physical world into mechanisms suitable for application in an electronic environment. In essence, this is what modern cryptography is all about. A central aim of this book is to demonstrate precisely what role cryptography plays in this translation process.

If this book was just about cryptography itself, then we could immediately proceed to a discussion of cryptographic mechanisms. However, this book is not just about the principles, but also the application of cryptography. We thus need to understand in a wider sense how cryptography fulfils a role in the provision of information security.

We now identify three different perspectives on the use of cryptography. The vested interests that these represent have helped to shape the modern use of cryptography.

### Individual Perspective

Cryptography is a technology just like any other. Thus the perspective of many individuals is that they have a right to use cryptography for any purpose that they deem fit. As we later discuss, using cryptography to encrypt data can serve a similar function to sealing a document in an envelope in the physical world. Thus, why should individuals be denied the right to use encryption? Further, many people regard cryptography as a technology that enables them to realise other rights. Foremost amongst these are rights to privacy and freedom of expression.

For businesses, computer networks, especially open networks such as the Internet, provide both great opportunities and significant risks. From a business perspective, cryptography is a technology that can be used in order to implement (p.6) information security controls which, ultimately, result in the business making more money than if these controls were not adopted.

It is a common misconception that business automation is often driven by a desire for increased security. This is in fact very rarely the case. An important early business application of cryptography was the adoption of automatic teller machines in the banking industry. These were introduced not to increase security, but to increase availability, and hence to increase business. It is arguably easier to defraud an automatic teller machine than it is to extract money improperly from the counter of a bank.

Business automation usually leads to a significant change in the threats to which a business is exposed. Unless these are carefully addressed, business automation can lead to a decrease in the level of security. The main business security requirement is thus that increased automation and adoption of technologies such as cryptography should not lead to a decrease in the overall security of conducting business. For example, when the GSM system was developed for mobile telecommuniations (see Section 12.3 ), the designers intended that GSM should be ‘as secure’ as landlines.

### Government Perspective

Sometimes governments have conflicting requirements with respect to information security. On the one hand, they may wish to promote competitive business. They can do this by reducing costs and obstacles to business operation, such as reducing trade barriers and harmonising laws and regulations. On the other hand, governments may wish to control crime and manage issues of national security. They may try to do this by imposing certain barriers and introducing other laws and regulations.

In the case of cryptography, these different governmental roles have sometimes led to conflicts of interest. The fundamental problem they face is that the traditional model of a cryptosystem (which we discuss in Section 1.4.3 ) involves ‘good’ users deploying encryption to protect themselves from ‘bad’ attackers who attempt to access their information. However, from a government perspective it may be the case that they perceive ‘bad’ users are deploying encryption to hide their information from ‘good’ attackers (such as law enforcement officers) who could foil their activities if they had access to this information.

### Conflicts of Interest

These different perspectives clearly lead to potential conflicts of interest. For example:

• Some governments have a history of attempting to impose export controls on cryptographic algorithms, which clashes with individual freedom and business needs for the deployment of strong encryption in commercial products.

• Some governments have introduced regulations requiring access, in certain circumstances, to encrypted data. This has the potential to clash with individual (p.7) desires for privacy, as well as business needs for the protection of commercial secrets.

Indeed, cryptographic technology is sometimes labelled as a dual use good, which means that it is considered to be a technology with the potential to be both beneficial or harmful, depending on the perspective taken on a particular cryptographic application.

## 1.1.4 The importance of security infrastructure

The security commentator Bruce Schneier wrote a book called Applied Crypto-break graphy in the early 1990s. A few years later he wrote a book on computer security called Secrets and Lies. He claimed that during the writing of the second book he had an ‘epiphany’ in which he realised that all the cryptographic mechanisms in Applied Cryptography were almost immaterial compared to the ‘real’ security problems associated with the provision of a complete information security system. The biggest problem was not designing the cryptographic mechanisms themselves. The real problem was making the cryptography actually work in a practical system through the provision of an entire information security architecture, of which cryptography was only a small, but vital, component.

This is an important issue and one that needs to be kept in mind throughout this book. Cryptography, just like any security technology, cannot be made to work without having the infrastructure in place to support its implementation. By ‘infrastructure’ we mean the procedures, plans, policies, management, whatever it takes, to make sure that the cryptographic mechanisms actually do the job for which they were intended.

We will consider certain aspects of this infrastructure. However, there are many aspects of this infrastructure that are well beyond the scope of our discussions. Ideally, computer operating systems should be designed and used securely, networks should be implemented and configured securely, and entire information systems should be planned and managed securely. A perfectly good cryptographic mechanism can fail to deliver its intended security services if any one of these other areas of the security infrastructure fail.

This holistic attitude to information security is one that must always be kept in mind whenever a cryptographic application is designed or used. One of the aims of this book is to identify which elements of this wider security infrastructure are particularly relevant to the effectiveness of a cryptographic application.

# 1.2 Security risks

We now consider the types of risk to which information is typically exposed. We examine a very basic communication scenario and discuss some of the factors (p.8) that determine the choice of security mechanisms deployed to address these risks.

## 1.2.1 Types of attack

Risks to information can be assessed by identifying different types of possible attack that can be applied. These attacks are often classified by the type of action that an attacker is able to perform.

### Passive Attacks

The main type of passive attack is unauthorised access to data. This is a passive process in the sense that the data and the processes being conducted on that data remain unaffected by the attack. Note that a passive attack is often likened to ‘stealing’ information. However, unlike stealing physical goods, in most cases theft of data still leaves the owner in possession of that data. As a result, information theft may go unnoticed by the owner. Indeed, it may even be undetectable.

### Active Attacks

An active attack involves either data being changed in some way, or a process being conducted on the data. Examples of active attacks include:

• unauthorised alteration of data;

• unauthorised deletion of data;

• unauthorised transmission of data;

• unauthorised changing of the origin of data;

We will see that cryptography can be used as a tool to help prevent most passive and active attacks. A notable exception is denial of service. There is very little protection that cryptography can provide against this type of attack. Defence against denial of service normally requires security controls in other parts of the security infrastructure.

## 1.2.2 Security risks for a simple scenario

We now examine a very simple communication scenario and consider what security risks might exist. The simple scenario depicted in Figure 1.1 features a sender (who in the cryptographic world is often called Alice) and a receiver (who is usually called Bob). Alice wishes to transmit some information in an email to Bob. If Alice and Bob are to have any assurances about the security of the email that they have just exchanged then they should ask themselves some serious questions.

(p.9)

Figure 1.1 Simple communication scenario

For example, Alice might ask herself:
• Am I happy that anyone could read this email, or do I only want Bob to see it?

• How can I make sure that my email reaches Bob without being changed?

• Am I prepared (or allowed) to take any measures to protect my email before I send it?

• How can I have confidence that this email actually came from Alice?

• Can I be sure that this is the email that Alice intended to send me?

• Is it possible that Alice could deny in the future that she sent me this email?

This simple communication scenario (or variations thereof) is one that we will regularly return to when we consider different types of cryptographic mechanism. However, it is important to realise that not all applications of cryptography conform to this simple communication scenario. For example, we may need to secure:

• a broadcast environment, where one sender is streaming data to a large number of receivers;

• a data storage environment, which may not have an obvious recipient.

At this stage it suffices to appreciate that there are other basic scenarios that each come with their own players and security risks.

## 1.2.3 Choosing security mechanisms

Alice and Bob's concerns in Section 1.2.2 may seem rather paranoid. Some people regularly encrypt emails and so might regard these concerns as being important, while other people rarely encrypt emails and might regard the questions raised by Alice and Bob as being slightly absurd, or at least ‘over the top’ (for more discussion of this particular issue, see Section 12.7.2 ).

(p.10) These contradictory perspectives are not surprising. Risk is subjective, and risks differ between applications. Indeed the assessment and management of risk is a major information security topic in its own right and one that many organisations devote entire departments to studying. It is prudent to think about questions such as those identified in Section 1.2.2 , but whether we act on them and introduce security controls to address them is another issue altogether.

In fact there are at least three different issues to consider when contemplating the use of any security mechanism, including a cryptographic control:

• Appropriateness. Is it the right tool for the job? It is important to understand the precise properties that a cryptographic mechanism will provide. One aim of this book is to explain how the various tools of cryptography can (and in some cases cannot) be used to provide different notions of security.

• Strength. Why put in an expensive burglar alarm in situations where a warning sign would suffice? Different information security mechanisms provide different levels of protection for data, just as different security mechanisms in the physical world provide a range of strengths of physical protection.

• Cost. Do the security gains justify the costs? The cost of a security mechanism is of fundamental importance. By ‘cost’ we do not necessarily mean monetary value. Cost can be measured in terms of ease of use and efficiency of operation, as well as directly in terms of financial worth. As we will see throughout Chapter 12 , in many real applications it is cost considerations that determine the security mechanism adopted, rather than the strength of security that the mechanism provides. In the past, some military and government sectors may have opted for strong security, whatever the cost, but in most modern environments this is not appropriate. A sensible commercial question might be: what strength of security is appropriate given the value of our assets? A more commonly asked question of modern security managers however is: what strength of security can we obtain within our budgetary constraints? One of the challenges of managing security in such environments is to make a case for having information security controls. This case can often be built on the argument that good security may succeed in reducing other costs.

Returning to our email example, an appropriate tool for preventing emails from being read by unauthorised parties is encryption. The strength of the encryption used is dependent on the cryptographic algorithm and the number of decryption keys (which we will discuss later). The cost is that it is necessary to buy and install suitable software, manage the relevant keys, configure the email client appropriately and incur some small time and communication costs every time the software is used.

So, is it worth encrypting an email? There is of course no general answer, since this very much depends on the value of the information in the email and the perceived risks. However, an overall aim of this book is to advise how cryptography can help in this type of situation and what the related issues are. We will focus on explaining the appropriateness and strength of various (p.11) cryptographic mechanisms, however, we will also indicate where issues of cost may arise. Hopefully you will then be able to make up your own mind.

# 1.3 Security services

A security service is a specific security goal that we may wish to achieve. We now introduce the main security services that we will be concerned with in this book. Note that while security services sometimes relate directly to human beings, more often they relate to computers or other devices (often operating on behalf of human beings). While this potential difference is an important issue that can have important security implications (see also Section 8.3 ), we will normally avoid concerning ourselves with it directly and use the generic terms user and entity in an interchangeable way to mean whoever, or whatever, is taking part in the processing of data in an information system.

## 1.3.1 Basic definitions

• Confidentiality is the assurance that data cannot be viewed by an unauthorised user. It is sometimes referred to as secrecy. Confidentiality is the ‘classical’ security service that can be provided by cryptography and is the one implemented by most historical applications. While it remains an important security service, there are many modern applications of cryptography that do not require the provision of confidentiality. Even when confidentiality is wanted, it is rare for it to be the only security service that is required.

• Data integrity is the assurance that data has not been altered in an unauthorised (which includes accidental) manner. This assurance applies from the time that the data was last created, transmitted or stored by an authorised user. Data integrity is not concerned with the prevention of alteration of data, but provides a means for detecting whether data has been manipulated in an unauthorised way.

• Data origin authentication is the assurance that a given entity was the original source of received data. In other words, if a technique provides data origin authentication that some data came from Alice then this means that the receiver Bob can be sure that the data did originally come from Alice at some time in the past. Bob does not necessarily care exactly when she sent it, but he does care that Alice is the source of the data. Nor does he care from which immediate source he obtained the data, since Alice could have passed the data to an intermediary for forwarding (as is the case when data is passed over the Internet, where the immediate source of data may be a web server or router). For this reason, data origin authentication is sometimes referred to as message authentication since it is primarily concerned with the authentication of the data (message) and not who we are communicating with at the time the data is received.

• (p.12) Non-repudiation is the assurance that an entity cannot deny a previous commitment or action. Most commonly, non-repudiation is the assurance that the original source of some data cannot deny to a third party that this is the case. Note that this is a stronger requirement than data origin authentication, since data origin authentication only requires this assurance to be provided to the receiver of the data. Non-repudiation is a property that is most desirable in situations where there is the potential for a dispute to arise over the exchange of data.

• Entity authentication is the assurance that a given entity is involved and currently active in a communication session. In other words, if a technique provides entity authentication of Alice then this means that by applying the technique we can be sure that Alice is really engaging with us now, in ‘real time’. If we fail to establish this temporal aspect of entity authentication (which requires the adoption of a freshness mechanism, see Section 8.2 ) then we have failed to achieve entity authentication. In certain contexts, entity authentication is referred to as identification because it is concerned with determining who am I communicating with now, in real time?

## 1.3.2 Relationships between security services

It is important to recognise that these basic security services are all essentially different, even though on first encounter they may seem similar. The following statements further illustrate this.

### Data Origin Authentication Is A Stronger Notion Than Data Integrity

In other words, if we have data origin authentication then we also have data integrity (but most certainly not the other way around).

To see that data origin authentication would be meaningless without data integrity, suppose that Alice has sent us some data. If we have no data integrity then we cannot be sure that the data received has not been changed by an attacker in transit. The actual data that we received might therefore have come from the attacker and not from Alice. How could we possibly claim to have data origin authentication from Alice in this case? We have thus tied ourselves in a logical knot. Therefore data origin authentication can only be provided if data integrity is also provided. It can be helpful to think of data origin authentication as a stronger version of data integrity. More precisely, data origin authentication is data integrity with the extra property of assurance of the identity of the original source of the data.

A commonly offered attempt at a counter-example to this relationship is recognition of the source of a broken voice message over a noisy channel (such as a telephone call). Since the voice message is audibly broken, we clearly do not have data integrity. However, because the voice is recognisable it could be argued (p.13) that we do know the source of the voice data. This, though, is not an example of data origin authentication without data integrity. Even if the speaker's voice is recognisable, since an attacker could be inserting noise into the broken message signal we cannot be certain that all the data we receive has come from the speaker whose voice we recognise. Data origin authentication must apply to the entire received message, not just parts of it.

Note that in almost all environments where we wish to detect deliberate modification of data, we will require data origin authentication. The weaker notion of data integrity without data origin authentication is normally only required in situations where the sole integrity concern is accidental modification of data.

### Non-Repudiation Of A Source Is A Stronger Notion Than Data Origin Authentication

We have to be slightly careful when making statements about non-repudiation, since this security service can be applied in different situations. However, when applied to the source of some data (which is the context that we will focus on in this book) then it is clear that non-repudiation cannot be provided without data origin authentication (and hence data integrity) also being provided. We can only bind the source to the data, in a manner that cannot be later denied, if we have assurance that the data itself is from that source. As noted earlier, non-repudiation also typically requires this binding to be verifiable by a third party, which is a stronger requirement than that for data origin authentication.

### Data Origin Authentication And Entity Authentication Are Different

Data origin authentication and entity authentication are different security services. The best way to see this is to look at applications that require one, but not the other.

Data origin authentication is useful in situations where one entity is forwarding information on behalf of another, for example, in the transmission of an email message over a public network. Entity authentication is unlikely to be meaningful in this case since there may be significant delays between the time that the message is sent, the time that the message is received and the time that the message is actually read. However, whenever the message is read we would like assurance of the identity of the creator of the email. This is provided by data origin authentication.

On the other hand, entity authentication is the main security service required when accessing resources. A user logging on to a computer is required to provide real-time evidence of their identity. Normally, entity authentication is provided either by presenting a credential (such as a password) or performing a cryptographic computation. In both cases, entity authentication is provided by demonstrating an ability to conduct this process correctly and does not necessarily require the origin of any data to be checked.

### (p.14) Data Origin Authentication Plus A Freshness Check Can Provide Entity Authentication

As we have just discussed, data origin authentication on its own is only concerned with the origin of data, and not whether the sender of data is currently active. However, if we carefully combine data origin authentication with some sort of freshness check then we can often achieve entity authentication, since we know where the data originated and we know that the originator is involved in the current communication session. We will see examples of this in Section 8.5 and Chapter 9 .

### Confidentiality Does Not Imply Data Origin Authentication

A common mistake is to believe that providing data confidentiality (primarily through encryption) also provides assurance of who sent the data and that it is correct. There are special situations where this is a reasonable deduction, but it is generally not true. Where both of these security services are required (which is the case for many cryptographic applications) then they should both be provided explicitly, either by using separate cryptographic mechanisms or one that is specially designed to provide both services. We will discuss this issue in further detail in Sections 6.3.1 and 6.3.6 .

# 1.4 Fundamentals of cryptosystems

Having set the scene, it is now time to look at the concept of a cryptosystem. We examine the basic model of a cryptosystem and explain fundamental terminology that will be used throughout the rest of the book. We also explain the crucial difference between two important types of cryptosystem.

## 1.4.1 Different cryptographic concepts

Before proceeding further, it is important to explain some common cryptographic terminology.

• Cryptography is a generic term used to describe the design and analysis of mechanisms based on mathematical techniques that provide fundamental security services. We will use cryptography in a generic sense, but a more formally accurate term is cryptology, which is the scientific study of cryptography (the design of such mechanisms) and cryptanalysis (the analysis of such mechanisms). It is appropriate to think of cryptography as the establishment of a large toolkit of different techniques, the contents of which can either be used on their own, or combined, in security applications.

• A cryptographic primitive is a cryptographic process that provides a number of specified security services. If cryptography is a toolkit, then cryptographic (p.15) primitives are the basic generic tools in that kit. Examples of cryptographic primitives that we will later discuss are block ciphers, stream ciphers, message authentication codes, hash functions and digital signature schemes.

• A cryptographic algorithm is the particular specification of a cryptographic primitive. A cryptographic algorithm is essentially a ‘recipe’ of computational steps (rules such as ‘add these two values together’ or ‘replace this value by an entry from this table’). An algorithm is a sufficiently detailed specification that a computer programmer could implement it. For example, AES is a cryptographic algorithm that specifies a block cipher. The term cipher is sometimes associated with a cryptographic algorithm, particularly historical algorithms such as those that we discuss in Chapter 2 .

• A cryptographic protocol is a sequence of message exchanges and operations between one or more parties, at the end of which a series of security goals should have been achieved. Examples of cryptographic protocols that we will discuss include the STS protocol (see Section 9.4.2 ) and SSL/TLS (see Section 12.1 ). Cryptographic protocols typically employ a number of different cryptographic primitives at various stages. If cryptographic primitives are tools in the cryptography toolkit, then a cryptographic protocol is a way of taking a number of these tools and using them in a specific way in order to achieve more complex security goals. We discuss cryptographic protocols in Chapter 9 .

• A cryptosystem (or cryptographic scheme) is often used rather generically to refer to the implementation of some cryptographic primitives and their accompanying infrastructure. Thus, while a cryptosystem that is being used to provide data confidentiality might use a block cipher, the ‘cryptosystem’ may also include the users, the keys, the key management, etc. This term is most often used in association with cryptographic primitives that provide data confidentiality. A cryptosystem is sometimes also referred to as a cipher system.

## 1.4.2 Cryptographic primitives for security services

Having introduced the notion of a cryptographic primitive, we now indicate which common cryptographic primitives can be used to implement the various security services defined in Section 1.3.1 . Table 1.1 provides a mapping from our list of security services onto some of the cryptographic primitives that we will encounter in the remainder of the book. It shows the common use of cryptographic primitives used on their own to achieve security services. Note that we use the generic term ‘encryption’ in Table 1.1 to represent a range of cryptographic primitives, including block ciphers, stream ciphers and public-key encryption.

The immediately striking aspect of Table 1.1 is its sparseness with respect to ‘Yes’ entries. In particular, none of these primitives provides entity authentication when used on their own. However, if we relax the requirement used on their own and replace this with can be used to help provide then we obtain the much more ‘positive’ Table 1.2 .

(p.16)

Table 1.1: Mapping of primitives used on their own to provide security services

Confidentiality

Data integrity

Data origin auth.

Non-repudiation

Entity auth.

Encryption

Yes

No

No

No

No

Hash function

No

Sometimes

No

No

No

MAC

No

Yes

Yes

Sometimes

No

Digital signature

No

Yes

Yes

Yes

No

Table 1.2: Mapping of primitives that can be used to help provide security services

Confidentiality

Data integrity

Data origin auth.

Non-repudiation

Entity auth.

Encryption

Yes

Yes

Yes

Yes

Yes

Hash function

Yes

Yes

Yes

Yes

Yes

MAC

No

Yes

Yes

Yes

Yes

Digital signature

No

Yes

Yes

Yes

Yes

The entries in Table 1.2 should not be agonised over at too great a length, especially as we have yet to discuss any of the primitives described there. The main point is to indicate how intricately related the various standard cryptographic primitives are and, in particular, to flag that they are often combined to achieve security services. For example:
• Encryption can be used to design a message authentication code (MAC), which provides data origin authentication (see Section 6.3.3 ).

• Hash functions can be used to store special types of confidential data (see Section 6.2.2 ).

• In certain circumstances, MACs can be used to provide non-repudiation (see Section 7.2 ).

• Digital signatures can be used in entity authentication protocols (see Section 9.4 ).

In the second part of this book we develop the cryptographic toolkit in terms of these different security services. Chapters 4 and 5 will focus on providing confidentiality. Chapter 6 looks at mechanisms for providing data integrity and data origin authentication. Chapter 7 is concerned with the provision of non-repudiation. Chapter 8 considers entity authentication.

(p.17) For the remainder of this chapter, and indeed Chapters 2 and 3 , we continue with our background to cryptography mainly from the perspective of providing confidentiality. This is for two reasons:

1. 1. Confidentiality is the ‘oldest’ security service. The historical development of cryptography is thus easiest to illustrate in terms of the provision of confidentiality.

2. 2. Confidentiality is the most ‘natural’ security service. By this we mean that when presented with the idea of cryptography, confidentiality is the first security service that occurs to the majority of people.

## 1.4.3 Basic model of a cryptosystem

We now examine a simple model for a cryptosystem that is providing confidentiality. This basic model is depicted in Figure 1.2 . We make two restrictions in order to keep things as straightforward as possible. Please keep them in mind throughout the discussion.

1. 1. The only security service required for this cryptosystem is confidentiality. Hence the cryptographic primitive that is used within this cryptosystem is one that provides data confidentiality, such as a block cipher, a stream cipher, or a public-key encryption scheme. Although the rest of this chapter will focus on encryption and encryption algorithms, most of the issues that we address are relevant to other types of cryptographic primitive.

2. 2. The basic model that we describe is for a communications environment (in other words, Alice sending information to Bob across a communication channel of some sort). This basic model will look slightly different if we want data confidentiality in a different environment, such as for secure data storage.

Figure 1.2 Basic model of a cryptosystem

(p.18) Figure 1.2 depicts a sender who wishes to transfer some data to a receiver in such a way that any party intercepting the transmitted data cannot determine the content. The various components of the model are as follows:
• The plaintext is the raw data to be protected during transmission from sender to receiver. Raw data of this type is sometimes referred to as being in the clear. This is also often (ambiguously) referred to as the message. The intention is that at the end of the process only the sender and the receiver will know the plaintext. In particular, an interceptor cannot determine the plaintext.

• The ciphertext is the scrambled version of the plaintext that results from applying the encryption algorithm (and the encryption key) to the plaintext. It is sometimes referred to as the cryptogram. The ciphertext is not a secret and can be obtained by anyone who has access to the communication channel. In certain contexts this access is referred to as eavesdropping.

• The encryption algorithm is the set of rules that determines, for any given plaintext and encryption key, a ciphertext. Using our terminology more appropriately, it is a cryptographic algorithm that takes as input a plaintext and an encryption key, and outputs a ciphertext. The choice of encryption algorithm must be agreed between sender and receiver. An interceptor may or may not know the encryption algorithm used (see Section 1.5.3 ).

• The decryption algorithm is the set of rules that determines, for any given ciphertext and decryption key, a unique plaintext. In other words, it is a cryptographic algorithm that takes as input a ciphertext and a decryption key, and outputs a plaintext. The decryption algorithm essentially ‘reverses’ the encryption algorithm and is thus closely related to it. An interceptor may or may not know the decryption algorithm used (see Section 1.5.3 ).

• The encryption key is a value that is known to the sender. The sender inputs the encryption key into the encryption algorithm along with the plaintext in order to compute the ciphertext. The receiver normally also knows the encryption key. It may or may not be known by an interceptor (see Section 1.4.8 ).

• The decryption key is a value that is known to the receiver. The decryption key is related to the encryption key, but is not always identical to it. The receiver inputs the decryption key into the decryption algorithm along with the ciphertext in order to compute the plaintext. The interceptor must not know the decryption key. It may or may not be known by the sender (see Section 1.4.7 ). We call the collection of all possible decryption keys the keyspace.

• An interceptor (in a more general setting we also refer to an adversary or an attacker) is an entity other than the sender or receiver who attempts to determine the plaintext. The interceptor will be able to see the ciphertext. The interceptor may know the decryption algorithm (see Section 1.5.3 ). The one piece of information that the interceptor must never know is the decryption key.

To encrypt the plaintext the sender needs access to the encryption key and the encryption algorithm. The plaintext must be encrypted at the sender's end within (p.19) a secure environment. Cryptography cannot protect the plaintext before it has been converted into the ciphertext.

To decrypt the ciphertext the receiver needs access to the decryption key and the decryption algorithm. The receiver must keep the decryption key secret. The ciphertext must be decrypted at the receiver's end within a secure environment. Once the plaintext has been computed at the receiver's end then the receiver must take measures to protect (or destroy) it.

1. 1 Encryption does not prevent communication interception. There are security techniques that can be employed to prevent interception of communicated data, but encryption is not one of them. What encryption does is to render intercepted data unintelligible to anyone who does not have access to the appropriate decryption key. As such, encryption is a suitable tool to use to protect data being exchanged over open networks.

2. 2 Encryption of the communication channel does not guarantee ‘end-to-end’ confidentiality. It is true that (appropriate) encryption should guarantee that an interceptor who only has access to the ciphertext cannot decrypt it. However, the plaintext itself may be vulnerable at places within the system that are not protected by the encryption process. For example, the plaintext may exist in the clear on either the sender or receiver's computer. Other security mechanisms may be needed in order to protect plaintext data elsewhere in the system.

We note that since it does not make any sense to specify an encryption algorithm without specifying the decryption algorithm, we follow wider convention by using the term encryption algorithm to implicitly include the decryption algorithm. When dealing with the details we may refer to the encryption process or the decryption process but we assume that a specification of the encryption algorithm includes a specification of both processes.

This basic model of a cryptosystem may appear at this stage rather abstract. In Chapter 2 we will examine a number of simple cryptosystems of this type that will serve as illustrative examples.

## 1.4.4 Codes

The word ‘code’ is not one that we will be using within the context of cryptography, although it is a term that is often associated informally with cryptography. There are many different interpretations of the concept of a ‘code’.

Most generally, the term ‘code’ is often used for any scheme where data is replaced by alternative data before being sent over a communication channel. This replacement is usually dictated by the contents of a codebook, which states precisely which replacement data to use. A good example is Morse Code, which (p.20) replaces the letters of the alphabet with short sequences of dots and dashes. Note that Morse Code has nothing to do with secrecy, since the codebook in this case is well known. Morse Code was designed to efficiently transmit messages over telegraph wires. Another example of a code is ASCII, which provides a means of converting keyboard symbols into data suitable for processing on a computer (see the Mathematics Appendix).

If a codebook is kept secret, and is only known by the sender and the receiver of some data, then the resulting code can be regarded as a type of cryptosystem. In this case the encryption algorithm is simply to replace the plaintext with its matching ciphertext entry in the codebook. The decryption algorithm is the reverse process. The encryption (and decryption) key is the codebook specification itself. For example, Morse Code is not a cryptosystem because there is only one way of replacing letters by dots and dashes. However, if the rule for replacing letters by dots and dashes was kept secret from everyone except a chosen sender and receiver, then we could regard this as a cryptosystem.

In general, cryptosystems based on codebooks only tend to be referred to as ‘codes’ when the codebook describes ways of replacing dictionary words by other words. Thus the term ‘code’ is most likely to be encountered in reference to historical cryptosystems or recreational puzzles. The types of cryptosystem that we will be most interested in do not convert words into words, but rather convert sequences of ones and zeros into other sequences of ones and zeros. While we could produce ‘codebooks’ for these modern cryptosystems, the codebooks would have to be so large that they would be impractical to use.

The term ‘code’ is also often used as an abbreviated form of error-correcting code. This is a technique that can be deployed in order to enable the recovery of correct data from ‘noisy’ data containing accidental errors that are introduced in an unreliable channel. Error-correcting codes have nothing to do with preventing data from being seen by unauthorised users. While they are related to data integrity, error-correcting codes do not protect data from being deliberately manipulated by an attacker. Therefore, we cannot really regard them as cryptographic primitives.

## 1.4.5 Steganography

Another concept often confused with cryptography is steganography, which is also concerned with preventing unauthorised users from accessing plaintext data. However, the basic assumptions behind the use of steganography are rather different from those of cryptography. Steganography is essentially the study of information hiding. The main aim of steganography is for a sender to transfer a plaintext to a receiver in such a way that only the receiver can extract the plaintext because only the receiver knows that a hidden plaintext exists in the first place, and how to look for it (for example, by extracting information from a digital image). In steganography an ‘interceptor’ may well be unaware that observed data contains (p.21) hidden information. This is quite unlike cryptography, where an interceptor is normally fully aware that data is being communicated because they can see the ciphertext. Their problem in this case is that they cannot determine what data the ciphertext represents.

Cryptography and steganography are used in quite different applications. They can also be used together. In this case, steganography can be used to hide a ciphertext. This creates two layers of security:

1. 1. The first layer, steganography, tries to hide the fact that a ciphertext exists in the first place.

2. 2. In the event that this use of steganography is detected and the ciphertext is found, the second layer, cryptography, prevents the plaintext from being known.

We will not discuss steganography any further in this book. While it does potentially have niche applications, and might in some cases be regarded as a potential threat to an information system, steganography is rarely employed to secure information systems.

## 1.4.6 Access control

It is worth observing that there are in fact three different approaches that can be taken to providing data confidentiality. The one that we are most interested in is encryption, since this provides protection independently of the location where the data resides. As we have just seen, steganography relies on ‘hiding’ the data. A third approach is to control access to the (unencrypted) data. Access control is a major topic in its own right. Indeed, much of our data is not protected through the use of encryption, but rather through access control mechanisms on computers that use a combination of software and hardware techniques to prevent unauthorised users from accessing data.

Encryption can be regarded as a means of implementing a type of access control, where only those with access to the appropriate decryption key can access protected data. However, they are normally separate mechanisms. Indeed, just as we saw for steganography, they can be used together to provide two separate layers of security. Access control can be used to restrict access to data, which is itself encrypted. Thus an attacker who manages to get around the access control mechanism only manages to retrieve encrypted data.

## 1.4.7 Two types of cryptosystem

There are two different types of cryptosystem and understanding the differences between them is crucial. The difference hinges on the relationship between the encryption and the decryption key. In any cryptosystem these two values must obviously be closely related since we cannot expect to be able to encrypt a plaintext (p.22) with one key and then later decrypt the resulting ciphertext with a totally unrelated key. The precise relationship between these keys defines not only the type of cryptosystem, but also all of its resulting properties.

• In symmetric cryptosystems the encryption key and the decryption key are essentially the same (in situations where they are not exactly the same, they are extremely closely related). All cryptosystems prior to the 1970s were symmetric cryptosystems. Indeed, symmetric cryptosystems are still widely used today and there is no sign that their popularity is fading. The study of symmetric cryptosystems is often referred to as symmetric cryptography. Symmetric cryptosystems are also sometimes referred to as secret key cryptosystems.

• In public-key cryptosystems the encryption key and the decryption key are fundamentally different. For this reason, public-key cryptosystems are sometimes referred to as asymmetric cryptosystems. In such cryptosystems it is ‘impossible’ (we often use the phrase computationally infeasible to capture this impossibility) to determine the decryption key from the encryption key. The study of public-key cryptosystems is often referred to as public-key cryptography.

Symmetric cryptosystems are a ‘natural’ concept. In contrast, public-key cryptosystems are quite counterintuitive. How can the decryption key and the encryption key be ‘related’, and yet it be impossible to determine the decryption key from the encryption key?

The answer lies in the ‘magic’ of mathematics. It is possible to design a cryptosystem whose keys have this property, but it is not obvious how to do so. Within the context of cryptographic history, the concept of public-key cryptography is relatively new and there are far fewer public-key algorithms known than symmetric algorithms. They are, however, extremely important as their distinctive properties have useful applications, as we will see.

## 1.4.8 Secrecy of the encryption key

We already know that in any cryptosystem an interceptor must not know the decryption key. In a symmetric cryptosystem, the encryption key and the decryption key are the same. It follows that in a symmetric cryptosystem there is only one key, and that this key is used for both encryption and decryption, which is why it is often referred to as a symmetric key. The sender and the receiver must be the only people who know this key.

On the other hand, in a public-key cryptosystem the encryption key and the decryption key are different. Further, the decryption key cannot be determined from the encryption key. This means that as long as the receiver keeps the decryption key secure (which they must in any cryptosystem) there is no need for the corresponding encryption key to be kept secret. It follows that the encryption key could, at least in principle, be made publicly available (hence the term public (p.23) key) so that anyone could look it up and use it to send a ciphertext to the receiver. In contrast, the associated decryption key is usually called the private key, since it is a ‘private’ value known only to that particular receiver.

In order to clarify this fundamentally different property, it can be helpful to consider a physical world analogy that usefully demonstrates the main difference between symmetric and public-key cryptosystems. This is the analogy of locking a piece of paper in a box in order to provide confidentiality of a message written on the paper. The piece of paper is the analogue of the plaintext. The locked box containing the paper is the analogue of the ciphertext. Encryption can be thought of as being analogous to the process of locking up the piece of paper, and decryption analogous to the process of unlocking it. This analogy is particularly appropriate since the physical locking process also involves the use of keys.

We consider two different types of lock that are widely available in the physical world, as illustrated in Figure 1.3 :

1. 1. Conventional locks are those normally found on filing cabinets, cars or windows. In this case the sender needs a key to lock the paper in the box. The receiver needs an identical copy of the key to later unlock it. Thus, when using a conventional lock the sender and the receiver need to share the same key. This is analogous to symmetric cryptosystems.

2. 2. Self-locking locks are those found on padlocks and often on the front doors of houses. These locks do not require a key to conduct the locking operation (a padlock can simply be snapped shut and a front door can often just be closed). When using a self-locking lock, the sender does not need a key to lock the paper

Figure 1.3 Physical lock analogy for the two types of cryptosystem

(p.24)

Table 1.3: Basic properties and terminology for keys in the two types of cryptosystem

Relationship between keys

Encryption key

Decryption key

Symmetric cryptosystems

same key

symmetric

symmetric

Public-key cryptosystems

different keys

public

private

in the box. However, the receiver does need a key to open it. This is almost analogous to public-key cryptosystems. We say ‘almost’ analogous because, to make this analogy strictly accurate, we have to assume that the ability of anyone to lock the box without a key is ‘equivalent’ to having a key to lock the box that is made available to anyone who wants it.

We note that the term secret key is rather ambiguous, since it is often applied to both symmetric and private keys. We thus reserve the use of this term to situations where we we refer to either (or both) symmetric and private keys (mainly in Chapter 10 ). The relationship and terminology for encryption and decryption keys in the two types of cryptosystem is summarised in Table 1.3 .

The ability to make encryption keys public makes the concept of public-key cryptography seem extremely attractive for a number of different applications. However, public-key cryptography comes with its own set of problems and one of the aims of this book is to explain the various advantages and disadvantages of using symmetric and public-key cryptosystems. As we will learn later, symmetric and public-key cryptosystems are often both implemented and used together in real information systems.

# 1.5 Cryptosystem security assumptions

We now consider what resources it is reasonable to assume that an attacker of a cryptosystem has access to. We begin by looking at standard assumptions and attack models. We then have a short discussion about the extent to which revealing the details of the encryption algorithm might affect the security of a cryptosystem.

## 1.5.1 Standard assumptions

In order to assess the security of a cryptosystem we must first establish exactly what assumptions we are making about potential attackers of the cryptosystem. Identifying assumptions about the capabilities of attackers is standard practice in all areas of information security and forms part of the larger process of (p.25) risk assessment. If we underestimate an attacker's capabilities then the resulting security might be inadequate. It thus makes sense to be slightly conservative and take a ‘worst-case’ view.

In cryptography there are three standard assumptions that are normally made concerning an attacker's ability. These are that the attacker knows:

• All ciphertexts sent using the cryptosystem. It is entirely reasonable to assume that an attacker has access to all the ciphertexts sent using the cryptosystem. These are not hidden from public view by the encryption process.

• Some corresponding pairs of plaintexts and ciphertexts. At first glance, this might not seem such an obvious assumption to make, however, there are many circumstances where an attacker could have access to corresponding pairs of plaintexts and ciphertexts. Just some possible scenarios are:

• The receiver has been careless in failing to keep decrypted ciphertexts secret.

• The attacker has intelligently guessed some predictable plaintexts. A good example is predictable document headers.

• The attacker has been able to influence the choice of plaintexts encrypted by the sender.

• The attacker has (temporary) access to either the encryption or decryption device. Note that this does not imply that the attacker knows the encryption or decryption key. The keys might be embedded in secure hardware and the attacker only has access to the interface of the machine that conducts the encryption (decryption) process. Obviously, we assume that the attacker does not have permanent access to the decryption device, otherwise they are in a very strong position!

• We are using a public-key cryptosystem where the encryption key is known to any potential attacker. Thus an attacker can generate pairs of corresponding plaintexts and ciphertexts at their leisure.

• The details of the encryption algorithm. This is the standard assumption that sometimes causes the most confusion. We consider this issue in Section 1.5.3 .

## 1.5.2 Theoretical attack models

Simple attacks on cryptosystems have historically been classified using the following terminology:

• ciphertext-only attacks require the attacker to know the encryption algorithm and some ciphertext;

• known-plaintext attacks require the attacker to know the encryption algorithm and some plaintext/ciphertext pairs;

• chosen-plaintext attacks require the attacker to know the encryption algorithm and some plaintext/ciphertext pairs that correspond to plaintexts chosen by the attacker.

(p.26) These are increasingly powerful attacks, since an attacker who can choose which plaintext/ciphertext pairs to examine is clearly in a better position than an attacker who can only see arbitrary plaintext/ciphertext pairs.

Our ‘standard assumptions’ do not clearly differentiate between known and chosen-plaintext attacks, since this depends on whether the attacker can only see plaintexts chosen by the sender or was able to select plaintexts for encryption. It is safest to assume that an attacker has been able to choose the plaintexts for which they know plaintext/ciphertext pairs. Most modern cryptosystems (and all public-key cryptosystems) are thus designed to withstand chosen-plaintext attacks.

While it will suffice for us to remember the three standard assumptions about the knowledge of an attacker, it is worth recognising that cryptographic researchers often have even more stringent assumptions about the possible attack model. For example, in one strong theoretical model of security of a cryptosystem, an attacker should not be able to tell the difference between ciphertext that is produced using the cryptosystem and randomly generated data. While this is a good property that any cryptosystem should aspire to, for many practical applications it might be questionable whether it is strictly necessary to pass this ‘test’.

## 1.5.3 Knowledge of the encryption algorithm

As promised, we now consider the validity of the standard assumption that an attacker knows the encryption algorithm. There tend to be two different approaches to designing encryption algorithms, which result in most encryption algorithms being classified as either:

• publicly known algorithms: the full details of the algorithm are in the public domain and can be studied by anyone;

• proprietary algorithms: the details of the algorithm are only known by the designers and perhaps a few selected parties.

In the case of publicly known encryption algorithms, an attacker knows the encryption algorithm. In the case of proprietary encryption algorithms, an attacker may well know the name of the encryption algorithm and certain basic properties, but it is not intended that they know any of the details of how it performs the encryption and decryption processes.

Note that the term ‘proprietary’ is often used in other contexts to describe something that has an ‘owner’ (an individual or organisation) and may have been patented, hence our use of this term is slightly unusual. It is possible for a publicly known algorithm to be patented by an ‘owner’, and indeed there are several high-profile examples. Further, it is not necessarily the case that a proprietary algorithm has any patent issues, although its use will necessarily be restricted.

### (p.27) The Impact of Kerckhoffs’ Second Principle

At first glance, proprietary encryption algorithms would seem to be much more sensible, since they offer two distinct advantages:

1. 1. Hiding the details of the algorithm will make life much harder for any attacker who tries to attack any cryptosystem using this algorithm. Hiding the encryption algorithm thus provides an extra ‘layer’ of security.

2. 2. Proprietary encryption algorithms can be designed to meet the specific needs of a particular application.

However, there is a danger in relying on this first advantage. This is because there are many real examples of the details of proprietary encryption algorithms eventually becoming publicly known. This could happen if:

• a device on which the encryption algorithm is implemented is captured and expert attackers are able to investigate this device and somehow extract the algorithm (this process is often called reverse engineering);

• the details of the algorithm are accidentally or deliberately leaked to the public domain.

For this reason alone it is very unwise to rely on making an encryption algorithm proprietary. In fact, good cryptographic designers work on the principle that a proprietary encryption algorithm should still be secure in the event that the encryption algorithm becomes publicly known.

This principle is the most famous of six cryptosystem design principles that were identified by Auguste Kerckhoffs in the 19th century. More precisely, Kerckhoffs stated that the cryptographic algorithm should not be required to be secret. This principle is often misinterpreted as stating that the cryptographic algorithm should be publicly known, and hence rely only on the secrecy of the decryption key. However, Kerckhoffs did not say this. He simply pointed out that proprietary algorithms should not rely on their ‘extra layer of security’ and should be designed in such a way that public exposure of the algorithm does not compromise the security (more literally, he stated that the algorithm must be able to fall into the hands of the enemy without inconvenience).

### The Case for Publicly Known Algorithms

There are many reasons why it might be preferable to use a publicly known algorithm:

• Scrutiny. A cryptographic algorithm that is in the public domain has the chance to be studied by a wide range of experts. If they all agree that the algorithm seems to be a good one then there are strong reasons to believe that the algorithm is secure. Such an algorithm could then be adopted by public standardisation bodies. In contrast, a proprietary algorithm may only have been assessed by a handful of experts.

• (p.28) Interoperability. It is much easier to adopt and implement publicly known algorithms in open networks. If an organisation wishes to regularly secure communications with external clients then use of a proprietary algorithm means that all the clients will either have to be given the algorithm specification, or the software or hardware necessary to run it.

• Transparency. Businesses may find it easier to convince a trading partner that their systems are secure if the security techniques that they employ, which includes the cryptographic algorithms, are open to assessment by their partners. If an algorithm is proprietary then partners may want to perform independent evaluations of its strength.

### What Happens In Practice?

The different advantages and disadvantages associated with proprietary and publicly known algorithms mean that their adoption is application dependent. In practice, both proprietary and publicly known algorithms are used in modern information systems.

Proprietary algorithms are normally only adopted by organisations (such as governments) that are large enough to be able to employ their own high-quality cryptographic design teams. They are also typically only used in closed environments where interoperability issues are less problematic.

The vast majority of applications of cryptography use publicly known algorithms. Indeed, in any commercial environment it is probably unwise to rely on the security of any cryptosystem that claims to use a proprietary algorithm unless the source of the cryptosystem design can be identified and is regarded as being highly reputable.

## 1.5.4 Use of publicly known algorithms

We have just observed that one possible advantage of publicly known algorithms is that a wide range of experts will have had the chance to evaluate such algorithms. However, designing cryptographic algorithms requires a great deal of knowledge, experience and skill. Many well-qualified (and less-qualified!) people have designed cryptographic algorithms, but very few ever gain sufficient public confidence to become recommended for use in real applications. It is thus very important to appreciate that:

• just because an algorithm is publicly known does not imply that it has been studied by a wide range of experts;

• even if a publicly known algorithm has been fairly well scrutinised, it may not be wise to deploy it in an application from a security perspective (for example, the level of scrutiny may not be sufficient);

• relatively few publicly known algorithms are actually deployed in applications;

• very few publicly known algorithms are widely supported across different applications.

(p.29)

Figure 1.4 Taxonomy of publicly known encryption algorithms

To emphasise these points, Figure 1.4 presents a conceptual taxonomy of publicly known encryption algorithms. While this taxonomy is artificial, it is designed to emphasise the prudent ‘conservatism’ of adoption of publicly known encryption algorithms. The zones in the figure can be interpreted as follows:
• Unstudied algorithms (Zone A). This consists of a substantial number of encryption algorithms that have been proposed by designers, but never subjected to any serious analysis. There may well be some very good algorithms in this zone, but they have not been scrutinised enough to be relied upon. Algorithms in this zone include those used by a number of commercial products that claim to have designed their own encryption algorithm. Great caution should be applied before relying on such products.

• ‘Broken’ algorithms (Zone B). This consists of the many publicly known encryption algorithms that have been analysed and subsequently found to be flawed.

• Partially studied algorithms (Zone C). This consists of a reasonable number of publicly known encryption algorithms that have undergone some analysis without significant security weaknesses being found, but which have not subsequently attracted a great deal of attention. The most likely reason for this is that they do not appear to offer any significant benefits over algorithms in the next two zones. As a result, even though there may be very good algorithms in this zone, the extent to which they have been studied is probably not sufficient to justify deploying them in an application without good reason.

• Respected algorithms (Zone D). This consists of a very small number of publicly known encryption algorithms that have been subject to a great deal of expert scrutiny without any flaws being found. These algorithms might reasonably be regarded as being secure enough to deploy in an application. Some of the algorithms in this zone may appear in standards. However, they are not ‘default’ encryption algorithms and so there is the potential for interoperability problems when they are used, since they are not as widely deployed as encryption algorithms in Zone E.

• (p.30) Default algorithms (Zone E). This consists of a handful of publicly known encryption algorithms that are widely recognised and deployed. These are regarded as safe choices and likely to be supported by many cryptographic applications.

Note that a publicly known encryption algorithm may well move between these zones over time. The only modern encryption algorithms that we will make specific references to in this book are (or used to be) either default or respected algorithms. It would normally be unwise to deploy a publicly known encryption algorithm that belongs to any other zone.

# 1.6 Breaking cryptosystems

We now discuss the much misunderstood concept of ‘breaking’ a cryptosystem. We will focus on:

• Cryptosystems providing confidentiality based on encryption algorithms. We note that the general principles apply to other cryptosystems supporting other cryptographic primitives.

• ‘Breaks’ that are directly related to the underlying cryptographic primitives. There are many ways in which a cryptosystem could be ‘broken’ which have nothing to do with the underlying cryptographic primitives. We discuss these further in Section 3.2 .

## 1.6.1 Some useful preliminaries

An important objective of this book is to explain cryptography without the need for skills in mathematics. While we fully intend to honour this objective, there are some very basic pieces of notation and terminology that we will need. At the risk of insulting the intelligence of some readers, this is a good place to make them clear. Other (optional) mathematical ideas are relegated to the Mathematics Appendix.

### Binary Numbers

It is important to realise that although we will discuss some historical encryption algorithms in Chapter 2 which operate on letters of the alphabet, all the ‘real’ cryptographic algorithms that we will discuss in this book are designed to run on computers, and thus process information (including plaintexts, ciphertexts and cryptographic keys) as binary data consisting of zeros and ones. Individual zeros and ones are conventionally referred to as bits and groups of eight bits are referred to as bytes.

For much of our discussion we can probably ‘get away’ with just considering binary data as sequences of zeros and ones. However, it is important to realise (p.31) that a sequence (we sometimes refer to this as a string) of zeros and ones such as 1101001 does actually represent a binary number. A full explanation of binary numbers and how they relate to the more familiar decimal numbers is provided in the Mathematics Appendix. This also includes an explanation of hex, which is most useful to us as a compact way of representing binary numbers.

### XOR

Modern symmetric cryptographic algorithms process binary data by conducting various different operations on the data. One common operation is to compute the exclusive or, better known as XOR, of two binary strings (numbers). This is essentially the equivalent of ‘addition’ for binary numbers. Thus, every time that we refer to the binary operation XOR, it is reasonable to interpret this as ‘adding’ the two binary strings together. When we refer to this operation in text we use the term XOR, but when we write XOR in mathematical notation we commonly use the symbol ⊕ (which itself indicates that we are conducting a type of ‘addition’). The XOR operation is described in more detail in the Mathematics Appendix.

### Exponentiation

A mathematical operation that we will often need to refer to is exponentiation. This means raising a number to a power, which means multiplying the original number by itself a certain number of times. Commonly we will need to raise the number 2 to various powers. We use the conventional notation 2k to mean ‘raising the number 2 to the power k’, which means multiplying 2 by itself k times. In other words:

$Display mathematics$
where there is a total of k occurrences of 2 on the right. As an example with k=4:
$Display mathematics$
More generally, we use the notation ab to mean raising a to the power b, which just means multiplying the number a by itself b times. In other words:
$Display mathematics$
where there is a total of b occurrences of a on the right. As an example with a=3 and b=5:
$Display mathematics$
Another simple fact that we will need later is that if we raise a to the power b, and then raise the result to the power c, then this is the same as raising a to the power b× c. In other words:
$Display mathematics$

(p.32) As an example with a=2, b=3 and c=4:

$Display mathematics$
We will also use the fact that if we raise a to the power b, and then raise the result to the power c, then this is the same as raising a to the power c and then raising the result to the power b. In other words:
$Display mathematics$
Using our previous example:
$Display mathematics$

### Concatenation

At various points in our coverage of cryptographic mechanisms we will need a mathematical notation to mean the very simple process of writing two pieces of data (or numbers) ‘next to one another’. We say that the data (or number) x is concatenated to the data y, and write this as x ||y. In other words, x ||y consists of x (written on the left) next to y (written on the right). For example, the concatenation of x=1101 and y=11100011 is:

$Display mathematics$

## 1.6.2 Key lengths and keyspaces

Before proceeding further, it is important to understand various concepts relating to the number of possible different decryption keys in a cryptosystem, which we refer to as the size of the keyspace. This is important because one strategy for an attacker of a cryptosystem is to try to determine the decryption key, hence the size of the keyspace is certainly something that the attacker will be interested in. The majority of cryptosystems have a fixed size of keyspace. However, it is worth noting that:

• Some cryptosystems can provide a choice of size of keyspace. For example, the encryption algorithm AES can be used in three different ‘settings’, each of which has a different size of keyspace (see Section 4.5 ). While a cryptosystem using AES may select just one of these ‘settings’, it is also possible that it could support more than one.

• For some cryptosystems the size of the keyspace is highly flexible. For example, both the Vigenère Cipher (see Section 2.2.4 ) and one-time pad (see Section 3.1.3 ) have keyspaces whose sizes can (at least in theory) be made arbitrarily large.

Since the size of the keyspace in modern cryptosystems can be enormous, we tend to focus attention on the length of a cryptographic key (often also referred to as the size or strength of the key), which is the number of bits that it takes to (p.33) represent the key. For example, the cryptographic key 10011010 has length eight. The length of a cryptographic key is referred to more commonly than the size of the keyspace.

Regarding the relationship between key length and the size of the keyspace, there is an important difference between symmetric and public-key cryptosystems:

• Symmetric cryptosytems. By and large, the length of the key can be used to determine the size of the keyspace. If the key length is k bits (which we sometimes refer to by saying that we have a k-bit key) then the size of the keyspace is 2 k , since there are two choices (0 or 1) for each of the bits of the key, and thus the number of possible keys is:

$Display mathematics$
We used the caveat ‘by and large’ because some symmetric cryptosystems specify that particular keys should not be used, thus the keyspace is sometimes slightly smaller than 2 k . Also, some symmetric cryptosystems use keys that contain redundant bits (for example, while DES keys are normally 64 bits long, 8 of these bits are redundant and hence the effective key length is only 56 bits).

• Public-key cryptosystems}. The length of the decryption key provides an indication of the size of the keyspace, although the precise relationship between the length of the decryption key and the size of the keyspace will depend on which public-key cryptosystem is being used. We discuss this in more detail in Section 5.4.3 .

Note for symmetric cryptosystems that while the the length of a 256-bit key is indeed double the length of a 128-bit key, the size of the keyspace associated with a 256-bit key is vastly greater than that of a 128-bit key. To be precise, it is 2128 times as big!

The notions of the length of a key and the size of the keyspace are thus related and we often use these terms fairly interchangeably, keeping in mind the above discussion.

## 1.6.3 Breaking encryption algorithms

An encryption algorithm is often referred to as being broken if a method of determining the plaintext from the ciphertext is found that does not involve being legitimately given the decryption key. This is an extremely uncomfortable definition to work with since we will shortly see that under this definition every encryption algorithm can be broken. It might be better to suggest that an encryption algorithm is broken if a practical method to do this is found, but this too has problems, which we will come to in a moment.

The process of trying to determine a plaintext from a ciphertext is conducted under the standard assumptions of Section 1.5.1 . In many cases, ‘breaks’ of (p.34) encryption algorithms also involve large quantities of corresponding plaintext and ciphertext pairs being used during the cryptanalysis. There are generally two types of break:

1. 1. A method of determining the decryption key directly is found. This is the most powerful type of break, since obtaining knowledge of the decryption key allows decryption of all other ciphertexts that were generated using the corresponding encryption key.

2. 2. A weakness in the encryption algorithm is discovered that leads directly to a plaintext being deduced from the corresponding ciphertext without first determining the decryption key.

Determining the decryption key is the most common, and most natural, way of breaking an encryption algorithm, so we will focus on this type of break in the subsequent discussion.

It is very important to be aware of the fact that the term ‘break’ comes with a substantial health warning. Deciding when a method of determining the plaintext from a ciphertext is actually feasible (or relevant) is subjective and depends, to an extent, on what it is reasonable to expect an attacker to be able to do. It is thus highly plausible that an encryption algorithm that is regarded as ‘broken’ with respect to one application might still be suitable for another.

Another point, which we will make repeatedly, is that the most likely point of failure in any cryptosystem is in the management of the cryptographic keys. If a decryption key is inadequately protected then the cryptosystem becomes useless, regardless of the strength of the underlying encryption algorithm. It is surprisingly common for key management failures to be confused with breaks of encryption algorithms, especially in the media.

## 1.6.4 Exhaustive key searches

There is one important method that can be used to ‘break’ almost all known encryption algorithms (we will discuss the only exception in Section 3.1.3 ). This attack is so important that it provides a security ‘benchmark’ against which the effectiveness of other attacks can be measured.

### Conducting An Exhaustive Key Search

An exhaustive key search can be conducted by an attacker who is in possession of a target ciphertext that has been encrypted using a known encryption algorithm. The attacker:

1. 1. selects a decryption key from the keyspace of the cryptosystem;

2. 2. decrypts the target ciphertext using that decryption key;

3. 3. checks to see if the resulting plaintext ‘makes sense’ (we discuss this concept in a moment);

4. (p.35) 4. if the plaintext does make sense then the attacker labels the decryption key as a candidate decryption key;

5. 5. if the attacker can confirm that this decryption key is the correct decryption key then the attacker stops the search, otherwise they select a new decryption key from the keyspace and repeat the process.

In other words, an exhaustive key search involves decrypting the ciphertext with different decryption keys until candidates for the correct decryption key are found. If the correct decryption key can be identified as soon as it is tested then the attacker stops the search as soon as it is found. If it cannot be identified then the attacker searches all possible decryption keys until the list of candidate decryption keys is complete. This type of attack is sometimes also referred to as a brute-force attack, since in its simplest form it involves no sophisticated knowledge of the cryptosystem other than the encryption algorithm used.

### Identifying Candidate Decryption Keys

In order to decrypt a target ciphertext ‘correctly’, an attacker conducting an exhaustive key search needs to be able to recognise when they have found candidates for the correct decryption key. The attacker thus needs some information that can be used to identify candidate decryption keys. This type of information could be:

• Some known plaintext/ciphertext pairs}: the attacker could then apply each decryption key to the known ciphertexts to see if that decryption key successfully decrypts the known ciphertexts into the corresponding known plaintexts.

• Knowledge of the plaintext language: if the plaintext is in a known language, such as English, then the attacker will be able to use the statistical properties of the language to recognise candidate plaintexts, and hence candidate decryption keys.

• Contextual information: the attacker may have other information concerning the plaintext that allows candidate decryption keys to be identified (for example, perhaps the plaintext has a specific format or begins with a particular known string of characters).

### Determining The Correct Decryption Key

Suppose now that an attacker is not able to immediately identify the correct decryption key and thus generates a list of candidate decryption keys. If only one candidate decryption key is found then the attacker can of course reasonably deduce that this is the correct decryption key. If more than one candidate decryption key is found then the attacker will not necessarily be able to identify the correct decryption key from this list, unless they obtain some extra information (such as another valid plaintext/ciphertext pair).

However, it should be noted that the list of candidate decryption keys is likely to be very small. For example, suppose that a highly regarded encryption algorithm (p.36) with a keyspace of size 2128 is being used. If an attacker already knows one plaintext/ciphertext pair then it can be shown that an exhaustive key search that uses this known plaintext/ciphertext pair to identify candidate decryption keys for a new target ciphertext will result in at most a handful of candidate decryption keys. This is because the chances that a different decryption key to the correct one also successfully decrypts the known ciphertext into the known plaintext is extremely small. Use of the knowledge of a second plaintext/ciphertext pair will almost always be sufficient to identify the correct decryption key.

However, even without determining precisely which of a short list of candidate decryption keys is the correct decryption key, an attacker may still have enough information to proceed. To see this, suppose that the plaintext is the launch date for a new product and the attacker is a commercial rival. If the attacker has just reduced 2128 possible decryption keys to just three (say) candidate decryption keys then this is a spectacularly significant achievement from a security perspective. Even if all three candidate plaintexts are plausible (in this case they would all have to be plausible launch dates), the attacker could:

• proceed to develop three separate courses of action, each based on a different candidate launch date of the rival's product;

• simply guess which one is correct, since they have a one third chance of being correct.

Thus in most cases it is normally assumed that an exhaustive key search results in the attacker ‘knowing’ the correct decryption key and hence the correct plaintext, even if in practice they are left with a small degree of doubt. Indeed, in some cases it is probably reasonable to assume that an attacker can identify the correct decryption key as soon as it is tested, hence they do not need to complete a search of the entire keyspace.

### Protecting Against Exhaustive Key Searches

An exhaustive key search is indeed exhausting to conduct manually, but this is precisely the type of process that computers can perform with ease. To withstand an exhaustive key search there must be so many different decryption keys to try out that the search becomes impossible to conduct in practice (it either takes too much time or costs too much money). This is why most practical cryptosystems must have a sufficiently large keyspace that an exhaustive key search is infeasible.

We now briefly consider how big ‘sufficiently large’ might be. We make the assumptions that:

• All possible keys in the keyspace are available and equally likely to be selected. If this is not the case then the keyspace is smaller than we think and the subsequent analysis may be invalid.

• The attacker can identify the correct decryption key as soon as it is tested.

Estimating exactly how much time is needed to conduct an exhaustive key search requires assumptions about the resources available to an attacker. A good place to start is to try to estimate the amount of time that it might take an attacker (p.37) to test one decryption key. We can then use this estimate to assess how long an exhaustive key search might take. We first need one piece of statistical information. Namely, if the size of the keyspace of an encryption algorithm is 2 k then the laws of probability imply that, on average, an attacker can expect to find the correct decryption key in 2k-1 decryption attempts.

This is quite an intuitive statement. It says that an attacker can, on average, expect to find the correct decryption key about halfway through an exhaustive key search. It might be the first key that they try (lucky!) and it might be the last key that they try (unlucky!) but, on average, they will find it halfway through.

We can use this information to estimate how large a keyspace needs to be in order to reasonably withstand an exhaustive key search that takes one year. There are approximately 3× 107 seconds in one year, which is approximately 225 seconds (see the Mathematics Appendix for an explanation). For a number of assumed computational strengths of an attacker, Table 1.4 shows the approximate length of decryption key that is needed in order to protect against an exhaustive key search lasting one year.

To see where the figures in Table 1.4 come from, consider the third row. Note that one thousand is approximately 210 and that one million is approximately 220. Thus in one year, one thousand processors testing one million keys per second will be able to test an approximate total of

$Display mathematics$
keys. This means that a keyspace of 256 should suffice to make it likely that an exhaustive key search will, on average, take a full year. In other words, if we estimate that an attacker has one thousand processors that can each test one million keys per second then, if we want security for one year, the minimum key length that we should consider is 56 bits. In practice we would be wise to use keys longer than this, just to be on the safe side.

Note that, somewhat ironically, the threat of an exhaustive key search presents a case (at least in theory) for slowing down the speed of a decryption algorithm. While slowing down decryption makes the cryptosystem slightly more

Table 1.4: Key lengths needed to protect against an exhaustive key search that takes one year

Strength of attack

Key length

Human effort of one key per second

26 bits

One processor testing one million keys per second

46 bits

1000 processors testing one million keys per second

56 bits

One million processors testing one million keys per second

66 bits

(p.38) cumbersome to use, it has the potential to make an exhaustive key search much more cumbersome to conduct. We give an example of a cryptographic primitive that has been deliberately slowed down for this reason in Section 8.4.2 . Nonetheless, most applications tend to choose to maximise decryption speeds in order to make the cryptography as ‘seamless’ as possible. Hence making sure that the keys are sufficiently long is the only protection against an exhaustive key search.

## 1.6.5 Classes of attack

Although we do not plan to discuss the details of many cryptanalytic attacks, it is important to be aware of the types of attack that cryptosystems are commonly subjected to. A simple classification of the most common classes of cryptanalytic attack is as follows:

• Generic attacks. These are attacks that apply to a wide range of cryptographic primitives and do not normally employ knowledge of the working of the primitive itself. We have already discussed the most important member of this class, the exhaustive key search. Other examples are:

• Dictionary attacks. This term is used in a number of different contexts, all of which relate to attacks that involve compiling a type of ‘dictionary’. For example:

• An attacker of a simple cryptosystem (for example, one using a block cipher in ECB mode, see Section 4.6.1 ) with a fixed key might be able to build a dictionary which consists of ciphertexts corresponding to plaintexts that the attacker has been able to learn by some means. For example, if the plaintexts correspond to dates that an event will occur on, the attacker will learn the plaintext when they later observe the event occurring. When a future ciphertext is seen, the attacker looks up the dictionary in the hope that the observed ciphertext is listed, in which case the attacker can read off the corresponding plaintext.

• An attacker exploits a key derivation process (see Section 10.3.2 ) where keys are derived from passwords. In this case the attacker compiles a dictionary of likely passwords and then derives the resulting keys from them, which are then used in an ‘intelligent’ exhaustive key search.

• Time memory tradeoff attacks. These are related to both exhaustive key searches and dictionary attacks, and are based on balancing computational and memory resources in attempts to determine decryption keys. For example:

• An attacker builds tables which consist of ciphertexts corresponding to specific (commonly sent) plaintexts encrypted using a large number of keys. When a ciphertext is seen that the attacker suspects may correspond to one of the commonly sent plaintexts, the attacker looks up the tables in the hope that the observed ciphertext is listed, in which case the attacker can then read off which key is likely to have been used. The size of the tables that the attacker needs to store in memory can be traded off against the time saved by table lookups.

• (p.39) Primitive-specific attacks. These are attacks that apply generically to a specific class of cryptographic primitives. Examples include:

• Differential and linear cryptanalysis. These two cryptanalysis techniques are primarily targeted against block ciphers, which are now explicitly designed to try to resist them.

• Birthday attacks. This simple attack can be conducted against any hash function and is the baseline attack that determines the output length for modern hash functions (see Section 6.2.3 ).

• Statistical attacks. There is a suite of simple statistical attacks that can be conducted against deterministic generators (see Section 8.1.4 ) and any modern deterministic generator should be resistant to them.

• Algorithm-specific attacks. These are attacks that are designed for use against a specific cryptographic algorithm. Often such attacks are variants of more generic attacks that have been tailored to the working of the specific algorithm.

• Side-channel attacks. This is an important class of attacks that are not against the theoretical design of a cryptographic primitive, but rather the way in which the primitive is implemented. An increasing number of side-channel attacks are being discovered and thus implementers of cryptography need to pay close attention to developments in this area. Examples include:

• Timing attacks. These exploit the fact that different processor computations take slightly different times to compute. Hence, by measuring such timings, it may be possible to learn information about the nature of a computation that a processor is trying to conduct. For example, it may be possible to determine a key by noting the timings of several different operations conducted using that key.

• Power analysis. These attacks are similar to timing attacks except that power consumption is used to obtain information about the nature of the underlying computations.

• Fault analysis. These attacks involve an attacker inducing errors in a cryptosystem and studying the resulting output for useful information.

• Padding attacks. These attacks exploit the fact that plaintext usually needs to be ‘padded’ before processing (see Section 4.3.2 ). By manipulating this process and monitoring resulting error messages it can be possible to learn important information about the nature of the underlying data.

It is notable that the majority of attacks on modern cryptographic algorithms come from the academic community. However, these are often academic attacks in both their origin and applicability. Recall that the idea of ‘breaking’ a cryptographic algorithm is a subjective one and depends on what attack capabilities are considered to be reasonable. Security of modern encryption algorithms tends to be set very conservatively, so that even a very good attack that significantly improves on an exhaustive key search may still be well beyond a practical (p.40) attacker's capabilities. Indeed, many academic attacks involve quite unrealistic assumptions and thus do not have practical impact (for example, they require an impractical number of deliberately chosen plaintext/ciphertext pairs). Others only have practical security implications for some types of application. Nonetheless, the fact that any attack was found at all might be a cause for concern, particularly if the attack technique has the potential to be improved.

Thus caution should be applied before reacting to claims that a particular cryptographic algorithm has been broken. It is important to recognise that without context and detail, such a claim on its own has very little meaning. More detailed information should always be sought and, if necessary, expert opinion should be obtained.

# 1.7 Summary

In this chapter we motivated the need for cryptography and discussed issues concerning its use. In particular, we introduced the basic model of a cryptosystem, as well as important terminology. There are a number of lessons that have emerged:

• The need to secure information is not a new concept, but the environment within which information needs to be secured has changed significantly.

• Cryptography provides the technical means to replicate some of the fundamental security requirements of the physical world in an electronic environment.

• Cryptography can offer strong protection, but only against certain specific threats. It is just as important to be aware of the security threats that cryptography does not protect against, as to be aware of those threats that it does address.

• There are two different types of cryptosystem, symmetric and public-key. These have significantly different properties and each type of cryptosystem has its own inherent advantages and disadvantages, which we will discuss in later chapters. Symmetric and public-key cryptosystems are often combined in real systems.

• In order to assess the security offered by a cryptosystem, it is important to establish clear assumptions about what an attacker can do, and what resources they might make available to attack the cryptosystem.

Cryptography provides fundamental mechanisms for supporting information security. A reader wishing to explore information security in more detail has plenty of options. A good starting place is Schneier [170] , which provides a very accessible overview of different computer security problems and, in particular, places the role (p.41) of cryptography in context. A wider education in information security requires, at a minimum, a broad understanding of information security management, network security and computer security. While there are increasing numbers of texts on specialist aspects of these subjects, we recommend Dhillon [56] and Purser [159] for general introductions to the management of information security, Stallings [183] for network security, and Bishop [34] and Gollmann [92] for introductions to computer security. Anderson [23] provides an interesting read of relevance to all of those subjects, including cryptography. Although only of indirect relevance to cryptography, Stoll [187] is an entertaining story for anyone seeking further motivation for securing information systems.

Levy's highly recommended Crypto is a fascinating read, which covers the dramatic development of cryptography in the latter decades of the 20th century. The ‘crypto politics’ that surrounded these events provides a rich perspective on the different attitudes and perspectives that are held about cryptography. Levy brings this subject alive through interesting profiles of some of the main parties involved during this influential period. This book has been published under two different subtitles [117, 118] and, although sometimes hard to get hold of, is worth tracking down.

The different security services that we have introduced in this chapter are notoriously hard to formally define. Menezes, van Oorschot and Vanstone [123] contains a number of useful definitions, while Dent and Mitchell [55] cover the approach taken by ISO. For an introduction to coding theory, and how it relates to cryptography, approachable reads include Biggs [33] and Welsh [198] . More information about access control can be found in Gollman [92] and Anderson [23] . An accessible introduction to steganography and how it relates to cryptography is Wayner [197] , while Fridrich [85] provides a more detailed discussion of steganographic principles and techniques. Walton [195] provides a thought-provoking perspective on the changes that have occurred in the application environment in which cryptography is deployed since the early 1970s, and the subsequent implications. A useful portal for laws and regulations relating to cryptography is maintained by Bert-Jaap Koops [110] .

Auguste Kerckhoffs’ original article [108] is available online, as are various translations of his six principles for cryptosystem design. We have only touched on very basic attack models for cryptosystems in this chapter. An indication of the stronger and more rigorous attack models used to design modern cryptosystems can be found in, for example, Katz and Lindell [105] and Stinson [185] . The study of side-channel attacks is a very active area of current research, with The Side Channel Cryptanalysis Lounge [64] being a recommended starting point.

Finally, we mention two interesting perspectives on cryptography. Matt Blaze [35] takes our analogy between encryption and physical locks much further. Blaze caused a real stir in the locksmith world when he first published this article, which is an interesting read and illustrates lessons that can be learnt by both the cryptographic and locksmith communities from studying one another's design methodologies. Young and Yung [206] discuss a number of ways in which cryptography can be (p.42) exploited by attackers in order to attack computer systems, which is quite the opposite intention of most cryptographic applications.

# 1.9 Activities

1. 1. Unauthorised access to information could also be reasonably described as ‘stealing’ information. What is one significant difference between ‘stealing’ information and ‘stealing’ physical goods?

2. 2. Consider the two common (but analogous) scenarios of sending a letter in the post and sending an email message.

1. (a) Write down a few lines that summarise the differences between these two processes with respect to:

1. i. ease of creation of a message;

2. ii. ease of sending a message;

3. iii. ease of interception of a message;

4. iv. ease of forgery of a message;

5. v. ease of repudiation (denial of sending) of a message.

2. (b) Having outlined the differences in process, now comment in each case on how the two scenarios differ with respect to the security mechanisms in place at each stage.

3. (c) Is there an equivalent of registered post for sending an email?

4. (d) Is there an equivalent of secure email for sending a letter by post?

5. (e) In your opinion, which scenario is ‘more secure’ than the other?

3. 3. For each of the physical world and the electronic world, provide two examples of the following:

1. (a) Two weak security mechanisms that, when adopted together, represent a fairly strong security mechanism.

2. (b) A strong security mechanism that, when used incorrectly, becomes a weak security mechanism.

3. (c) A strong security mechanism that, when used without an appropriate security infrastructure, becomes a weak security mechanism.

4. 4. Provide an example of at least one application (if there are any such applications) where:

1. (a) data integrity is more important than data confidentiality;

2. (b) entity authentication is more important than data origin authentication;

3. (c) entity authentication and data origin authentication are both required;

4. (d) data origin authentication is necessary but non-repudiation is not necessarily required;

5. (e) data integrity is required but not data origin authentication;

6. (f) data origin authentication is required but not data integrity;

7. (g) entity authentication is provided using more than one mechanism.

5. (p.43) 5. Explain the differences between providing confidentiality through cryptography, steganography and access control mechanisms.

6. 6. In the 19th century, Auguste Kerckhoffs defined six design principles for encryption algorithms.

1. (a) State Kerckhoffs six design principles.

2. (b) Do these six design principles still hold today?

3. (c) Translate these six design principles into a more appropriate language for the cryptographic algorithms that are used today on modern computers.

7. 7. A government department decides that it needs to use encryption to protect communication between itself and its international counterparts. At a meeting with its counterparts it is decided to develop a proprietary cryptographic algorithm for this purpose.

1. (a) Is this decision justifiable?

2. (b) What risks are being taken?

8. 8. There are some encryption algorithms that are almost publicly known in the sense that most of the details are published, but some components of the encryption algorithm are kept secret (proprietary).

2. (b) Do you think this captures the ‘best of both worlds’ or the ‘worst of both worlds’ with respect to knowledge of the encryption algorithm?

9. 9. It is generally good practice in most situations to adopt publicly known and well-established encryption algorithms such as the AES. Some people might argue that this approach is akin to ‘putting all of your eggs into one basket’ and is inherently risky since, if a serious flaw is found in AES, then the implications could be disastrous.

1. (a) Although diversity can be a good thing in many aspects of life, explain why it is not necessarily good when it comes to use of encryption algorithms.

2. (b) How should we mitigate against the risk that a leading encryption algorithm, such as AES, does get unexpectedly broken in the near future?

10. 10. Consider the zoned classification of publicly known encryption algorithms in Section 1.5.4 :

1. (a) For each of the classification zones, explain the potential disadvantages of using an encryption algorithm belonging to that zone.

2. (b) To what extent do you think that such a zoning applies to publicly known cryptographic mechanisms for providing other security services, such as data origin authentication?

11. 11. Suppose that an attacker has got hold of 128 bits of ciphertext that have been encrypted using an encryption algorithm whose keys are known to be 128 bits long. How effective is an exhaustive key search if:

1. (p.44) (a) The attacker does not know the encryption algorithm that was used?

2. (b) The attacker knows the encryption algorithm, but does not know any previous plaintext/ciphertext pairs, and knows that the plaintext is randomly generated?

3. (c) The attacker knows the encryption algorithm, one previous plaintext/ ciphertext pair, and knows that the plaintext is randomly generated?

12. 12. Dan Brown's best seller Digital Fortress [42] features a machine which, it is claimed, can ‘break’ most cryptosystems. Comment on the practicality of building such a machine.

13. 13. Explain why a cryptographic designer might reasonably claim that the main security goal for designing a symmetric encryption algorithm is to make sure that the best attack against it is an exhaustive key search.

14. 14. We often lose perspective of very large numbers.

1. (a) Place the following values in order of increasing size:

• number of possible 40-bit keys

• number of possible 90-bit keys

• number of possible 128-bit keys

• number of web pages indexed by Google

• number of stars in our galaxy

• number of stars in the universe

• number of species of bird on Earth

• number of seconds since the beginning of the universe

2. (b) For each of the above, identify how many bits would be required to represent symmetric keys in a keyspace of that size.

15. 15. Encryption algorithm ALEX has a 40-bit key and encryption algorithm CARLOS has a 48-bit key. Assume that you have sufficient computing power to use an exhaustive key search to find the key of ALEX in one day.

1. (a) Assuming that they have similar computational complexity, how long would you expect it to take to find the key of CARLOS by means of an exhaustive key search?

2. (b) Assume now that the (bad) design of CARLOS allows it to be run in two separate stages such that it is possible to conduct an exhaustive key search for the first 40 bits of a CARLOS key and then perform a separate exhaustive key search for the last 8 bits. How long do you now expect to take to recover a CARLOS key by means of an exhaustive key search?

16. 16. The following table specifies a cryptosystem based around a very simple encryption algorithm with four different plaintexts A, B, C and D (one corresponding to each column) and four different ciphertexts A, B, C and D. The encryption algorithm has five different keys K 1, K 2, K 3, K 4, K 5 (one corresponding to each row). By writing EK (P)=C to mean that the encryption of (p.45) plaintext P using encryption key K is C, the entire cryptosystem is defined as follows:

$Display mathematics$

1. (a) What is the size of the keyspace?

2. (b) If an interceptor sees the ciphertext B then which plaintext can he rule out?

3. (c) What is the ciphertext that results from encrypting plaintext B with K 3, and is this a problem?

4. (d) Could we replace the bottom right-hand entry of the table with$E K 5 ( D ) = C$?

5. (e) Suppose that we define a sixth key K 6 by the rule$E K 6 ( P ) = E K 5 ( E K 1 ( P ) )$ for each plaintext P. In other words, for example,$E K 6 ( A ) = E K 5 ( E K 1 ( A ) ) = E K 5 ( B ) = D$. What are the values for$E K 6 ( B ) , = E K 6 ( C ) and E K 6 ( D ) ?$

6. (f) Could we use a table such as this to represent a real cryptosystem?

17. 17. Explain whether the following scenarios are possible for a symmetric cryptosystem:

1. (a) two different plaintexts encrypt to the same ciphertext under different keys (in other words,$E K 1 ( P 1 ) = E K 2 ( P 2 ) = C ) ;$

2. (b) two different plaintexts encrypt to the same ciphertext under the same key (in other words,$E K ( P 1 ) = E K ( P 2 ) = C ) ;$

3. (c) a plaintext encrypts to the same ciphertext under two different keys (in other words,$E K 1 ( P ) = E K 2 ( P ) = C )$.

18. 18. In most of this chapter we assumed that cryptography was being used to protect data in a communication scenario. However, cryptography can also be used to protect stored data. Which of the issues that we discussed in this chapter are exactly the same regardless of whether cryptography is being used to protect transmitted data or stored data, and which of these are subtly different? (You might like to consider who the likely players are in the basic model of a cryptosystem being used to protect stored data, which security questions they might ask, etc.)

19. 19. Most people generally regard cryptography as a ‘force for good’, which can be used to help protect computer systems.

1. (a) Explain why government organisations might not always regard cryptography as a ‘force for good’.

2. (b) Can you think of any ways in which cryptography could be used as a tool to attack computer systems?

20. (p.46) 20. In certain jurisdictions, cryptographic technology is subject to export controls.

1. (a) Find an example of a country that currently has export controls on cryptographic technology and summarise the extent of these controls.

2. (b) What potential problems do export controls present, given that cryptography is deployed in everyday applications?

21. 21. In contrast to our usage, the term ‘proprietary’ is often used to describe something that is subject to a patent.

1. (a) Find some examples of encryption algorithms that, using our terminology, are:

1. i. publicly known and subject to patent issues;

2. ii. proprietary but not subject to patent issues;

3. iii. proprietary and subject to patent issues.

22. (b) Do you think that it is more, or less, likely that a proprietary encryption algorithm is subject to patent issues than a publicly known encryption algorithm?