Content uploaded by Neil Johnson

Author content

All content in this area was uploaded by Neil Johnson on Sep 02, 2016

Content may be subject to copyright.

Chapter 3

A survey of steganographic techniques

Neil F. Johnson and Stefan C. Katzenbeisser

Many diﬀerent steganographic methods have been proposed during the last few

years; most of them can be seen as substitution systems. Such methods try to

substitute redundant parts of a signal with a secret message (as presented in Section

2.3); their main disadvantage is the relative weakness against cover modiﬁcations.

Recently, the development of new robust watermarking techniques led to advances

in the construction of robust and secure steganography systems. Therefore, some

of the methods presented here are strongly related to watermarking techniques of

Chapter 6.

There are several approaches in classifying steganographic systems. One could

categorize them according to the type of covers used for secret communication. A

classiﬁcation according to the cover modiﬁcations applied in the embedding process

is another possibility. We want to follow the second approach and group stegano-

graphic methods in six categories, although in some cases an exact classiﬁcation is

not possible:

Substitution systems substitute redundant parts of a cover with a secret

message;

Transform domain techniques embed secret information in a transform

space of the signal (e.g., in the frequency domain);

Spread spectrum techniques adopt ideas from spread spectrum commu-

nication;

43

44 Information hiding techniques for steganography and digital watermarking

Statistical methods encode information by changing several statistical prop-

erties of a cover and use hypothesis testing in the extraction process;

Distortion techniques store information by signal distortion and measure

the deviation from the original cover in the decoding step;

Cover generation methods encode information in the way a cover for secret

communication is created.

In the following sections these six categories will be discussed.

3.1 PRELIMINARY DEFINITIONS

Throughout the following sections we want to refer to the cover used in the embed-

ding step as c. We will further assume (without loss of generality) that any cover

can be represented by a sequence of numbers ciof length `(c) (i.e., 1 ≤i≤`(c)). In

the case of digital sound this could be just the sequence of samples over time; in the

case of a digital image, a sequence can be obtained by vectorizing the image (i.e.,

by lining up all pixels in a left-to-right and top-to-bottom order). Possible values

of ciare {0,1}in the case of binary images or integers greater than 0 and less than

256 in the case of quantized images or sound. We will denote the stego-object by s

which is again a sequence siof length `(c).

Sometimes we have to index all cover-elements ci; we will use the symbol j

for such an index. If the index is itself indexed by some set, we use the notation

ji. When we refer to the jith cover-element we mean cji. We will refer to a stego-

key as k; the structure of kwill be explained separately in each steganographic

application. The secret message will be denoted by m, the length of mby `(m),

and the bits forming mby mi,1≤i≤`(m). Unless otherwise stated, we assume

that mi∈{0,1}.

A color value is normally a three-component vector in a color space (a set of

possible colors), see [1]. A well-known color space is RGB. Since the colors red,

green, and blue are additive primaries, every color can be speciﬁed as a weighted

sum of a red, green, and a blue component. A vector in RGB space describes the

intensities of these components. Another space, known as YCbCr, distinguishes

between a luminance (Y) and two chrominance (Cb,Cr) components. Whereas the

Ycomponent accounts for the brightness of a color, Cb and Cr distinguish between

the color grades. A color vector in RGB can be converted to YCbCr using the

transform:

Y=0.299 R+0.587 G+0.114 B

Cb =0.5+(B−Y)/2 (3.1)

Cr =0.5+(R−Y)/1.6

A survey of steganographic techniques 45

An image Cis a discrete function assigning a color vector (of any color space) c(x, y)

to every pixel (x, y).

3.2 SUBSTITUTION SYSTEMS AND BITPLANE TOOLS

A number of methods exist for hiding information in various media. These meth-

ods range from LSB coding—also known as bitplane or noise insertion tools—

manipulation of image or compression algorithms to modiﬁcation of image proper-

ties such as luminance. Basic substitution systems try to encode secret information

by substituting insigniﬁcant parts of the cover by secret message bits; the receiver

can extract the information if he has knowledge of the positions where secret in-

formation has been embedded. Since only minor modiﬁcations are made in the

embedding process, the sender assumes that they will not be noticed by a passive

attacker.

3.2.1 Least signiﬁcant bit substitution

Bitplane tools encompass methods that apply LSB insertion and noise manipulation.

These approaches are common in steganography and are relatively easy to apply

in image and audio [2–6]. A surprising amount of information can be hidden with

little, if any, perceptible impact to the carriers [5, 7, 8].

Sample tools used in this group include StegoDos [9], S-Tools [10], Mandelsteg

[11], EzStego [12], Hide and Seek [13], Hide4PGP [14], White Noise Storm [15], and

Steganos [16]. The image formats typically used in such steganography methods

are lossless and the data can be directly manipulated and recovered. Some of these

programs apply compression and encryption in addition to steganography services.

These services provide better security of the hidden data. Even so, the bitplane

methods are rather brittle and vulnerable to corruption due to small changes to the

carrier.

The embedding process consists of choosing a subset {j1,...,j

`(m)}of cover-

elements and performing the substitution operation cjimion them, which ex-

changes the LSB of cjiby mi(mican either be 1 or 0). One could also imagine a

substitution operation which changes more than one bit of the cover, for instance

by storing two message bits in the two least signiﬁcant bits of one cover-element.

In the extraction process, the LSB of the selected cover-elements are extracted and

lined up to reconstruct the secret message. This basic scheme is presented in Algo-

rithms 3.1 and 3.2. One problem remains to be solved: in which way should the cji

be chosen?

In order to be able to decode the secret message, the receiver must have access

to the sequence of element indices used in the embedding process. In the simplest

46 Information hiding techniques for steganography and digital watermarking

Algorithm 3.1 Embedding process: least signiﬁcant bit substitution

for i=1,...,`(c)do

si←ci

end for

for i=1...,`(m)do

compute index jiwhere to store ith message bit

sji←cjimi

end for

Algorithm 3.2 Extraction process: least signiﬁcant bit substitution

for i=1,...,`(M)do

compute index jiwhere the ith message bit is stored

mi←LSB(cij)

end for

case, the sender uses all cover-elements for information transfer, starting at the

ﬁrst element. Since the secret message will normally have less bits than `(c), the

embedding process will be ﬁnished long before the end of the cover. In this case,

the sender can leave all other cover elements unchanged. This can, however, lead to

a serious security problem: the ﬁrst part of the cover will have diﬀerent statistical

properties than the second part, where no modiﬁcations have been made. To over-

come this problem, for instance the public domain program PGMStealth enlarges

the secret message with random bits—so that `(c)=`(m)—in an attempt to create

an equal change in randomness at the beginning and the end of the cover. The

embedding process thus changes far more elements than the transmission of the

secret would require. Therefore the probability that an attacker will suspect secret

communication increases.

A more sophisticated approach is the use of a pseudorandom number generator

to spread the secret message over the cover in a rather random manner; a popular

approach is the random interval method (e.g., [3]). If both communication partners

share a stego-key kusable as a seed for a random number generator, they can create

a random sequence k1,...,k

`(m)and use the elements with indices

j1=k1

ji=ji−1+ki,i≥2 (3.2)

for information transfer. Thus, the distance between two embedded bits is deter-

mined pseudorandomly. Since the receiver has access to the seed kand knowledge

of the pseudorandom number generator, he can reconstruct kiand therefore the

entire sequence of element indices ji. This technique—which is especially eﬃcient

A survey of steganographic techniques 47

Algorithm 3.3 Embedding process: random interval method

for i=1...,`(c)do

si←ci

end for

generate random sequence kiusing seed k

n←k1

for i=1,...,`(m)do

sn←cnmi

n←n+ki

end for

Algorithm 3.4 Extraction process: random interval method

generate random sequence kiusing seed k

n←k1

for i=1,...,`(m)do

mi←LSB(cn)

n←n+ki

end for

in the case of stream covers—is illustrated in Algorithms 3.3 and 3.4, which are

special cases of the general framework presented in Algorithms 3.1 and 3.2.

3.2.2 Pseudorandom permutations

If all cover bits can be accessed in the embedding process (i.e., if cis a random

access cover), the secret message bits can be distributed randomly over the whole

cover. This technique further increases the complexity for an attacker, since it is

not guaranteed that subsequent message bits are embedded in the same order.

In a ﬁrst attempt Alice could create (using a pseudorandom number gener-

ator) a sequence j1,...,j

`(m)of element indices and store the kth message bit in

the element with index jk. Note that one index could appear more than once in

the sequence, since we have not restricted the output of the pseudorandom number

generator in any way. We call such a case “collision.” If a collision occurs, Alice will

possibly try to insert more than one message bit into one cover-element, thereby

corrupting some of them. If the message is quite short compared with the number

of cover-elements, she hopes that the probability of collisions is negligible and that

corrupted bits could be reconstructed using an error-correcting code. This is, how-

48 Information hiding techniques for steganography and digital watermarking

ever, only the case for quite short secret messages. The probability pof at least one

collision can be estimated1by (provided that `(m)`(c)):

p≈1−exp −`(m)[`(m)−1]

2`(c)

For constant `(c), pconverges rapidly to 1 as `(m) increases. If, for example, a

digital image with 600×600 pixels is used as cover and about 200 pixels are selected

in the embedding process, pis approximately 5%. On the other hand, if 600 pixels

are used for information transfer, pincreases to about 40%. We can conclude that

only for very short messages the probability of collisions is negligible; if the message

size increases, collisions must deﬁnitely be taken into account.

To overcome the problem of collisions, Alice could keep track of all cover-

bits which have already been used for communication in a set B. If during the

embedding process one speciﬁc cover-element has not been used prior, she adds its

index to Band continues to use it. If, however, the index of the cover-element is

already contained in B, she discards the element and chooses another cover-element

pseudorandomly. At the receiver side, Bob applies a similar technique.

Another method has been proposed by Aura [18]; he uses the basic substitution

scheme of Algorithms 3.1 and 3.2 and calculates the index jivia a pseudorandom

permutation of the set {1,...,`(c)}. Suppose the number `(c) can be expressed as

a product of two numbers, Xand Y(recall that this is always the case for digital

images), and hKis an arbitrary cryptographically secure hash function depending

on a key k.Letk1,k2and k3be three secret keys. It can then be shown [19, 20]

that Algorithm 3.5 outputs a diﬀerent number jifor each input i(1 ≤i≤XY ),

(i.e., it produces a pseudorandom permutation of the set {1,...,`(c)}), provided

that the algorithm is evaluated with input i=1,...,`(c).

Alice ﬁrst splits the stego-key kinto three pieces k1,k2and k3. In the embed-

ding process she stores the ith message bit in the element with index ji,whichis

computed according to Algorithm 3.5. Collisions do not occur, since Algorithm 3.5

does not produce duplicate element indices. If Bob has access to the three keys k1,

k2and k3, he is able to reconstruct the positions where Alice embedded the secret

1 The problem of calculating pis an instance of the so-called birthday paradox: an urn is ﬁlled

with nballs, numbered from 1 to n. Suppose that mballs are drawn from the urn with

replacement and their numbers are listed. The probability P(n, m)thatatleastoneballis

drawn twice, provided that m=O(√n), is given by [17]

P(n, m)=1−

m−1

Y

i=0 1−i

n→1−exp −m(m−1)

2n+O1

√n

A survey of steganographic techniques 49

Algorithm 3.5 Computing the index jiusing pseudorandom permutations

v←idiv X

u←imod X

v←(v+hk1(u)) mod Y

u←(u+hk2(v)) mod X

v←(v+hk3(u)) mod Y

ji←vX +u

message bits. However, Aura’s method needs a considerable amount of computation

time, since the chosen hash function must be evaluated 3`(m)times.

3.2.3 Image downgrading and covert channels

In 1992, Kurak and McHugh [5] reported on a security threat in high-security

operating systems. Their fear was that a steganographic technique, called image

downgrading, could be used to exchange images covertly. Image downgrading is a

special case of a substitution system in which images act both as secret messages

and covers. Given a cover-image and a secret image of equal dimensions, the sender

exchanges the four least signiﬁcant bits of the cover’s grayscale (or color) values

with the four most signiﬁcant bits of the secret image. The receiver extracts the

four least signiﬁcant bits out of the stego-image, thereby gaining access to the most

signiﬁcant bits of the secret image. While the degradation of the cover is not visually

noticeable in many cases, 4 bits are suﬃcient to transmit a rough approximation of

the secret image.

In multilevel-secure operating systems, subjects (processes, users) and objects

(ﬁles, databases, etc.) are assigned a speciﬁc security level; for example, see the

famous Bell-LaPadula [21] model. Subjects are normally only allowed to access

objects with a lower security level (“no read up”), whereas they are only able to write

onto objects with a higher level (“no write down”). Whereas the reason for the ﬁrst

restriction is obvious, the second attempts to prohibit users from making conﬁdent

information available to subjects with a lower security classiﬁcation. Information

downgrading can be used to declassify or downgrade information (hence the name)

by embedding classiﬁed information into objects with a substantially lower security

classiﬁcation and thus subvert the principle of “no write down.” Chapter 4 will look

at possible counterstrategies.

50 Information hiding techniques for steganography and digital watermarking

3.2.4 Cover-regions and parity bits

We will call any nonempty subset of {c1,...,c

`(c)}a cover-region. By dividing the

cover in several disjoint regions, it is possible to store one bit of information in a

whole cover-region rather than in a single element. A parity bit of a region Ican

be calculated by

p(I)=X

j∈I

LSB(cj) mod 2 (3.3)

In the embedding step, `(m) disjoint cover-regions Ii(1 ≤i≤`(m)) are selected,

each encodes one secret bit miin the parity bit p(Ii). If the parity bit of one cover-

region Iidoes not match with the secret bit mito encode, one LSB of the values

in Iiis ﬂipped. This will result in p(Ii)=mi. In the decoding process, the parity

bits of all selected regions are calculated and lined up to reconstruct the message.

Again, the cover-regions can be constructed pseudorandomly using the stego-key as

a seed.

Although the method is not more robust than simple bit substitution, it is

conjectured to be more powerful in many cases. First, the sender can choose which

element should be modiﬁed in the cover-region; he can do it in a way that changes

the cover statistics least. Furthermore, the probability p∗

0that the parity bit of

a cover-region consisting of Nrandomly chosen elements is zero, is approximately

1/2, nearly independent of the probability p0that the LSB of one randomly selected

cover-element is zero, since

p∗

0=

bN/2c

X

i=0 N

2i(1 −p0)2ipN−2i

0

=pN

0

2"1+1−p0

p0N

+1−1−p0

p0N#

=1

21+(2p0−1)N(3.4)

Equation (3.4) follows from the fact that p(I) = 0 if and only if there is an even

number of pixels in the cover-region which have least signiﬁcant bit 1. Since (2p0−

1)N→0if0<p

0<1, we can conclude that p∗

0rapidly approaches 1/2asN

increases, regardless of p0. This indicates that the eﬀect of the embedding process

on the cover can be reduced by increasing N.

A survey of steganographic techniques 51

3.2.5 Palette-based images

In a palette-based image only a subset of colors from a speciﬁc color space can

be used to colorize the image. Every palette-based image format consists of two

parts: a palette specifying Ncolors as a list of indexed pairs (i, ci), assigning a color

vector cito every index i, and the actual image data which assign a palette index

to every pixel rather than the color value itself. If only a small number of color

values are used throughout the image, this approach greatly reduces the ﬁle size.

Two of the most popular formats are the graphics interchange format (GIF) and the

BMP bitmap format. However, due to the availability of sophisticated compression

techniques, their use declines.

Generally, there are two ways to encode information in a palette-based image:

either the palette or the image data can be manipulated. The LSB of the color

vectors could be used for information transfer, just like the substitution methods

presented in the last subsections. Alternatively, since the palette does not need to

be sorted in any way, information can be encoded in the way the colors are stored

in the palette. Since there are N! diﬀerent ways to sort the palette, there is enough

capacity to encode a small message. However, all methods which use the order of a

palette to store information, are not robust, since an attacker can simply sort the

entries in a diﬀerent way and destroy the secret message (he thereby does not even

modify the picture visibly).

Alternatively, information can be encoded in the image data. Since neighbor-

ing palette color values need not be perceptually similar, the approach of simply

changing the LSB of some image data fails. Some steganographic applications (e.g.,

the program EzStego) therefore sort the palette so that neighboring colors are per-

ceptually similar before they start the embedding process. Color values can, for

instance, be stored according to their Euclidian distance in RGB space:

d=√R2+G2+B2(3.5)

Since the human visual system is more sensitive to changes in the luminance of

a color, another (probably better) approach would be sorting the palette entries

according to their luminance component, see (3.1). After the palette is sorted, the

LSB of color indices can safely be altered.

Fridrich [22] proposes using a slightly diﬀerent technique which does not need

the palette to be sorted: for every pixel, the set of closest colors (in the Euclidian

norm) is calculated. Starting with the closest color, the sender proceeds to ﬁnd the

next-closest color until a color is found where its parity (R+G+Bmod 2) matches

with the secret bit to encode. Once such a color is found, the pixel is changed to

this new color.

Yet another steganographic application reduces the total number of color values

52 Information hiding techniques for steganography and digital watermarking

in a picture to bN/2cusing some dithering method, and doubles the entire palette;

thereby all doubled entries are slightly modiﬁed. After this preprocessing stage,

each color value of the dithered image corresponds to two palette entries, from which

one is chosen according to a secret message bit (e.g., Mandelsteg [11], S-Tools [10],

Hide4PGP [14], and Hide and Seek [13] apply variations of this method).

3.2.6 Quantization and dithering

Dithering and quantization of digital images can be used for embedding secret in-

formation. Matsui and Tanaka [23] presented two steganographic systems which

operate on quantized images. We brieﬂy review quantization in the context of pre-

dictive coding here. In predictive coding, the intensity of each pixel is predicted

based on the pixel values in a speciﬁc neighborhood; the prediction may be a lin-

ear or nonlinear function of the surrounding pixel values. In its simplest form, the

diﬀerence eibetween adjacent pixels xiand xi+1 is calculated and fed into a quan-

tizer Qwhich outputs a discrete approximation ∆iof the diﬀerence signal xi−xi−1

(i.e., ∆i=Q(xi−xi−1)). Thus, in each quantization step a quantization error

is introduced. For highly correlated signals we can expect ∆ito be close to zero,

so an entropy coder—which tries to create a minimum-redundancy code given a

stochastic model of the data to be transmitted—will be eﬃcient. At the receiver

side the diﬀerence signal is dequantized and added to the last signal sample in order

to construct an estimate for the sequence xi.

For steganographic purposes the quantization error in a predictive coding

scheme can be utilized; speciﬁcally, we adjust the diﬀerence signal ∆iso that it

transmits additional information. In this scheme, the stego-key consists of a table

which assigns a speciﬁc bit to every possible value of ∆i; for instance, the following

assignment could be made:

∆i-4-3-2-101234

010111001

In order to store the ith message bit in the cover-signal, the quantized diﬀerence

signal ∆iis computed. If ∆idoes not match (according to the secret table) with the

secret bit to be encoded, ∆iis replaced by the nearest ∆jwhere the associated bit

equals the secret message bit. The resulting values ∆iare then fed into the entropy

coder. At the receiver side, the message is decoded according to the diﬀerence signal

∆iand the stego-key.

Secret information can also be inserted into a signal during a dithering process;

see [23] and Baharav and Shaked [24] for details.

A survey of steganographic techniques 53

Algorithm 3.6 Zhao and Koch’s algorithm for data embedding in binary images

for i=1,...,`(M)do

do forever

pseudorandomly select a new image block Bj

/* Test, if block Biis valid */

if P1(Bj)>R

1+3λor P1(Bj)<R

0−3λthen continue

if (ci=1andP1(Bj)<R

0)or(ci=0andP1(Bj)>R

1)then

mark block Bjas unusable, i.e. modify block so that

either P1(Bj)<R

0−3λor P1(Bj)>R

1+3λ

continue

endif

break

enddo

/* Embed secret message bit in Bj*/

if ci=1then

modify Bjso that P1(Bj)≥R1and P1(Bj)≤R1+λ

else

modify Bjso that P0(Bj)≤R0and P0(Bj)≥R0−λ

end if

end for

3.2.7 Information hiding in binary images

Binary images—like digitized fax data—contain redundancies in the way black and

white pixels are distributed. Although the implementation of a simple substitution

scheme is possible (e.g., certain pixels could be set to black or white depending on

a speciﬁc message bit), these systems are highly susceptible to transmission errors

and are therefore not robust.

One information hiding scheme which uses the number of black pixels in a

speciﬁc image region to encode secret information was presented by Zhao and Koch

[25]. A binary image is divided into rectangular image blocks Bi;letP0(Bi)be

the percentage of black pixels in the image block Biand P1(Bi) the percentage

of white pixels, respectively. Basically, one block embeds a 1, if P1(Bi)>50%

and a 0, if P0(Bi)>50%. In the embedding process the color of some pixels is

changed so that the desired relation holds. Modiﬁcations are carried out at those

pixels whose neighbors have the opposite color; in sharply contrasted binary images,

modiﬁcations are carried out at the boundaries of black and white pixels. These

rules assure that the modiﬁcations are not generally noticeable.

In order to make the entire system robust to transmission errors and other

54 Information hiding techniques for steganography and digital watermarking

Algorithm 3.7 Extraction process (Zhao and Koch)

for i=1,...,`(M)do

do forever

pseudorandomly select image block Bj

if P1(Bj)>R

1+3λor P1(Bj)<R

0−3λthen continue

break

enddo

if P1(Bj)>50% then

mi←1

else

mi←0

end if

end for

image modiﬁcations, we have to adapt the embedding process. If it is possible

that some pixels change color during the transmission process, it could be the

case that for instance P1(Bi) drops from 50.6% to 49.5%, thereby destroying the

embedded information. Therefore two threshold values R1>50% and R0<50%

and a robustness parameter λ, which speciﬁes the percentage of pixels which can

change color during transmission, are introduced. The sender assures during the

embedding process that either P1(Bi)∈[R1,R

1+λ]orP0(Bi)∈[R0−λ, R0]instead

of P1(Bi)>50% and P0(Bi)<50%. If too many pixels must be changed in order

to achieve that goal, the block is marked as “invalid”: P1(Bi) is modiﬁed to fulﬁll

one of the two conditions

P1(Bi)<R

0(Bi)−3λ

P1(Bi)>R

1(Bi)+3λ

and another block is pseudorandomly chosen for bit i. In the decoding process,

invalid blocks are skipped. Otherwise, the information is decoded according to

P1(Bi). The embedding and extraction algorithms are outlined in Algorithms 3.6

and 3.7.

A diﬀerent embedding scheme, presented by Matsui and Tanaka [23], uses

the lossless compression system which is used to encode information in a facsimile

document. According to a recommendation of the former Comit´e Consultatif Inter-

national T´el´egraphique et T´el´ephonique (which is now the International Telecom-

munication Union) [26], fax images can be coded using a combination of run length

(RL) and Huﬀman encoding. RL techniques utilize the fact that in a binary image

successive pixels have the same color with high probability. Figure 3.1 shows one

scan line from a fax document; we will indicate positions with changing colors with

A survey of steganographic techniques 55

aa a aa

24310

Figure 3.1 One scan line of a binary image.

ai. Instead of coding the color of every pixel explicitly, RL methods code the posi-

tions of color changes (ai) together with the number RL(ai,a

i+1) of successive pixels

with the same color starting at ai. Our hypothetical scan line of Figure 3.1 would be

coded by ha0,3i,ha1,5i,ha2,4i,ha3,2i,ha4,1i. We can thus describe a binary image

as a sequence of RL elements hai,RL(ai,a

i+1)i.

Information can be embedded into a binary, run-length encoded image by modi-

fying the least signiﬁcant bit of RL(ai,a

i+1). In the encoding process we modify

the run lengths of the binary picture so that RL(ai,a

i+1)iseven,iftheith secret

message bit miis zero. If, however, RL(ai,a

i+1) is odd, miis one. This can be

achieved, for example, by the following manner: if miis zero but RL(ai,a

i+1)is

odd, we move the position of ai+1 one pixel to the left. On the other hand, we

move ai+1 one pixel to the right, if mi=1andRL(ai,a

i+1) is even. This insertion

technique, however, leads to problems if the run-length RL(ai,a

i+1) is one. If the

run-length needs to be changed in the embedding process, it could be lost. We

therefore have to assure that such a situation will never happen; for example, all

RL elements with run-length one could be dropped before starting the embedding

process.

3.2.8 Unused or reserved space in computer systems

Taking advantage of unused or reserved space to hold covert information provides

a means of hiding information without perceptually degrading the carrier. For

example: the way operating systems store ﬁles typically results in unused space that

appears to be allocated to a ﬁle. For example, under Windows 95 operating system,

drives formatted as FAT16 (MS-DOS compatible) without compression typically use

cluster sizes of 322kilobytes (Kb). This means that the minimum space allocated

to a ﬁle is 32 Kb. If a ﬁle is 1 Kb in size, then an additional 31 Kb is “wasted.” This

“extra” space can be used to hide information without showing up in the directory.

Unused space in ﬁle headers of image and audio can also be used to hold “extra”

information.

2 This depends on the size of the hard drive.

56 Information hiding techniques for steganography and digital watermarking

Another method of hiding information in ﬁle systems is to create a hidden

partition. These partitions are not seen if the system is started normally. However,

in many cases, running a disk conﬁguration utility (such as DOS’s FDISK) exposes

the hidden partition. These concepts have been expanded in a novel proposal of a

steganographic ﬁle system [27, 28]. If the user knows the ﬁle name and password,

access is granted to the ﬁle; otherwise, no evidence of the ﬁle exists in the system.

Protocols in the OSI network model have characteristics that can be used to

hide information [29]. TCP/IP packets used to transport information across the

Internet have unused space in the packet headers. The TCP packet header has six

unused (reserved) bits and the IP packet header has two reserved bits. Thousands

of packets are transmitted with each communication channel, which provides an ex-

cellent covert communication channel if unchecked. The ease in use and abundant

availability of steganography tools has law enforcement concerned in traﬃcking of il-

licit material via Web page images, audio, and other ﬁles being transmitted through

the Internet. Methods of message detection and understanding the thresholds of

current technology are necessary to uncover such activities (see Chapter 4).

3.3 TRANSFORM DOMAIN TECHNIQUES

We have seen that LSB modiﬁcation techniques are easy ways to embed information,

but they are highly vulnerable to even small cover modiﬁcations. An attacker can

simply apply signal processing techniques in order to destroy the secret information

entirely. In many cases even the small changes resulting out of lossy compression

systems yield to total information loss.

It has been noted early in the development of steganographic systems that

embedding information in the frequency domain of a signal can be much more robust

than embedding rules operating in the time domain. Most robust steganographic

systems known today actually operate in some sort of transform domain.

Transform domain methods hide messages in signiﬁcant areas of the cover image

which makes them more robust to attacks, such as compression, cropping, and

some image processing, than the LSB approach. However, while they are more

robust to various kinds of signal processing, they remain imperceptible to the human

sensory system. Many transform domain variations exist. One method is to use the

discrete cosine transformation (DCT) [30–33] as a vehicle to embed information in

images; another would be the use of wavelet transforms [34]. Transformations can be

applied over the entire image [30], to blocks throughout the image [35, 36], or other

variations. However, a trade-oﬀ exists between the amount of information added to

the image and the robustness obtained [7, 37]. Many transform domain methods

A survey of steganographic techniques 57

are independent to image format and may survive conversion between lossless and

lossy formats.

Before we describe transform domain steganographic methods, we will brieﬂy

review the Fourier and cosine transforms which can be used to map a signal into

the frequency domain. The discrete Fourier transform (DFT) of a sequence sof

length Nis deﬁned to be

S(k)=F{s}=

N−1

X

n=0

s(n)exp

−2inπk

N(3.6)

where i=√−1 is the imaginary unit. The inverse Fourier transform is given by

s(k)=F−1{S}=

N−1

X

n=0

S(n)exp

2inπk

N(3.7)

Another useful transform is the DCT, given by

S(k)=D{s}=C(k)

2

N

X

j=0

s(j)cos(2j+1)kπ

2N

s(k)=D−1{S}=

N

X

j=0

C(j)

2s(j)cos(2j+1)kπ

2N(3.8)

where C(u)=1/√2ifu=0andC(u) = 1 otherwise. The DCT has the primary

advantage that D{s}is a sequence of real numbers, provided that the sequence sis

real. In digital image processing, the two-dimensional version of the DCT is used:

S(u, v)= 2

NC(u)C(v)

N−1

X

x=0

N−1

X

y=0

s(x, y)cosπu(2x+1)

2Ncos πv(2y+1)

2N

s(x, y)= 2

N

N−1

X

u=0

N−1

X

v=0

C(u)C(v)S(u, v)cosπu(2x+1)

2Ncos πv(2y+1)

2N

The two-dimensional DCT is the “heart” of the most popular lossy digital image

compression system used today: the JPEG system [38, 39] (see Figure 3.2). JPEG

ﬁrst converts the image to be compressed into the YCbCr color space and breaks up

each color plane into 8×8 blocks of pixels. Then, all blocks are DCT transformed.

In a quantization step all DCT coeﬃcients are divided by some predeﬁned quan-

tization values (see Table 3.1) and rounded to the nearest integer (according to a

quality factor, the quantization values can be scaled by a constant). The purpose of

this process is to modulate the inﬂuence of the diﬀerent spectral components on the

58 Information hiding techniques for steganography and digital watermarking

Source image

8x8 blocks

JPEG compression interface

DCT Quantizer Huﬀman-

coder

Quantization

table Table

Compressed

image

Figure 3.2 Outline of the JPEG image compression algorithm.

(u,v)01234567

016 11 10 16 24 40 51 61

112 12 14 19 26 58 60 55

214 13 16 24 40 57 69 56

314 17 22 29 51 87 80 62

418 22 37 56 68 109 103 77

524 35 55 64 81 104 113 92

649 64 78 87 103 121 120 101

772 92 95 98 112 100 103 99

Table 3.1 Quantization values used in the JPEG compression scheme (luminance components).

image. In particular, the inﬂuence of the highest DCT coeﬃcients is reduced: they

are likely to be dominated by noise and are not expected to contribute signiﬁcant

details to the picture. The resulting quantized DCT coeﬃcients are compressed

using an entropy coder (e.g., Huﬀman [40] or arithmetic coding). In the JPEG

decoding step all DCT coeﬃcients are dequantized (i.e., multiplied with the quan-

tization values which had been used in the encoding step). Afterwards an inverse

DCT is performed to reconstruct the data. The restored picture will be close to

(but not identical with) the original one; but if the quantization values were set

properly, there should be no noticeable diﬀerence for a human observer.

3.3.1 Steganography in the DCT domain

One popular method of encoding secret information in the frequency domain is

modulating the relative size of two (or more) DCT coeﬃcients within one image

A survey of steganographic techniques 59

Algorithm 3.8 DCT–Steg encoding process

for i=1,...,`(M)do

choose one cover-block bi

Bi=D{bi}

if mi=0then

if Bi(u1,v

1)>B

i(u2,v

2)then

swap Bi(u1,v

1)andBi(u2,v

2)

end if

else

if Bi(u1,v

1)<B

i(u2,v

2)then

swap Bi(u1,v

1)andBi(u2,v

2)

end if

end if

adjust both values so that |Bi(u1,v

1)−Bi(u2,v

2)|>x

b0

i=D−1{Bi}

end for

create stego-image out of all b0

i

block. We will describe a system which uses digital images as covers and which is

similar to a technique proposed by Zhao and Koch [25].

During the encoding process, the sender splits the cover-image in 8×8pixel

blocks; each block encodes exactly one secret message bit. The embedding process

starts with selecting a pseudorandom block biwhich will be used to code the ith

message bit. Let Bi=D{bi}be the DCT-transformed image block.

Before the communication starts, both sender and receiver have to agree on

the location of two DCT coeﬃcients, which will be used in the embedding process;

let us denote these two indices by (u1,v

1)and(u2,v

2). The two coeﬃcients should

correspond to cosine functions with middle frequencies; this ensures that the infor-

mation is stored in signiﬁcant parts of the signal (hence the embedded information

will not be completely damaged by JPEG compression). Furthermore, we can as-

sume that the embedding process will not degenerate the cover heavily, because it is

widely believed that DCT coeﬃcients of middle frequencies have similar magnitudes

[41]. Since the constructed system should be robust against JPEG compression, we

choose the DCT coeﬃcients in such a way that the quantization values associated

with them in the JPEG compression algorithm are equal. According to Table 3.1

the coeﬃcients (4,1) and (3,2) or (1,2) and (3,0) are good candidates.

One block encodes a “1,” if Bi(u1,v

1)>B

i(u2,v

2), otherwise a “0.” In the

encoding step, the two coeﬃcients are swapped if their relative size does not match

with the bit to be encoded. Since the JPEG compression can (in the quantiza-

60 Information hiding techniques for steganography and digital watermarking

Algorithm 3.9 DCT–Steg decoding process

for i=1,...,`(M)do

get cover-block biassociated with bit i

Bi=D{bi}

if Bi(u1,v

1)≤Bi(u2,v

2)then

mi=0

else

mi=1

end if

end for

tion step) aﬀect the relative sizes of the coeﬃcients, the algorithm ensures that

|Bi(u1,v

1)−Bi(u2,v

2)|>xfor some x>0, by adding random values to both

coeﬃcients. The higher xis, the more robust the algorithm will be against JPEG

compression, however, at the expense of image quality. The sender then performs

an inverse DCT to map the coeﬃcients back into the space domain. To decode the

picture, all available blocks are DCT-transformed. By comparing the two coeﬃ-

cients of every block, the information can be restored. Embedding and extraction

algorithms are outlined in Algorithms 3.8 and 3.9.

If the constant xand the location of the used DCT coeﬃcients are chosen

properly, the embedding process will not degenerate the cover visibly. We can expect

this method to be robust against JPEG compression, since in the quantization

process both coeﬃcients are divided by the same quantization values. Their relative

size will therefore only be aﬀected in the rounding step.

Perhaps the most important drawback of the system presented above is the

fact that Algorithm 3.8 does not discard image blocks where the desired relation of

the DCT coeﬃcients cannot be enforced without severely damaging the image data

contained in this speciﬁc block.

Zhao and Koch [25, 31] proposed a similar system which does not suﬀer from

this drawback. They operate on quantized DCT coeﬃcients and use the relations of

three coeﬃcients in a block to store the information. The sender DCT transforms

the image block biand performs a quantization step to get BQ

i. One block encodes

a “1,” if BQ

i(u1,v

1)>B

Q

i(u3,v

3)+Dand BQ

i(u2,v

2)>B

Q

i(u3,v

3)+D.Onthe

other hand, a “0” is encoded, if BQ

i(u1,v

1)+D<B

Q

i(u3,v

3)andBQ

i(u2,v

2)+D<

BQ

i(u3,v

3). The parameter Daccounts for the minimum distance between two

coeﬃcients for representing an embedded bit; normally D= 1. The higher Dis,

the more robust the method will be against image processing techniques. Again,

the three selected coeﬃcients should be situated in the middle of the spectrum.

In the encoding step, the relations between these three coeﬃcients are changed

A survey of steganographic techniques 61

so that they represent one bit of the secret information. If the modiﬁcations required

to code one secret bit are too large, then the block is not used for information

transfer and marked as “invalid.” This is the case, if the diﬀerence between the

largest and the smallest coeﬃcient is greater than some constant value MD.The

higher MD is, the more blocks can be used for communication. In order to allow a

correct decoding, the quantized DCT coeﬃcients of an invalid block are changed so

that they fulﬁll one of the two conditions

BQ

i(u1,v

1)≤BQ

i(u3,v

3)≤BQ

i(u2,v

2) (3.9)

or

BQ

i(u2,v

2)≤BQ

i(u3,v

3)≤BQ

i(u1,v

1) (3.10)

Afterwards the block is dequantized and the inverse DCT is applied.

The receiver can restore the information by applying DCT and quantizing the

block. If the three selected coeﬃcients fulﬁll one of the conditions (3.9) or (3.10), the

block is ignored. Otherwise the encoded information can be restored by comparing

BQ

i(u1,v

1), BQ

i(u2,v

2), and BQ

i(u3,v

3). The authors claim that this embedding

method is robust against JPEG compression (with quality factors as low as 50%),

since all changes are made after the “lossy” quantization step.

3.3.2 Hiding information in digital sound: phase coding

Embedding secret messages in digital sound is generally more diﬃcult than em-

bedding information in digital images. Moore [42] noted that the human auditory

system is extremely sensitive; perturbations in a sound ﬁle can be detected as low as

one part in 10 million. Although the limit of perceptible noise increases as the noise

level of the cover increases, the maximum allowable noise level is generally quite

low. It is however known that the human auditory system is much less sensitive

to the phase components of sound; this fact has been exploited in numerous digital

audio compression systems.

In phase coding [2], a digital datum is represented by a phase shift in the phase

spectrum of the carrier signal; the carrier signal cis split into a series of Nshort

sequences, ci(n)oflength`(m), a DFT is applied, and a matrix of the phases φi(k)

and Fourier transform magnitudes Ai(k) is created. Recall that

Ai(k)=pRe[F{ci}(k)]2+Im[F{ci}(k)]2(3.11)

and

φi(k) = arctan Im[F{ci}(k)]

Re[F{ci}(k)] (3.12)

62 Information hiding techniques for steganography and digital watermarking

Since phase shifts between consecutive signal segments can easily be detected, their

phase diﬀerences need to be preserved in the stego-signal. The embedding process

thus inserts a secret message only in the phase vector of the ﬁrst signal segment:

˜

φ0(k)=π/2ifmk=0

−π/2ifmk=1 (3.13)

and creates a new phase matrix using the original phase diﬀerences

˜

φ1(k)=˜

φ0(k)+[φ1(k)−φ0(k)]

...

˜

φN(k)=˜

φN−1(k)+[φN(k)−φN−1(k)] (3.14)

The sender then uses the new phase matrix ˜

φi(k) and the original matrix of Fourier

transform magnitudes Ai(k) to construct the stego-signal using the inverse Fourier

transform. Since φ0(k) is modiﬁed, the absolute phases of all following segments are

changed, while their relative diﬀerences are preserved. Before the secret information

can be restored, some sort of synchronization must take place. Given the knowledge

of the sequence length `(m), the receiver is able to calculate the DFT and to detect

the phases φ0(k).

3.3.3 Echo hiding

Echo hiding [4] attempts to hide information in a discrete signal f(t) by introducing

an echo f(t−∆t) in the stego-signal c(t):

c(t)=f(t)+αf(t−∆t) (3.15)

Information is encoded in the signal by modifying the delay ∆tbetween the signal

and the echo. In the encoding step, the sender chooses either ∆tor ∆t0; in the ﬁrst

case, a “0” is encoded in the signal c(t), in the latter case a “1.” The delay times ∆t

or ∆t0are chosen in a way that the echo signal is not audible for a human observer.

The basic echo hiding scheme can only embed one bit in a signal; therefore a

cover signal is divided into `(m) blocks prior to the encoding process. Consecutive

blocks should be separated by a random number of unused samples so that the

detection and extraction of the secret message bits is harder. In each block one

secret bit is embedded according to (3.15); in the last step all signal blocks are

concatenated.

Before the secret message can be extracted out of the stego-signal, some sort

of synchronization must take place; the receiver must be able to reconstruct the

`(m) signal blocks the sender used to embed one secret message bit. Each signal

A survey of steganographic techniques 63

segment can then be decoded via the autocorrelation function of the signal’s cep-

strum. Gruhl et al. [4] show that the autocorrelation function shows a spike at the

delay time ∆t. For a further investigation of echo hiding see Section 7.5.1.

Chang and Moskowitz [43] analyze several methods usable for information hid-

ing in digital sound, among them low-bit coding (LSB), phase coding, spread spec-

trum techniques (see Section 3.4), and echo hiding. Low-bit coding techniques are

not robust, but have the highest data transmission rate. Phase coding provides

robustness against resampling of the carrier signal, but has a very low data trans-

mission rate since secret information is encoded only in the ﬁrst signal segment. On

the contrary, spread spectrum and echo hiding perform better in many cases.

3.3.4 Information hiding and data compression

In some cases, information hiding algorithms are incorporated in data compression

systems; one can think of a videoconferencing system which allows messages to be

hidden in the video stream while it is being recorded. Most research work focussed

on information hiding schemes for lossy video or image compression systems, but

it should be noted that lossless compression systems can also be used for secret in-

formation transfer; Cachin [44] showed how to construct an asymptotically optimal

steganographic system by modifying Willems [45] “repetition times” compression

algorithm.

Numerous steganographic systems for compressed video or images have been

proposed. In the simplest technique, applied by the tool Jpeg-Jsteg [46], informa-

tion is hidden in the way DCT coeﬃcients in the JPEG compression system (see

Section 3.3) are rounded. Since the DCT normally outputs noninteger sequences for

integer inputs, the JPEG system must quantize DCT coeﬃcients before the encod-

ing process. Information is hidden by rounding the coeﬃcients either up or down

according to the secret message bits. Although such a system is not robust, detec-

tion of the cover modiﬁcations seems to be diﬃcult. Westfeld and Wolf [47] describe

a similar technique. Their system operates on quantized, DCT-encoded blocks of

video frames. After distinguishing blocks which are suitable for secret transmission

from unusable blocks, the modulo-2 sum of the DCT coeﬃcients of the block is

changed in a way that it transmits secret information (see [47] for details). More

sophisticated methods combine video compression schemes with spread spectrum.

As an example, Hartung and Girod [48, 49] presented an information-hiding scheme

operating on precompressed video using their spread spectrum watermarking system

(see Section 6.4.1).

64 Information hiding techniques for steganography and digital watermarking

3.4 SPREAD SPECTRUM AND INFORMATION HIDING

Spread spectrum (SS) communication technologies have been developed since the

1950s in an attempt to provide means of low-probability-of-intercept and antijam-

ming communications. Pickholtz et al. [50] deﬁne spread spectrum techniques as

“means of transmission in which the signal occupies a bandwidth in excess of the

minimum necessary to send the information; the band spread is accomplished by

means of a code which is independent of the data, and a synchronized reception with

the code at the receiver is used for despreading and subsequent data recovery.” Al-

though the power of the signal to be transmitted can be large, the signal-to-noise

ratio in every frequency band will be small. Even if parts of the signal could be

removed in several frequency bands, enough information should be present in the

other bands to recover the signal. Thus, SS makes it diﬃcult to detect and/or re-

move a signal. This situation is very similar to a steganography system which tries

to spread a secret message over a cover in order to make it impossible to perceive.

Since spreaded signals tend to be diﬃcult to remove, embedding methods based on

SS should provide a considerable level of robustness. Since the landmark paper by

Tirkel et al. [51], spread spectrum methods are of increasing importance in the ﬁeld

of information hiding.

In information hiding, two special variants of SS are generally used: direct-

sequence and frequency-hopping schemes. In direct-sequence schemes, the secret

signal is spread by a constant called chip rate, modulated with a pseudorandom

signal and added to the cover. On the other hand, in frequency-hopping schemes

the frequency of the carrier signal is altered in a way that it hops rapidly from one

frequency to the another. SS are widely used in the context of watermarking, as

will be shown in Section 6.4.1. One particularly interesting direct-sequence water-

marking algorithm, invented by Hartung and Girod [48, 49], which could also be

used for steganographic purposes, will be described in Section 6.4.1.

Due to the similarity of SS watermarking and steganography algorithms, we will

limit the discussion in this chapter to presenting a mathematical model describing

the application of spread spectrum techniques in information hiding and discuss a

system called SSIS as a case study.

3.4.1 A spread spectrum model

Smith and Comiskey [52] presented a general framework for spread spectrum

steganography. Their approach originally used N×Mgrayscale images as cov-

ers; however, the work can easily be extended to all cover sets on which a scalar

product can be deﬁned. We will assume that Alice and Bob share a set of (at

A survey of steganographic techniques 65

least) `(m) orthogonal N×Mimages φias a stego-key. Alice ﬁrst generates a

stego-message E(x, y) by building the weighted sum

E(x, y)=X

i

miφi(x, y) (3.16)

The images φiare orthogonal to each other,

hφi,φ

ji=

N

X

x=1

M

X

y=1

φi(x, y)φj(x, y)=Giδi,j (3.17)

where Gi=PN

x=1 PM

y=1 φ2

i(x, y)andδi,j is the Kronecker delta function. Alice then

encodes the secret information Ein a cover Cby building the element-wise sum of

both images, creating the stego-cover S:

S(x, y)=C(x, y)+E(x, y) (3.18)

In the ideal case, Cis orthogonal to all φi,(sohC, φii= 0) and Bob can extract

the ith message bit miby projecting the stego-image Sonto the ith basis image φi:

hS, φii=hC, φii+*X

j

mjφj,φ

i+

=X

j

mjhφj,φ

ii

=Gimi(3.19)

Therefore, the secret information can be recovered by calculating mi=hS, φii/Gi.

Note that the original cover Cis not needed in the decoding phase. In practice,

however, Cwill not be completely orthogonal to all images φi, so an error term

hC, φii=∆Cihas to be introduced in (3.19):

hS, φii=∆Ci+Gimi(3.20)

We will now show that under reasonable assumptions the expected value of ∆Ci

is zero. Let both Cand φibe two independent NM-dimensional random variables.

If we assume that all basis images were created using a zero-mean random process

and they are independent from the messages to be transmitted, then

E[∆Ci]=

N

X

i=1

M

X

j=1

E[C(x, y)]E[φi(x, y)] = 0 (3.21)

Thus, the expected value of the error term in (3.20) is zero under these assumptions.

66 Information hiding techniques for steganography and digital watermarking

The decoding operation therefore consists of reconstructing a secret message by

projecting the stego-image Sonto all functions φiyielding an approximative value

si=hS, φii=∆Ci+Gimi(3.22)

Subject to the conditions stated above, the expected value of ∆Ciis zero, so si≈

Gimi. The ﬁnal task is to reconstruct mifrom si. If we encode secret messages as

strings of −1 and 1 instead of simply using binary strings, the values of mican be

reconstructed using the sign function, provided that Gi0:

mi=sign(si)=

−1if si<0

0if si=0

1if si>0

(3.23)

In the case of mi= 0 the encoded information has been lost. In some severe circum-

stances the quantity |∆Ci|could become so large (recall that we have only proved

that the expected value is zero) that the recovery of one bit is not possible. How-

ever, this case will not happen often and can be coped with by the implementation

of an error-correcting code.

The main advantage of using spread spectrum techniques in steganography

is the relative robustness to image modiﬁcations. Since the encoded information

is spread over a wide frequency band it is quite diﬃcult to remove it completely

without entirely destroying the cover. In practice, modiﬁcations of the stego-cover

will increase the value of ∆Ci. These modiﬁcations will not be harmful to the

embedded message, unless |∆Ci|>|Gimi|.

3.4.2 SSIS: a case study

Marvel et al. [53] presented a steganographic system called SSIS which we will

discuss here brieﬂy as a case study. SSIS uses a spread spectrum technique as

an embedding function; this mechanism can be described as follows. Before the

embedding process, the secret message is encrypted using a conventional symmet-

ric encryption scheme, thereby using a secret key k1. Furthermore, the encrypted

secret message will be encoded via a low-rate error-correcting code (such as a Reed-

Solomon code). This step will increase the robustness of the overall steganographic

application. The resulting encoded message is then modulated by a pseudoran-

dom sequence produced by a pseudorandom number generator using k2as seed.

The resulting (random-looking) signal is then input into an interleaver (which uses

k3as seed) and added to the cover. In a last step, the resulting stego-image is

appropriately quantized.

At the receiver side the embedding process is reversed. Since one design goal

of SSIS was to provide a blind steganographic system—thus, a system in which

A survey of steganographic techniques 67

the original image is not needed in the decoding process—an estimate of the orig-

inal image is obtained using an image-restoration technique such as an adaptive

Wiener ﬁlter. Subtracting the stego-image from the cover-image estimate yields an

estimate for the modulated and spread stego-message. The resulting bits are then

deinterleaved and demodulated (using k3and k2). Due to the poor performance

of the Wiener ﬁlter, the reconstructed secret message will contain incorrect bits;

the stego-system can thus be seen as a form of transmission on a noisy channel.

However, the use of an error-correcting code can help to recover corrupted message

bits. In a last step, the secret message is decrypted.

3.5 STATISTICAL STEGANOGRAPHY

Statistical steganography techniques utilize the existence of “1-bit” steganographic

schemes, which embed one bit of information in a digital carrier. This is done

by modifying the cover in such a way that some statistical characteristics change

signiﬁcantly if a “1” is transmitted. Otherwise the cover is left unchanged. So the

receiver must be able to distinguish unmodiﬁed covers from modiﬁed ones.

In order to construct a `(m)-bit stego-system from multiple “1-bit” stego-

systems, a cover is divided into `(m) disjoint blocks B1,...,B

`(m). A secret bit,

mi, is inserted into the ithblockbyplacinga“1”intoBiif mi= 1. Otherwise,

the block is not changed in the embedding process. The detection of a speciﬁc bit

is done via a test function which distinguishes modiﬁed blocks from unmodiﬁed

blocks:

f(Bi)=1blockBiwas modiﬁed in the embedding process

0 otherwise (3.24)

The function fcan be interpreted as a hypothesis-testing function; we test the null-

hypothesis “block Biwas not modiﬁed” against the alternative hypothesis “block

Biwas modiﬁed.” Therefore, we call the whole class of such steganography systems

statistical steganography. The receiver successively applies fto all cover-blocks Bi

in order to restore every bit of the secret message.

The main question which remains to be solved is how such a function fin (3.24)

can be constructed. If we interpret fas a hypothesis-testing function, we can use

the theory of hypothesis testing from mathematical statistics. Let us assume we

can ﬁnd a formula h(Bi), which depends on some elements of the cover-block Bi,

and we know the distribution of h(Bi) in the unmodiﬁed block (i.e., the hypothesis

holds in this case). We can then use standard procedures to test if h(Bi)equalsor

exceeds a speciﬁc value. If we manage to alter h(Bi) in the embedding process in a

way that its expected value is 0 if the block Biwas not modiﬁed, and its expected

68 Information hiding techniques for steganography and digital watermarking

value is much greater otherwise, we could test whether h(Bi) equals zero under the

given distribution of h(Bi).

Statistical steganographic techniques are, however, diﬃcult to apply in many

cases. First, a good test statistic h(Bi) must be found which allows distinction

between modiﬁed and unmodiﬁed cover-blocks. Additionally, the distribution of

h(Bi) must be known for a “normal” cover; in most cases, this is quite a diﬃcult

task. In practical implementations many (quite questionable) assumptions are made

in order to determine a closed formula for this distribution.

As an example, we want to construct a statistical steganography algorithm out

of Pitas’ watermarking system [54], which is similar to the Patchwork approach of

Bender et al. [2]. Suppose every cover-block Biis a rectangular set of pixels p(i)

n,m.

Furthermore, let S={s(i)

n,m}be a rectangular pseudorandom binary pattern of equal

size, where the number of ones in Sequals the number of zeros. We will assume

that both the sender and receiver have access to S, which represents the stego-key

in this application. The sender ﬁrst splits the image block Biinto two sets, Ciand

Di, of equal size (i.e., he puts all pixels with indices (n, m)intosetCwhere the

corresponding key bit sn,m equals zero):

Ci={p(i)

n,m ∈Bi|sn,m =1}

Di={p(i)

n,m ∈Bi|sn,m =0}(3.25)

The sender then adds a value k>0 to all pixels in the subset Cibut leaves all

pixels in Diunchanged. In the last step, Ciand Diare merged to form the marked

image block ˜

Bi.

In order to extract the mark, the receiver reconstructs the sets Ciand Di.If

the block contains a mark, all values in Ciwill be larger than the corresponding

values in the embedding step; thus we test the diﬀerence of the means of sets Ci

and Di. If we assume that all pixels in both Ciand Diare independent identically

distributed random variables with an arbitrary distribution, the test statistic

qi=Ci−Di

ˆσi

(3.26)

with

ˆσi=sVar[ Ci]+Var[Di]

|S|/2(3.27)

where Cidenotes the mean over all pixels in the set Ciand Var[Ci] the estimated

variance of the random variables in Ci, will follow a N(0,1) normal distribution

asymptotically due to the central limit theorem. If a mark is embedded in the

A survey of steganographic techniques 69

image block ˜

Bi, the expected value of qwill be greater than zero. The receiver is

thus able to reconstruct the ith secret message bit by testing whether the statistic

qiof block Biequals zero under the N(0,1) distribution.

3.6 DISTORTION TECHNIQUES

In contrast to substitution systems, distortion techniques require the knowledge of

the original cover in the decoding process. Alice applies a sequence of modiﬁcations

to a cover in order to get a stego-object; she chooses this sequence of modiﬁcations

in such a way that it corresponds to a speciﬁc secret message she wants to trans-

mit. Bob measures the diﬀerences to the original cover in order to reconstruct the

sequence of modiﬁcations applied by Alice, which corresponds to the secret message.

In many applications, such systems are not useful, since the receiver must have

access to the original covers. If Wendy also has access to them, she can easily

detect the cover modiﬁcations and has evidence for a secret communication. If the

embedding and extraction functions are public and do not depend on a stego-key, it

is also possible for Wendy to reconstruct secret messages entirely. Throughout this

section we will therefore assume that original covers can be distributed through a

secure channel.

An early approach to hiding information is in text. Most text-based hiding

methods are of distortion type (i.e., the arrangement of words or the layout of a

document may reveal information). One technique is by modulating the positions

of lines and words, which will be detailed in the next subsection. Adding spaces

and “invisible” characters to text provides a method to pass hidden information.

HTML ﬁles are good candidates for including extra spaces, tabs, and linebreaks.

Web browsers ignore these “extra” spaces and lines, and they go unnoticed until

the source of the Web page is revealed.

3.6.1 Encoding information in formatted text

Considerable eﬀort has been made to construct data-embedding methods for for-

matted text, which is interpreted as a binary image. Maxemchuk et al. [55–58]

presented text-based steganographic schemes which use the distance between con-

secutive lines of text or between consecutive words to transmit secret information.

It should be noted, however, that any steganographic system which uses the text

format to transmit information can easily be broken by retyping the document.

In line-space encoding, the positions of lines in the document are moved up

or down according to secret message bits, whereas other lines are kept stationary

for the purpose of synchronization (in the original implementation, information was

transmitted in every second line). One secret message bit is encoded in one line that

70 Information hiding techniques for steganography and digital watermarking

This is just an example

This is just an example

This is just an example

Figure 3.3 Encoding information in interword spaces (the vertical lines are provided for refer-

ence). Data is embedded in the ﬁrst and third sentences.

is moved; if a line is moved up, a 1 is encoded, otherwise a 0. When decoding a secret

message, centroid detection can be used; the centroid is deﬁned to be the center of

mass of the line about a horizontal axis. Let us denote with ∆R+the distance

between the centroids of a shifted line and the next stationary synchronization line

above, with ∆R−the distance of centroids between the shifted line and the next

stationary line below, and with ∆X+and ∆X−the corresponding centroid distances

in the unmodiﬁed document. The distance above one line was increased, if

∆R++∆

R−

∆R+−∆R−

>∆X++∆

X−

∆X+−∆X−

(3.28)

Similarly, if

∆R++∆

R−

∆R+−∆R−

<∆X++∆

X−

∆X+−∆X−

(3.29)

the distance above the line was decreased. Note that if the page was scaled by a

constant factor during reproduction, this factor cancels out because of the fraction

in (3.28) and (3.29). Similarly, changes in vertical print density should aﬀect all cen-

troids in approximately the same way. These properties make line-space encoding

techniques resistant to most distortion attacks. For an analysis of this embedding

technique see [57].

Another possible embedding scheme in formatted text is word-space encoding,

illustrated in Figure 3.3. According to a secret message bit, horizontal spaces be-

tween selected words of the carrier are altered. Theoretically, it is possible to alter

every space between two words; the only limitation is that the sum of all movements

in one speciﬁc line equals zero so that the line keeps properly aligned.

3.6.2 Distortion of digital images

Distortion techniques can easily be applied to digital images. Using a similar ap-

proach as in substitution systems, the sender ﬁrst chooses `(m) diﬀerent cover-pixels

he wants to use for information transfer. Such a selection can again be done using

pseudorandom number generators or pseudorandom permutations. To encode a 0

in one pixel, the sender leaves the pixel unchanged; to encode a 1, he adds a random

A survey of steganographic techniques 71

value ∆xto the pixel’s color. Although this approach is similar to a substitution

system, there is one signiﬁcant diﬀerence: the LSB of the selected color values do

not necessarily equal secret message bits. In particular, no cover modiﬁcations are

needed when coding a 0. Furthermore, ∆xcanbechoseninawaythatbetter

preserves the cover’s statistical properties. The receiver compares all `(m) selected

pixels of the stego-object with the corresponding pixels of the original cover. If the

ith pixel diﬀers, the ith message bit is a 1, otherwise a 0.

Many variants of the above method could be implemented: similar to the parity

bit method presented in Section 3.2.4, the parity bit of a certain image region can

be altered or left unchanged in order to encode a 1 or a 0. Furthermore, image

processing techniques could be applied to certain image regions so that they are not

visible to an observer.

Another image distortion technique, data embedding, has been introduced by

Sandford et al. [59, 60]. In contrast to all distortion techniques discussed so far, data

embedding tries to modify the order of appearance of redundant data in the cover

rather than to change values themselves; the embedding process therefore maintains

a “pair list” (i.e., a list of pairs of samples whose diﬀerence is smaller than a speciﬁc

threshold). The receiver can reverse the embedding process if he has access to the

pair list. This list can be seen as an analogon to a key in cryptography; it normally

cannot be restored out of the cover by the receiver (see [59] for details).

3.7 COVER GENERATION TECHNIQUES

In contrast to all embedding methods presented above, where secret information is

added to a speciﬁc cover by applying an embedding algorithm, some steganographic

applications generate a digital object only for the purpose of being a cover for secret

communication.

3.7.1 Mimic functions

Due to the explosion of information traﬃc it can be assumed that it is impossible

for a human being to observe all communications around the world; as noted in the

conclusion of Chapter 2, such a task can only be done using automated supervision

systems which are therefore of increasing importance. These systems check com-

munication by examining keywords and the statistical proﬁle of a message. It is

possible, for instance, to distinguish unencrypted from encrypted messages automat-

ically because of their diﬀerent statistical properties. Mimic functions,proposedby

Wayner [61], can be used to hide the identity of a message by changing its statistical

proﬁle in a way that it matches the proﬁle of any innocent looking text.

72 Information hiding techniques for steganography and digital watermarking

It is well known that the English language possesses several statistical prop-

erties. For instance the distribution of characters is not uniform (see for in-

stance the appendix of [62] for the frequency distributions of English di- and tri-

grams). This fact has been exploited in numerous data-compression techniques

(e.g., the Huﬀman coding scheme [40]). Given an alphabet Σ and a probability

distribution A, the Huﬀman coding scheme can be used to produce a minimum-

redundancy compression function fA:Σ→{0,1}∗,where∗denotes the Kleene-

Star (Σ∗=Si≥0{x1···xi|x1,...,x

i∈Σ}). A mimic function g:Σ

∗→Σ∗that

converts a message whose characters show a probability distribution Ato a mes-

sage which approximately mimics the statistical proﬁle B, can be constructed using

two Huﬀman compression functions:

g(x)=f−1

B(fA(x)) (3.30)

Thus, the ﬁle xis ﬁrst compressed using a Huﬀman scheme with distribution A.

This process will create a ﬁle of binary strings which can be interpreted as output

of a Huﬀman compression scheme (with distribution B) of a diﬀerent ﬁle. This

ﬁle can be reconstructed by applying the inverse Huﬀman compression function

f−1

Bto the ﬁle of binary strings and will act as a stego-object. Since both fA

and fBare one-to-one, the constructed mimic function will be one-to-one. Wayner

showed that this function is optimal in the sense that if fAis a theoretically optimal

Huﬀman compression function and xis a ﬁle of random bits, then f−1

A(x)isthebest

approximation of the statistical proﬁle Awhich is one-to-one.

Instead of using distributions of single characters, Huﬀman coding schemes

can be constructed to compress ncharacters at one time, based on the frequency

distribution of n-grams. However, the size of the compression tree created by the

Huﬀman schemes grows exponentially with n. Wayner instead proposed to exploit

the intercharacter dependencies by creating Huﬀman compression functions for ev-

ery string tof length n−1 to encode probabilities for each character which may

follow tin the ﬁle. A mimic function can be constructed out of the collection of

these Huﬀman compression functions (see [61] for details).

3.7.2 Automated generation of English texts

However, mimic functions can only be used to fool machines. Since the stego-objects

are created only according to statistical proﬁles, the semantic component is entirely

ignored. To a human observer the created texts look completely meaningless and

are full of grammatical and typographical errors.

To overcome this problem, the use of context-free grammars (CFG) has been

proposed; for a theoretical overview of CFG see [63]. Let G=hV,Σ,Π,Sibe a CFG,

where Vis the set of variables, Σ the set of terminal symbols, Π ⊆V×(V∪Σ)∗

A survey of steganographic techniques 73

the set of productions and S∈Vthe start symbol. The productions can be seen

as a substitution rule; they convert a variable into a string containing terminal or

variable symbols. A string s∈Σ∗which is deﬁned to be a sequence of termi-

nal symbols is said to be generated by G(formally: s∈L(G)) if scan be pro-

duced successively from the start symbol Sby substituting variables by sequences

of terminal or variable symbols according to Π. For example, from the grammar

h{S, A, B, C },{A,...,Z,a...,z},Π,Siwith

Π={S→Alice B,S→Bob B,S→Eve B,S→IA,

A→am working,A→am lazy,A→am tired,

B→is C,B→can cook,

C→reading,C→sleeping,C→working}

the sentences I am lazy,Alice is reading, etc. can be derived. If for every

string s∈L(G) there exists exactly one way scan be generated from the start

symbol, the grammar is said to be unambiguous.

Unambiguous grammars can be used as a steganographic tool. Wayner [61,

64] proposed an extension to the technique of mimic functions. Given a set of

productions, we assign a probability to each possible production for variable Vi.In

our example above, we could choose

Π={S→0.5Alice B,S→0.3Bob B,S→0.1Eve B,S→0.1IA,

A→0.3am working,A→0.4am lazy,A→0.3am tired,

B→0.5is C,B→0.5can cook,

C→0.5reading,C→0.1sleeping,C→0.4working}

Let ΠVi={πi,1,...,π

i,n}be the set of all productions associated with variable

Vi. The sender then constructs a Huﬀman compression function fΠifor every set

Πi. In Figure 3.4 possible Huﬀman trees for ΠSand ΠAare shown. Huﬀman

compression functions can easily be derived out of these trees; for example the

production “Eve B” will be encoded as 110, “Aam tired” as 11, etc.

For steganographic purposes, the inverse Huﬀman compression functions will

be used. In the encoding step, the sender derives one speciﬁc string out of the CFG

which will act as the stego-object. Starting from the start symbol S,theleftmost

variable Viis changed by a production. This production is determined by the secret

message and the Huﬀman compression function for ΠVi. Speciﬁcally, the Huﬀman

tree is traversed according to the next bits of the secret message until a node of the

tree is reached. The start symbol is then substituted by the production which can

be found at this node of the tree. This process is iterated (i.e., the leftmost variable

is exchanged by a production which is determined by its Huﬀman tree and the next

74 Information hiding techniques for steganography and digital watermarking

1

0

SA

1

1

IA

Eve B

Alice B

Bob B

0

0

1

1

0

0

am tiredam working

am lazy

Figure 3.4 Huﬀman compression functions for ΠSand ΠA.

few message bits), until all message bits are used and the string consists only of

terminal symbols. Continuing the previous example, suppose the secret message is

11110. In the ﬁrst step we traverse the Huﬀman tree for ΠSand eventually reach

the node “IA” by consuming the ﬁrst three secret message bits. Thus, the start

symbol Sis replaced by “IA.” We now traverse the Huﬀman tree ΠAand ﬁnd the

replacement “am working” by consuming another two secret message bits. Thus,

since the derived string consists of terminal symbols only and all secret message

bits were used, the stego-object representing 11110 is the sentence I am working.

In the decoding process, the cover is parsed in order to reconstruct the pro-

ductions which have been used in the embedding step; this can be accomplished by

the use of a parse tree for the given CFG, see [65]. Since the productions uniquely

determine the secret message and the underlying grammar is unambiguous, the

receiver is able to reconstruct the stego-message.

A similar system has been proposed by Chapman and Davida [66]. Their system

consists of two functions, NICETEXT and SCRAMBLE. Given a large dictionary

of words, categorized by diﬀerent types, and a style source, which describes how

words of diﬀerent types can be used to form a meaningful sentence, NICETEXT

transforms secret message bits into sentences by selecting words out of the dictionary

which conform to a sentence structure given in the style source. SCRAMBLE

reconstructs the secret message if the dictionary which has been used is known.

Style sources can either be created from sample natural-language sentences or be

generated using CFG.

A survey of steganographic techniques 75

3.8 CONCLUSION

In this chapter we gave an overview of diﬀerent steganographic methods which have

been proposed in the literature during the last few years. Many ﬂexible and simple

methods exist for embedding information in noisy communication channels.

However, covers and messages tend to have unique patterns a steganalyst could

exploit. Most of the simple techniques can be broken by careful analysis of the

statistical properties of the channel’s noise. Images and many other signals were

subject to quantization, ﬁlters, transformations, format converters, etc. Most of

these techniques left some sort of “ﬁngerprints” in the data. All these problems

must be addressed when designing a steganographic system; methods which use

these properties to break secret communication will be outlined in the next chapter.

REFERENCES

[1] Foley, J., et al., Computer Graphics, Principles and Practice, Reading, MA: Addison

Wesley, 1990.

[2] Bender, W., D. Gruhl, and N. Morimoto, “Techniques for data hiding,” IBM Systems

Journal, vol. 35, no. 3/4, 1996, pp. 131–336.

[3] M¨oller, S., A. Pﬁtzmann, and I. Stirand, “Computer Based Steganography: How It

Works and Why Therefore Any Restrictions on Cryptography Are Nonsense, At Best,”

in Information Hiding: First International Workshop, Proceedin gs , vol. 1174 of Lecture

Notes in Computer Science, Springer, 1996, pp. 7–21.

[4] Gruhl, D., A. Lu, and W. Bender, “Echo Hiding,” in Information Hiding: First Interna-

tional Workshop, Proceedin gs , vol. 1174 of Lecture Notes in Computer Science, Springer,

1996, pp. 295–316.

[5] Kurak, C., and J. McHughes, “A Cautionary Note On Image Downgrading,” in IEEE

Computer Security Applications Conference 1992, Proceedin gs , IEEE Press, 1992, pp.

153–159.

[6] van Schyndel, R. G., A. Tirkel, and C. F. Osborne, “A Digital Watermark,” in Pro-

ceedings of the IEEE International Conference on Image Processing , vol. 2, 1994, pp.

86–90.

[7] Johnson, N. F., and S. Jajodia, “Exploring Steganography: Seeing the Unseen,” IEEE

Computer , vol. 31, no. 2, 1998, pp. 26–34.

[8] Gerzon, M. A., and P. G. Graven, “A High-Rate Buried-Data Channel for Audio CD,”

Journal of the Audio Engineering Society, vol. 43, no. 1/2, 1995, pp. 3–22.

[9] “StegoDos—Black Wolf’s Picture Encoder v0.90B,” <ftp://ftp.csua.berkeley.edu/

pub/cypherpunks/steganography/stegodos.zip>, 1993.

[10] Brown, A., “S-Tools for Windows,” <ftp://idea.sec.dsi.unimi.it/pub/security/

crypt/code/s-tools4.zip>, 1996.

[11] Hastur, H., “Mandelsteg,” <ftp://idea.sec.dsi.unimi.it/pub/security/crypt/

code/steg.tar.Z>, 1994.

[12] Machado, R., “EzStego, Stego Online, Stego,” <http://www.stego.com>, 1997.

76 Information hiding techniques for steganography and digital watermarking

[13] Maroney, C., “Hide and Seek,” <ftp://ftp.csua.berkeley.edu/pub/cypherpunks/

steganography/hdsk41b.zip>,<http://www.rugeley.demon.co.uk/security/

hdsk50.zip>, 1994–1997.

[14] Repp, H., “Hide4PGP,” <http://www.rugeley.demon.co.uk/security/hide4pgp.

zip>, 1996.

[15] Arachelian, R., “White Noise Storm,” <ftp://ftp.csua.berkeley.edu/pub/

cypherpunks/steganography/wns210.zip>, 1994.

[16] Hansmann, F., “Steganos, Deus Ex Machina Communications,” <http://www.

steganography.com/>, 1996.

[17] Menezes, A. J., P. C. van Oorschot, and S. A. Vanstone, Handbook of Applied Cryptog-

raphy, Boca Raton: CRC Press, 1996.

[18] Aura, T., “Practical Invisibility in Digital Communication,” in Information Hiding: First

International Workshop, Proceedin gs , vol. 1174 of Lecture Notes in Computer Science,

Springer, 1996, pp. 265–278.

[19] Luby, M., and C. Rackoﬀ, “How to Construct Pseudorandom Permutations from Pseu-

dorandom Functions,” SIAM Journal on Computation, vol. 17, no. 2, 1988, pp. 373–386.

[20] Naor, M., and O. Reingold, “On the Construction of Pseudorandom Permutations: Luby-

Rackoﬀ Revisited,” Journal of Cryptology, vol. 12, no. 1, 1999, pp. 29–66.

[21] Bell, D. E., and L. J. LaPadula, “Secure Computer Systems: Mathematical Founda-

tions,” Mitre Report ESD-TR-73-278 (Vol. I–III), Mitre Corporation, Bedford, MA,

Apr. 1974.

[22] Fridrich, J., “A New Steganographic Method for Palette-Based Images,” in Proceedings

of the IS&T PICS conference, Savannah, Georgia, Apr. 1998, pp. 285–289.

[23] Matsui, K., and K. Tanaka, “Video-Steganography: How to Secretly Embed a Signature

in a Picture,” I MA I nte l lect ua l Pro pert y Pro ject Proceed ing s , vol. 1, no. 1, 1994, pp. 187–

205.

[24] Baharav, Z., and D. Shaked, “Watermarking of Dither Halftoned Images,” in Proceedin gs

of the SPIE 3657, Security and Watermarking of Multimedia Content, 1999, pp. 307–316.

[25] Zhao, J., and E. Koch, “Embedding Robust Labels into Images for Copyright Protec-

tion,” in Proceedings of the International Conference on Intel lectual Property Rights

for Information, Knowledge and New Techniques,M¨unchen, Wien: Oldenbourg Verlag,

1995, pp. 242–251.

[26] “CCITT Recommendation T6: Facsimile Coding Schemes and Coding Control Functions

for Group 4 Facsimile Apparatus for Document Transmission,” 1984.

[27] Anderson, R. J., R. Needham, and A. Shamir, “The Steganographic File System,” in

Proceedings of the Second International W orkshop on Information H iding , vol. 1525 of

Lecture Notes in Computer Science, Springer, 1998, pp. 73–82.

[28] “ScramDisk: Free Hard Drive Encryption For Windows 95 & 98,” <http://www.

scramdisk.clara.net>, 1998.

[29] Handel, T. G., and M. T. Sandford, “Data Hiding in the OSI Network Model,” in

Information Hiding: First International Workshop, Proceedin gs , vol. 1174 of Lecture

Notes in Computer Science, Springer, 1996, pp. 23–38.

[30] Cox, I., et al., “A Secure, Robust Watermark for Multimedia,” in Information Hid-

ing: First International Workshop, Proceedi ngs , vol. 1174 of Lecture Notes in Computer

Science, Springer, 1996, pp. 185–206.

[31] Koch, E., and J. Zhao, “Towards Robust and Hidden Image Copyright Labeling,” in

IEEE Workshop on Nonlinear Signal and Image Processing, Jun. 1995, pp. 452–455.

A survey of steganographic techniques 77

[32] Koch, E., J. Rindfrey, and J. Zhao, “Copyright Protection for Multimedia Data,” in

Proceedings of the International Conference on Digital Media and Electronic Publishing ,

Leeds, UK, Dec. 1994.

[33] ´

O Runaidh, J. J. K., F. M. Boland, and O. Sinnen, “Watermarking Digital Images for

Copyright Protection,” in Electronic Imaging and the Visual Arts, Proceedings ,Feb.

1996.

[34] Xia, X., C. G. Boncelet, and G. R. Arce, “A Multiresolution Watermark for Digital

Images,” in Proceedings of the IEEE International Conference on Image Processing

(ICIP’97), 1997.

[35] Rhodas, G. B., “Method and Apparatus Responsive to a Code Signal Conveyed Through

a Graphic Image,” U.S. Patent 5,710,834, 1998.

[36] Swanson, M. D., B. Zhu, and A. H. Tewﬁk, “Transparent Robust Image Watermarking,”

in Proceedings of the IEEE International Conference on Image Processing, vol. 3, 1996,

pp. 211–214.

[37] Langelaar, G., J. van der Lubbe, and R. Lagendijk, “Robust Labeling Methods for Copy

Protection of Images,” in Proceedings of the SPIE vol. 3022, Storage and Retrieval for

Image and Video Databases V , 1997, pp. 298–309.

[38] Pennebaker, W. B., and J. L. Mitchell, JPEG Still Image Compression Standard,New

York: Van Nostrand Reinhold, 1993.

[39] Wallace, G. K., “The JPEG Still Picture Compression Standard,” Communications of

the ACM , vol. 34, no. 4, 1991, pp. 30–44.

[40] Huﬀman, D. A., “A Method for the Construction of Minimum-Redundancy Codes,”

Proceed ings of the IRE , vol. 40, no. 10, 1952, pp. 1098–1101.

[41] Smoot, S., and L. A. Rowe, “DCT Coeﬃcient Distributions,” in Proceedings of t he SPIE

2657, Human Vision and Electronic Imaging, 1996, pp. 403–411.

[42] Moore, B. C. J., An Introduction to the Psychology of Hearing, London: Academic Press,

1989.

[43] Chang, L., and I. S. Moskowitz, “Critical Analysis of Security in Voice Hiding Tech-

niques,” in Proceedings of the Internat ional Conferen ce on In formation and Communi-

cat ion s Sec ur ity , vol. 1334 of Lecture Notes in Computer Science, Springer, 1997, pp.

203–216.

[44] Cachin, C., “An Information-Theoretic Model for Steganography,” in Proceedings of the

Second International Workshop on Information Hiding, vol. 1525 of Lecture Notes in

Computer Science, Springer, 1998, pp. 306–318.

[45] Willems, F. M., “Universal Data Compression and Repetition Times,” IEEE Transac-

tions on Information Theory, 1989, pp. 337–343.

[46] Upham, D., “Jpeg-Jsteg, modiﬁcation of the independent JPEG group’s JPEG software

(release 4) for 1-bit steganography in JFIF output ﬁles,” <ftp://ftp.funet.fi/pub/

crypt/steganography/>, 1992–1997.

[47] Westfeld, A., and G. Wolf, “Steganography in a Video Conferencing System,” in Proceed-

ings of the Second International Workshop on Information Hiding , vol. 1525 of Lecture

Notes in Computer Science, Springer, 1998, pp. 32–47.

[48] Hartung, F., and B. Girod, “Copyright Protection in Video Delivery Networks by

Watermarking of Pre-Compressed Video,” in Multimedia Applications, Services and

Techniques—ECMAST 97, vol. 1242 of Lecture Notes in Computer Science, Springer,

1997, pp. 423–436.

[49] Hartung, F., and B. Girod, “Watermarking of Uncompressed and Compressed Video,”

78 Information hiding techniques for steganography and digital watermarking

Signal Processing, vol. 66, no. 3, 1998, pp. 283–301.

[50] Pickholtz, R. L., D. L. Schilling, and L. B. Milstein, “Theory of Spread-Spectrum

Communications—A Tutorial,” IEEE Transactions on Communications, vol. 30, no. 5,

1982, pp. 855–884.

[51] Tirkel, A. Z., G. A. Rankin, and R. van Schyndel, “Electronic Watermark,” in Digi-

tal Image Computing, Technology and Applications—DICTA 93, Macquarie University ,

1993, pp. 666–673.

[52] Smith, J., and B. Comiskey, “Modulation and Information Hiding in Images,” in Infor-

mation Hiding: First International Workshop, Proceedi ngs , vol. 1174 of Lecture Notes

in Computer Science, Springer, 1996, pp. 207–227.

[53] Marvel, L. M., C. G. Bonclet, and C. T. Retter, “Reliable Blind Information Hiding for

Images,” in Proceedings of the Second International Workshop on Information Hidin g ,

vol. 1525 of Lecture Notes in Computer Science, Springer, 1998, pp. 48–61.

[54] Pitas, I., “A Method for Signature Casting on Digital Images,” in International Confer-

ence on Image Processing, vol. 3, IEEE Press, 1996, pp. 215–218.

[55] Maxemchuk, N. F., “Electronic Document Distribution,” AT&T Technical Journal,

September/October 1994, pp. 73–80.

[56] Low, S. H., et al., “Document Marking and Identiﬁcation Using Both Line and Word

Shifting,” in Proceed ings of In focom’ 95 , 1995, pp. 853–860.

[57] Low, S. H., N. F. Maxemchuk, and A. M. Lapone, “Document Identiﬁcation for Copyright

Protection Using Centroid Detection,” IEEE Transactions on Communications, vol. 46,

no. 3, 1998, pp. 372–383.

[58] Low, S. H., and N. F. Maxemchuk, “Performance Comparison of Two Text Marking

Methods,” IEEE Journal on Selected Areas in Communications, vol. 16, no. 4, 1998,

pp. 561–572.

[59] Sandford, M. T., J. N. Bradley, and T. G. Handel, “Data Embedding Method,” in

Proceedings of the SPIE 2615, Integration Issues in Large Commercial Media Delivery

Systems, 1996, pp. 226–259.

[60] Sandford, M. T., T. G. Handel, and J. M. Ettinger, “Data Embedding in Degenerate

Hosts,” Technical Report LA-95-4446UR, Los Alamos National Laboratory, 1996.

[61] Wayner, P., “Mimic Functions,” Cryptologia, vol. XVI/3, 1992, pp. 193–214.

[62] “Basic Cryptanalysis,” Headquarters Department of the Army, Field Manual NO 34-40-

2, <ftp://ftp.ox.ac.uk/cryptanalysis/basic_cryptanalysis.ps.tar.gz>.

[63] Hopcroft, J. E., and J. D. Ullman, Introduction to Automata Theory, Languages and

Computation, Reading, MA: Addison Wesley, 1979.

[64] Wayner, P., “Strong Theoretical Steganography,” Cryptologia, vol. XIX/3, 1995, pp. 285–

299.

[65] Aho, A., R. Sethi, and J. Ullman, Compilers: Principles, Techniques and Tools, Reading

(MA): Addison Wesley, 1986.

[66] Chapman, M., and G. Davida, “Hiding the Hidden: A Software System for Concealing

Ciphertext as Innocuous Text,” in Proceedings of the International Conference on Infor-

mation and Communications Security, vol. 1334 of Lecture Notes in Computer Science,

Springer, 1997, pp. 335–345.