[Written on February 24nd 2004 - Updated on March 10th 2004 and March 14th 2004]

Two steganography softwares broken (for the price of one) :
C1oak v7.0 and Data Stea1th v1.0

Steganography strength (is it easy to see there is hidden data?): Low
Cryptography strength (is it easy to recover the hidden data?): Low

[Update 1 : I notified the authors of this software, as I always do. I then had a cordial and constructive exchange of email with the CEO of the company. Very quickly, almost one week after the publication of this article, they released a new version that corrects the problem, and asked me to review it. Unfortunately, I didn't have the time yet. But I think they handled the matter quickly and professionaly, which is not that frequent. Everybody can make errors; not everyone take the time to correct them. So keep in mind that this present article is about an outdated version of their software, and that the security flaws are now probably corrected]

[Update 2 : Because these people seem to be nice (although their lawyer is not) and reactive, even if we disagree on my full disclosure philosophy, and maybe to avoid a DMCA lawsuit, I slightly modified the name of keywords to lower the rank of this page in search engines]

1. The softwares

A US company called 1nsight Concepts Inc is selling a variety of softwares, and that includes a few about security. More precisely, two steganography software. One is called C1oak (v7.0) to hide stuff into image files, the second one is called Data Stea1th (v1.0) to hide stuff into multimedia files (video, sound). The interface of their software is well done but so painfully slow that C1oak for example is not really usable on my old computer. Their website has a nice professional look.

Both of them are sold for US$ 35 (download) or US$ 48 (on a CD). I broke both of them. It's getting boring, frankly.

2. A first look at the documentation

There is some snake oil on their website and presentation files, but it stays on the average level. A few examples, just for fun:

- "Electronic steganography is a very complex and highly structured technology, and C1oak uses a more advanced form of steganography that will make your files virtually undetectable and also irretrievable as well"... Wrong, wrong, wrong, and wrong. Average steganography is a simple technology. C1oak uses an much simpler than average steganography technology. Your hidden files will be very detectable and very retrievable.

- "that makes C1oak one of the most reliable secure forms of data protection in the world."... Do I really need to be cruel here? I broke it in less than one hour, without doing any reverse engineering, and I am an simple amateur.

- Their encryption algorithms used are called "Scatter, Interlock, C1oak-256, Active, Particle, and Dynamic" and are "all very strong encryption algorithms"... The first rule of computer security is: if the encryption algorithm used is not cited, or developped in-house, and is not one of the few publicly known, studied and trusted strong algorithms, it smells bad. We don't really care anyway, because you can extract the data without breaking the encryption, just by overwriting the password by a new one.

- "the option Enhance Encryption systematically scrambles your data after it has been encrypted to further secure your information"... Okay. Writing this phrase means two things: 1. they never read a book about cryptography, and/or 2. their in-house algorithms are really bad.

Then there is this strange thing: Certificates. "Certificates are security templates that are used to encrypt and decrypt your data. Without the correct certificate, Data Stea1th cannot correctly decrypt your data."... What is that? A second key? Strong crypto algorithms don't need two keys. One is enough.

3. A first look

To simplify things and go faster, let's say right away that both of these softwares use the same weak way to hide data. One is for images, the other one for multimedia files, but it's not important, because they both fuse the "hidden" data at the end of the files, a method unfortunately very simple, very bad and very popular, as we already know. The steganography routine doesn't care at all about what your file is (although these softwares are able to view/play the file, so at some point they check the format, but this has nothing to do with steganography). On my self-defined steganography scale, these steganography programs are category 1, the simplest method.

An example of hiding a small text file (called "hiddenfile.txt") in a JPEG, with the only encryption algorithm available in the demo version ("Scatter"). The password is "a". I've put in color a few fields I could identified after playing around with these softwares for a dozen of minutes. Remember FFD9 is the marker of the end of a JPEG file:


      0  1  2  3  4  5  6  7   8  9  A  B  C  D  E  F

05A0  BE 11 BC 13 3E 43 8B A5  1C 8D 1F FF D9 0E 07 05
05B0  0E 00 00 00 90 91 9C 9C  9D 96 9E 91 94 9D D6 8C
05C0  80 8C 20 00 00 00 C0 CD  CF CA CF 9C C9 CF C9 99
05D0  9D CD 9E CD C0 CA CB 9B  CD C0 CA C0 C0 C9 CB 9A
05E0  CF CB CF C0 9E 9B 20 00  00 00 C8 9B 9B C9 CF CD
05F0  9A C1 9B C8 9E C9 9A CE  99 C0 CB C9 9B CB C1 C1
0600  9D CA CE C1 CF CF CA CE  CE C9 20 00 00 00 00 B1
0610  EB DF 5D 53 CB EC C6 85  24 88 2F 99 93 CB 6E 89
0620  9E 51 E7 60 11 01 6B DF  67 2D 27 85 09 43 FA 0F
0630  E6 47 0C AD 05 00 00




End of JPG, ?, Version?
size / Name of file

size / MD5 Certificate?

size / MD5 Password

size / Encrypted file
?
Offset end original file

So, as we can see, some typical errors here, I will just go fast because I'm bored to explain for the hundredth time that: 1. Data fused at the end of a file is easily accessible for everybody to see. 2. When the "hidden" data is formatted nicely, with fields, size of each fields, etc, then an attacker can very easily find this "hidden" data automatically. 3. If you include the password alongside the hidden data, then everything is useless, the security is zero. Ah well, here it's not the password, but its MD5 hash. So, if you want to retrieve the "hidden" data, just replace the MD5 in this field by the password of your choice. Pooof. Cracked. End of story. Next one, please.

4. But I don't see any understandable data here?

Ah yes. I forgot to mention that everything is scrambled with an extremely simple fixed substitution algorithm. The changed are resumed in this table. You can see it's regular and simple. The numbers and letters you will need for the MD5 encoding are in yellow.

char  old new
----  --- ---

space 20  d8
!     21  d9
"     22  da
#     23  db
$     24  dc
%     25  dd
&     26  de
'     27  df
(     28  d0
)     29  d1
*     2A  d2
+     2B  d3
,     2C  d4
-     2D  d5
.     2E  d6
/     2F  d7
0     30  c8
1     31  c9
2     32  ca
3     33  cb
4     34  cc
5     35  cd
6     36  ce
7     37  cf
8     38  c0
9     39  c1
:     3A  c2
;     3B  c3
<     3C  c4
=     3D  c5
>     3E  c6
?     3F  c7
@     40  b8

char  old new
----  --- ---

A     41  b9
B     42  ba
C     43  bb
D     44  bc
E     45  bd
F     46  be
G     47  bf
H     48  b0
I     49  b1
J     4A  b2
K     4B  b3
L     4C  b4
M     4D  b5
N     4E  b6
O     4F  b7
P     50  a8
Q     51  a9
R     52  aa
S     53  ab
T     54  ac
U     55  ad
V     56  ae
W     57  af
X     58  a0
Y     59  a1
Z     5A  a2
[     5B  a3
\     5C  a4
]     5D  a5
^     5E  a6
_     5F  a7
`     60  98

char  old new
----  --- ---

a     61  99
b     62  9a
c     63  9b
d     64  9c
e     65  9d
f     66  9e
g     67  9f
h     68  90
i     69  91
j     6A  92
k     6B  93
l     6C  94
m     6D  95
n     6E  96
o     6F  97
p     70  88
q     71  89
r     72  8a
s     73  8b
t     74  8c
u     75  8d
v     76  8e
w     77  8f
x     78  80
y     79  81
z     7A  82
{     7B  83
|     7C  84
}     7D  85
~     7E  86

With this table, we can now decode by hand what was for example in the "Name of file" field above. The first "0E 00 00 00" is the size of the field, so it indicates the next 14 bytes (0E hexa is 14 decimal). These bytes and their decrypted values are:


90 91 9C 9C 9D 96 9E 91 94 9D D6 8C 80 8C
h  i  d  d  e  n  f  i  l  e  .  t  x  t

Which is the name of the hidden file. But we don't really care about that. It's just to demonstrate the weakness of the cypher used here. More fundamental is the MD5 field, which is stored in a weird overbloated format, but nothing surprises me anymore these days. The first "20 00 00 00" is the size of the field, so it indicates the next 32 bytes (20 hexa is 32 decimal). These bytes and their decrypted values are (I removed the space to fit in one line):


C89B9BC9CFCD9AC19BC89EC99ACE99C0CBC99BCBC1C19DCACEC1CFCFCACECEC9
0 c c 1 7 5 b 9 c 0 f 1 b 6 a 8 3 1 c 3 9 9 e 2 6 9 7 7 2 6 6 1

This is the MD5 for the string "a", as we already know it from here and you can still calculate it with good free tools like HashCalc.

5. So how do we extract data we don't know the password?

I'm too bored to code a program for that. Here is how to do it by hand:

1. Calculate the MD5 of your favorite password. For example, the MD5 of the string "1" is "c4ca4238a0b923820dcc509a6f75849b".

2. Translate it in C1oak / DataStea1th substitution alphabet. For example, the above MD5 becomes "9bcc9b99cccacbc099c89ac1cacbc0cac89c9b9bcdc8c199ce9ecfcdc0ccc19a".

3. Find the password MD5 field in your file containing "hidden" data. A first idea that comes to mind is to do a search for the bytes "20 00 00 00" for example, as a MD5 is always 128 bits, and so 16 bytes, so this field will always be 32 bytes. Be careful, there are two fields with the same size, see the hexa dump above.

4. Overwrite the MD5 bytes with the new ones, with an hexadecimal editor.

5. Extract your data with C1oak or Data Stea1th, entering the new password "1". Et voilà.

Have a nice day!

Guillermito, February 24th 2004