[Written on February 24nd 2004]
Steganography strength (is it easy to see there is hidden data?): Low
Cryptography strength (is it easy to recover the hidden data?): Low
I generally like to write these papers in a funny way, to lighten up a bit the austere technical side, and sometimes, yes, I shamefully admit, I poke some fun at programmers, but I swear I try never to be too cruel. Especially when we are talking about freewares and even better, Open Source software. When people try to sell bad software, I rarely have pity.
But in the case of this program, it's no more funny. I think a line is definitely crossed here. This is plain irresponsible to sell such crap. First, this software does not do what it is supposed to do. Even more, it is so badly written by someone who has no idea about what steganography and file formats are, vaguely documented, buggy, and probably never tested, that the steganography routine in FortKnox will actually corrupt your files, both the ones you try to hide and the carrier files, while you're using it just as they say it should be used. This program technically fits in the definition of a trojan. And they try to sell it for 45 dollars.
Nevertheless, I contacted the author so he could remove the leaked information inside the demo BMP (see the very last chapter) before I put this article on-line.
1. A first look at the software presentation page
A company called ClickOK Ltd in the UK is selling a software for 25 pounds, which is around 45 US$. It's called "FortKnox", after the very secure location where the US gold reserve is located. Apparently, this software was previously (before february 2004, as they mention in the readme) known as "PalmTree", or just "Steganography". It bundles several security routines like encryption, hiding directories, managing passwords. But I will only have a look at the steganography part of it, apparently version 2.8. I discovered this software because they apparently bought the Google adword "steganography", which means that every time you type this keyword in the search engine (and I do that a lot), there is an advertisement for their product appearing in a small box on the right side of the search results screen.
To be really frank, a simple look at their web page already raises a few mental alarms. As I often wrote, you often get a pretty good idea of what you will be dealing with, just by reading the presentation of a product. And it gives you a few ideas and wild guesses that will help for designing simple tests. Here we have a typical and nice example of snake oil:
- "The market leader in steganography"... I've never heard about them, which does not mean anyhting by itself, but I do a lot of searches about steganography products. Never seen a test of their software either.
- "Hide any file in any file type"... Well, if you can hide something into any file, that means that there is something wrong. An in-depth understanding of file formats is important in steganography. At this point, I suspect another of these end-of-file fusing of the "hidden" data, with the hope that it will not mess up with the carrier file. It's actually worse, as we will see. Note that on the software package that we can see on the web page, there is written "Hide data into images and most other file types". Well, they need to decide. "Any file type" is conceptually very different than "most file types".
- "Image file needs to be only 9 times larger than data file"... Now that seems to contradict my speculations about the above claim. If it's a simple file fusing, you don't care about size. The ratio here (close to 1 bit in a byte of 8 bits) seems to point to some kind of LSB embedding. We will see.
- "UNBREAKABLE MILITARY GRADE encryption"... A classic one in good old uppercase so you won't miss it. Of course "military grade" and even "unbreakable" do not mean anything.
- "Tried, tested and trusted by millions of users"... What? Millions of users? What about billions while we are at it? There are very few softwares that actually have millions of users, and generally you have heard about them. The most funny part of this is that I know exactly how many clients they have for another thing they are selling, see last paragraph if you want to have fun.
- "Derived from our Blowfish, CryptoAPI and Rijndale algorithm lab tests"... That's misleading at best. First, the correct speling is Rijndael, and it's now the AES encryption standart. Second, Blowfish was invented by Bruce Schneier, Rijndael by two belgian academic cryptographers from the University of Leuven, and CryptoAPI is not an algorithm per se, but a library of cryptography routines (probably the Microsoft one, this program being written in Visual Basic) that includes several well-known algorithms.
Then there is this odd comment: "Known Errors - As this software allows you to hide any file within any other file, people tend to try some weird combinations (like database files in text files and movie files in html files etc). An error will only occur if the destination file (when encrypting) is smaller than the source file"... What does it mean? That we can do some "weird combination"? That it will work? Or not? I don't understand. Their second statement was "Hide any file in any file type". Yes or no? Incredibly vague. Some testing will be necessary.
2. A first test with BMP images
The programmers of this software nicely include a big 24 bits BMP image (see below what surprise I found inside this image), and a text for you to test. The demo version is limited in such a way that you can only hide text starting with the string "testing this software". So let's do it.
Well, it works. A few other simple tests show that they embed the hidden data by overwriting the two Least Significant Bits (LSBs) of each byte of the pixel data. Just as a reminder, in 24 bits BMP images, the most used type of BMP picture, you have a small header, and then each pixel is encoded flatly and sequencially (from the first pixel to the last pixel) as three bytes, one for the Red, one for the Green, and one for the Blue levels (RGB encoding). So it's 3 bytes, or 24 bits, hence the name.
This header in BMPs is in most cases 54 bytes long (36 in hexa). Its size is written in the tenth byte. After that, you have the pixel data. FortKnox leaves the very first pixel byte alone, and then starts to overwrite the 2 LSBs starting from byte 55. The point is, as with almost all BMP steganography programs, to hide the data by slightly changing the color of each pixel, modifying the 2 least significant bits out of a total of eight bits for each color coefficient.
The hidden data is not encrypted, compressed, encoded or modified. So it's very easy to retrieve it. I've coded a small utility to automatically extract the data, see below. In my self-defined steganography scale, FortKnox steganography routine is category 3, one of the simplest to crack.
I will go into a little bit more detail than usual so people are going to understand why this program destroys your files. It will come later. First, just BMPs.
Here is an hands-on example. First, the very beginning of a BMP without anything hidden in it. The whole BMP header is in red. It's 54 bytes long, and this size is stored on the double byte which is underlined, at offset 10. The white following bytes are the actual pixel data in RGB format.
Now here is the same BMP with the demo text hidden with FortKnox. Obviously, the header does not change at all, because that would totaly mess up the image, and it would become unreadable. As I said, the hidden data is hidden starting at byte 55, so you can see that the values of these white bytes have actually slightly changed.
Let's extract the beginning of the hidden message by hand. Remember that it is hidden in the 2 Least Significant Bit of each byte. If you want to extract them programmatically, you will do a "AND 3" to apply a mask for each byte, because 3 in decimal is 00000011 in binary. So, the 2 LSBs of:
If we re-arrange them in groups of 8 bits, we have two complete bytes and one incomplete (remember I just show a partial hexa dump):
Which are the ASCII numbers for the letters:
Obviously, you remember the demo text was "testing this software". So we got it. A point that will become important later is that FortKnox puts a zero at the end of the hidden text. Because zeros are not supposed to appear in a text ASCII file, the program will know it's time to stop when it meets a zero at extraction time.
Here is a small utility I've quickly coded to extract the data automatically. It's called FortKnox_Hidden_Text_Finder. Keep in mind that my program, unlike FortKnox, will not stop after meeting a zero in the extracted data. It's deliberate. So if you extract a small text, you will get this small text, a zero, and then a lot of random bytes.
3. So where is the scandal? I want blood, guts and tears!
Until now, the stego sub-program bundled in FortKnox looks like a normal simple and weak steganography software, like so many others before it, nothing to get scandalized about. If you want to hide a spicy letter from your lover from the eyes of your little sister, and if you plan to use only 24 bits BMP files, there is no problem. We are even used to some level of snake oil and somehow probably desensitized. Oh, and this software is extremely slow as well. My own small program, 2 Kb, is around a hundred times faster to extract data from carriers.
But the real fun is coming now.
The true problems started when I tested their next claims: it can hide any file (claim 1) into any file (claim 2).
3.1 Claim 1: hide any file... and destroy it.
Any file? No. Because FortKnox steganography routine does not know how to handle zeros. Yes, zeros. Like a byte equal to 00 inside a file. Weird, uh? If you have a zero in your data stream to be hidden, it simply... stops! Except pure ASCII text files and a few other rare file types, almost all binary files have zeros in them, it's so evident. Strongly encrypted files have zeros appearing with a 1/256 probability. Databases have zeros. Executables have huge islands of zeros. Image files have zeros. zeros are everywhere! So you think you hide some kind of important file, just as they claim you can do, you wipe the original one, and oooops, FortKnox has destroyed your precious data. You're just left with the few bytes before the first evil zero from hell.
The programmer of FortKnox does not know that he could encode the size of the hidden data inside the hidden data itself, so he does not have to wait for a zero.
Nowhere it is said that you should just hide ASCII text files only. On the very opposite, they claim you can hide "any file" on their web site. When you click the "read this" button inside the software, they say texto "allow you to hide the content of a text file (or any data file)". It's plainly irresponsible.
And just because of this bug, just because they have no idea about what they are doing, they left exposed their whole list of clients' emails for the world to see. See below.
3.2 Claim 2: ... into any file... and destroy it too.
For the carrier files you may want to use, they give at least a few examples: "BMP, JPG, ZIP, DBF, almost anything" (in their web page), and once again when you click on the "Hide in other file types" button inside the program, it says "hide messages in an image file OR ANY FILE TYPE (JPG, Executables, DBF,... almost anything) - Just choose your alternative to an image file" (uppercase conform to the original, so they really want to put an emphasis on that).
So let's try.
First with BMPs that are not 24 bits. I took a small image and transformed it in BMP format with 24 bits for each pixel (16 millions of colors), 8 bits (256 colors), 4 bits (16 colors), or 1 bit (2 colors). Then I hid some text in it (the demo text followed by a bunch of "a"), close to the maximum capacity. Here are the results (for viewing on this webpage, the images are changed to GIF). I didn't enhance anything. These are the results you will see inside any image viewer:
Strange, isn't it? Actually, it looks like FortKnox does not care about the internal format of the image. It treats every BMP like if it was a 24 bits BMP. And of course, it screws up the images which are not 24 bits, because the pixel data is not stored the same way. The "Least Significant Bits" that FortKnox modifies become very significant when you don't use 24 bits. Because the header size of X bits BMP is constant, you can still see an image, although totally scrambled (in a sort of regular way because my hidden text was constant after the demo text, a lot of "a"). It's just plain chance.
Than a question came to me: does it actually verify that carriers indeed are BMPs?
The response is: no! Feed FortKnox with any kind of file as a carrier, even files that they claim can be used, and it will treat them as 24 bits BMPs! Even if these are JPGs, ZIPs, executables, a file containing just zeros, or whatever you want. There is no verification that the file is a BMP.
And, of course, your carrier files will be destroyed in the process. You cannot treat a JPG or a ZIP the same way as a BMP, it's so evident. At the beginning of most file formats, you have a header with very important information absolutely needed for the file to work the normal way. Modify this information and your files will be definitely corrupted. Do I really have to explain that?
Just an example with the beginning of a JPG file used as a carrier for hiding their demo text. The modifications, starting at offset 55d or 37h, are shown in red. FortKnox treats it as a BMP! Here is the beginning of the JPG file, before and after hiding something in it. I've underlined the markers for very important fields in the JPG header like FF D8 (Start of Image) or FF DB (Define Quantization Table). You can see that FortKnox destroys some of these markers. The JPEG image is corrupted.
Let's try using a ZIP, an executable, and a JPEG as carriers, just as they say it's possible. The result, when you will try to use your carriers, will be:
It's actually so odd it's almost unbelievable, and I tried to think about all the weird consequences. For example, imagine you try to hide a JPG inside a JPG: you will destroy both files.
3.3 The funniest part of it all.
A first observation is that FortKnox does not know how to delete hidden messages, for example by setting back all LSBs to random values. So you cannot "clean" a BMP already used for a test. A second observation is that, if it hides a small message, it will not touch the remaining LSB.
Because of these two observations, and because we now know how FortKnox works, think about what may happen if you first use a big BMP to hide a big text file, then use the very same BMP to hide a small text file.
The small text file, with an added zero at its end, will overwrite the beginning of the big text file. But the remaining part of the big text file will still be here. Something like this:
Hide this: [blabla first big text file hidden by FortKnox blabla]00 Then hide that: [2nd small text]00 Result: [2nd small text]00[ext file hidden by FortKnox blabla]00
If you use FortKnox to extract information, it will stop at the first zero, so it will just extract the last text you hid. But remember that I coded a small program to extract data hidden by FortKnox. And I didn't follow the bug "Stop at the first zero". I programmed it so it will extract everything from an image, from beginning to end. So it will extract the text you hide, plus all the rest (traces of old hidden and longer texts, and then random bytes) until the end of the carrier file.
When I tested my program with the BMP image included in FortKnox, I had a surprise. I extracted the "testing this software" string ended by a zero, as expected, and then, I was waiting for some random bytes, but.... There was some kind of Perl CGI script, with URL addresses in it, some password, and a long list of hundred email addresses. All of them interrupted each time with a zero. It was several layers of forgotten hidden information.
After close examination, it appears that these are several documents hidden one after the other in the demo BMP.
At the beginning of it, you have the script that prints a webpage after people complete a credit card transaction after buying another product from the same company that distributes FortKnox. This one. Stuff like this:
Following this you have 885 email addresses (the first ones have been erased by hiding the precedent smaller script), the database of people who bought this product. A lot of "recruitment@", "ressource@", "jobs@", "cv@"...
So it appears that the programmer of FortKnox once tried his own program with the included BMP, and needed some kind of long text file. He probably chose what was around on his computer, the database of his clients and some script. And then he forgot to clean the BMP before including it in the software package, exposing passwords and clients email for everybody to see. That tells a lot about the seriousness of it all. By the way, in France, not protecting enough a nominal database is a crime that can be punished by 5 years of jail. Kitetoa knows this very well, as he's always exposing companies website which do not follow the law by not protecting enough these client databases.
It reminds me of an anti-virus product which executable had traces of a badly repaired CIH infection inside it. Anyway. To conclude, just remember this: some softwares are written by people who have no clue about what they are doing, and just want to make a quick buck. This one is an extreme exemple.
Have a nice day!
Guillermito, February 24th 2004