Analysing data files by drawing pictures

31 May

Not many people have to analyse completely undocumented data files to see what they are – but, every now and then, I do.

One surprisingly useful technique is to map the possible values of a byte (0 to 255) to unique colours, then display the file as a picture, one pixel per byte.

Here’s an old data file:

But, that’s just drawing the pixels out at a random width (250 pixels wide, in this case).

Starting a new line after every 269 pixels, you get this:

Note the diagonal lines. Since the lines are at 45 degrees, and slanting top-left to bottom right, you can tell that the display width needs to be 1 more pixel wide.

Drawing the file out with 270 pixel rows, one gets:

Nice vertical lines show that one has found the record size, the first step on decoding the data.

Other types of file don’t display so nicely. Here’s a couple of graphics files (a .gif and a .png)

No obvious patterning, but there’s still information to gain. In particular, it’s clear that both files have a header section which has much more structure than the image data that follows it. The fact that a file contains two or more different types of data can be very useful – sometimes you would have to step through hundreds of pages with a hex editor  to spot it.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: