From Fedora Project Wiki

X BitMaps: extract image data and display in ascii terminal

Intro

While being distracted from my previous distractions from an earlier distraction (fonts), I was intrigued by: [Great 202 Jailbreak - Computerphile]

The report [a Summer Vacation: Digital Restoration and Typesetter Forensics] included a link to [archive made available of Martin W. Guy's backup to tape from the 80s], where the authors found some data they used either directly or to confirm their earlier guesses about construction of the document. This appears to have taken about 6-8 weeks of work to rebuild one printed report from various information they were able to find or still had in hand. But I digress.

Within the archive index was images described as: Mike Hawleys's collection of tiny X bitmaps (Dec 1988) Including: [Brian Kernighan].

Unknown image type

After clicking the extension-less file I saw:

#define bwk_width 48
#define bwk_height 48
static char bwk_bits[] = {
0x00, 0x00, 0xc0, 0x3f, 0x00, 0x00, 
0x00, 0x00, 0xf8, 0xea, 0x01, 0x00, ...

Hoping to find information to help find an application that could show this source code, I saved it to disk and tried file: bwk.image.c_source: ASCII text. Seeing this is c source code, I assumed that this was used by directly compiling into a larger c application. What I could have done was attempt to identify the file with:

test result
ffprobe bwk.image.c_source: Invalid data found when processing input
gimp bwk.image.c_source' failed: Unknown file type
imageinfo XBM X Windows system bitmap (black and white) 1850 8 48x48

using: imageinfo --format --fmtdscr --size --depth --geom bwk.image.c_source

imagemagick identify XBM 48x48 48x48+0+0 8-bit sRGB 2c 1.85KB 0.000u 0:00.000

Python workout

Given the things I tried hadn't made me any the wiser, I considered starting a c app, to include the file and code something to view it somehow. Expanding my python skills was more important, so I began looking at the structure of the file to plan how to proceed: - read the file - get the width - get the height - get the image data - transform / feed into an image creation library to create a png/bmp - whatever was easiest. - not knowing about the file format I decided to also grab the filename (bwk) from the defines, assuming that you could define more than 1 image in a file, and you need to pick the right defines and data for a single file.

Python development environment

Half the problem is to find and setup a dev env to speed the development. I started with: python3, gedit, gnome-terminal, firefox (google, python manual, stackoverflow). Hacking involved trying stuff in the python3 interpreter, and then copy paste into my code.py in gedit.

Later I started using bluefish editor, with a custom command for python: gnome-terminal --geometry=100x50+1200+0 --working-directory='%c' -e "bash -c \"python3 '%f'; read -n1 junk\"" Clicking Python starts the terminal, with the correct directory, starts python3 with the file in the editor, and pauses the terminal output until a key is pressed - necessary to see interpreter messages and my hacking output.

file read

Getting the text of the file into a string in memory was easy:

fhand = open('bwk.image.c_source.txt')
sDataRaw = fhand.read()

regular expressions

I learnt a lot about regex's by using the re module, and then the extended regex module to detect conforming file content. The [regex builder/tester] was useful. At first I tried to match the two #define lines, and extract the match group data, leaving the pixel data for a second regex.

import re
...
matchobject = re.search('.*#define ([[:alpha:]]{1,3})_([[:alpha:]]{4,6}) ([[:digit:]]{1,2}).*', sDataRaw)
if matchobject:
  print(matchobject)

However, this would only show the first match. I extended this to match the overall file structure extracting: imagename1, metric1, value1, imagename2, metric2, value2, imagename3, and which had data that looked like a c string of 0xab hex values. Since I needed multiple matches, I changed to regex library instead:

import regex
...
pattern = regex.compile('#define ([[:alpha:]]{1,8})_([[:alpha:]]{4,6}) ([[:digit:]]{1,2})\n.*#define ([[:alpha:]]{1,8})_([[:alpha:]]{4,6}) ([[:digit:]]{1,2}).*static char ([[:alpha:]]{1,8})_bits\[\] *= *.*[, \n0x[:xdigit:]]+\};', regex.DOTALL)

pixel data