Bush Hid the Facts

In early 2004, it appeared that Microsoft had a problem. By this time the United States Coalition were a year into their invasion of Iraq, and people were starting the question the legitimacy of this war. What was being covered up? Who was lying? Was George W. Bush lying?

Iraq being bombed by the coalition

Now this wasn’t Microsoft’s problem, per say, they likely had very little to do with the whole thing, but they did have a problem connected to it.

At some point, after seeing a blog post online, someone went to their Windows accessories folder, opened up notepad, typed “Bush hid the facts” and saved it. We’re not sure exactly why they did this, or who this person specifically was. But when they re-opened the file, the text had gone, replaced with garbled characters that couldn’t be displayed on screen.

Bush hid the facts, replaced by a string of blocks

Quickly this popped up on Usenet groups, email and internet forums.

Hey all try this:
>>>
>>>For those of you using Windows, do the following:
>>>
>>>1.) Open an empty notepad file
>>>2.) Type “Bush hid the facts” (without the quotes)
>>>3.) Save it as whatever you want.
>>>4.) Close it, and re-open it.
>>>
>>>Real strange huh.

For all of you using Windows, do this
1) Open an empty notepad file
2) Type Bush Hid the Facts"
3) Save it
4) Close it, re-open it

People rapidly started reporting that they had the same results, and this apparent conspiracy spread like wildfire.

Not long after, someone reported that if you typed one of flight numbers from 9/11 using the Wingdings font, you would get this. Similar instructions were given.

The Wingdings symbols depicting a plane, two towers and a skull and crossbones

Open  Notepad / WordPad or MS Word, type in that flight number i.e.Q33N
> > (Qand N Caps only)
> > * Increase the font size to 72
> > * Change the font to Wingdings
> > ……. U will be amazed by the findings!!!…………………..
> > !!!!!!!!!!!!!!!!!!! IS IT COINCIDENCE.!!!!!!!!!!!!!!!!!!!OR WHAT???

A Usenet post describing the behaviour and how to replicate it

Seemingly another Microsoft derived message, that must have held deeper meaning and that could be executed within Notepad.

What the hell was going on?

Did someone at Microsoft know something we didn’t? Did someone at Microsoft hold a grudge against Bush? Or was something weirder going on.

Well, it turns out that it was actually something a lot weirder.

It was actually David Crumps who, on 27th February 2004, originally reported on this issue on his self titled blog;

“Someone showed me a weird text file today. It was a bat file with ‘copy MeYou.bak MeYou.txt’. When you would ran it, it would work. But when you opened it in Notepad, there was nothing.

So we decided to look a bit into this and here is something we came up with to ‘create’ invisible text:

Open notepad and enter:
‘ abc.bak abc.txt’

(That is: space abc dot bak space abc dot txt, no line break, without the quotes)

It doesn’t work with every string, just follow us on this example and use that one.

Save your file. Notepad picks default ANSI as encoding.

Open your file, Notepad seems to open by default in Unicode encoding.

Your text is now invisible.”

A notepad file showing " abc.bak abc.txt"

As you’ll notice, this error doesn’t have anything to do with the text string “Bush hid the facts”, instead the text is “abc.bak abc.txt”, and the error was actually stumbled upon by accident. I fired him a message to see what he recalled, but it was a fairly inconsequential run of the mill batch scripting problem, just before he became an intern for Microsoft.

It was the fallout that was more significant.

The IsTextUnicode Problem

You see, the error is down to a fundamental function of Windows called IsTextUnicode that Notepad uses to determine the encoding of text files.

Every file stored on your hard drive is just collection of bytes. Each byte being 8 bits in length. Each of those bits can be on, or off; a one or a zero. In the olden days, we just had plain text, otherwise known as ASCII, this was extended by ANSI, or more officially, SO/IEC 8859, and each character took up exactly one byte. 8 bits in each byte gives the potential for 256 permutations, and so ANSI had 256 characters.

Some binary and corresponding hex codes

So, if this is the only encoding format, notepad would look at a file, see the hex code for each byte, and translate it to a character on screen, as per ANSI rules.

But, by the time Windows was knocking about, there was need for more than just ANSI. There are many languages around the world, and a lot of them have their own character sets. So, other encoding standards were developed.

ANSI and ASCII character sets

One of those was Unicode. The development of Unicode dates back to the 1987, when Xerox employee Joe Becker, along with Apple employees Lee Collins and Mark Davis started looking into a universal character set. Various other members would join the group, including from Microsoft, and on the 3rd January 1991, the first Unicode Standard was published.

Unicode 1.0.1 document

But, for Unicode to function, it needed more characters, and so 16 bits were now used for each character, or two bytes. This gives 65,536 possible permutations, and enough to include most languages. This encoding used the UCS-2 Character set, and is now known as UTF-16. But just like Microsoft does, I’ll refer to it as Unicode here.

Hex code from the notepad file

So, if we open up notepad and type ” abc.bak abc.txt”, and save the file. Then re-open it. Notepad has to now determine how the file is encoded, and display the appropriate characters on screen. If it assumes ANSI, reading each byte as a character, then we’ll get our text file as normal. However, if it decides it is Unicode, and uses two bytes as each character, then we’ll get either invisible text, or a bunch of squares. This is because, on this computer at least, Notepad is trying to display characters from a character set that isn’t installed.

Splitting the hex codes into 2 bytes worth of data

This has a name, it’s called Mojibake, and its definition is the gibberish resulting from text being decoded using an unintended encoding method, and it happens more than you may realise. From websites to software, Mojibake is pretty common.

"The Gibberish resulting from text being decoded using an unintended encoding method"

So, if we go back, create the file and save it again, but this time, specify Notepad to save it as Unicode, then open it again, there’s no problem. This is because we’re making Notepad specify the encoding using a BOM, or Byte Order Mark. Which is essentially a Marker in the file to tell a parser what encoding format is being used, but text files don’t always have this marker, and so Notepad then falls onto the function IsTextUnicode to attempt to guess the encoding.

IsTextUnicode is a Win32 function that has been around since Windows NT 3.51. It was then passed down the NT lineage including Windows 2000 and XP. The function is different from the hyrbrid 16/32bit versions of Windows, and so, this behaviour won’t replicate on those.

If we take a look at the Notepad source code, we can see the point at which IsTextUnicode is called, in an attempt to ascertain whether our file is ANSI or Unicode.

Source code calling IsTextUnicode

IsTextUnicode will take a string of text, typically the first 256 bytes of a file, perform some statistical analysis, and then return whether it thinks the file is Unicode or not. In this instance, for whatever reason, it decides that our file is in fact Unicode and thus, Notepad presents it as such, resulting in what we see, or don’t see, depending on your choice of font.

On 24th March 2004, Microsoft developer Raymond Chen would publish a blog post entitled “Some files come up Strange in Notepad” in response to David Crumps post, where he explains the various encodings.

Blog post from Raymond Chen

Then on 30th January 2005, Lead Microsoft developer Michael S. Kaplan posted about “Why I don’t like the IsTextUnicode” API, detailing some of its pitfalls, and pointing out that this function was now written some 10 years prior by someone outside of the NLS (National Language Support) team, when there wasn’t as much Unicode awareness or acceptance. Nethertheless, it does sometimes assume Unicode, when it’s not Unicode.

Blog post from Michael S. Kaplan

On 18th May 2006, someone going by the username Zoomba wrote on the WinCustomize forum about how you can type the sentence “this app can break” into Notepad, save it, reopen it, and voila. We have the same problem.

"How to break Windows Notepad" blog post

Kaplan responded to this with his own blog post, identifying that, in this instance, Notepad is actually trying to display a bunch of CJK ideographs. It seems then, that IsTextUnicode thinks that it’s more likely that these characters fit together, than the original ANSI characters. Which isn’t as outlandish as you may think.

Michael Kaplan's post about CJK ideographs

Now I actually dug out the source code, although it’s also explained well in FlyTech Videos, the algorithm it employs checks the difference between the two bytes which would make up a Unicode character. It plots the difference in value of each first character, and each second character. Let’s take these ones and convert them into decimal to make it easier. So 74 becomes 116, 69 becomes 105, 20 becomes 32, down the left hand column. On the right, 68 is 104, 73; 115 and 61;97.

Source code for IsTextUnicode function

Ok, so we start with 116 on the left, and calculate the difference between 105, which is 11. The difference between 105 and 32 is 73. Now we do the same for the right hand column.

Splitting the bytes into values

It then adds these values together, and then uses the final numbers to perform a calculation. That is, if the left hand sum > right hand sum*3 then it’s Unicode. This kind of makes sense as, Unicode characters will typically have a higher left hand, or low byte value than the right hand, or high byte value (assuming Little Endian formatting, otherwise it’s reversed, but forget about that), so even if you multiply the right byte value by 3, it will still typically be less. BUT, it’s very much not without its flaws. As we can see….. and we can easily force this algorithm to guess incorrectly every time, simply by limiting the amount of deviance (and therefore numerical difference) of every even character, which the algorithm will see as the right hand Unicode byte.

Low Order Bytes vs High Order Bytes

Now, you’ll probably have noticed that “this app can break” follows the same pattern as “Bush hid the facts”. A four letter word, 3 letter word, three letter word and a five letter word, and it just so happens that this combination also triggers IsTextUnicode to return a TRUE value when asked. I mean, that’s simplifying the function’s input and output slightly, it has various arguments, but if suffices for explaining this story.

If we add an extra character onto the end of this 18 character sentence, then Notepad will no longer make the mistake. That’s because there is now an extra byte. 19 bytes doesn’t fit into the Unicode bracket, as its bytes come in pairs, so IsTextUnicode returns FALSE and Notepad correctly presumes this is an ANSI file.

But, given what was in the news at the time, all the time, it didn’t take long for someone to come up with Bush hid the facts, and start spreading rumours of this weird apparent conspiracy.

Of course, the vast majority of people on the internet, even then, weren’t developers with a specific knowledge of the IsTextUnicode function, and therefore, upon seeing such strange behaviour, their instant reaction was to conclude some kind of weird foul play, and then, pass the message on, which led to the “Bush Hid the Facts” conspiracy spreading around forums and groups for years and years.

"bush hid the facts" in notepad

Usenet groups alone were brimming with the stuff1. But it also did the rounds in email, on forums, and it even made it into the press. Because of course it did.

a string of blocks in notepad

Hoaxes and conspiracies aside, this clearly this wasn’t useful behaviour for a Notepad application to have. If you save a file, you kinda want to be able to read it again…. and so, by April 2008 Microsoft decided to do something about it.

A post about how to replicate "Bush hid the facts" problem

This came in the form of Windows Vista SP1, where the IsTextUnicode calls from Notepad were bolstered with additional checks, and just like that, Bush apparently no longer hid the facts. I mean Saddam Hussein had been plucked from his hole by now anyway. The only problem of course is that, it was Windows Vista. So most people just stuck with XP and Bush hiding the facts, and the conspiracy kept spreading. That is until enough posts popped up explaining the reality that it just kind of fizzled out. The website Hoax-Slayer being one of the first.2

Hoax-Slayer post about Bush Hid the Facts

The Wingdings Issue

But, that doesn’t explain the Wingdings issue, which was often shared in the same posts as the “Bush hid the facts” conspiracy. Now that, is a different deal entirely, and honestly, even more of a piss take.

After September 11th, an email began circulating claiming that “Q33 N” or “Q33 NY” was the flight number of the first plane to hit the Twin Towers. If you type this into Notepad, or anywhere else, using the Wingdings font, then it would generate a plane, two filing cabinets, a space, a skull and cross bones and, with the added Y, the Star of David.

A post about replicating the Q33N Wingdings problem

Now this was supposed to represent, apparently, the strike on 9/11. However, it’s driven off the back of a conspiracy theory some ten years earlier, after the release of Windows 3.1, which came pre-loaded with the Wingdings font.3

The result of typing NYC in Wingdings

It was The New York Post who kicked off the controversy with the 1992 front page headline “PROGRAM OF HATE… Millions of Computers Carry Secret Message That Urges Death to Jews in New York City”. *sigh* A headline that was clearly going to sell copies of the rag, and it was all because typing NYC using Wingdings would result in this….

New York Post headline about Wingdings

The consultant was testing a mailing-address use of the program when he noticed the letters “NYC” had been replaced by a hateful message – a skull and crossbones, the Star of David and an approving thumbs-up symbol.

Microsoft strongly denies any hidden message. Others disagree.

“There’s no way it could be a random coincidence,” said Brian Young, a friend of the consultant, who does not wish to be named.

Apparently, the friend of an anonymous programmer who discovered the issue, Brian Young, calculated the odds of the three letters of the alphabet being combined with 255 symbols were less than one in a trillion. Less than a trillion hey Brian? Good work.

A newspaper story about "Hidden Images Offer a Window on Techno Graffiti"

This purported anti-Semitic message apparently referencing New York’s large Jewish community was nothing more than random placement of glyphs relating to characters, however it didn’t stop The Anti-Defamation League from sending a letter of complaint to Microsoft or for it from bouncing around in the news for years. Even after The League and other groups concluded there was no malicious intent by Bill’s software company.

NUN, IBM and MILLENNIUM in Wingdings

Microsoft’s Brad Silverberg commented “The Wingdings were like Rorschach blots, it probably said more about the person than the symbols”. Which, given you could extract meaning from pretty much any words you typed; NUN is two poison signs with a cross; IBM is an openhand, an OK sign and a bomb, and MILLENIUM, which would pop up in 1999 is this, its a valid statement to make.

Brad Silverberg next to his quote

A rebuke by PC Computing Magazine columnist, Penn Jillette, quickly did the rounds, correcting the ridiculous claims and odds and penning various tongue in cheek comments, including among other things;

There is so much hate, paranoia, and bad math in this whole thing that I quit. (“QUIT,” incidently comes out;

[airplane] [cross] [hand held up] [snowflake]

This must mean “A plane carrying christians must be stopped if it’s snowing.” The odds of obtaining this message by chance are one in 4,228,250,625

The rebuke post from Penn Jillette

In an attempt to smooth things over Microsoft would roll out their Unicode compatible Webdings font a few years later, in 1997, using this now deliberate combination for NYC4… Something that only brought out the hoaxers once again. HOW CAN THIS BE A CONICIDENCE NOW.

NYC in Webdings

So, as you’d expect after 9/11, this rumour cropped up again and Microsoft felt obliged to issue a further statement;

“We can certainly understand how people would respond with some shock to this apparent issue. We did too when it first came up nine years ago and we investigated it thoroughly in partnership with the Anti-Defamation League. The conclusion was that the sequence in the Wingdings character set is coincidental and that there was no malicious intent. In fact, it impacted several software companies at the time and continues to do so. Unfortunately, there was not an easy way to fix the problem. We understand that this requires explanation.

At the simplest level, wingdings and webdings are much like an alphabet of characters and provide thousands of potential combinations from which a person could choose. Changing the character set would create an impact of unknown scale on existing data and code using the affected font. Again, using the example of the alphabet, what would happen to existing documents and applications if we switched around a handful of letters? The likely result is that we would create significant issues for people, cause some unintended humorous moments and several offensive ones. For that reason Wingdings has been left unaltered since its inception.”4

"Frauds and Hoaxsters thrive in a time of tragedy" news headline

Even so, they had no reason to, if people only looked at the facts, rather than being reactive knobends, the claim that Q33NY was the flight or tail number of either of the planes that hit the towers was completely fabricated. It was a hoax, designed to spread a new conspiracy, a new drama to propagate the web. Not that people ever let factual inaccuracy get in their way.

Combine it with the very real bug of “Bush hid the facts” and you had a rather sinister story pointed against Microsoft, and specifically Bill Gates… Because if you’re gonna throw a conspiracy out there, he is apparently the man.

Thankfully by the end of the decade things had calmed down. Most people now had a version of Windows that didn’t replicate the error, email was no longer the primary method of spreading information, and there was now social media and a whole spate of brand new conspiracies to keep people occupied instead.

Hurrah.

Fool me once, you can’t fool me again.

George Bush doing his infamous "fool me once" quote

Until next time, I’ve been Nostalgia Nerd.

Toodleoo.

  1. groups.google.com/g/sadgoshthi/c/CLdPdVNNKyU/m/oPDm5yEwKiYJ []
  2. web.archive.org/web/20100315222317/http://www.hoax-slayer.com/bush-hid-the-facts-notepad.html []
  3. www.vox.com/2015/8/25/9200801/wingdings-font-history []
  4. web.archive.org/web/20140501070552/http://archive.wired.com/techbiz/media/news/2001/09/47042 [] []

Leave a Reply