December 2008 Archives

♻ I just heard about that

| No TrackBacks

This happens to me a lot—I’m glad it’s not just me. I felt the same way after learning about floaters (and also felt relieved that my eyes weren’t permanently scarred).

[via the always awesome Trivium]


Somewhat related: Littlewood’s law codifies what I’ve said before: “People should be amazed more coincidences don’t happen to them.” We just don’t understand the cardinality of the sets of ‘all possible coincidences’ and ‘all events perceived.’”

♻ This is Howie do it

| 1 TrackBack

This show looks terrible (and its promos infuriate me), but this Variety article has to have one of the worst titles I’ve ever seen: NBC asks Howie Mandel to ‘Do It’.

Clinton v. Franklin

| No TrackBacks

I asked myself the other day, “Are any city names so popular they’re in all 50 states?” With a little research and some UNIX magic (sort, uniq, regular expressions, etc.) I discovered that I was pretty wrong; there are no city names across all 50 states. However, the uniqueness of names makes for a nice distribution:

How to read this: There are 13,584 city names that are in only one state—unique in the nation. Similarly, 1638 city names are in two states, and so on. The y-axis is logarithmic.

At the far right, we see that there are two city names in 27 states(!), but none in all 50. The two winners? Clinton and Franklin.

♻ Lambert v. Hartmann v. Sanity

| No TrackBacks

After reading the opinion, petition for writ of cert, and brief in opposition of the writ of cert, it was refreshing to see this in the reply to the brief in opposition of the writ of cert (phew):

They call it a “Brief in Opposition,” but they do not actually “oppose” any argument in the Petition. … A more apt label is: “Brief in Obfuscation.”

The more these go on, the more aggressive and less passive these become…

[via]

♻ Fresh from California

| No TrackBacks

For Christmas, one of our family’s contribution to the East Coast was a four-pack of Martinelli’s Sparkling Cider. Not only is it from my mom’s hometown of Watsonville, but it’s also delicious. I do not hesitate in saying that Martinelli’s is Watsonville’s best brand (sorry, Driscoll’s).

The latest bottles we got looked different, though. Upon reading the back, I found that these were modeled to look more like the original champagne-style bottles. I think it’s great, and I’m about to play my own version of Brand New.

Before:

After:

I like the new font weight and color—they really classed up the faux champagne. Someone also did a pretty great job cataloging all their previous labels.

I’ve always been a fan of Martinelli’s style, sometimes sporting this sweet shirt I got when I was young that describes their apples (in great detail):

♻ Those bugs got fixed

| No TrackBacks

It’s bad when the CEO finds bugs in your product, right?

After a question on Gates’s programming, discussing a web browser mod he made:

That’s just actually a nice little VB application that one other guy and I hacked together. So I play around with that quite a bit. And it kind of gives me a sense that there are still some things that are fairly hard to program. I even found one or two bugs in the process.

♻ C'est la vie

| 1 TrackBack

Ha, Shorpy didn’t publish my comment (it’s approval-based) on how they photoshopped out the watermark from this photo. It’s totally cool to modify LOC photographs, but this is pretty dick of them.


Editor’s note: Mac OS X’s spellchecker thinks “photoshopped” is a valid word, so take that Adobe.

♻ Pandering

| 1 Comment | No TrackBacks

Finally got around to reading some more Supreme Court FCC shit: ABC is fighting back against the Commission’s fine for airing an episode of “NYPD Blue” that began with a woman undressing to take a shower, showing her “buttocks” to the screen.

The details are (over-)analyzed in the briefs, along with a good history of other precedent-setting indecency cases. There are a few choice quotes I gleaned from the briefs, though.

ABC’s opening brief:

But buttocks are not a sexual or excretory organ, or indeed an organ at all.

The FCC’s response:

This is nonsense.

[via]


Another great touch: The “anti-virus certification” appended to the PDFs.

I was curious as to how rotary dial phones affected the distribution of area codes over time; would the speed at which an area code on a vintage phone really matter to Ma Bell? With a little research, I was able to make this chart:

On the y-axis, we have the time it takes to dial the area code on a rotary phone (the number of pulses sent)—this would be the sum of all the digits in the area code, plus ten for each zero. Area codes were introduced in 1947, and we definitely see consciousness regarding dialing time. New York receives the fastest area code, of course. (212 is the fastest because area codes cannot begin with 1, nor can they end in “11” due to their reserved status—411, 911, etc. At the time, the second digit had to be 0 or 1, precluding the other five-pulse area code of 221 from existing. To this day, it is not in use.) The other original area codes try hard to be quick to dial.

As telephones become more popular, we see new area codes being created for different locales. It’s not until the late ’80s that we see real splitting of area codes with the rise of fax machines, and in the ’90s with mobile phones.

With the number of phones increasing to one per person with mobile phones, we quickly saturated our existing area codes and an area code explosion emerges. Fortunately, touch tone phones had become more popular in the ’70s, letting us not care at all about the number of pulses it took to dial. These new area codes are fairly normally distributed.

Trivia I can’t leave out

The “worst” “area code” issued, with a dial time of 29, is 900, assigned to—ahem—“premium services.” The runners-up? San Bernardino, CA’s 909 created in 1992 and 1966’s 800 tie at 28.

The lowest numerical area code in use is 201 (\d00s are reserved). Hackensack, NJ is the recipient of this prize—probably because AT&T was headquartered there.


P.S.: I managed to do this whole investigation without listening to Ludacris once.

♻ Archetypes

| No TrackBacks

I’m surprised I hadn’t realized this earlier, but “The Onion” is using different “man on the street” pictures than in the normal section of the paper.

♻ Hated

| 1 Comment | 2 TrackBacks

Just read Roger Ebert’s latest post comprised of quotes from reviews of his most despised movies. It read like a condensed version of “I Hated, Hated, Hated This Movie”. Fantastic.

Alright, so here’s why I was poking around Google’s DOM: I was writing a little userscript that lets me know when a link is being prefetched. Prefetching is a way to signal a browser to download something in the background.

Google is the only site I know that uses prefetching actively; if Google is pretty sure a result is going to be clicked on, it will put a <link rel="prefetch" href="http://example.com/"> in the results page. My Greasemonkey script makes those prefetched links visible by dashing the link instead of leaving it solid.

Basically, it turns this:

Into this:

To use it, follow these steps:

  1. Get Greasemonkey for Firefox, GreaseKit for Safari, etc.
  2. Head over to Prefetch Notifier’s userscripts.org page and hit “Install.”

Seeing when Google is confident you’ll click on stuff is interesting (and the reason I wrote the script). Tell me if you ever see anything weird, e.g., multiple prefetched links, another site actively using prefetching, etc.

Leave any feedback or comments you have!


Update: It looks like Google sniffs the browser and doesn’t send back prefetching <link> elements on Safari, since it appears it doesn’t support it. IE doesn’t support prefetching either (of course), so it looks like this is a Firefox(/Opera) feature for now. These guys seem to be out of the loop that the HTML changes depending on browser…

♻ RSS even makes mail better

| 1 TrackBack

Looks like isnoop finally made his awesome RSS package tracker a standalone product called Boxoh. Watch my mom’s present get delivered to me!

Subscribing to these feeds in Google Reader is a great way to be kept in the loop—on one occasion, I saw a “Delivered” message appear while going through my feeds. Incredulous, I opened my dorm room door only to see a package waiting for me.

♻ foo test test2

| 1 Comment | No TrackBacks

No, this is not a test post.

I searched for “foo” on Google as a test search (I wanted to inspect the DOM for something). To my surprise, the first result was… a test advertisement! Seems like even Google isn’t immune to some hacks from time to time:

As I look at the same search page now, the large horizontal is gone, but is now on the right “skyscraper.” Perhaps I caught it at the right time? Particularly bad was that the “ad” link didn’t even point to a valid page—at least test.com could have gotten some hits!

♻ Alive

| 1 Comment | No TrackBacks

Don’t tell me I’m crazy for thinking about Daft Punk when I saw this picture from the Greek riots.

The Browser Security Handbook reads like half history lesson, half exposé on browsers’ terrible secrets …with charts to rival ppk!

♻ Audio Armistice

| No TrackBacks

Bloomers

| 2 Comments | No TrackBacks

Since I thought a Bloom filter might be to blame in Mac OS X’s spellchecker, I made a little Processing app to demonstrate how exactly a Bloom filter works.

First, a brief description of what Bloom filters are. They’re a data structure that relies on hash functions for representing a set. The actual structure is a bunch of bits and a few hash functions. To add an element to a Bloom filter, compute the hashes for each hash function, and set each bit high. To test if an element has been added to the Bloom filter, compute the hashes and return true if all the bits are already high. A quick overview of what this means:

Pros:

  • You don’t store any elements in the Bloom filter—just the elements’ hashes. Very compact.
  • Lookup and adding to the set are extremely fast.
  • Union and intersection of Bloom filter sets are just bitwise operators.

Cons:

  • False positives are possible. (Since we don’t store the actual elements, we might see all the hashed bits as high when it was actually one or more previously added elements that made them high.)
  • There’s no way to remove an element from the set. (We don’t know if another element has hashed to the bits we want to set low.)

Without further ado, here’s the app. (It’s a little rough around the edges, and, yes, it’s a Java applet—Processing.js has a few bugs and is slow as balls for this type of application, apparently.) This is a Bloom filter with 625 bits and 10 hash functions. (More implementation details below.) Have fun—ask questions or leave feedback in the comments!

Directions:

  1. Give the applet focus by clicking on the dots.
  2. Start typing into it and you’ll see the bits your string is hashing to in real time.
  3. Hit enter to add the string and you’ll see the high bits turn red. Repeat!
<object classid="clsid:8AD9C840-044E-11D1-B3E9-00805F499D93" 
        codebase="http://java.sun.com/update/1.5.0/jinstall-1_5_0_15-windows-i586.cab"
        width="500" height="550"
        standby="Loading Processing software..."  >

    <param name="code" value="bloom" />
    <param name="archive" value="/files/processing/bloom.jar" />

    <param name="mayscript" value="true" />
    <param name="scriptable" value="true" />

    <param name="image" value="loading.gif" />
    <param name="boxmessage" value="Loading Processing software..." />
    <param name="boxbgcolor" value="#FFFFFF" />

    <param name="test_string" value="inner" />

    <p>
        <strong>
            This browser does not have a Java Plug-in.
            <br />
            <a href="http://java.sun.com/products/plugin/downloads/index.html" title="Download Java Plug-in">
                Get the latest Java Plug-in here.
            </a>
        </strong>
    </p>

</object>

For those curious, to get my ten hash functions, I’m double hashing two string hash functions I found: djb2 and sdbm.

For the even more curious, here’s the Processing source.

Update: Added more explicit directions.

♻ Jockin' My Style

| No TrackBacks

Well, I guess I’m not the first one to use Unicode to its fullest. Looks like people have been using recycling symbols to indicate reuse of ideas on Twitter for a while. But do they have a trash motif on their websites? Huh? What now?

[via “It’s Just a Bunch of Stuff That Happens”]

Not bloomy enough

| No TrackBacks

Apparently, I use Mac OS X’s system-wide spellchecker more than others (or misspell things more), since I’ve noticed a bunch of false positives it’s suggested to me and it doesn’t seem like others have written about it. Examples:

(Never mind the irony of not knowing how to spell “knowledgeable” and posting to the world my ability to make myself “embarrassed.”)

I know that most spellcheckers are implemented with Bloom filters. They’re a really cool data structure that can test for membership in a set with the benefit that you don’t actually have to store the element in the Bloom filter (among other benefits). The main downside is that you aren’t guaranteed to be correct; there can be false positives (times when you say an element is in the set when it actually isn’t).

It seems like that’s what’s happening with Mac OS X here—too many false positives. My guess is that its suggestion algorithm looks something like this:

if word not in spelling_bloom_filter
    for each suggestion_word with edit distance = 1 from word
        if suggestion_word in spelling_bloom_filter
            add suggestion_word to suggestions

This would result in tons of words being thrown at the Bloom filter, which, if nearing capacity for the number of hash functions it has in it, could result in false positives happening relatively frequently. Does anyone have a better idea of what’s going on?

The way Apple could fix this is slimming down their list of words to test. Perhaps use a smarter suggestion algorithm that takes into account the keyboard layout of the user? Ideas?


Update: Fixed a spelling mistake in my rant on spellcheckers. <sigh>

♻ Honest Services

| 1 TrackBack

I love the Explainer. When it comes to news minutiae, Slate always answers the questions I have in the back of my head. Like this time: Why won’t federal prosecutors in Illinois use Barack Obama’s name?

Matt Kraning also pointed out to me the (very) interesting language in the complaint against Governor Blagojevich and his Chief of Staff Harris. They are being charged with “conspiring with each other and with others to devise and participate in a scheme to defraud the State of Illinois and the people of the State of Illinois of the honest services of ROD BLAGOJEVICH and JOHN HARRIS” (emphasis in original).

♻ What? No shoutout?

| No TrackBacks

The unicode snowman does not make an appearance at isthisyourpaperonsingleservingsites.com.

♻ Stanford Residences RWC

| No TrackBacks

Redfin switches from Microsoft Virtual Earth to Google Maps. They used to say Virtual Earth’s bird’s eye views were important enough to the buyer that they would want to see them on the main search screen—now they’ve been relegated to a link opening in a new window (probably all they can do without paying Microsoft).

♻ Of course

| No TrackBacks

♻ Oh, Snap

| No TrackBacks

The trader I photographed on my trip to visit David in New York City was just featured on Sad Guys on Trading Floors.

♻ No Intelligence Allowed

| 1 TrackBack

Sweet. Another one of the emails I received got published on My Right-Wing Dad.

Just heard one of Don Knuth’s Computer Musings. It was pretty amazing to watch him lecture—he’s so brilliant, the words can’t come out fast enough (at the age of 70!). Knuth reminded me that CS is an exciting, dynamic field, and breakthroughs are happening all the time. Mehran Sahami sums it up well (with a shoutout to Knuth, nonetheless):

Catch all of Don Knuth’s Stanford videos at SCPD.

♻ Theocacaoheads

| 1 TrackBack

The next Cocoaheads meeting looks interesting: Google Earth iPhone application development. This Thursday at 7PM in San Jose. Free. Hit me up if you want to come with me.

♻ G > Hot

| 1 Comment | No TrackBacks

New Google Chrome dev release fixes some issues with Hotmail—which raises the question: Who uses Google Chrome and Hotmail?

(User-Agent sniffing seems to be the (non-Google-related) bug on Microsoft’s side.)

“The Controller” just popped up on Netflix’s feed of new “Watch Instantly” titles. The description killed me:

After his wife is abducted, billionaire William Fence (Bob Rue) has eight hours to win an online video game called Liberation Force Earth if he wants to see his wife again, so he enlists the country’s top-ranked video game players for the cyber-rescue. This scattered group of ragtag gamers, communicating with each other via online headsets, must work together as a team if they want to save the day in this inventive technological adventure.

It has not gone unnoticed by the gaming community, apparently. (Make sure to check out the trailer—the tackle at the end is my favorite.) A watching party is most necessary.

The Snowman Saga

| 2 Comments | No TrackBacks

Seeing as it’s my most well-known “work,” I guess I’ll satisfy some percentage of web citizens and bring you the story behind the Unicode Snowman.

Seed

On October 6th, after getting a little… not sober with roommates, I started talking with Ben about how much I liked Daring Fireball’s ★. John Gruber uses the Unicode Black Star symbol (U+2605) to demarcate his link posts from his self-generated content. Inspired by his approach, I decided to take a look at the rest of the Miscellaneous Symbols to see if anything would be appropriate for this site—a trashcan.

In my search, I stumbled upon U+2603—the Unicode Snowman, a.k.a. ☃.

Birth

I had to let people know about this symbol. Why is there a text solution to displaying an image? I didn’t know, but its existence had to be exploited. For ten dollars, I registered unicodesnowmanforyou.com with my existing DreamHost account. The domain is simple and friendly, while still retaining some nerd appeal.

The page was always intended to be a single character, but it proved surprisingly challenging to display this glyph to everybody. Some browsers (cough Internet Explorer) don’t play nice with Unicode font selection, and some operating systems (cough Windows) don’t come with fonts that support U+2603.

Predicament

The difficulty was thus: a bare HTML &#9731; (wrapped around a text centering and embiggening div, of course) will display as a giant “character not found” box in Internet Explorer because it doesn’t know which font to use. (Other browsers just choose a font that contains the given character based on some internal preferences when not given a font directive in a style sheet.) Specifying Arial Unicode MS as the font for the site seems wrong for two reasons, though:

  1. Different fonts render the snowman very differently. It’s almost like a font free-for-all when it comes to snowman actualization. Limiting it to one snowman takes out some of the magic.
  2. Not even all Windows machines have Arial Unicode MS. Only those with Microsoft Office installed have it. Even Leopard ships with it, for fuck’s sake.

Solution

Luckily, IE supports a couple non-standard features that let the site work as normal. If anybody with IE goes to the site, a conditional comment downloads an Embedded OpenType file containing a (very) pared down version of Arial Unicode MS with only the Latin alphabet and, of course, U+2603. A font normally 22 megabytes in size reduced three orders of magnitude to 80 kilobytes.

Web font embedding is a hot issue now, as Firefox just implemented it, and Safari is on its way. IE, on the other hand, has had the ability to read Embedded OpenType since IE 4! The only problem is that in order the make the file, one must use a terribly designed tool called WEFT, which doesn’t appear to have been updated since 1999, despite the latest version being released in 2003. It’s no wonder font embedding didn’t catch on.

Non-IE users will not receive the style tag since they won’t interpret the comment, so they are free to make the best choice about which font to use, i.e., which of their fonts has a snowman. It’s even valid HTML 4.01 Strict!

An attentive reader may note that this leaves users in the dark if they go to the Unicode Snowman on Windows, on a non-IE browser, and don’t have MS Office installed. Yes, it does. I feel like I can live without this segment of the population seeing this Unicode entity in all its brilliance.

Getting Seen

With the website built, and Google Analytics integrated, the next step was showing it off. I first showed it to my friend (and extraordinary former RCC, I might add) Brendan O’Connor, who posted it on Hacker News. It remained on the front page a good five minutes before somebody booted it for not being news. Great.

An IM to another friend got the ball rolling again. Reddit was the target, and it stuck. It was on the front page for pretty much all of the 8th, racking up 8,000+ views from all over the world. It was refreshing to see nerds in so many countries!

Later, it was linked by NotCot, The Triumph of Bullshit, and (most importantly) Waxy, among other blogs. I was happy to have made something people liked, if only for fifteen seconds.

Aftermath

What did we learn from all of this? Well, for starters, we now know why he exists. Also, some 30,000+ people have experienced the glory of Unicode, which I’ll mark as a learning experience (or something).

I got some neat email at snowman@unicodesnowmanforyou.com, to each of which I dutifully replied. Cabel from Panic even mirrored it at http://www.☃.net/!

As far as the original search for a mark for my site, I eventually found U+267B, Black Universal Recycling Symbol (♻), and I use it well! More explanation in my Hello, World.


Added Ben shortly after the reddit-ing:

“Isn’t it kind of depressing how much traffic Unicode Snowman is getting?” No, why would you say that? “Just how little effort it took you and how it’s perhaps the most popular thing you’ve ever created.”

Ay, there’s the rub.

I just saw the TV spot for the “electronic banking” variant of Monopoly along with “Times Square” listed as a property (and a heavily inflated price tag). Apparently, the new cashless variant is using the “Here and Now” edition of Monopoly, which asked the public, “What would the Monopoly board look like if it were designed today?” Times Square is the new Boardwalk.

Slightly irked, I went online to check to make sure nothing in the standard version of the game had changed. I was wrong. Apparently, shit changed in September. I don’t even like Monopoly that much—it’s not a very well-designed game—and I’m fine with the other changes, but changing the color of Baltic and Mediterranean from purple to brown doesn’t jive with me. Those colors are iconic! Presumably, the change was made to avoid confusion between the shitty pair and less shitty purples of Virginia, States, and St. Charles.


Also of note, the cashless edition uses Visa debit cards. I don’t know if buying large properties on credit is what we need to ingrain in our children’s heads right now… I wonder how much Visa had to pay to be part of kids’ childhoods?

Being decided today by the Supreme Court is whether or not to grant a writ of certiorari to Curry v. Hensinger. The case has to do with a fifth grader distributing candy canes with a (very incorrect) description of their origins for a classroom-wide introduction to economics dubbed “Classroom City.” As it goes, the teacher denied Mr. Joel Curry the right to distribute his candy canes and attached story. The Curry’s were pissed, but Joel’s partner for Classroom City, Siddarth Reddy, summed it up nicely, saying, “Nobody wants to hear about Jesus” upon seeing Joel’s item to distribute. (Siddarth was assigned to decorate the storefront.)

The Sixth Circuit affirmed a lower court’s ruling that the public school’s principal did not infringe on Joel’s First Amendment rights by not permitting him to distribute the item. The rationale was that since the speech was unsolicited (as opposed to an assignment where viewpoints were asked for, e.g., an essay on an object important to the student) and would be exposed to very young children (the products were made available throughout the entire elementary school), it could be offensive to some students or parents.

It will be interesting to see if the court decides to take this issue on. Since Justices serve for life (technically, “during good Behaviour”), the Court isn’t subject to senioritis at the end of a President’s term. There are a lot of interesting facts the Sixth Circuit’s decision relies on that could be picked apart by a higher court: the age of the ones exposed to the speech (Would this be legal in a high school? Middle school?), the definition of solicitation, what “offensive” speech is, etc.

Interestingly, this candy cane bullshit has been pulled before. Walz ex. rel. Walz describes how one mother used her child to push not only candy canes with the same dumbass story attached, but also pencils bearing the phrase “Jesus ♥s The Little Children.”

[via SCOTUSblog’s Petitions to Watch]


N.B.: The Third Circuit fails at using the heart symbol, instead opting for “‘Jesus [Loves] The Little Children’ (heart symbol).” Get with the times—Unicode is here to stay!

Update: Doesn’t look like it happened.

♻ Sage

| No TrackBacks

♻ Pills, Immolation, Nooses

| No TrackBacks

The Pepsi generation is suicidal. I do not believe suicide (especially of your product’s cute mascot) is culturally acceptable, let alone enticing for a brand’s image… I doubt these ads will make it far.