Wednesday, February 29, 2012

Glitch Helperator

I've been working on the Glitch Helperator. It is a collection of tools and things I've put together for Glitch. It has a few features that I haven't seen elsewhere including:

Favorite Streets
A notebook in which you can save information about interesting streets and later use it to find your way back to them.
Birthday
Find out how old your Glitch is and the date of your next birthday in Glitch time or Earth time.
API Update History
A history of changes to the streets, skills and achievements of Glitch noting when new ones are added and when existing ones are changed.
It also has an interactive skill tree, find nearest feature tool, and achievement display. If you play Glitch, check it out.

Monday, February 27, 2012

Client Side Cross Domain Data YQL Hack

One of the more limiting issues of writing client side script in the browser is the same origin limitations of XMLHttpRequest. The latest version of all browsers support a subset of CORS to allow servers to opt-in particular resources for cross-domain access. Since IE8 there's XDomainRequest and in all other browsers (including IE10) there's XHR L2's cross-origin request features. But the vast majority of resources out on the web do not opt-in using CORS headers and so client side only web apps like a podcast player or a feed reader aren't doable.

One hack-y way around this I've found is to use YQL as a CORS proxy. YQL applies the CORS header to all its responses and among its features it allows a caller to request an arbitrary XML, HTML, or JSON resource. So my network helper script first attempts to access a URI directly using XDomainRequest if that exists and XMLHttpRequest otherwise. If that fails it then tries to use XDR or XHR to access the URI via YQL. I wrap my URIs in the following manner, where type is either "html", "xml", or "json":

        yqlRequest = function(uri, method, type, onComplete, onError) {
            var yqlUri = "http://query.yahooapis.com/v1/public/yql?q=" + 
                encodeURIComponent("SELECT * FROM " + type + ' where url="' + encodeURIComponent(uri) + '"');

            if (type == "html") {
                yqlUri += encodeURIComponent(" and xpath='/*'");
            }
            else if (type == "json") {
                yqlUri += "&callback=&format=json";
            }
            ...

This also means I can get JSON data itself without having to go through JSONP.

Friday, February 24, 2012

Web Worker Initialization Race

Elaborating on a previous brief post on the topic of Web Worker initialization race conditions, there's two important points to avoid a race condition when setting up a Worker:

  1. The parent starts the communication posting to the worker.
  2. The worker sets up its message handler in its first synchronous block of execution.

For example the following has no race becaues the spec guarentees that messages posted to a worker during its first synchronous block of execution will be queued and handled after that block. So the worker gets a chance to setup its onmessage handler. No race:

'parent.js':
   var worker = new Worker();
   worker.postMessage("initialize");

'worker.js':
   onmessage = function(e) {
      // ...
   }

The following has a race because there's no guarentee that the parent's onmessage handler is setup before the worker executes postMessage. Race (violates 1):

'parent.js':
   var worker = new Worker();
   worker.onmessage = function(e) {
      // ...
   };

'worker.js':
   postMessage("initialize");

The following has a race because the worker has no onmessage handler set in its first synchronous execution block and so the parent's postMessage may be sent before the worker sets its onmessage handler. Race (violates 2):

'parent.js':
   var worker = new Worker();
   worker.postMessage("initialize");

'worker.js':
   setTimeout(
      function() {
         onmessage = function(e) {
            // ...
         }
      },
      0);

Monday, February 20, 2012

URI Percent Encoding Ignorance Level 2 - There is no Unencoded URI

As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping).

Getting into the more subtle levels of URI percent-encoding ignorance, folks try to apply their knowledge of percent-encoding to URIs as a whole producing the concepts escaped URIs and unescaped URIs. However there are no such things - URIs themselves aren't percent-encoded or decoded but rather contain characters that are percent-encoded or decoded. Applying percent-encoding or decoding to a URI as a whole produces a new and non-equivalent URI.

Instead of lingering on the incorrect concepts we'll just cover the correct ones: there's raw unencoded data, non-normal form URIs and normal form URIs. For example:

  1. http://example.com/%74%68%65%3F%70%61%74%68?query
  2. http://example.com/the%3Fpath?query
  3. "http", "example.com", "the?path", "query"

In the above (A) is not an 'encoded URI' but rather a non-normal form URI. The characters of 'the' and 'path' are percent-encoded but as unreserved characters specific in the RFC should not be encoded. In the normal form of the URI (B) the characters are decoded. But (B) is not a 'decoded URI' -- it still has an encoded '?' in it because that's a reserved character which by the RFC holds different meaning when appearing decoded versus encoded. Specifically in this case, it appears encoded which means it is data -- a literal '?' that appears as part of the path segment. This is as opposed to the decoded '?' that appears in the URI which is not part of the path but rather the delimiter to the query.

Usually when developers talk about decoding the URI what they really want is the raw data from the URI. The raw decoded data is (C) above. The only thing to note beyond what's covered already is that to obtain the decoded data one must parse the URI before percent decoding all percent-encoded octets.

Of course the exception here is when a URI is the raw data. In this case you must percent-encode the URI to have it appear in another URI. More on percent-encoding while constructing URIs later.

Friday, February 17, 2012

Why I Like Glitch

Sarah and I have been enjoying Glitch for a while now. Reviews are usually positive although occasionally biting (but mostly accurate).

I enjoy Glitch as a game of exploration: exploring the game's lands with hidden and secret rooms, and exploring the games skills and game mechanics. The issue with my enjoyment coming from exploration is that after I've explored all streets and learned all skills I've got nothing left to do. But I've found that even after that I can have fun writing client side JavaScript against Glitch's web APIs making tools (I work on the Glitch Helperator) for use in Glitch. And on a semi-regular basis they add new features reviving my interest in the game itself.

Wednesday, February 15, 2012

URI Percent-Encoding Ignorance Level 1 - Purpose

As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping).

Worse than the lame blog comments hating on percent-encoding is the shipping code which can do actual damage. In one very large project I won't name, I've fixed code that decodes all percent-encoded octets in a URI in order to get rid of pesky percents before calling ShellExecute. An unnamed developer with similar intent but clearly much craftier did the same thing in a loop until the string's length stopped changing. As it turns out percent-encoding serves a purpose and can't just be removed arbitrarily.

Percent-encoding exists so that one can represent data in a URI that would otherwise not be allowed or would be interpretted as a delimiter instead of data. For example, the space character (U+0020) is not allowed in a URI and so must be percent-encoded in order to appear in a URI:

  1. http://example.com/the%20path/
  2. http://example.com/the path/
In the above the first is a valid URI while the second is not valid since a space appears directly in the URI. Depending on the context and the code through which the wannabe URI is run one may get unexpected failure.

For an additional example, the question mark delimits the path from the query. If one wanted the question mark to appear as part of the path rather than delimit the path from the query, it must be percent-encoded:

  1. http://example.com/foo%3Fbar
  2. http://example.com/foo?bar
In the second, the question mark appears plainly and so delimits the path "/foo" from the query "bar". And in the first, the querstion mark is percent-encoded and so the path is "/foo%3Fbar".

Monday, February 13, 2012

Blackmail DRM - Stolen Thoughts

Most existing DRM attempts to only allow the user to access the DRM'ed content with particular applications or with particular credentials so that if the file is shared it won't be useful to others. A better solution is to encode any of the user's horrible secrets into unique versions of the DRM'ed content so that the user won't want to share it. Entangle the users and the content provider's secrets together in one document and accordingly their interests. I call this Blackmail DRM. For an implementation it is important to point out that the user's horrible secret doesn't need to be verified as accurate, but merely verified as believable.

Apparently I need to get these blog posts written faster because only recently I read about Social DRM which is a light weight version of my idea but with a misleading name. Instead of horrible secrets, they say they'll use personal information like the user's name in the DRM'ed content. More of my thoughts stolen and before I even had a chance to think of it first!

Friday, February 10, 2012

URI Percent Encoding Ignorance Level 0 - Existence

As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping). The basest ignorance is with respect to the mere existence of percent-encoding. Percents in URIs are special: they always represent the start of a percent-encoded octet. That is to say, a percent is always followed by two hex digits that represents a value between 0 and 255 and doesn't show up in a URI otherwise.

The IPv6 textual syntax for scoped addresses uses the '%' to delimit the zone ID from the rest of the address. When it came time to define how to represent scoped IPv6 addresses in URIs there were two camps: Folks who wanted to use the IPv6 format as is in the URI, and those who wanted to encode or replace the '%' with a different character. The resulting thread was more lively than what shows up on the IETF URI discussion mailing list. Ultimately we went with a percent-encoded '%' which means the percent maintains its special status and singular purpose.