Friday, May 23, 2014

Cloud Share - New App

I've put a new app on the Windows Store: Cloud Share. It connects the web to your Windows 8 share charm.

I did the development on GitHub and quite enjoyed myself. I wasn't sure I liked the game-ification of development in GitHub's dashboard showing you your longest development streak in days. However I realized that it encourages me to do work on my personal project and anything that aids in holding my attention on and helping me finish these projects is a good thing.

Thursday, May 22, 2014

Debugging anecdote - the color transparent black breaks accessibility

Some time back while I was working on getting the Javascript Windows Store app platform running on Windows Phone (now available on the last Windows Phone release!) I had an interesting bug that in retrospect is amusing.

I had just finished a work item to get accessibility working for JS WinPhone apps when I got a new bug: With some set of JS apps, accessibility appeared to be totally broken. At that time in development the only mechanism we had to test accessibility was a test tool that runs on the PC, connects to the phone, and dumps out the accessibility tree of whatever app is running on the phone. In this bug, the tool would spin for a while and then timeout with an error and no accessibility information.

My first thought was this was an issue in my new accessibility code. However, debugging with breakpoints on my code I could see none of my code was run nor the code that should call it. The code that called that code was a more generic messaging system that hit my breakpoints constantly.

Rather than trying to work backward from the failure point, I decided to try and narrow down the repro and work forwards from there. One thing all the apps with the bug had in common was their usage of WinJS, but not all WinJS apps demonstrated the issue. Using a binary search approach on one such app I removed unrelated app code until all that was left was the app's usage of the WinJS AppBar and the bug still occurred. I replaced the WinJS AppBar usage with direct usage of the underlying AppBar WinRT APIs and continued.

Only some calls to the AppBar WinRT object produced the issue:

        var appBar = Windows.UI.WebUI.Core.WebUICommandBar.getForCurrentView(); 
        // appBar.opacity = 1;
        // appBar.closeDisplayMode = Windows.UI.WebUI.Core.WebUICommandBarClosedDisplayMode.default;
        appBar.backgroundColor = Windows.UI.Colors.white; // Bug! 
Just setting the background color appeared to cause the issue and I didn't even have to display the AppBar. Through additional trial and error I was blown away to discover that some colors I would set caused the issue and other colors did not. Black wouldn't cause the issue but transparent black would. So would aqua but not white.

I eventually realized that predefined WinRT color values like Windows.UI.Colors.aqua would cause the issue while JS literal based colors didn't cause the issue (Windows.UI.Color is a WinRT struct which projects in JS as a JS literal object with the struct members as JS object properties so its easy to write something like {r: 0, g: 0, b: 0, a: 0} to make a color) and I had been mixing both in my tests without realizing there would be a difference. I debugged into the backgroundColor property setter that consumed the WinRT color struct to see what was different between Windows.UI.Colors.black and {a: 1, r: 0, g: 0, b: 0} and found the two structs to be byte wise exactly the same.

On a hunch I tried my test app with only a reference to the color and otherwise no interaction with the AppBar and not doing anything with the actual reference to the color: Windows.UI.Colors.black;. This too caused the issue. I knew that the implementation for these WinRT const values live in a DLL and guessed that something in the code to create these predefined colors was causing the issue. I debugged in and no luck. Now I also have experienced crusty code that would do exciting things in its DllMain, the function that's called when a DLL is loaded into the process so I tried modifying my C++ code to simply LoadLibrary the DLL containing the WinRT color definition, windows.ui.xaml.dll and found the bug still occurred! A short lived moment of relief as the world seemed to make sense again.

Debugging into DllMain nothing interesting happened. There were interesting calls in there to be sure, but all of them behind conditions that were false. I was again stumped. On another hunch I tried renaming the DLL and only LoadLibrary'ing it and the bug went away. I took a different DLL renamed it windows.ui.xaml.dll and tried LoadLibrary'ing that and the bug came back. Just the name of the DLL was causing the issue.

I searched for the DLL name in our source code index and found hits in the accessibility tool. Grinning I opened the source to find that the accessibility tool's phone side service was trying to determine if a process belonged to a XAML app or not because XAML apps had a different accessibility contract. It did this by checking to see if windows.ui.xaml.dll was loaded in the target process.

At this point I got to fix my main issue and open several new bugs for the variety of problems I had just run into. This is a how to on writing software that is difficult to debug.

location.hash and location.search are bad and they should feel bad

The DOM location interface exposes the HTML document's URI parsed into its properties. However, it is ancient and has problems that bug me but otherwise rarely show up in the real world. Complaining about mostly theoretical issues is why blogging exists, so here goes:
  • The location object's search, hash, and protocol properties are all misnomers that lead to confusion about the correct terms:
    • The 'search' property returns the URI's query property. The query property isn't limited to containing search terms.
    • The 'hash' property returns the URI's fragment property. This one is just named after its delimiter. It should be called the fragment.
    • The 'protocol' property returns the URI's scheme property. A URI's scheme isn't necessarily a protocol. The http URI scheme of course uses the HTTP protocol, but the https URI scheme is the HTTP protocol over SSL/TLS - there is no HTTPS protocol. Similarly for something like mailto - there is no mailto wire protocol.
  • The 'hash' and 'search' location properties both return null in the case that their corresponding URI property doesn't exist or if its the empty string. A URI with no query property and a URI with an empty string query property that are otherwise the same, are not equal URIs and are allowed by HTTP to return different content. Similarly for the fragment. Unless the specific URI scheme defines otherwise, an empty query or hash isn't the same as no query or hash.
But like complaining about the number of minutes in an hour none of this can ever change without huge compat issues on the web. Accordingly I can only give my thanks to Anne van Kesteren and the awesome work on the URL standard moving towards a more sane (but still working practically within the constraints of compat) location object and URI parsing in the browser.