semi/signal

Bridge in fog

Avishkar Autar · May 8 2012 · Random

Bridge in fog II

Having recently upgraded from a Palm Pixi to a iPhone 4S, leaving behind webOS for iOS, it’s easy to find many things that are vastly superior on iOS; not surprising, given the superior hardware on the iPhone and the relative maturity of iOS as a platform. However, I find myself missing a few things I’ve become accustomed to with webOS; things I think webOS simply did a better job at.

Notifications
While both platform alert you to events, webOS also kept notifications stacked on the bottom of the display until you chose to swipe them away.

notifications

Integrated Contacts
webOS automatically imported and linked contacts from multiple sources (Facebook, Gmail, etc.) making it fairly simple to manage (or more accurately, not have to manage) an address book.

integrated contacts

Multitasking
While iOS supports multitasking on a technical level, on a UI/UX level the focus is very much centered on one app at a time, as swapping between apps always requires a trip back to the home screen. The webOS process of sliding between cards was not only a slightly faster method to swap between apps but also fairly convenient when it came to glancing at something in another app and then getting back to what you were doing; the scenario that pops into my mind is texting something from a webpage but not remembering it exactly or entirely, and having to swap between the messaging app and the browser.

Also, swiping a card up and off the screen was a fairly elegant way to close it. Exiting apps is perhaps not a big of a deal on an iPhone due to the larger memory pool, but when you do near the memory limit, I’m not sure double tapping the home button, pressing and holding the app icon, and hitting the remove icon is the easiest nor most intuitive action.

multitasking

integrated contactsiOSiphonemultitaskingnotificationsuser experienceuser interfacewebOS

Subway relics

Avishkar Autar · Apr 21 2012 · Random

Old, dirty, and one-third of them don’t work…

Subway Payphone

NYCpayphonephotosubway

Implementing a “did you mean…?” function

Avishkar Autar · Apr 7 2012 · Random

A while ago I became interested in how one would go about implementing something akin to Google Search’s “did you mean…” function. This answer from StackOverflow provides a good overview of what Google does, which involves looking at a user’s incorrect entry as well as a subsequent correction provided by the user. Data mining both the incorrect entry and correction, for millions of users and billions (trillions?) of entries, Google Search can thus make an intelligent guess as to what a user really meant when an incorrect entry is submitted. While most applications can’t do this at Google Search’s scale or generality (which is a span of terms from across the entire web), I can certainly see this model working for smaller applications, which only need to deal with a smaller subset of terms (this blog for example only needs to handle terms that are in my posts). Remove the data capture aspect, providing a fixed database of terms from which to search, and implementation becomes even simpler!

did you mean google?

Digging deeper into implementation details, there’s the problem of figuring out how closely an input string (X) matches each string in a database of terms (T_i). The closeness or distance here is the edit distance between the 2 strings, or the minimum number of edits it takes to turn one string into another. There are a number of edit distance algorithms, but the Levenshtein distance seems to be a popular choice.

For a simple did-you-mean suggester, computing the edit distance is the crux of the method.

To test things out, I did a simple project in C++. Using a straightforward implementation of the Levenshtein distance and a vector of 112 chemical elements (up to copernicium), I wrote a program that would prompt the user for the name of an element, if the element was found it would output “ELEMENT FOUND”, if not it would suggest the name of an element based on the user’s input.

Includes

#include <iostream>
#include <string>
#include <vector>

Levenshtein distance implementation

int ld(const std::string& strA, const std::string& strB)
{
    int lenA = strA.length() + 1;
    int lenB = strB.length() + 1;

    int** mat;
    mat = new int*[lenA];
    for(int i=0; i<lenA; i++)
    {
        mat[i] = new int[lenB];
    }

    for(int i=0; i<lenA; i++)
    {
        mat[i][0] = i;
    }
    
    for(int j=0; j<lenB; j++)
    {
        mat[0][j] = j;
    }

    for(int i=1; i<=lenA-1; i++)
    {
        for(int j=1; j<=lenB-1; j++)
        {
            if(strA[i-1] == strB[j-1])
            {
                mat[i][j] = mat[i-1][j-1];
            }
            else
            {
                mat[i][j] = std::min(mat[i-1][j-1]+1, std::min(mat[i-1][j] + 1, mat[i][j-1] + 1) );
            }
        }
    }

    int ret = mat[lenA-1][lenB-1];

    // memory cleanup
    for(int i=0; i<lenA; i++)
    {
        delete [] mat[i];
    }
    delete [] mat;

    return ret;
}

Function to construct std::vector of chemical elements

std::vector<std::string> make_elements_vector()
{
    std::vector<std::string> elements;

    elements.push_back("hydrogen");
    elements.push_back("helium");
    elements.push_back("lithium");
    elements.push_back("beryllium");
    elements.push_back("boron");
    elements.push_back("carbon");
    elements.push_back("nitrogen");
    elements.push_back("oxygen");
    elements.push_back("fluorine");
    elements.push_back("neon");
    elements.push_back("sodium");
    elements.push_back("magnesium");
    elements.push_back("aluminium");
    elements.push_back("silicon");
    elements.push_back("phosphorus");
    elements.push_back("sulfur");
    elements.push_back("chlorine");
    elements.push_back("argon");
    elements.push_back("potassium");
    elements.push_back("calcium");
    elements.push_back("scandium");
    elements.push_back("titanium");
    elements.push_back("vanadium");
    elements.push_back("chromium");
    elements.push_back("manganese");
    elements.push_back("iron");
    elements.push_back("cobalt");
    elements.push_back("nickel");
    elements.push_back("copper");
    elements.push_back("zinc");
    elements.push_back("gallium");
    elements.push_back("germanium");
    elements.push_back("arsenic");
    elements.push_back("selenium");
    elements.push_back("bromine");
    elements.push_back("krypton");
    elements.push_back("rubidium");
    elements.push_back("strontium");
    elements.push_back("yttrium");
    elements.push_back("zirconium");
    elements.push_back("niobium");
    elements.push_back("molybdenum");
    elements.push_back("technetium");
    elements.push_back("ruthenium");
    elements.push_back("rhodium");
    elements.push_back("palladium");
    elements.push_back("silver");
    elements.push_back("cadmium");
    elements.push_back("indium");
    elements.push_back("tin");
    elements.push_back("antimony");
    elements.push_back("tellurium");
    elements.push_back("iodine");
    elements.push_back("xenon");
    elements.push_back("caesium");
    elements.push_back("barium");
    elements.push_back("lanthanum");
    elements.push_back("cerium");
    elements.push_back("praseodymium");
    elements.push_back("neodymium");
    elements.push_back("promethium");
    elements.push_back("samarium");
    elements.push_back("europium");
    elements.push_back("gadolinium");
    elements.push_back("terbium");
    elements.push_back("dysprosium");
    elements.push_back("holmium");
    elements.push_back("erbium");
    elements.push_back("thulium");
    elements.push_back("ytterbium");
    elements.push_back("lutetium");
    elements.push_back("hafnium");
    elements.push_back("tantalum");
    elements.push_back("tungsten");
    elements.push_back("rhenium");
    elements.push_back("osmium");
    elements.push_back("iridium");
    elements.push_back("platinum");
    elements.push_back("gold");
    elements.push_back("mercury");
    elements.push_back("thallium");
    elements.push_back("lead");
    elements.push_back("bismuth");
    elements.push_back("polonium");
    elements.push_back("astatine");
    elements.push_back("radon");
    elements.push_back("francium");
    elements.push_back("radium");
    elements.push_back("actinium");
    elements.push_back("thorium");
    elements.push_back("protactinium");
    elements.push_back("uranium");
    elements.push_back("neptunium");
    elements.push_back("plutonium");
    elements.push_back("americium");
    elements.push_back("curium");
    elements.push_back("berkelium");
    elements.push_back("californium");
    elements.push_back("einsteinium");
    elements.push_back("fermium");
    elements.push_back("mendelevium");
    elements.push_back("nobelium");
    elements.push_back("lawrencium");
    elements.push_back("rutherfordium");
    elements.push_back("dubnium");
    elements.push_back("seaborgium");
    elements.push_back("bohrium");
    elements.push_back("hassium");
    elements.push_back("meitnerium");
    elements.push_back("darmstadtium");
    elements.push_back("roentgenium");
    elements.push_back("copernicium");

    return elements;
}

Application Logic

int main(int argc, char* argv[])
{

    std::vector<std::string> elements = make_elements_vector();

    std::cout << "What element are you attempting to find? ";
    std::string inputStr;
    std::cin >> inputStr;

    size_t minDistIndex = 0;
    int minDist = INT_MAX;

    for(size_t i=0; i<elements.size(); i++)
    {
        int dist = ld(elements[i], inputStr);

        if(dist < minDist)
        {
            minDist = dist;
            minDistIndex = i;
        }
    }

    if(minDist == 0)
    {
        std::cout << "ELEMENT FOUND!" << std::endl;
    }
    else
    {
        std::string dym = "Did you mean " + elements[minDistIndex] + "?";
        std::cout << dym << std::endl;
    }

    return 0;
}

Search for “hydrogen”

hydrogen, element found

Search for “tillium”

tillium, did you mean gallium?

This little demo works surprisingly well and suggestions are more-or-less inline with what you’d expect. There are obviously limitations as you think about applying this to other domains as language, context, etc. are not taken into consideration, but as a simple suggester it holds up pretty well and is perhaps a nice addition to a number of search methods in a variety of applications (the majority of which don’t seem to implement anything of the sort).

algorithmdid-you-meanedit distancelevenshtein distancesearchsuggester

Mapping NYC subway stations

Avishkar Autar · Jan 21 2012 · Random

I previously wrote about showing the transit layer with the Google Maps API, this is somewhat of a continuation, but narrower in scope; here I’ll talk about showing custom markers for New York City subway stations, making use of data from NYC Open Data (formerly the NYC Data Mine).

The reason for doing this at all, given that a Google Map already shows subway stations, is that the default indicators are fairly inflexible:

You can’t use a custom icon to change how they look
You can’t fire off custom event handlers when the user interacts with them

NYC subway stations near city hall

(this is true for all of the point of interest indicators: parks, schools, etc.)

Replacing the point of interest indications with markers turned out to be fairly simple and the data from NYC Open Data was (relatively) clean and readily usable, a pleasant surprise given my previous experience. For this little project, I exported the Subway Stations dataset; it can be exported in a number of formats, but JSON is probably the easiest to work client-side. With the data readily available as a JSON file, it can be loaded simply with an AJAX call to get the file and, once loaded, the subway stations can be plotted on the map by iterating through the list of stations in the JSON data.

It’s worth taking a look at the format of the JSON data, as the indexing of the nodes isn’t all that clear. There’s a meta element and data element at the root, the data element contains a zero-indexed array of subway stations, and each subway station contains a zero-indexed array of attributes, notably:

10 = station name
12 = dash-delimited list of train lines
9 = object with latitude, longitude, and geometry field
(Note that these values are referenced by the field name, not a numeric index).

NYC Open Data, Subway Station, JSON format

The code to get the JSON data (using jQuery’s .ajax), iterate through the array of subway stations, extract the relevant pieces of information about the stations, and create map markers for them is shown below. A label is attached to the marker by making use of Marc Ridey’s Label class.

$.ajax({
    url: 'http://whatever.com/subway-stops.json',
    success: function (ret)
    {
        for(var i=0; i<ret.data.length; i++)
        {
            // extract station name, latitude, longitude, and dash-delimited list of train lines at station
            var stationName = ret.data[i][10];
            var lat = ret.data[i][9]['latitude'];
            var lon = ret.data[i][9]['longitude'];
            var trainLines = (ret.data[i][12]).split('-');

            // make comma-delimited list of train lines
            var trainLinesLbl = '';
            for(var k=0; k<trainLines.length; k++)
            {
                trainLinesLbl += trainLines[k];
                if(k < trainLines.length-1)
                {
                    trainLinesLbl += ',';
                }                                    
            }

            // create marker
            marker = new google.maps.Marker({
                "position": new google.maps.LatLng(lat, lon),
                "map": map,
                "title": stationName + " [" + trainLinesLbl + "]",
                "icon": "http://whatever.com/marker-subway.png"
            });

            // create label for marker
            // uses Label created by Marc Ridey
            var label = new Label({ map: map });
            label.bindTo('position', marker, 'position');
            label.bindTo('text', marker, 'title');                                                        
        }
    }                    
});

With some minor styling to the label and an icon from Map Icons Collection, here’s my result:

NYC subway stations with custom markers

google mapsgoogle maps apimapnyc data minenyc open dataNYC subway stationssubwaysubway linessubway stations

Arch Enemies

Avishkar Autar · Jan 2 2012 · Random

Arch Enemies by Jason Bergsieker,

Arch Enemies

archdrawingenemiesfunnyillustrationJason Bergsieker

Paintings @ Bright Lyons

Avishkar Autar · Dec 26 2011 · Random

Passed by Bright Lyons on Atlantic Avenue in Brooklyn and saw this awesomeness,

Bright Lyons, Homer and Bart

bartbright lyonshomerhomer simpsonpaintingphoto

Batching, a basis for optimization

Avishkar Autar · Dec 18 2011 · Application Design

It’s interesting that in 3 distinct domains I’ve run across the same underlying basis for optimization:

Graphics: Modern GPUs depend heavily on batching primitives, typically triangles. Instead of rendering triangles individually, you get much better performance by batching primitives together in a list, sending it to the GPU via a single call, then letting the GPU pipelines to do their thing. Even before modern GPUs existed, graphics cards supported techniques like BitBlt which, essentially, performed operations on batched blocks of pixels, to take advantage of the embarrassingly parallel nature of computer graphics.
Relational Databases: Issuing lots of small queries can kill performance. A better strategy is, usually, to issue fewer queries, joining and returning as much data as possible with each query. Even if these queries becomes complex and costly, the cost of a complex query will usually still be less than the aggregate cost of numerous simpler queries.
Networking: The speed of light sucks… server and packet switching latencies make things worse. I usually assume ~50ms baseline latency to send a request packet + get a reply packet back from an internet server (I use the term “packet” loosely, referring to programmer-defined, application-level “packets” or messages, or whatever you like to call them, not necessarily TCP/IP packets). Note that this baseline is regardless of the amount of information in a packet and is bound by the travel time between server and client. So, to optimize communication and bandwidth, a good strategy is to transfer as much as possible per-packet instead of depending upon numerous requests/responses to/from a server, which would mean lots of packets and lots of wasted time.

batchingBitBltcomputer graphicsembarrassingly parallelGPUlatencynetworkingoptimizationquery optimizationrelational database

Multi-faceted online identities

Avishkar Autar · Dec 12 2011 · Random

Incredibly insightful insights by Christopher Poole (“moot”, founder of 4chan and Canvas)

From SXSW earlier this year…

Zuckerberg’s totally wrong on anonymity being total cowardice. Anonymity is authenticity. It allows you to share in an unvarnished, unfiltered, raw and real way.

The cost of failure is really high when you’re contributing as yourself, to fail in an environment where you’re contributing with your real name is costly.

As for how anonymity connects to identity, he spoke to this at Web 2.0…

It’s not who you share with, it’s who you share as… We all have multiple identities, it’s part of being human, identity is prismatic. Google and Facebook would have you believe you are a mirror, but in fact we’re more like diamonds, you can look at people from any angle and see something totally different and yet they’re still the same

4chananonymityChristopher Pooleidentitymootmulti-faceted identity

Showing the transit layer with the Google Maps API

Avishkar Autar · Dec 8 2011 · Random

While Google Maps has a very useful transit layer (showing subway lines, bus stops, etc.) available when zoomed in on a city, this layer is unfortunately not exposed via the Google Maps API. However, as demonstrated on BlinkTag Inc. by Brendan Nee, it’s possible to load the transit layer as a custom tile layers, pulling the transit layer images directly from Google’s servers.

// add transit overlay
var transitOptions = {
    getTileUrl: function (coord, zoom)
    {
           return "http://mt1.google.com/vt/lyrs=m@155076273,transit:comp|vm:&" + "hl=en&opts=r&s=Galil&z=" + zoom + "&x=" + coord.x + "&y=" + coord.y;
    },

    tileSize: new google.maps.Size(256, 256),
    isPng: true
};

var transitMapType = new google.maps.ImageMapType(transitOptions);
map.overlayMapTypes.insertAt(0, transitMapType);

Google Maps, Transit Layer, Subway - NYC, City Hall

However, there are 2 issues you may quickly notice:

1. Custom styling applied to the base layer is lost.
This is because full image tiles are loaded, which completely obscures the lower layer. A solution to this is to find and copy the apistyle and style URL parameters when the base layer is loaded (you can do this by looking at the GET requests with a tool like Firebug). You then simply add these paremeters to the URL returned by the getTileUrl() function.

getTileUrl: function (coord, zoom)
{
    return "http://mt1.google.com/vt/lyrs=m@155076273,transit:comp|vm:&" + "hl=en&opts=r&s=Galil&z=" + zoom + "&x=" + coord.x + "&y=" + coord.y + "&apistyle=s.t%3A3%7Cp.h%3A%23C5C5C5%7Cp.s%3A-100%7Cp.l%3A37%7Cp.v%3Aon%2Cs.t%3A35%7Cp.h%3A%23F284FF%7Cp.s%3A100%7Cp.l%3A-9%7Cp.v%3Aon%2Cs.t%3A81%7Cp.v%3Aoff&s=Gal&style=api%7Csmartmaps";
},

Google Maps, Transit Layer, Subway - NYC, City Hall

2. The large subway stop markers are useless
In New York City at least, it’s impossible to identify train lines by color alone, so it’s fairly important to see the letter or number identifier for trains at the different stations. The map has both large and small markers for each station, but only the small markers shows this information. I couldn’t figure out a way to get the larger marker to show the letters/numbers, but by changing “vm:” to “vm:1” you can completely remove the large markers from the map. However, this also shrinks the size of the lines indicating the train routes.

getTileUrl: function (coord, zoom)
{
    return "http://mt1.google.com/vt/lyrs=m@155076273,transit:comp|vm:1&" + "hl=en&opts=r&s=Galil&z=" + zoom + "&x=" + coord.x + "&y=" + coord.y + "&apistyle=s.t%3A3%7Cp.h%3A%23C5C5C5%7Cp.s%3A-100%7Cp.l%3A37%7Cp.v%3Aon%2Cs.t%3A35%7Cp.h%3A%23F284FF%7Cp.s%3A100%7Cp.l%3A-9%7Cp.v%3Aon%2Cs.t%3A81%7Cp.v%3Aoff&s=Gal&style=api%7Csmartmaps";
},

Google Maps, Transit Layer, Subway - NYC, City Hall

There is a third issue that’s pretty noticeable as well: bus stops do not how the identifier of the buses that stop at them. I’ve yet to find a way to show them.

google mapsgoogle maps apigoogle maps javascript apimapsubway linestransit layer