Introduction

If you are unfamiliar with microformats or the Firefox plugin called operator, you may want to read the basics first.

Microformats have a greater chance of adoption if we encourage a evolve-to-a-standard-as-you-use approach rather than a define-first approach. While we have a bunch of standardized representations to describe events, personal information, geographic locations etc, they represent only a tiny speck in the overall semantic space. Consider the standardization efforts required to realize the following scenarios for the Semantic Web:

  1. Say I have an online stock trading account. One feature of this account is that it lets you maintain a watch list. I would like to be able to use operator to add stocks to this watch list while reading an online newspaper article about some company.
  2. Say I have an account on a movie rental website like netflix. I would like to be able to use operator to add movies to my rental queue while reading a movie review or a blog or while browsing through IMDB. (if IMDB chooses to markup their content)
  3. Use operator to search for jobs at a company mentioned in some blog/article by sending a query to a job portal.

poshZone is a attempt to enable the above. POSH stands for plain old semantic HTML. A web page that contains a few zones of semantic HTML can be excellent fodder for operator. Try this demo. It is a proof of concept for example 2 above.

poshZoneShot.JPG

A poshZone is any content described within a div that has a class of poshZone . The content may describe anything in any manner as long it adheres to the following restrictions:

  1. Use only div and span to structure content. (TBD: support for header, p and a tags)
  2. Use class attribute to annotate content.
  3. divs are containers. spans are key value pairs (where class is key).

These restrictions are quite similar to the guidelines for semantic XHTML. Here is the markup used to describe a movie:

<div class="poshZone">
    <span class="poshZoneDescription">Movie: Cold Mountain</span>
    <div class="movie">
        <span class="movieName">Cold Mountain</span>. 
    Made in <span class="movieYear">2003</span>. Watch it.
        <p>In the waning days of the American Civil War, a wounded soldier (Law) 
    embarks on a perilous journey back home to Cold Mountain, North Carolina 
    to reunite with his sweetheart (Kidman). Based on the novel by Charles Frazier.
        </p>
    </div>
</div>

And here is a user script that asks operator for movie nodes in the page:

var rentalq = {
    description: "Add to RentalQ on myjavaserver.com",
    shortDescription: "RentalQ",
    scope: {
        semantic: {
        "poshZone" : {
            custom: "['movie']['movieName']"
        }
    }
}

Basically it asks for all poshZone nodes that have a movieName node within a movie node. This could as easily have been a request for all nodes that have a nyseSymbol node within a stockQuote node or a request for all nodes that have a companyName node within a company node.

A div1/div2/spanX structure is accessible as [class1][class2][classX]. If any of the elements represent a list, they are accessed with indexes. e.g. if we had div1/div2a/spanX and div1/div2b/spanX where div2 represents a list, then we use [class1][class2][1][classX] to access the value of property X in the second item of list div2.

The action (user script) can then construct a URL with the available information and do a GET or POST to the target website, a mock rentalQ in this example. In some cases, the action will need information other than whats marked up in the page to complete a request. For example, a user script that allows you to lookup flights for an event will need the 'origin' information from you. A user script that allows to add stocks to a watch list on your trading account will need authentication information from you. The script could prompt for this at runtime. A nice addition to operator would be the ability to store this kind of information in the browser user preferences.

How do we standardize?

I believe that standardization in this space should be bottom up and based on popularity of adoption. I find it hard to imagine a committee coming up with a standard way of describing movies and stock quotes and other bits of information that make up the interaction between consumers and service providers. Some parties will have more influence than others. You can influence formats at the markup collaboration area.

Lets say a big online news website started marking up movie and book reviews in a certain style. Now there is good incentive for movie rental service providers and online bookshops to publish user scripts that understand the format used by the news website. Users just need to register these scripts with their copy of operator to enjoy semantic interactions between the news site and these service providers.

PoshZones get you up and running without waiting for a formal microformats specification. At the same time, this could serve as useful real world input for ongoing formal spec efforts.

Try it out

As a blogger or other type of content provider, you can start marking up your content as suggested in the PoshZone Index. You could also suggest new PoshZones or enhance existing ones. Do take a look at existing microformat definitions so that we don't reinvent them here.

As a developer, you could write user scripts that do interesting things with marked up content. Refer to the tutorial page to get started.

Finally if you are an online service provider, you could publish a script that lets users interact with your service via operator.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License