kafsemo.org: 2006-01-23: Larry Sanders and Dr. Katz (SPARQL and Mivvi)

Dr. Katz: Professional Therapist (coming to DVD this year?) and The Larry Sanders Show were both great '90s US comedies, albeit totally different in style. One similarity was the eclectic mix of comedians and actors as guests, a time capsule of significant performers; so who appeared on both shows?

The current data for Mivvi (introduction) includes, for some series, exactly this information, scraped from different sources but using common IMDb URIs for people.

SPARQL is an RDF query language, currently being prepared by the W3C’s RDF Data Access Working Group. (Disambiguation: Sparql is also the name of Danny Ayers’ cat.) The language is still under development, but there are many implementations of the various drafts. I chose Rasqal, with its Roqet front-end (available as rasqal-utils in Debian).

SPARQL Query

PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX mvi: <http://mivvi.net/rdf#>
SELECT ?c, ?title, ?episode1, ?larryTitle, ?episode2, ?katzTitle
FROM <a.rdf>
WHERE {
	<http://en.wikipedia.org/wiki/The_Larry_Sanders_Show#> mvi:seasons ?s1.
	?s1 ?w ?season1.
	?season1 mvi:episodes ?es1.
	?es1 ?x ?episode1.
	?episode1 dc:contributor ?c.

	?episode1 dc:title ?larryTitle.

	<http://www.sassman.com/katz/#> mvi:seasons ?s2.
	?s2 ?y ?season2.
	?season2 mvi:episodes ?es2.
	?es2 ?z ?episode2.
	?episode2 dc:contributor ?c.

	?episode2 dc:title ?katzTitle.

	?c dc:title ?title.
}

The intent should be apparent, if not the syntax: find any chain, from season to episode, for both series. By requiring the same contributor, ?c, for both series, we will only get results where the same person appeared in both series. The output will be the variables that satisfied the match.

(You could also use inference to bring down an mvi:series predicate for each episode. This would make the query far simpler, at the expense of adding an extra processing step or requiring an RDF store with inference.)

The version of Rasqal I was using had no support for multiple FROM graphs, so I merged the RDF ahead of time (cwm dapcentral/dr-katz.rdf epguides/the-larry-sanders-show.rdf extras/the-larry-sanders-show_guests.rdf >a.rdf). SPARQL doesn’t appear to support rdf:Seq, so the single-letter dummy variables (w, x, y, z) are used as an approximation of rdf:_[0-9]+ to mean any indexed member of a sequence.

Presentation

SPARQL queries can result in tabular data or RDF graphs. For this query, and to present with XSLT, neither is perfect. Fresnel looks like it might be worth investigation but, for now, let’s go with tabular XML output (roqet -r xml-v1 multiple-appearances.sparql >contributors2.xml) and a whole load of XSLT munging.

Results

Performer	The Larry Sanders Show	Dr. Katz
Al Franken	The Roast	Sharon Meyers
Andy Kindler	Conflict of Interest	Family Car New Phone System Mourning Person
Ben Stiller	Make a Wish	Ticket
Bob Goldthwait	Life Behind Larry Like No Business I Know	Studio Guy
Catherine O'Hara	Talk Show	Bakery Ben
Dave Chappelle	Pilots and Pens Lost	Electric Bike
David Duchovny	The Bump Everybody Loves Larry Flip	Metaphors
Jake Johannsen	Where Is the Love?	Day Planner Expert Witness
Jeff Goldblum	Nothing Personal Just the Perfect Blendship	Sissy Boy
Jon Stewart	Everybody Loves Larry The Roast Another List Flip The Beginning of the End Adolf Hankler	Guess Who Walk for Hunger
Kevin Nealon	Life Behind Larry Larry's Sitcom The New Writer	Earring
Larry Miller	I Buried Sid	Everybody's Got a Tushy
Richard Lewis	Life Behind Larry	Undercover
Sandra Bernhard	Larry's on Vacation Arthur After Hours	A Journey for the Betterment of People
Steven Wright	Life Behind Larry Artie's Gone Beverly's Secret	Bystander Ben Mask
Teri Garr	The Breakdown (2)	Pullman Square
Wendy Liebman	Next Stop Bottom	Pretzelkins Chain Letter
Winona Ryder	Another List	Monte Carlo

(Due to methodology, none of the principals were included: both Jonathan Katz and Garry Shandling guested on each other’s shows, and Janeane Garofalo and Sarah Silverman were Sanders cast members who appeared on Katz.)

Conclusions

Jon Stewart and Steven Wright were the most significant cultural figures of 1990s television comedy. (Both also appeared in The Aristocrats; it’s no They Rule, but you might want to cross-reference that cast list.)

SPARQL is here and it works. Most RDF repositories had their own proprietary query languages before, but standardisation should make it easier to move between implementations.

The boundary between RDF and HTML still feels like an impedence mismatch at the structural level. I’m not sure which side needs to move, or if there’s simply a better approach that I’ve missed. It’s always possible to write code to get the presentation you need, but rarely desirable.

(Music: Paul Simon, “Graceland”)