triples | The Next Big...

Web Search: It’s a Maturity Thang

Posted on September 27, 2007 by jrotman

Growing pains: Search engines, web design, web writers, internet users.

If we could plot the web savvy of the virtual collective, at what point in a lifecycle–human, let’s say–would we be? Infancy? Dangerous toddler? Terrible twos? Confused tween? Self-destructive teen?

Factors that dispute the comprehension/comprehensiveness and challenge the “learning” abilities of both keyword-based and semantic (linguistic meaning) or natural language (coined by Powerset) search engines:

Lack of key keywords on a page. I’ve tried in vain to extract non-existent information from websites. If it’s about the real estate market in Soho, then why are those words missing? Or a professional association website–this is common–with no informational content at all–just crap that applies only to “members.” You’re public, folks; people like me researching your services; “I need the info,” so says Dr. Evil, and so do the search engines. This applies to both keyword and semantic search. There must be some toehold onto which each may scale the search cliff face, right?

Acronyms. I was using Hakia the other day and I realized quickly that it had not yet “learned” the acronym MEMS (micro-electronic-mechanical systems), a rapidly growing nanotechnology niche. But many websites relative to this information only use the acronym.

Just a matter of time before there is true semantic maturity.

Beyond the visual images and words on each webpage are the infra-data (data below, beneath): HTML, javascript, CSS, image files, metadata, and a growing lexicon of RDF, or the semantic web stuffing.

RDF (Resource Description Framework): It’s Simple, right?

I guess I thought it would be easy to distill a simple definition of RDF–but I’m learning myself. I have to say, with consistently mind-numbing strings of terms and acronyms such as XML, OWL, schema, interoperability, directed graphs, triples, and URIs, to name a few–I’m challenged. And An Idiot’s Guide to the Resource Description Framework is non-palatable, as well.

A complicated W3C specification framework for describing the web. RDF codes are written in triples–language is given meaning via components that hardwire meaning and nuance to words which will be accessible/usable in all web strata. Triples constitute the linguistic containers into which language is poured: subject, predicate, object. Here is a really good visual snippet from an RDF beer ontology:

<statement>
<subject type=”uri”>http://www.purl.org/net/ontology/beer.owl#IndiaPaleAle</subject>
<predicate>http://www.w3.org/2000/01/rdf-schema#subClassOf</predicate>
<object type=”uri”>http://www.purl.org/net/ontology/beer.owl#Ale</object>
</statement>
<statement>
<subject type=”uri”>http://www.purl.org/net/ontology/beer.owl#Barley</subject>
<predicate>http://www.w3.org/1999/02/22-rdf-syntax-ns#type</predicate>
<object type=”uri”>http://www.w3.org/2002/07/owl#Class</object>
</statement>

Filed under: Data, linguistics, Metadata, Semantic Web | Tagged: Metadata, ontology, rdf, Semantic Web, triples | Leave a comment »

The Next Big…

Pages

Categories

Archives

Web Search: It’s a Maturity Thang

Blogroll