Higher Order: Google-Facebook Mashup, Anyone?

Social networking sites do provide working models of disparate interconnectedness.

The Atlantic features a smartly titled article, About Facebook–as important as Google at “ordering the web”? hmmmm. Originally launched as a campus tool to connect college students, Facebook has all but busted out for everyone now. I won’t argue its powers to order or the smorgasbord of mini applications being gobbled up for personal pages. But as far as organization of data, doesn’t any type of directory possess some mettle in the information management department?

Maybe my biggest beef with the notion is that Facebook must still inspire membership, must convince me that it’s useful in ways better than any process I have installed in my life right now. Interested parties must “buy into” the business, create a login and profile and actually provide information to order. FB’s new public search remains limited and FB members may block their profiles and information at any time. As far as I can foresee, member privatization undoes the site’s ability to access pockets of data that could potentially make it as big as Google–if it were possible.

I have to say I don’t use Facebook. I don’t use any type of social networking site, per se, and regardless of the millions of users the site may boast, there are still millions that abstain. We do, however, use search engines. So if Facebook is supposedly the next Google, I for one will not be included in any search results, as will not millions of others on a very incomplete (albeit well-ordered) planet of search. Maybe a Google-Facebook mashup?


The Question Asked

What Does Hakia Have that Google Does Not and What Does Google Have that Hakia Does Not?

Google’s Basics of Search tips say that words like who, what, and how are summarily dropped from Google search queries simply because this is how keyword-centric engines operate. I now know this is occasionally the reason for weak and untargeted results that send me clicking on over to Hakia.com to give the semantic search engine a drive. But maybe this is exactly what I’ll do the rest of my search years. There are instances in which Google returns more satisfying results and vice versa, so which search engine is better? Maybe it’s exactly in the question asked.

Who, what, how, and why questions clearly get more specific results with Hakia, but not always enough to fully answer my question and some that have left me with nothing, still.

My simplistic question posed to both, who defines a minority student? clearly illustrates the divergent results. Google is able to return a couple of results that happen to directly reflect my query with the phrase define minority students, but without any direct association with a who.

Hakia, on the other hand specifically returns results, highlighted, too, but specifically associated with the who part of the query.

Proprietary Processes Hakia Boasts

A deeper dip into Hakia reveals a bit of the proprietary processes on which this search engine is built:

OntoSem, or Ontological Semantic parser is “a linguistic theory of meaning in natural language.” OntoSem maintains a highly developed “language-independent ontology of thousands of interrelated concepts; an ontology-based English lexicon of 100,000 word senses, and counting (plus, the lexicons for several other languages under construction); and an ontological parser which ‘translates’ every sentence of the text into its text meaning representation, approximating the complete understanding of the sentence by the native speaker.”

QDEX, or Query Detection and Extraction, is an does a thorough “decomposition” of the WWW prior to any search queries being posited and stores all its possible queries waiting for a user to ask some semantic twist of its data. “The critical point in QDEX system is to be able to decompose sentences into a handful of meaningful sequences without getting lost in the combinatory explosion space.” QDEX interfaces with OntoSem in the miasma of semantic meaning. OntoSem is able to determine which of the billions of semantic options are most meaningful and worthy of indexing.

Hakia’s QDEX

Semantic Rank, if it sounds similar to Google’s Page Rank, the similarity stops there. While Google is very good at determining the authority (may not indicate relevancy) of a webpage based on linking strategies, Hakia and semantic search engines have no such algorithmic variables. Semantic Rank then ranks results by pure meaning, “based on advanced sentence analysis and concept match between the query and the best sentence of each paragraph.

Hakia SemRank

How Keywords May be Replaced by Old Fashioned WORDS

In the Google, Yahoo! universe keywords have become a commodity–the monetary muscle that drives search development, much the same way as Big Oil has driven energy–up til now. A couple posts ago I referred to the next big push in search–natural language. This sounds way academic for most commoners, but really what it means is this: your next gen search engine may actually be more finetuned to word meanings than your current search vehicle.

I’m a big time Google-head. I do nothing but online research day in and day out. My search capabilities have matured in the last couple years. I’ve gone from one and two-word search queries to whole sentences. For example, today I wanted to know about a rumor I’d heard about a high-end grocery store going in, so I typed in: plans for fresh market chapel hill nc. I was immediately returned a page of results that included two local newspaper articles with the keyphrases: fresh market and chapel hill. I consider that a successful search. But have I become keyword-centered?


When I checked into the Powerset blog this morning I found one of the posts most illustrative of the language flexibility that the search engine will have. The example of a Powerset search query for who proved fermat’s last theorem? returns results that not only correspond to the terms fermat and last theorem, but also understand the question is about a “who.” Not very impressive maybe until I plugged the exact same phrase into a Google search box. My results only corresponded to the keywords fermat’s last theorem, with no apparent recognition of the fact that I had asked a question about a “who.” Results were not nearly as concise as those delivered by Powerset. This, then, is a small indicator of the linguistic muscle being built into next gen semantic web.

The Next Big

next big   The next big, humongous, large, giant, monstrous, gigantic, monumental…

The current biggest thing around is really Google. But if you read anything about Google’s history, you quickly realize that the company that has come to revolutionize the way we find information on the internet, how we shop, and even think, was literally spawned from the doctoral work of a couple of computer geeks at Stanford. Add a bit of innovation, a heaping cup of hutzpah, and voila! A culture is ignited. However, given the idea that what is on top rarely stays on top, one has to wonder: what’s beyond Google? What academic is out there already ruminating on the how, where, and how much, of the next big search engine? Deep web search trawler?

I’m interested in trying to examine some of the ideas that may be percolating along the periphery of technological research. Right now, and thanks to innovators like Google, significant databases of scholarly papers–ie, Google Scholar–are available, as well as a wealth of information on dynamic sites, like blogs. This is what I’m up to with The Next Big Humongous…