So for a couple projects in the past year I’ve utilized the Google Search Appliance (here forward called the GSA). The GSA is a powerful tool – it’s like your own slice of Google, only you can customize it, and in theory, tweak the algorithm and manipulate your results. Like a “SEO box” – but to my understanding, the algorithm is different, but in theory it’s the same.

The Versions of GSA

I’ve worked on a Google-mini, the Virtual GSA (version 5.0, no longer offered for download, for shame) and a 6.0 GSA. In terms of front end coding, you manipulate the XSL/XSLT code to change the default appearance. It’s cool – you can get the power of Google without the “branded” Google crap you get from the customized Google. So – your own branded, somewhat manipulatable Google box. Cool.

What is Google?

Of course, Google is not Bing. It is not Yahoo. It is not a commerce engine – it is a relevancy machine. The sole purpose of a Google crawl is to gather documents, have you execute a query, and for Google to say:

“I think this is what you want?”

At least, that’s *MY* experience with the GSA – they have a capability referred to as “Results Biasing.” This allows you to “influence” the results. My issues with this, is that you can make changes, but without arduous testing, you are never sure how long it’ll take the results to show up changed.

You can force a recrawl, but it seems the GSA “caches” the prior results for roughly 15 minutes. That means to tweak your results means retesting every 15 minutes, and even with results biasing, you may still be out of luck. It’s a crap shoot. A coworker and I spent several hours manipulating and testing results, only to eventually make one result bump up one slot, and we were working against tightly knit test set – we basically had 500 pages that were pure keywords and some other custom attributes to be utilized through a customized “Search As You Type” functionality.

Search As You Type

Yes – this was cool. A mix of plain JavaScript and PHP that essentially created a JSON result of your Google Search query. It’s a very cool project, and it allowed for some very unique experimentation with the results of a google query. I can’t delve into it to much, but we manipulated the data to spit out the results in a different display, for a more “spread” experience and relevant results. Very cool.

The Lessons Learned

I think the most difficult thing about the work was digging through the documentation – it was somewhat inaccurate at times, and very often some important, pertinent information was often in a single sentence, in some random blurb. Like Meta tag length limits (320 characters). Agitating, but we were able to rewrite our original meta tag code (to be better, in my opinion) to generate structural details in the PHP vs. including it in the meta data (it was a prototype stage, okay?). The only problem was the keywords were still being stuffed – way too many repeats and useless character phrases. This is a matter of education to the client, and clarifying that the GSA was not a “easy to manipulate” set of data – it was a complex beast, and that in ANY kind of manipulation should be taken with a grain of salt.

By no means is this article definitive, and, to be 100% honest, totally accurate. I spent many emails and phone calls with Google Support (which was a decent experience, I’d dare say surprisingly so) to come to many conclusions of our work arounds and test data issues and uncovered new information (little nuggets of wisdom) hidden in the documentation, and even uncovered a few inconsistencies in the documentation (from 2009) in terms of how the GSA was handling result diagnostics.

It’s a powerful tool – but it’s not EXTREMELY customizable to fit every situation. I wish the had more “virtual” instances (like the VGSA they had) of all their products. I’d love to really see the full capabilities when you don’t have to drop a ton of cash just to play with some hardware.

{ Comments on this entry are closed }

CSS Best Practices and a Return to the Basics

October 26, 2009

This article is not about learning CSS. It’s about having a set architecture in a project when moving forward. One of the first things in beginning development is setting in place a best practice procedure when moving forward – it’s not saying “there is only one way”, it’s saying “for this project, this is how […]

Read the full article →

Logan. The human-monkey hybrid.

October 25, 2009

Posted via web from keif’s posterous

Read the full article →

My first posterous post.

October 25, 2009

I'm mainly testing cross-posting, and to see how damn easy it is to post on posterous. It may become my main source of "random thoughts" and let represent my main blogging articles. Debates of debates! Posted via email from keif’s posterous

Read the full article →

IE6, VML, AlphaImageLoader and You (and Your E-Commerce Baby)

October 23, 2009

A constant debate is always “why support IE6? It’s expensive, it’s annoying.” Now, I’m not disagreeing with that. I don’t like IE6. It’s old. It’s broken. I’d prefer it went away. Unfortunately, IE6 represents over 30% of all IE traffic (which represents roughly 64% of all internet browsers). It’s not representative of my site. If […]

Read the full article →