SSI, CGI and PERL, Oh My!

Will the wonders of the world wide web ever cease? There are a lot of things about web technology that just don't make a lot of sense. A lot of unixisms show through like a bad one-coat paint job. (And not everything about unix "makes sense" either. It's just the way things are done. After all, the internet is a unix legacy.) What many people seem to love about the web is the way all these different technologies can work together to achieve results. My pet peeve is the way they don't work together. Or worse still don't work the same in every implementation, forcing programmers like myself into corners or making us resort to horrible hacks as work-arounds.

My latest saga starts some time ago when I let the beast called server side includes (or SSI to obfuscate) out of its cage. SSI seems like a reasonable solution to certain kinds of problems. For example, I have a menu that I want to appear on several web pages, but if I change the menu, I don't want to have to go to every web page where it appears and make the same change. There is too much room for error and a lot of wasted time. Enter SSI. I can put the html (obfuscated from hypertext markup language) text for the menu into one file and include it in several web pages. All that has to be done is to tell the web server to pre-process the web page before serving. Doesn't sound like a problem.

Well you have to distinguish the pages that need pre-processing somehow. Maybe change their extension from html to shtml or maybe set the file's execute bit - I dunno, you pick. And what's this? For some reason when I turn on SSI my JPEG files start showing up as text, but my jpeg files don't. ?? Now once SSI is going you notice that aside from just including one html file into another, you can call little programs with SSI to generate html and include it. These generators lead to something called dynamic html (or dhtml to obfuscate). If you are using Apache then there are two different ways to run these little programs both of which don't work equally well. (Just like there were two ways to include a file with some subtle difference. Why is that keyword "virtual" anyway?)

So I wanted to write a program that would randomly insert different ads into my website. (2010 Note: I don't do ads anymore.) That way every time you load a page you would get (possibly) a different ad. I use generated ads from Google and Amazon. The calling interface to these little programs is called the common gateway interface, or CGI to further obfuscate. Now normally when a web page is processing a form, your CGI receives parameters from the web page's entry fields. You see it in the URL after the '?' (from uniform resource locator - uh, oh forget it). For example 'http://www.grab.your.personal.info.com?mothersmaiden=blah+blah+blah'

But guess what? When calling a CGI from a SSI, the parameters can't be passed as they normally would from a form. Well in theory if you are using the "virtual" keyword it should be possible to tack them on yourself but in practice I have not been able to get this to work. I must need to set the make-it-work-in-practice-flag somewhere in Apache or something. Now a CGI program can be written in almost any language but for mostly pragmatic reasons a whole lot of them are written in PERL, which stands for Practical Extraction and Reporting Language and its acronym is, oddly enough, pretty easy to comprehend. The main reason for using PERL is that the language is almost universally available wherever web services are sold.

So to get around grabbing CGI parameters and parsing them (which never worked anyway) I used a nice facility of PERL which allows me to fetch an external variable known as an environment variable. And I can set these environment variables using the "set" keyword in SSI. Great! So now everything is in place and I can write my little PERL program. PERL is interpreted so any errors you make while programming will show up in a log somewhere. After tracking down bugs I had everything working. That is until I noticed a really odd behavior.

You see these Google ads are delivered as an embedded JavaScript, yet another language you can use with web pages. Now I was trying out different browsers and refreshing my page and watching my ads change. Cool. Except every once in a while a strange thing would happen. i would get a 160x600 pixel google add embedded in the 468x60 frame. Or vice versa. Yet if I looked at the source of the page everything seemed perfect. There were the correct codes in the correct place. I have to conclude that there is a bug in Safari (or perhaps in Apache although I haven't seen this fail in other browsers) which causes a race condition in which the script that sets the variables is lagging and the other script is using the wrong values for these variables. Opening the page anew never fails, and linking to it from elsewhere hasn't failed either. Simply refreshing does this.

Now if you think about it what does it mean to refresh a page? Omniweb seems to just redisplay the exact same page. The ads won't change until I load the page again from scratch by either creating a new browser window or linking to it from elsewhere. On the other hand you might actually call refresh because the page content was corrupt. And what happens when you refresh a page that has some content on it that you actually want to see change? For example an eBay auction. Perhaps reloading it every time makes more sense. But I definitely think that when the page source and the display don't actually match then you have a problem.

Perhaps there is some more documentation I'll find that will shed light on why it needs to be this way. Or is there just a make-the-bad-software-go-away flag I forgot to set?

Add new comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.
  • Only images hosted on this site may be used in <img> tags.
  • You can align images (data-align="center"), but also videos, blockquotes, and so on.
  • You can caption images (data-caption="Text"), but also videos, blockquotes, and so on.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.