Archive for August, 2011

08/08/11
Paul Savage
tags:  

Using RegEx to prefix or postfix


While it’s programmatically probably not the best way to do this type of operation, you may find that you need to use a Regular expression to prepend or postpend a string. I’ve used standard regular expression notation here where you cite the replacements in [].

Prefix String with RegEx

Search for [(^)] Replace with [Pre: $1] , this will add Pre: to the start all your stings.
E.g.

  • String : Blackdog.ie raises first round of VC
  • Search : (^)
  • Replacement : CNN : $1
  • Result : CNN : Blackdog.ie raises first round of VC

Postfix a string with RegEx

Search for [($)] Replace with [$1 : Post], this will add : Post to the end of all your strings

  • String : Blackdog.ie raises first round of VC
  • Search : ($)
  • Replacement : $1 : CNN
  • Result : Blackdog.ie raises first round of VC : CNN

I use this for http://pipes.yahoo.com/ which is a handy tool for mashing RSS feeds together. I did want to attribute where the feeds were coming from, and putting the source in the title really improve the feed.

08/01/11
Paul Savage
tags:  

IrishRail.ie : a lot of potential wasted


Maybe it’s just me, but I really do get saddened when I see a websites, especially large or popular websites, hitting below their mark in terms of SEO & user experience. Yesterday I stumbled across IrishRail.ie, Ireland’s national train operator. It’s currently ranked as Ireland’s 120th most popular website according to Alexa.com. I’ve used the IrishRail website in the past, and it did it’s job adequately, but I know it’s capable of more. Looking at the website yesterday in a bit of detail I did notice a few items that could easily be improved upon or quickly fixed.

From the first impressions of the website I was a little surprised that there isn’t much of that magic ingredient that search engines love, namely text. In all the homepage has 142 words and only 120 when you remove navigation. This doesn’t give the search engines much to work with. Another aspect to keep in mind is that not every visitor will be be able to read English when they visit. Using tools like Google Translate, visitors would be translate the text on the page into their own language, but sadly most of the information is presented with images.

With regards to the language of the website, I was surprised in was the fact that the website isn’t available in Irish. Seeing as the Irish government is the sole owner of CIE, I would expect that this website would also fall under the Official Languages Act, 2003.

The Official Languages Act 2003 (section 9(3)) requires public bodies to ensure that where they are communicating for the purposes of providing information to the general public or to a class of the general public – in writing or by electronic mail – the communication shall be in the Irish language only or in the Irish and English languages.

But apparently it doesn’t (see end). As this is a website that is also used extensively by toursits having a German / French / Spanish would only help to improve conversions. For example the German train operator Deutsche Bahn offers their site in 10 languages.

HTTP Redirects

Every site should really check that the headers they are sending are the correct ones. This means sending the right HTTP header status, see here for a list of HTTP header statuses. Some potential issues I noticed :

  • Main domain : www.irishrail.ie , returns HTTP/1.1 302 Object moved, which is more for when you move something temporarily. If it’s the case that the main page always redirects to /home/ then it should be a 301, a Moved Permanently status.
  • Missing pages : www.irishrail.ie/home/help.html returns an error page, saying that the page can’t be found. But the page’s header is saying HTTP/1.1 200 OK, which means that it was indeed found. In this case a 404 Not Found status should be returned. Doing this will help error pages finding this way into the index. It can also have more serious implications like for your robots.txt, where search engines are expecting files in a certain format, and they something radically different while still being told that it’s a valid file.

VIP : Visually Impaired site

I was really pleased to see that they had a section for the visually impaired.

This is a high contrast, low graphic version of their website, as I was expecting.

On further inspection I was a little surprised when not all the links worked. The links to “Your Journey” , “Projects” & “Opportunities” all brought me back to the main website. Also the clicking for the full version of the Breaking News returned me to the home page. While I commend the effort in doing such a website, things like the timetable search works perfectly, I do think that a when doing such a site it should either be done properly or not at all. Doing it properly could involve not having every page available on the VIP version. But if you do include links to resources then they should really be consistent with the ‘expected experience’.

A quick look at the HTML code

Looking closer at the HTML code, there is plenty of room for optimisation. My quick run down would include :

  • Moving the 700+ lines of CSS code in the header into it’s own CSS file. This will allow the browser to cache the file, so the 2nd, 3rd and subsequent pages will load faster.
  • Moving the 1200+ lines of Javascript code which is placed inline into included javascript files. This will allow the browser to cache these files, so the 2nd, 3rd and subsequent pages will also load faster.
  • Moving included files CSS & JS into the header of the file, this will allow these files to be processed before the page is rendered. The one main exception would be the JS include for Google Analytics.
  • Remove redundant code, commenting out code sections is great while you are testing, or about to launch. But it also unnecessarily bloats code.
  • Use Tables for tabular data, otherwise use DIVs. This websites code uses tables extensively for laying out objects in a grid format, this wasn’t the intended use case for tables in HTML.
  • In-line style should be avoided as much as possible, better to use CSS.

The total size of the homepage is 0.45 MB and required in total 50 HTTP requests. This could definitely be reduced to make the page load faster with the suggestions above.

Using AdSense

As the benefit of using AdSense on a sales website can be debated back and forward, I’m not going to go into that here. I really feel that the implementation here is suboptimal. In the screenshot above the top ads do blend in with the site, the right hand side and bottom banners are rich image/rich media ads which don’t blend as easily. One option here would be to turn off the image/rich media ads option. This would result in a uniform look to your page, while also keeping the ads intact. As with most large companies I’m sure they would prefer to have their website inline with their corporate visual identity. It should be considered are 3 ad units really necessary ? Why are they using the maximum number of ads ? Do the all really convert ?

Here we have the AdSense banners highlighted in green with the play button, and the internal banners highlighted in yellow and green marked with an X. While the 3rd party ads are outside the main content, the can be distraction, and potentially annoying for customers. These ads are above the fold, which results in that little bit of extra scrolling to see the pages’ content.

Also I’m not sure as to why you would include AdSense in an IFRAME as this will only skew your AdSense targeting. Also in this case this would be a 4th AdSense Banner, which won’t be displayed (happens automatically), as you are only allowed to use up to 3 AdSense banners on 1 page.

As IrishRail are fans of using AdSense, one tip would be to introduce some Link Units, blended in with some navigational items. These convert quite well, we use them extensively on our Irish Jobs site.

Trying the search function

I tested out the search function, and got a little bit confused. I did a search for the word [contact] and the first result appeared to be the contact page, but it’s located under an obscure URL http://search.irishrail.ie/highlight.aspx?aid=2901210&pckid=147268114&rn=1&sp_id=147267858&lid=113272535&highlight=contact#firsthighlight rather than the expected URL http://www.irishrail.ie/contact_us/ . It searches via a sub-domain search.irishrail.ie , which thankfully blocks indexing via their robots.txt, so this will avoid the duplicate content issue. I did some further looking around to try to see if I could search via http://search.irishrail.ie/ directly, but it comes up with a Danish language search page, and returns results for the domain http://blog.siteimprove.co.uk/ & not IrishRail.ie.

All in all this website gives the appears that it was either half heartily done, or has grown out out control by adding on bits and pieces along the way. Either way this leaves to a diminished user experience and a damaged SEO potential.

A message from IrishRail

I did reach out to Irish Rail before I published this post and they offer the following comments:

  • They are planning on relaunching a brand new website in September 2011.
  • The website has become over grown and disorganised as they’ve progressed, and the new website will address this.
  • They don’t fall under the Official Languages Act, as they are not a governmental institution. Update : Websites don’t fall under the OLA, rather written publications and electronic mail do.
  • The new website will have the facility to be multi-lingual, but they are not yet decided as to whether it will be available in multiple languages on launch.
  • And yes the AdSense wouldn’t be there if it wasn’t pulling it’s own weight, and it’s quite a helpful source of income.