
I was thinking today how robust (read cool) Lucene and SOLR are for indexing and searching content. You write the connector, parse it, and place it in a Lucene flat-file, or memory (RAM directory) index....
When I developed a search solution for Buckinghamshire County Council, which would integrate with their Livesite CMS, I didn't fully realise how many search solutions would be powered by this technology. Get yourself over to here, and see the example code for creating and searching an index.
Don't forget to "send a correct" text/html header with your Python CGI:
------
#!/usr/bin/python
print "Content-type: text/html\n\n";
------
Find out where your Python binary is located, make sure you do a chmod 755 on the .py file, and use two new lines....
I like the BBC news website. It offers the closest to impartial journalism there is.
Whilst munching my tuna baguette, I read two different technology stories which seem relevant "for today". One was how the Microsoft Kinect has been hacked to offer Linux support. This allows the technology to be utilised by DIY technology hackers in areas such as motion detection, and robotics. The applications are endless, but the common denominator is "on a budget". This open approach to "systems integration" is nothing new, but like Nokias "open" N900 phone hack competition, the possibilities are endless.
The other story was about a charity called Refugees Reunites, whose aim is to Reunite refugees and families no matter where in the world. The website has a search facility, but the thing which I like is the fact that you can register withe website via text/sms. This single feature will allow people in the "back end of beyond" to register, and increase their chances of filling the void created by a displaced family (I have no idea what this feels like).
Mobile phones are common in rural developing countries, and a text message will "get-through" when a mobile call will not. The text message standard is also backwards compatible "across" new, and older mobile networks, and can be used "virtually anywhere".So, while students are demonstrating, the credit crunch...crunches, and lots of things have the potential to go to the wall - technology is helping to assist. I never said it was the solution, but you need to start somewhere, and focussing on robust technology offers a trail to blaze.
Someone recently asked me what job I did....
Do I sit in the corner, eyes glued to a screen. Sometimes, but more time is spent working with colleagues, adding clarity to a customers requirments, and fixing a something which is "apparently" broken.
For me the main difference is that "I don't sit in a corner coding all day. Being aware that the software which is being written is likely part of a bigger system, and links into other "things" like a database, webserver, or "other" system is an essential awareness to have.
A couple of years ago I wrote a web crawler, which was written in core Java, and used a HTMLParser to parse HTTP headers, text held in divs, and complete web pages. It was multi-threaded, and did what it said on the tin "feed a SOLR/Lucene search index with content".
Moving onto my last job I wrote a number of socket based "service status checker", tools written in Perl, and a bulk data exporter, importer (via a web-form) using Watij
The nice thing about Python is the speed at which you can script simple tools, which fall into the category of service checkers, or screen scraping.
Type the folowing into python>>
from BeautifulSoup import BeautifulSoup,
import urllib2,sys
address =
html = urllib2.urlopen(address).read()
soup = BeautifulSoup(html)
print soup.findAll("p")
I have just started tinkering with Erlang. Once I got used to putting a "." at the end of each line :O) - I was impressed at how compact it was.
I can recommend the Eclipse plugin - Erlide.
My current role as an automation tools developer means that I spend a fair bit of my time “gluing systems together” – or getting them to communicate with each other in an effective way.
This involves ports being open, and protocols like telnet or SSH being available.
In an outsourced IT environment I don’t have the luxury of sitting near operations or systems people (I wish I did) – they are likely to be offshore.
My second issue is the time we spend monitoring support queues.
Thirty percent of my current role is support. About half of this is first line support.
It’s not glamorous, and if you are not careful you will leave the (development) zone.There are several support queues – which in other words means.
I need an application capable of monitoring the incoming tasks.

I am not going to throw away services alive – but still use the swing application – with a “tabbed look” so you can select what you want to see.
I want trackerDeluxe to be lightweight, easy to configure, and not suffer from bloatware.
Come back soon to see how things are progressing…..
Driving home I idly thought why a simple tool did not exist which would show me the status of a number of websites.
Enter the aptly named ...."Services Alive."

This is a simple Java Swing application which opens a number of properties files (each containing a specific URL).
The results of each check are represented by a green or red cross being displayed. This check is repeated every minute.
The frame growns dynamically dependong on how many URLs/properties files are found.
There are numerous ways to improve this application - but I now have a means of monitoring our websites.
experience of content management systems, data feeds, parsers, search engines, indexing, and crawlers
currently working with Adobe CQ ensuring that the software and infrastructure does not misbehave.
prefers Linux, Unix, or Cygwin, common sense, pragmatism, Oreilly, Pragmatic bookshelf, and the pomodoro technique.