To inform entertain and excite my kids, Jamie, Patrick, Aaron & Sarah Middleburgh, our family and friends.about me
powered by BLOGGER
This immediately triggered a series of questions and introspection" Why was the blog search not returning any results? ; whats the value of a site search in the toolbar if it does return expected results ?; how do other search engines see my blog and its content? what have I done "'wrong" ? ( IMHO there is nothing wrong with a little paranoia!!)
To put things in context, I don't really care where my pages come in general search results although it must be confusing for anyone searching for David Middleburgh to get results for my cousin or vice versa. It seemed to me however that a site search should at least "work" and since the one in the toolbar didn't, I reinstated the Picosearch which I had on the site previously. This does work and can be found on the sitemap tab.
This of course didn't explain why the blogger search was "naff" nor why I couldn't find expected content on Google; Yahoo and other engines even when I did focussed searches. It's worth remembering that searching in the widest sense involves 3 sequetional proceses
The key question is whether the content was in fact indexed or not and It quickly became apparent that there was a discovery/indexing problem. Google and Yahoo both offer sitemapping facilities to let you help them spider the site more effecively. So after checking and cleaning up broken links etc ....
I switched on the Googles tools and discovered that that the last time they spidered my blog was in July 2005 when they changed their spidering system (and coincidentally when I temporarily stopped blogging). And all the decreasing traffic I was getting was was coming from old indexes which were being cleared down as a result of me switching off caching in May 2006 . I have switched caching back on (only way on a page basis i can work out what is being indexed and when) and prompted google to respider the site. It only discovered the last post and I gcame to conclusion that the bot appears to discover/index through the default blogger atom feed (first feed listed in the home page header rather than the RSS feed listed second or the HTML itself). My blog settings are such that the home page amd the atom feed only shows the last post. This means that if I did 2 postings a week and the bot visited once a week only half the posts would get discovered. I removed the atom feed from the header and pointed the sitemapping tool at the RSS feed (which contains the last 15 posts - am still looking at this ) Since there was shortfall of about 45 posts (mostly prior to July 2005) not being indexed I put up a temporary page on another (non authoratitive) property listing them for the bot to pick up; When it did, I took this page down since it could distort inbound link counts ( and I am actually quite ethical)
The Yahoo tools work similarly although in a couple of key respect its not as advanced; It does not yet let you put a metatag in homepage to authenticate ownership which means you can't see all the useful statistics. Apparently they are working on this. There is also no facility to indicate a priority that an author would put on pages. I think this facility is available(?) in google XML site maps not that its any use if you are using the public Google blogging service because you can't load an XML site map anywhere useful. If the author could indicate the page priority then Google or other search engines would know which one to display when ther are a number of "similar" pages. I am completely at a loss to understand why some of my pages display over other similar pages.
I am now looking at how Technorati (a mystery) and IceRocket index the site. I have given up on MSN which has no tools .....|