What Search Engine ‘Spiders’ Are And How They Work
Search engine ‘spiders’ are robots that seek out webpages to display in search engines. Below we’ll discuss how they work and why they’re important.
A search engine robot is a very simple program that has some basic functionality to help it understand web pages. However, spiders only have limited functionality to interpret websites: they cannot interpret frames, Flash video, images, or JavaScript; they can’t enter password-protected areas and can’t click buttons; they can be stopped by dynamically-generated URLs and JavaScript navigation. However, within HTML code, they’re able to retrieve data by travelling through the web to find information and links.
Spiders are able to determine the content of your page by looking at the visible text, the HTML code, and links. Based on the words it finds, the spider determines what the site is about using a complex algorithm to determine what is and isn’t important. Spiders also collect links from websites to follow later, which allows them to effectively hop from site to site to site. Since the entire internet is made up of links between websites, the robots use them to make their way through the internet as they search.
By collecting and following links, robots manage tn transport themselves all over the internet. Think of it as an internet equivalent of the roads we use in our lives. Robots travel on the roads and read the signposts so they know what leads to where.
When the robots return, the information they gathered is assimilated into the search engine’s database. Through a complex algorithm, this data is interpreted and web sites are ranked according to how relevant they are to various topics that would be searched for. Some of the bots are quite easy to notice – Google’s is the appropriately-named Googlebot, where Inktomi utilizes a more ambiguous bot named Slurp. Others may be difficult to identify at all.
A robot ‘reads’ your site by collecting data on any visible text, on tags you may have in the coding of your page, and on any links available. These are the things that determine what the search engines ‘think’ your content is about, so these are the things you really need to pay attention to when building a site that you want to have high visibility in search results.
The search engine sorts the information that has been delivered to the databases which has become a part of the search engine and directory ranking process. This allows it to display the results. Databases are updated periodically. Robots visit you regularly to find any changes to your pages so that the latest information will be available. The way in which the search engine is set up determines how the number of visits you get is calculated. This can vary with different search engines. If your website is down or experiencing a large amount of traffic, the robot may not be able to access the page they are trying to visit. The website may not be re-indexed when this occurs. This depends on how frequently your site is visited by the robot. In the hope that your site will be accessible again, the robot will re-visit your site to see if it has become accessible.
Justin Harrison is an internationally recognised Internet Marketing expert who provides world class SEO Services to website owners. For more information visit: http://www.seorankings.co.za
