Search Engine Spiders

What Spiders Do...



Search engine spiders are by far one of the most useful things to come around in the last 10 years of the internet. They are useful not only to the web sites(Google and many others) that use them, but also to people who are searching for a particular site and those who run web sites. Spiders allow your site to be seen by the millions of people who use search engines every day. In this newsletter, we will discuss what search engine spiders do, how they work, and how to set up a robots.txt file and upload that to your site to keep spiders from visiting your site.

What are spiders and what purpose do they serve?

Spiders are essentially programs that “crawl” sites and report back to their superior(Google or whatever search engine they were created for) what their findings are. Their purpose is to make it easy for sites to get listed in search engines.

You might be wondering, what does it mean to “crawl” a site? Well it means to visit and site and copy the information.

How do spiders work?

Spiders work by finding links to web sites, visiting those web sites, going through the content of a web site and then reporting the content of the site back to the database of the site which they are working for. Google spiders, thus, crawl sites and report the information back to Google’s database. From there, the information is added to Google’s search engine, and the site then shows up in Google search results. Much the same process happens with any other search engine spider.

How can I keep spiders from visiting my site?

You might be thinking, “why would I want to keep such a useful thing from visiting my site?” Well, the short answer is, sometimes site owners don’t want the spider to crawl on a particular part of their site. Some site owners don’t want spiders to crawl their site at all. The reasons for not wanting a spider to crawl a site or a particular part of a site vary, although most of the time it is because the site is either completely spam or features a page or two of spam.

If you’re one of those site owners, then you’ll want to create and upload something called a robots.txt file. We will briefly go over how to do this.

A robots.txt file

The whole purpose of a robots.txt file is to tell a search engine spider not to crawl the site or part of the site on which the robots.txt file resides. Creating the file

Creating a robots.txt file that blocks out spiders is easy. First, open up notepad. Then, copy and paste the following:

User-agent: * Disallow: /

Once you’ve done that, save the file as “robots” and as a .txt file.

Uploading the file

Next, you will upload the file to the part of your site which you do not want the spider to visit. So, if you don’t want them to visit yoursite.com/news/, you’ll upload robots.txt to the news folder. If you don’t want the search engine spider to visit your site as well, upload robots.txt to your index folder. That’s all there is to it.

Using the robots.txt file to make sure search engine spiders DO visit your site

Believe it or not, the robots.txt file can be used to both disallow and allow search engine spiders to crawl your site. Here’s how to create and upload such a file.

Creating the file

Open up notepad and copy and paste in the following:

User-agent: * Disallow:

You’ll notice that the only difference between this and the earlier example is that Disallow: is not followed with /. If it were, that would tell spiders to go away. Once again, save the file as robots.txt.

Uploading the file

All you’ll do is upload the robots.txt file to the part of your site that you want the robot to pay a visit to. So if you want the robot to see the whole site, just put the robots.txt file right alongside the index file. And you’re done.

To your success,

Michael Thomas

P.S. Creating and uploading a robots.txt file to help make sure spiders don’t miss your site is fast and easy. So what are you waiting for? Create and upload that file now!

 

 
Translate Page Into German Translate Page Into French Translate Page Into Italian Translate Page Into Portuguese Translate Page Into Spanish Translate Page Into Japanese Translate Page Into Korean

More Articles

 

 

 

Related Products And FREE Videos





 

More Articles


Submitting And Re Submitting Sites

... engine spiders don t visit your site even after it has been up for a month or too. Or sometimes the site is listed on a search engine, but then taken down. In either case, it is time for the web site owner to submit or re-submit the site to a search engine. This process isn t overly difficult, but there ... 

Read Full Article  


How Long Should My Pages Be

... to be concise not just for human visitors, but for the robot(spider) visitors as well. If you are able to limit your pages to between 700 and 1500 words, you will do well with human visitors and search engine spiders. That s the goal of every web site owner and should be your goal as well. To your success, ... 

Read Full Article  


Search Engine Metatag

... to steer clear of silly keywords that don t relate to your site. These would include keywords completely irrelevant to your site. #2: Make sure your keywords can flow naturally throughout the content of your site. Try to pick words that you can easily incorporate into the content of your site and that ... 

Read Full Article  


The Differences Between Search Engines And Directories

... the directory and then the editor will visit the site and see whether or not it is worthy of being listed in the directory. If it is, it will be grouped with other relevant sites in the directory. If not, it won t be listed at all. Directories are typically harder to get listed in than search engines ... 

Read Full Article  


Getting Listed Quickly

... for you to manually submit your site. Before submitting, however, it is important to make sure of the following: 1. All pages on your site are complete. Search engine spiders won t touch incomplete sites. 2. Your site is not full of spam and/or excessive use of keywords. Yet again, spiders won t touch ... 

Read Full Article