|
Navigation MAINHome Articles Ebooks Report Your Scams Dictionary My Affiliate Place Blog Sitemap Contact TECH SECTION BUSINESS SOFTWARE Accounting Software Video Web Conferencing Virtual Phone Service COMPUTER SOFTWARE Avast Security Utilities Farstone Backup/Recovery Kaspersky Internet Security COMPUTERS & ELECTRONICS Dell Weekly Deals Dell Server-Electronic Deals TECH ACCESSORIES Tech Accessories Deals LEARN AFFILIATE MARKETING Affiliate Marketing Info Landing Page Basics Net Etiquette AFFILIATE PROGRAMS & PROMOTE Find Affiliate Programs Promote Your Business ADSENSE Adsense TidBits Adsense Basics ADWORDS Adwords Basics SEO SEO Basics Meta Tag Tips Keyword Research Search Methods WEBSITE BASICS Website Overview Building Your Website Domain and Subdomain Website Protection-htaccess Website Protection-Robots.txt EQUIPMENT FOR ONLINE BUSINESS Buying A Laptop Buying A Desktop Protect Your Data MARKETING Article Marketing Email Marketing BLOG/PODCAST Blogging Basics Mechanics of Podcasting |
Behind the Scenes with Bots Manners By Vickie J. Scanlon Do bots have manners? You may be saying to yourself, "I really don't care if bots have manners". But you should. Why? If they behave badly, you are the recipient of lost bandwidth, traffic, or content and email. But there are some ways in which you can keep search engine bots honest. How, you ask? It is the .htaccess file, the robots.txt file and the and the meta tag that can save your traffic, content and email from being stolen or abused. The first two, are simple text files, and the last, a line of code that direct all or specific bots that enter the web page, to follow your set rules. You may be one of those people who may be a little programming challenged-and that possibly means you are bulking at the challenge ahead of you. But when you begin to see your website under attack or your traffic coming to a sudden stand still, you may change your mind. With that being said, I will attempt to clarify, keeping the information as simple as possible to the how to of setting them up, putting the files online. How do you know you have bad bots? You can check your traffic stats page or a log file supplied by your server to see how many times the bot is hitting your site. For example purposes, let's say your traffic stats page has identified for you the search engines that are hitting your site. You noticed that several of the bots are hitting your site many times, but you're not familiar with the name. What do you do? I would grab the bots that you are not familiar with or are questioning, and do a search in Google. In the search box put "BadBotName + bad bot". If it's a bad bot, you will see the screaming of the bots transgression by webmasters quickly. Sometimes the bot will identify itself with a url address-check it out and see what they are doing. Sometimes the "bad bot" label is subjective -- so you will have to decide if you want to allow the bot or not. Note: If a bot wants to stay out of the "bad bot" file, they should have a page that identifies itself and it's purpose for webmasters, as well as, how to prevent the bot from entering the page if the webmaster decides this is not the type of search engine they want on their site. How to set up a Robots.txt file The good bots will abide by what you put into the Robots.txt file. But yet, there are the bad bots that will ignore your robots.txt file instructions, taking your bandwidth,content and whatever else they find of quality from your website. I'll go into the "following no rules bad bots" later, but first, let's learn how to tame the bots that follow the rules. There are two ways in which you can tell the bots to either not hit a certain page, or not to bother you at all. Here is how you set the Robots.txt up to say, "Hey bot, don't bother me at all". 1. Open a Notepad session 2. Put the following info into the file to deny a specific bots from entering your site:
User-agent: specificbadbot
Example: Let's say your traffic stats identify a bot called "gimmeemailbot" is hitting your site. You did a search, and found that this fellow is a bad bot that harvests emails.
Here is how you would enter the information to restrict "gimmeemailbot" from entering your website.
The first line of text beginning with "#" is a comment-stating the purpose of the
.htaccess file
The "^" states that anything looking like this, I want to include in my exclusion.
5) You're done. |