Results 1 to 6 of 6
-
08-03-2009, 03:49 AM #1
Member
- Join Date
- Sep 2008
- Posts
- 53
- Thanks
- 3
- Thanked 2 Times in 2 Posts
How to control an "Unknown Robot" from wasting Bandwidth?
Hi,
I have a website businesslawn.com and i have installed bookmarking script - pligg in that, from past few weeks an unknown robot crawls the site more than 2000 times per day and eats my band width.
I dont know how to block this unknown robot, If any one knows Pls help me.
ThanksLast edited by fourwings; 08-03-2009 at 03:50 AM. Reason: -
-
08-03-2009, 12:00 PM #2
Re: How to control an "Unknown Robot" from wasting Bandwidth?
I don't know too much about this area, but have you tried disallowing the bot via a robots.txt file? For example, to exclude a specific robot, you need to insert the following code:
User-agent: [insert name of bot]
Disallow: /
Read more instructions here
The problem is some bots will ignore instructions contained in a robots.txt file. If that happens, another option is as follows:
If the bad robot operates from a single IP address, you can block its access to your web server through server configuration or with a network firewall.
-
The Following User Says Thank You to Ceres For This Useful Post:
fourwings (08-04-2009)
-
08-03-2009, 01:57 PM #3
Senior Member
- Join Date
- Oct 2008
- Location
- Bits & Byte
- Posts
- 183
- Thanks
- 44
- Thanked 81 Times in 51 Posts
Re: How to control an "Unknown Robot" from wasting Bandwidth?
Put a statcounter / free counter to first identify if the robot is coming from one IP or is from multiple IP's
If you have already done that, and if suggestion by Ceres doesn't help (some bots are obnoxious and don't know the rules to follow robot.txt) - time to get into .htaccess.
You can give a block / redirect using .htaccess and prevent bot from spidering your site. To achieve that - you would have to identify the IP though - thus need for stat counter / anything that would help to isolate.
-
The Following User Says Thank You to pubdomains.in For This Useful Post:
Ceres (08-04-2009)
-
08-04-2009, 04:28 AM #4
Member
- Join Date
- Sep 2008
- Posts
- 53
- Thanks
- 3
- Thanked 2 Times in 2 Posts
Re: How to control an "Unknown Robot" from wasting Bandwidth?
Yes it is true, it doesnt obeys robots.txt - i have to add stats counter to find the ip address.
I have tried with htaccess also, but not clear with that.
Now it is clear - find ip address via stats counter and block or redirect via htaccess
Thanks to ceres and pubdomains.in
-
08-04-2009, 08:45 AM #5
Senior Member
- Join Date
- Oct 2008
- Location
- Bits & Byte
- Posts
- 183
- Thanks
- 44
- Thanked 81 Times in 51 Posts
Re: How to control an "Unknown Robot" from wasting Bandwidth?
If you identify the IP address, simply add deny directive, i.e. assuming the IP address is 192.168.1.1 than put the following as first line
That should do the trick.Code:order allow,deny deny from 192.168.1.1 allow from all
-
-
08-04-2009, 02:41 PM #6
Member
- Join Date
- Sep 2008
- Posts
- 53
- Thanks
- 3
- Thanked 2 Times in 2 Posts
Re: How to control an "Unknown Robot" from wasting Bandwidth?
Thanks,
Now only i have added stats counter to my website to detect ip address which eats my BW
Thanks


LinkBack URL
About LinkBacks
Reply With Quote


