INForum.in - Home of the Indian Domain Industry
Results 1 to 6 of 6
  1. #1
    fourwings is offline Member
    Join Date
    Sep 2008
    Posts
    53
    Thanks
    3
    Thanked 2 Times in 2 Posts

    Default How to control an "Unknown Robot" from wasting Bandwidth?

    Hi,

    I have a website businesslawn.com and i have installed bookmarking script - pligg in that, from past few weeks an unknown robot crawls the site more than 2000 times per day and eats my band width.

    I dont know how to block this unknown robot, If any one knows Pls help me.

    Thanks
    Last edited by fourwings; 08-03-2009 at 03:50 AM. Reason: -
    CollisionDomains.com(Buy & Sell Domains - Free Listing) - Zeeru.com Search Engine.

  2. #2
    Ceres's Avatar
    Ceres is offline Senior Member
    Join Date
    Mar 2008
    Location
    Canada
    Posts
    2,206
    Thanks
    544
    Thanked 576 Times in 347 Posts

    Default Re: How to control an "Unknown Robot" from wasting Bandwidth?

    I don't know too much about this area, but have you tried disallowing the bot via a robots.txt file? For example, to exclude a specific robot, you need to insert the following code:

    User-agent: [insert name of bot]
    Disallow: /

    Read more instructions here

    The problem is some bots will ignore instructions contained in a robots.txt file. If that happens, another option is as follows:

    If the bad robot operates from a single IP address, you can block its access to your web server through server configuration or with a network firewall.

  3. The Following User Says Thank You to Ceres For This Useful Post:

    fourwings (08-04-2009)

  4. #3
    pubdomains.in is offline Senior Member
    Join Date
    Oct 2008
    Location
    Bits & Byte
    Posts
    183
    Thanks
    44
    Thanked 81 Times in 51 Posts

    Default Re: How to control an "Unknown Robot" from wasting Bandwidth?

    Put a statcounter / free counter to first identify if the robot is coming from one IP or is from multiple IP's
    If you have already done that, and if suggestion by Ceres doesn't help (some bots are obnoxious and don't know the rules to follow robot.txt) - time to get into .htaccess.
    You can give a block / redirect using .htaccess and prevent bot from spidering your site. To achieve that - you would have to identify the IP though - thus need for stat counter / anything that would help to isolate.

  5. The Following User Says Thank You to pubdomains.in For This Useful Post:

    Ceres (08-04-2009)

  6. #4
    fourwings is offline Member
    Join Date
    Sep 2008
    Posts
    53
    Thanks
    3
    Thanked 2 Times in 2 Posts

    Default Re: How to control an "Unknown Robot" from wasting Bandwidth?

    Yes it is true, it doesnt obeys robots.txt - i have to add stats counter to find the ip address.

    I have tried with htaccess also, but not clear with that.

    Now it is clear - find ip address via stats counter and block or redirect via htaccess

    Thanks to ceres and pubdomains.in
    CollisionDomains.com(Buy & Sell Domains - Free Listing) - Zeeru.com Search Engine.

  7. #5
    pubdomains.in is offline Senior Member
    Join Date
    Oct 2008
    Location
    Bits & Byte
    Posts
    183
    Thanks
    44
    Thanked 81 Times in 51 Posts

    Default Re: How to control an "Unknown Robot" from wasting Bandwidth?

    If you identify the IP address, simply add deny directive, i.e. assuming the IP address is 192.168.1.1 than put the following as first line
    Code:
    order allow,deny
    deny from 192.168.1.1
    allow from all
    That should do the trick.

  8. The Following 2 Users Say Thank You to pubdomains.in For This Useful Post:

    Ceres (08-04-2009),fourwings (08-04-2009)

  9. #6
    fourwings is offline Member
    Join Date
    Sep 2008
    Posts
    53
    Thanks
    3
    Thanked 2 Times in 2 Posts

    Default Re: How to control an "Unknown Robot" from wasting Bandwidth?

    Thanks,

    Now only i have added stats counter to my website to detect ip address which eats my BW

    Thanks
    CollisionDomains.com(Buy & Sell Domains - Free Listing) - Zeeru.com Search Engine.

 

 

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •