Bloquear robots y copiadores web 
Tuesday, July 31, 2012, 06:33 PM - Php
En CD que tenia perdido pude rescatar este bloqueador de robots y algunos programas que descargan una web completa
Fijense dentro del codigo se tiene que crear una carpeta llamada data y dentro de la misma un archvo banbot.txt , si no quieren que se guarde en la carpeta ponen otro nombre sin olvidarse tambien de cambiar dentro del codigo





block.php

<?php
// UA BLOCK
// By: Christopher Lover - webmaster@icehousedesigns.com
// http://www.icehousedesigns.com
// This script is freeware. I accept no responsibility for
// damage it may cause (which should be none).
// This script can be freely modified, as long as this
// header is included.
// Place an include ("/path/to/block.php"); //
// at the TOP of your HTML documents

//List user-agents below you wish to ban in correct format
// Lista actualizada el 31/01/2001

$browser = array ("Aculinx PicRip","Autonomy","BackStreet Browser","Bloodhound","BlackWidow","ChinaClaw","CLIDTmex","Crescent Internet ToolPak","Disco Pump","DA","Deweb","DnloadMage","DownloadIt","Download Demon","DownLoad Express","Download Wonder","eCatch","Express WebPictures","FileHound","FlashSite","FastNet","GetRight","GetSmart","GetURL","Go!Zilla","hhjhj@yahoo.com","HTMLgobble","HTTrack","Internet Research Software","Java102","JetCar","JoBo","JoBo Java Web Robot","JOC Web Spider","larbin","Leech","LeechFTP","Links SQL","Mass Downloader","MemoWeb","MetaProducts Download Express","Mozilla/2.0 (compatible;)","Mozilla/2.02 (compatible;)","Mozilla/3.0 (compatible;)","Mozilla/3.01 (compatible;)","MyGetRight","MS-Catapult","MS-Catapult/0.9","MS FrontPage","MSIECrawler","MSProxy","MSProxy/2.0","MyGetRight","Naviscope","NetAccelerator","NetAnts","NetCaptor","NetCarta","NetDrag","NetJet","Net M@nager","NetSonic","nicerspro","netattache","Net Vampire","NetZip Downloader","No me jodas que si no","pavuk","PC Accelerator","pcBrowser","PeakJet","Prozilla","Octopus","Offline Commander","Offline Explorer","RealDownload","RoboFox","SilentSurf","SiteAccelerator","Site Analyst","SiteSnagger","SmartDownload","Space Bison","Stamina","SuperBot","tarspider","Teleport Pro","Teleport-Pro","Templeton","tivraSpider","SpiderBot","TransSoft's Mail Control 5.0","UdmSearch","UtilMind HTTPGet","w3mir","Weazel","webbandit","WebCapture","WebCopy","WebCopier","Web Downloader","WebExe","WebFetch","webfetcher","WebFountain","Wget","Webleech","WebMirror","WebReaper","WebRecorder","WebRifle","WebSauger","Website Extractor","Webster","WebStripper","WebVac","WebVacuum","webwalk","WebWhacker","WebZIP","www4mail","WWWOFFLE","wwWhoosh Accelerator","XGET","e-collector","EmailSiphon","EmailWolf");


$punish = 0;
while (list ($key, $val) = each ($browser)) {
if (strstr ($HTTP_USER_AGENT, $val)) {
$punish = 1;
}
}

//Be sure to edit the e-mail address and custom page info below

if ($punish) {

$momento=date("dS F Y h:i:s A");

// Email the webmaster
$msg .= "The following session generated banned browser agent errors:\n";
$msg .= "Host: $REMOTE_ADDR\n";
$msg .= "Agent: $HTTP_USER_AGENT\n";
$msg .= "Referrer: $HTTP_REFERER\n";
$msg .= "Document: $SERVER_NAME" . $REQUEST_URI . "\n";
$msg .= "Fecha y hora: $momento\n";
$headers .= "X-Priority: 1\n";
$headers .= "From: Ban_Bot <webmaster@tudomino.com>\n";
$headers .= "X-Sender: <webmaster@tudomino.com>\n";

mail ("webmaster@tudomino.com", "BANNED BROWSER AGENT ERROR", $msg, $headers );


// Crea un registro en texto
// Es requisito crear una carpeta "data" con permisos 777
// En esta primera línea es necesario poner la ubicación real del archivo

$myfile = fopen("data/banbot.txt","a");
fputs($myfile,$msg);
fclose($myfile);


// Opcionalmente se puede reenvíar a otro sitio
// para que no limite el uso de tu ancho de banda
// header("Location: http://www.unsitio.com/");


// Print custom page
echo "<html>
<head>
<META NAME=\"ROBOTS\" CONTENT=\"NOINDEX, NOFOLLOW\">
<META HTTP-EQUIV=\"Content-Type\" CONTENT=\"text/html; charset=iso-8859-1\">
<meta http-equiv=\"Content-Style-Type\" content=\"text/css\">
<META NAME=\"COPYRIGHT\" CONTENT=\"Copyright (c) 2000-2002 by Grupo Alternativo / México Extremo\">
<meta name=\"copyright\" content=\"© 2000 - 2002 Grupo Alternativo / México Extremo. Todos los derechos reservados.\">
<title>México Extremo - 403 - Acceso DENEGADO</title>
</head>
<body bgcolor=\"white\">

<table width=\"580\" border=\"0\" cellpadding=\"4\" cellspacing=\"4\"><tr><td>
<img src=\"extremo1.png\" width=\"240\" height=\"70\" border=\"0\" alt=\"México Extremo\">
<h3>
<h1>403 - Acceso DENEGADO</h1>

<p>Información NO disponible. Dudas: contactar al Webmaster de este sitio web.</p>
<p><small>Su dirección IP (<strong>$REMOTE_ADDR</strong>) ha sido registrada. Gracias.</small></p>

</td></tr>
</table>
</body>
</html>";
exit;
}

?>



para finalizar hacen un include en el index.php o donde mejor quede

<? include('block.php'); ?>

Comentarios

Agregar comentario

Rellene los campos de abajo para dejar su comentario.









Extras (Negrita / Cursiva / URL / Imagen):








En este blog está activada la moderación. Tu comentario requiere que los administradores lo aprueben antes de hacerse visible.