站点公告+ 发布

截至年底"某"之领域、"雪"下小屋等二次元站倒了 。我们aixgame稳定不倒,请放心前来 。另外感谢catbox提供的视频存储,现在开始,里番/3D/动画/同人/cosplay,支持低清播放预览(高清需要下载)
11-16 19:58
支付二维码已恢复显示,放心购买,如果你没有显示,刷新一下浏览器 如果没有到账联系客服邮箱或者QQ 如果不记得网址发送邮箱到[email protected](这是自动发送地址邮箱,请勿回复)
11-07 15:11
注册或者已经注册的邮箱必须有效,系统会自动清理无法收到邮件的论坛用户,比如邮箱拒收(拒收的不恢复旧账号金币)或注销的情况。如果取回密码显示不存在就说明被系统注销删除了,这时请联系客服。
10-27 12:12
网站内容鼠标悬浮窗已上线,可以按住自动复制,全选自动复制所有秒传链接,很方便。另外响应不在大陆网友要求,没有支付宝、微信的情况,经过几天的调整修改,PAYPAL已经在网站完成部署,可以支付了。
06-10 02:11
05-28 10:40
04-20 18:44
空间有限,目前不接受非赞助注册,邀请码只是提前充值5元,主机空间都在涨价,别说开论坛免费发邀请码 ,我不信,那都是有广告商赞助,论坛资源会持续更新,其他论坛已关闭注册 ,已经是会员的也可以购买发送邀请码给他人 ,无限制时间请关注官网微博。
04-20 14:30
论坛域名已更改为aixgame.top,永久发布页将会在2022年6月6日更改为aixgame.cc(域名我已经提前买下来了,你们访问不了,是因为我没有解析。原来的aixiagame.cc最多支持到2022年12月22日),aix更好记住,请提前知晓
04-04 22:09
多逛论坛,渡过疫情。验证码总错误 ,重置wifi路由器恢复出厂 。取回密码若提示邮箱不存在或错误 ,联系客服(提供首次充值记录)。 移动网速慢的时候,可换联通电信广电等 。所有网盘压缩包请勿在线解压,只可以下载到本地。
03-30 22:58
请大家減少不必要的刷新,以免加重服务器负担。邮箱前面不需要加www。欢迎优秀作者入驻本站,每月有A币鼓励。最后,祝AI之魂熊熊燃烧,如愛一般永不灭!我们的口号是:愛下一会,你就知道。
03-30 22:58
查看: 340|回复: 0

[教程] 【DiscuzX屏蔽*蜘蛛爬虫】以屏蔽蜘蛛Bytespider为例

[复制链接]
等级

成就
A币
1
主题
15
精华
0
回帖
22
在线时间
936 小时

发表于 2020-6-25 14:20:16 | 显示全部楼层 |阅读模式
大量站长*发现了Bytespider这个新型*蜘蛛爬虫,大量的对*进行抓取,其行为不压于持续的CC攻击,模拟移动手机用户大量抓取信息. GET信息的方式与爬虫类似, 但并未标注任何爬虫信息(站长们已经知道,这是某国内公司的,做新闻头条的). 请求速度经常导致服务器崩溃.
所以,我们就要采取防护措施,如何屏蔽搜索引擎机器人。


以DiscuzX 3.4 R20191201建站程序的来屏蔽*蜘蛛为例子。


首先感谢@dgqjj 提供的思路:robots.txt+程序识别UA屏蔽(php代码屏蔽、nginx代码屏蔽、.htaccess代码屏蔽等等,任选其一)+屏蔽蜘蛛IP段
提供给小白们来解决这个*蜘蛛问题,老白就不用看了,写的不一定好。


打开文件夹logs,查看里面的日志文件,发现了大量的*蜘蛛Bytespider访问爬取,造成大量的流量消耗。


这时候我们就要对他采取必要的封禁行动,四步禁止。
第一步:robots.txt规则封禁
文本中复制粘贴以下代码:

#
# robots.txt for Discuz! X3
#


User-agent: Googlebot
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: Googlebot-Image
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: Baiduspider-news
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: Baiduspider
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: Baiduspider-image
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: baiduboxapp
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: Sosospider
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: bingbot
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: 360Spider
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: HaosouSpider
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: yisouspider
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: YoudaoBot
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: Sogou Orion spider
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: Sogou News Spider
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: Sogou blog
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: Sogou Spider
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: Sogou spider2
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: Sogou inst spider
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: Sogou web spider
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: EasouSpider
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: MSNBot
Disallow: /api/
Disallow: /data/
Disallow: /source/
Disallow: /install/
Disallow: /template/
Disallow: /config/
Disallow: /uc_client/
Disallow: /uc_server/
Disallow: /static/
Disallow: /admin.php
Disallow: /search.php
Disallow: /member.php
Disallow: /api.php
Disallow: /misc.php
Disallow: /connect.php
Disallow: /forum.php?mod=redirect*
Disallow: /forum.php?mod=post*
Disallow: /home.php?mod=spacecp*
Disallow: /userapp.php?mod=app&*
Disallow: /*?mod=misc*
Disallow: /*?mod=attachment*
Disallow: /*mobile=yes*


User-agent: Bytespider
Disallow: /


User-Agent: *
Disallow: /
第二步:程序识别UA屏蔽(.htaccess代码屏蔽)跳转与阻拦
当然*蜘蛛是不会遵守规则的,继续,文本中复制粘贴以下代码:

# 将 RewriteEngine 模式打开
RewriteEngine On
# 修改以下语句中的 /discuz 为您的论坛目录地址,如果程序放在根目录中,请将 /discuz 修改为 /
RewriteBase /
# Rewrite 系统规则请勿修改
RewriteCond %{QUERY_STRING} ^(.*)$
RewriteRule ^topic-(.+)\.html$ portal.php?mod=topic&topic=$1&%1
RewriteCond %{QUERY_STRING} ^(.*)$
RewriteRule ^article-([0-9]+)-([0-9]+)\.html$ portal.php?mod=view&aid=$1&page=$2&%1
RewriteCond %{QUERY_STRING} ^(.*)$
RewriteRule ^forum-(\w+)-([0-9]+)\.html$ forum.php?mod=forumdisplay&fid=$1&page=$2&%1
RewriteCond %{QUERY_STRING} ^(.*)$
RewriteRule ^thread-([0-9]+)-([0-9]+)-([0-9]+)\.html$ forum.php?mod=viewthread&tid=$1&extra=page\%3D$3&page=$2&%1
RewriteCond %{QUERY_STRING} ^(.*)$
RewriteRule ^group-([0-9]+)-([0-9]+)\.html$ forum.php?mod=group&fid=$1&page=$2&%1
RewriteCond %{QUERY_STRING} ^(.*)$
RewriteRule ^space-(username|uid)-(.+)\.html$ home.php?mod=space&$1=$2&%1
RewriteCond %{QUERY_STRING} ^(.*)$
RewriteRule ^blog-([0-9]+)-([0-9]+)\.html$ home.php?mod=space&uid=$1&do=blog&id=$2&%1
RewriteCond %{QUERY_STRING} ^(.*)$
RewriteRule ^(fid|tid)-([0-9]+)\.html$ archiver/index.php?action=$1&value=$2&%1
RewriteCond %{QUERY_STRING} ^(.*)$
RewriteRule ^([a-z]+[a-z0-9_]*)-([a-z0-9_\-]+)\.html$ plugin.php?id=$1:(去掉括号内容,文章变表情了)$2&%1
<IfModule mod_rewrite.c>
## 设置同时屏蔽多个*搜索引擎爬虫
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} "^.*Bytespider.*|.*stagefright.*|.*ExtLinksBot.*|.*DuckDuckGo.*$" [NC]
RewriteRule ^(.*)$ https://weibo.com/u/7430037395?topnav=1&wvr=6&topsug=1&is_all=1 [R=302,L,NC]
# RewriteRule ^(.*)$ - [F]
</IfModule>

第三步:程序识别UA屏蔽(php代码屏蔽)过滤【这个可不选,优先服务器配置代码】
在入口处加这段php代码,然后引用。感谢@gaoyibobo 提供的代码
<?php
// 屏蔽过滤恶意爬虫蜘蛛机器人
$ua = $_SERVER['HTTP_USER_AGENT'];

$now_ua = array('Bytespider');

if(!$ua) {
        header("Content-type: text/html; charset=utf-8");
        die('请勿采集本站,如有疑问请联系客服!');
}else{
        foreach($now_ua as $value)
        if(strpos($ua,$value) !== false) {
                header("Content-type: text/html; charset=utf-8");
                die('请勿采集本站,如有疑问请联系客服!');
        }
}

?>

在目录/source/class/里建个robots.txt,保存后改成robots.php,之后引用到class_core.php

首行加引用php代码:include('robots.php');
如图:

第四步:屏蔽蜘蛛IP段(注意是*蜘蛛)禁止访问这个优先用服务器禁止爬虫IP段,其次选论坛
比如:
RewriteEngine On
RewriteCond %{http:X-Forwarded-For}&%{REMOTE_ADDR}&%{http:X-Real-IP} (6.6.6.6) [NC]
RewriteRule (.*) - [F]

以下是论坛禁止IP方法:

已知的*蜘蛛Bytespider的IP段为以下(如果有新的,会逐渐更新):
220.243.136.*
220.243.135.*
110.249.202.*
110.249.201.*
111.225.148.*
111.225.149.*
60.8.123.*


这两个地方禁止*蜘蛛的IP段。


最后,总结一下:
第一步:robots.txt规则屏蔽
第二步:程序识别UA屏蔽(.htaccess代码屏蔽)跳转与阻拦
第三步:程序识别UA屏蔽(php代码屏蔽)过滤
第四步:屏蔽蜘蛛IP段(注意是*蜘蛛)禁止访问
如果有什么新方法、好方法,欢迎大家留言,给予遇到的人帮助,先就这样吧,祝大家大吉大利、万事顺心。


您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

小黑屋|aix论坛 |Sitemap

GMT+8, 2024-11-23 18:25 , Processed in 0.118999 second(s), 43 queries .

Powered by Discuz! X3.4

Copyright © 2001-2024, Cloud.

快速回复 返回顶部 返回列表