當(dāng)前位置:首頁 > IT技術(shù) > 微信平臺 > 正文

python微信域名或者鏈接批量檢測
2021-07-24 16:23:53

好久沒上來寫博客了,直入主題。

大家經(jīng)常用google搜索,如何提取搜索結(jié)果的鏈接呢

google搜索結(jié)果url提取,F12,來到console端; 粘貼下面語句,回車。

?

var tag=document.getElementsByClassName('r');

 for (var i=0;i<tag.length;i++){
        var a=tag[i].getElementsByTagName("a");
        console.log(a[0].href)
 }

提取出來,保存到url.txt. 待檢測的url和域名,一行一個,先經(jīng)過去重去空白行

import io
import shutil
readPath='oldurl.txt'
writePath='url.txt'
lines_seen=set()
outfiile=io.open(writePath,'a+',encoding='utf-8')
f=io.open(readPath,'r',encoding='utf-8')
for line in f:
    if not len(line):
        continue
    if line not in lines_seen:
        outfiile.write(line)
        lines_seen.add(line)

然后再批量檢測

ok.txt 域名正常

red.txt 已經(jīng)屏蔽的域名和鏈接

#! /usr/bin/env python
#coding:utf-8
import os,urllib,linecache
import sys
import time
import requests

result = list()
strxx = '"Code":"102"'
html = ''
for y in linecache.updatecache(r'url.txt'):
    try:
       headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'}
       #response = urllib.urlopen(x)        
       #html = response.read()
       x = 'http://wx.rrbay.com/pro/wxUrlCheck.ashx?url=' +  y
       response = requests.get(x,headers=headers)
       html = response.text

       time.sleep(3)
       #print x,a
    except Exception,e:
        html = ''
        print e
    if strxx in html:
        print 'ok:'
        print x
        with open ('ok.txt','a') as f:  
            f.write(y)
    else:
        print 'error:'        
        print y
        html = ''
        with open ('red.txt','a') as f:  
            f.write(y)

?

本文摘自 :https://blog.51cto.com/u

開通會員,享受整站包年服務(wù)立即開通 >