当前位置：首页 > news >正文

python3爬取墨迹天气并发送给微信好友，附源码

news 来源：原创 2024/5/5 16:36:02

需求：

1. 爬取墨迹天气的信息，包括温湿度、风速、紫外线、限号情况，生活tips等信息

2. 输入需要查询的城市，自动爬取相应信息

3. 链接微信，发送给指定好友

思路比较清晰，主要分两块，一是爬虫，二是用python链接微信（非企业版微信）

先随便观察一个城市的墨迹天气，例如石家庄市的url为“https://tianqi.moji.com/weather/china/hebei/shijiazhuang”，多观察几个城市的url可发现共同点就是，前面的都一样，后面的是以省拼音/市拼音结尾的。当然直辖市两者拼音一样。当然还有一些额外情况，比如山西和陕西，后者的拼音是Shaanxi，这个用户输入的时候注意一下

prov = input("请输入省份：")

city = input("请输入城市：")

pin = Pinyin()

prov_pin = pin.get_pinyin(prov,'')#将汉字转为拼音

city_pin = pin.get_pinyin(city,'')

url = "https://tianqi.moji.com/weather/china/"

url = url + prov_pin +'/'+ city_pin

print(url)

将用户输入的省、市与开头不变的做字符串连接，形成需要爬取的完整的url。我这里用户输入的是中文，而url中需要的是拼音，因此安装了第三方库xpinyin

#获取天气信息begin#

htmlData = request.urlopen(url).read().decode('utf-8')

soup = BeautifulSoup(htmlData, 'lxml')

#print(soup.prettify())

weather = soup.find('div',attrs={'class':"wea_weather clearfix"})

#print(weather)

temp1 = weather.find('em').get_text()

temp2 = weather.find('b').get_text()

# 使用select标签时，如果class中有空格，将空格改为“.”才能筛选出来

# 空气质量AQI

AQI = soup.select(".wea_alert.clearfix > ul > li > a > em")[0].get_text()

H = soup.select(".wea_about.clearfix > span")[0].get_text()#湿度

S = soup.select(".wea_about.clearfix > em")[0].get_text()#风速

if prov == '北京' or prov == '天津':

F = soup.select(".wea_about.clearfix > b")[0].get_text()#查找尾号限行

A = soup.select(".wea_tips.clearfix em")[0].get_text()#今日天气提示

U = soup.select(".live_index_grid > ul > li")[-3].find('dt').get_text() #紫外线强度

#print(AQI,H,S,A,U)

DATE = str(datetime.date.today())#获取当天日期****-**-**

if prov == '北京' or prov =='天津' or prov =='上海' or prov =='重庆':

if prov == '北京' or prov =='天津':

info = '来自大明明的天气问候\n' + city + '市' + ',' + DATE + '\n'+ '实时温度：' + temp1 + '℃' + ',' + temp2 + '\n' '湿度：' + H + '\n' '风速：' + S + '\n' '紫外线：' + U +'\n' '今日提示：' + A + '\n' +'今日限行：' + F

else:

info = '来自大明明的天气问候\n' + city + '市' + ',' + DATE + '\n'+ '实时温度：' + temp1 + '℃' + ',' + temp2 + '\n' '湿度：' + H + '\n' '风速：' + S + '\n' '紫外线：' + U +'\n' '今日提示：' + A

else:

info = '来自大明明的天气问候\n' + prov +'省' + city + '市' + ',' + DATE + '\n'+ '实时温度：' + temp1 + '℃' + ',' + temp2 + '\n' '湿度：' + H + '\n' '风速：' + S + '\n' '紫外线：' + U +'\n' '今日提示：' + A

#print(info)

#获取明日天气

tomorrow = soup.select(".days.clearfix ")[1].find_all('li')

temp_t = tomorrow[2].get_text().replace('°','℃')+ ',' + tomorrow[1].find('img').attrs['alt']#明日温度

S_t1 = tomorrow[3].find('em').get_text()

S_t2 = tomorrow[3].find('b').get_text()

S_t = S_t1 + S_t2#明日风速

AQI_t = tomorrow[-1].get_text().strip()#明日空气质量

info_t = '\n明日天气：\n' + '温度：' + temp_t + '\n' + '风速：' + S_t + '\n' '空气质量：' + AQI_t + '\n'

#print(info_t)

#获取天气信息结束

有几点注意的是：

1、尾号限行不是每个城市都有的，需要判断下

2、直辖市输出的时候，最好不要写成“北京省北京市”，这样很别扭

3. 使用select筛选的的是class名或者id名，注意同级和下一级的书写形式；find和find_all是查找的标签

4. 查找单标签中的内容，例如<img alt=**** src=‘***************************.jpg’>这种，想查alt等号后面的内容，或者src后面的连接，用正则感觉很麻烦

#获取生活tips开始

# url1 = 'https://tianqi.moji.com/'

# url3 = '/china/beijing/beijing'

#定义一个tips的字典

tips_dict = {'cold':'感冒预测','makeup':'化妆指数','uray':'紫外线量','dress':'穿衣指数','car':'关于洗车','sport':'运动事宜'}

info_tips = ''

for i in list(tips_dict.keys()):

url_tips = url.replace('weather',i)

#url_tips = url1 + i + url3

#print(url_tips)

htmlData = request.urlopen(url_tips).read().decode('utf-8')

soup = BeautifulSoup(htmlData, 'lxml')

tips = soup.select(".aqi_info_tips > dd")[0].get_text()

#print(tips)

info_tips = info_tips + tips_dict.get(i) + ':' +tips +'\n'

#print(info_tips)

#获取生活tips结束

生活tips在另外的网页中，可以观察到网页的形式是一样的，只是中间的weather换成了其他，因此写一段做循环就ok了

这里用到了字典是因为输出的时候想用中文做提示

链接微信需要安装第三方库itchat，链接只需要这一句话，很简单。初次链接会弹出二维码，手机扫二维码登陆

#链接微信

itchat.auto_login(hotReload=True) #在一段时间内运行不需要扫二维码登陆

全部代码

"""

从墨迹天气中获取天气信息，推送给微信好友

"""

from bs4 import BeautifulSoup

from urllib import request

import datetime

import itchat

from xpinyin import Pinyin

prov = input("请输入省份：")

city = input("请输入城市：")

pin = Pinyin()

prov_pin = pin.get_pinyin(prov,'')#将汉字转为拼音

city_pin = pin.get_pinyin(city,'')

url = "https://tianqi.moji.com/weather/china/"

url = url + prov_pin +'/'+ city_pin

print(url)

#获取天气信息begin#

htmlData = request.urlopen(url).read().decode('utf-8')

soup = BeautifulSoup(htmlData, 'lxml')

#print(soup.prettify())

weather = soup.find('div',attrs={'class':"wea_weather clearfix"})

#print(weather)

temp1 = weather.find('em').get_text()

temp2 = weather.find('b').get_text()

# 使用select标签时，如果class中有空格，将空格改为“.”才能筛选出来

# 空气质量AQI

AQI = soup.select(".wea_alert.clearfix > ul > li > a > em")[0].get_text()

H = soup.select(".wea_about.clearfix > span")[0].get_text()#湿度

S = soup.select(".wea_about.clearfix > em")[0].get_text()#风速

if prov == '北京' or prov == '天津':

F = soup.select(".wea_about.clearfix > b")[0].get_text()#查找尾号限行

A = soup.select(".wea_tips.clearfix em")[0].get_text()#今日天气提示

U = soup.select(".live_index_grid > ul > li")[-3].find('dt').get_text() #紫外线强度

#print(AQI,H,S,A,U)

DATE = str(datetime.date.today())#获取当天日期****-**-**

if prov == '北京' or prov =='天津' or prov =='上海' or prov =='重庆':

if prov == '北京' or prov =='天津':

info = '来自XX的天气问候\n' + city + '市' + ',' + DATE + '\n'+ '实时温度：' + temp1 + '℃' + ',' + temp2 + '\n' '湿度：' + H + '\n' '风速：' + S + '\n' '紫外线：' + U +'\n' '今日提示：' + A + '\n' +'今日限行：' + F

else:

info = '来自XX的天气问候\n' + city + '市' + ',' + DATE + '\n'+ '实时温度：' + temp1 + '℃' + ',' + temp2 + '\n' '湿度：' + H + '\n' '风速：' + S + '\n' '紫外线：' + U +'\n' '今日提示：' + A

else:

info = '来自XX的天气问候\n' + prov +'省' + city + '市' + ',' + DATE + '\n'+ '实时温度：' + temp1 + '℃' + ',' + temp2 + '\n' '湿度：' + H + '\n' '风速：' + S + '\n' '紫外线：' + U +'\n' '今日提示：' + A

#print(info)

#获取明日天气

tomorrow = soup.select(".days.clearfix ")[1].find_all('li')

#<img alt=***** src="*************************.jpg">标签的查找

temp_t = tomorrow[2].get_text().replace('°','℃')+ ',' + tomorrow[1].find('img').attrs['alt']#明日温度

S_t1 = tomorrow[3].find('em').get_text()

S_t2 = tomorrow[3].find('b').get_text()

S_t = S_t1 + S_t2#明日风速

AQI_t = tomorrow[-1].get_text().strip()#明日空气质量

info_t = '\n明日天气：\n' + '温度：' + temp_t + '\n' + '风速：' + S_t + '\n' '空气质量：' + AQI_t + '\n'

#print(info_t)

#获取天气信息结束

#获取生活tips开始

# url1 = 'https://tianqi.moji.com/'

# url3 = '/china/beijing/beijing'

#定义一个tips的字典

tips_dict = {'cold':'感冒预测','makeup':'化妆指数','uray':'紫外线量','dress':'穿衣指数','car':'关于洗车','sport':'运动事宜'}

info_tips = ''

for i in list(tips_dict.keys()):

url_tips = url.replace('weather',i)

#url_tips = url1 + i + url3

#print(url_tips)

htmlData = request.urlopen(url_tips).read().decode('utf-8')

soup = BeautifulSoup(htmlData, 'lxml')

tips = soup.select(".aqi_info_tips > dd")[0].get_text()

#print(tips)

info_tips = info_tips + tips_dict.get(i) + ':' +tips +'\n'

#print(info_tips)

#获取生活tips结束

#链接微信

itchat.auto_login(hotReload=True)#在一段时间内运行不需要扫二维码登陆

#给自己的文件助手filehelper发送信息,此时无需访问通讯录

#itchat.send('❤来自大明明的天气问候❤',toUserName='filehelper')

#I = itchat.search_friends()# 获取自己的信息，返回自己的属性字典

#friends = itchat.get_friends(update=True)#返回值类型<class 'itchat.storage.templates.ContactList'>。可以看做是列表，列表里的每个元素是一个字典，对应一个好友信息

#userName=itchat.search_friends(userName='@b895b018931614e8d30a16b15a8db2da')# 获取特定UserName的用户信息，列表

info_all = '❤❤❤❤❤❤❤❤❤❤❤\n'+info + '\n' + info_tips + info_t + '❤❤❤❤❤❤❤❤❤❤❤'

print(info_all)

#发送微信个人

def sendToPerson(nickName):

user = itchat.search_friends(name=nickName)# 使用备注名或者昵称搜索，微信号不行；若有重名的则全部返回，列表

#print(user)

userName = user[0]['UserName']

itchat.send(info_all, toUserName=userName)

print('succeed')

#发送微信群

def sendToRoom(nickName):

user = itchat.search_chatrooms(name=nickName)# 支持模糊匹配

#print(user)

userName = user[0]['UserName']

itchat.send(info_all, toUserName=userName)

print('succeed')

sendToPerson(input("你要问候哪位小宝贝呀？"))

sendToRoom(input("你要轰炸那个群呀？"))

微信中的显示：

需要改进之处：

1. 有些地名url和汉字拼音不是匹配的，例如齐齐哈尔，拼音是qiqihaer，但是url中是qiqihar，这种情况很多。因此最好是提前有对应的字典

2. 微信无法长连接，过一段时间就会退出，没法做到每日定时推送

3. 本程序只做到了市一层，墨迹天气还可以在细分到下面的区，这里更需要中国城区字典的支持

转载于:https://blog.51cto.com/13870710/2300867

晒一晒老司机写的“超融合私有云”解决方案

4种删除Word空白页的小技巧，都是你需要用到的！

ASP.NET Core 2.2.0-preview3 发布

LaTeX-用polynom宏包排版多项式的除法

Java中JNI的使用（上）

番外篇——什么叫会工作

Python3.6使用tesseract-ocr的正确方法

(转)如何上传第三方jar包至Maven私服让maven项目可以使用第三方jar包

Java提高篇（一）：区分引用变量与对象

Elasticsearch 参考指南（升级前重新索引）

FreeWheel业务系统微服务化过程经验分享

CENTOS7 Python3.7安装 scipy

问:在指定的JSON数据中（最外层是数组）根据指定条件拿到匹配到的结果

函数柯里化

前端页面注意事项

分享一款快速APP功能测试工具

CSS 专业技巧

HTML中设置input等文本框为不可操作

HTTP传输编码增加了传输量，只为解决这一个问题 | 实用 HTTP

JS基础篇--通过JS生成由字母与数字组合的随机字符串

LeetCode刷题——29. Divide Two Integers（Part 1靠自己）

linux安装openssl、swoole等扩展的具体步骤

Map集合、散列表、红黑树介绍

php ci框架整合银盛支付

Python爬虫--- 1.3 BS4库的解析器

Spring Cloud中负载均衡器概览

SwizzleMethod 黑魔法

Unix命令

更好理解的面向对象的Javascript 1 —— 动态类型和多态

前端攻城师

前嗅ForeSpider采集配置界面介绍

浅析微信支付：申请退款、退款回调接口、查询退款

算法-图和图算法

学习Vue.js的五个小例子

hi-nginx-1.3.4编译安装

zabbix3.2监控linux磁盘IO

新年再起“裁员潮”，“钢铁侠”马斯克要一举裁掉SpaceX 600余名员工 ...

# 再次尝试连接失败_无线WiFi无法连接到网络怎么办【解决方法】

(16)UiBot：智能化软件机器人（以头歌抓取课程数据为例）

（附源码）流浪动物保护平台的设计与实现毕业设计 161154

（算法）Travel Information Center

（提供数据集下载）基于大语言模型LangChain与ChatGLM3-6B本地知识库调优：数据集优化、参数调整、Prompt提示词优化实战

(原創) 博客園正式支援VHDL語法著色功能 (SOC) (VHDL)

(源码版)2024美国大学生数学建模E题财产保险的可持续模型详解思路+具体代码季节性时序预测SARIMA天气预测建模

（转）visual stdio 书签功能介绍

.NET Core WebAPI中使用Log4net 日志级别分类并记录到数据库

.NET Core、DNX、DNU、DNVM、MVC6学习资料

.net 桌面开发运行一阵子就自动关闭_聊城旋转门家用价格大约是多少,全自动旋转门,期待合作...

.project文件

/etc/apt/sources.list 和 /etc/apt/sources.list.d

@JSONField或@JsonProperty注解使用

@四年级家长，这条香港优才计划+华侨生联考捷径，一定要看！

[ 数据结构 - C++] AVL树原理及实现

[51nod1610]路径计数

[asp.net core]project.json（2）

相关文章：