Python 访问中文URL相关的问题

CCJ做项目需要用到语音API，而这个API是存在中文URL的，URL的访问和解析都需要转成Unicode编码，而Python默认编码又是UTF8，因此在编写的时候需要进行一个转换。

Eg.1 访问，解析中文：

import urllib
import re
from urllib import request

"""
这个例子是：
    访问的URL存在中文，利用正则表达式匹配+转换的方式提取URL的中文
"""

# 需要访问的 URL
url = "https://api.oick.cn/txt/apiz.php?text=老王测试&spd=2"

# 编码 - 把存在中文的URL编程万国码
data = urllib.parse.quote(url)
print("编码后URL：", data)

# 发送请求…… 这里忽略

# 假设接收data就是接收到的请求，进行解码
de_data = request.unquote(data, encoding='utf-8')  # 解码，将url中转码的中文字符解码
print("解码后URL：", de_data)

# 用匹配所有中文字符
pattern = re.compile("[^\u4e00-\u9fa5]")  
onlyChinese = re.sub(pattern, '', de_data)  # 将模式外的所有字符用空代替，即非中文字符
print("从中提出的中文：", onlyChinese)

Eg.2 提交带中文的URL：

import urllib
import re
from urllib import request

url = "https://api.oick.cn/txt/apiz.php?"


headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36"
}

formData = {
    "text":"老王说鸡哥牛逼",
    "spd":"5"
}

#会生成：url = "https://api.oick.cn/txt/apiz.php?text=老王说鸡哥牛逼&spd=5"

# 编码，发送
data = urllib.parse.urlencode(formData).encode(encoding="UTF8")
req = request.Request(url=url,data=data,headers=headers)

# 接受返回
response = request.urlopen(req)
move_info = response.read().decode()

Post Views: 1,575

一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Python 访问中文URL相关的问题

于2021年9月16日2021年9月16日由Mustenaka发布

Eg.1 访问，解析中文：

Eg.2 提交带中文的URL：

0 条评论

发表回复取消回复

macOS安装unrar(rar)解压缩软件

Python -> 的含义

Python a,b = b,a 的实现

Python 访问中文URL相关的问题

于2021年9月16日2021年9月16日由Mustenaka发布

Eg.1 访问，解析中文：

Eg.2 提交带中文的URL：

0 条评论

发表回复 取消回复

相关文章

macOS安装unrar(rar)解压缩软件

Python -> 的含义

Python a,b = b,a 的实现

发表回复取消回复