注意事項(xiàng)
科大訊飛語音轉(zhuǎn)寫 API 文檔鏈接: https://www.xfyun.cn/doc/asr/lfasr/API.html.
科大訊飛語音轉(zhuǎn)寫Python3的demo下載鏈接:http://xfyun-doc.ufile.ucloud.com.cn/1564736425808301/weblfasr_python3_demo.zip
上一篇寫了用百度智能云進(jìn)行音頻文件轉(zhuǎn)寫的博客,但是那個(gè)效果啊,有點(diǎn)慘不忍睹,至少我的識(shí)別結(jié)果是這樣。然后轉(zhuǎn)而使用了一下科大訊飛的,想著科大訊飛專門做語音相關(guān)的這一塊,應(yīng)該會(huì)好些。語音轉(zhuǎn)寫的Python3的demo代碼確實(shí)很不錯(cuò),函數(shù)接口很簡(jiǎn)潔,本文代碼都是這個(gè)demo里面的。識(shí)別準(zhǔn)確率還是可以的,而且不需要像百度那樣整點(diǎn)才開始識(shí)別,很快就返回了識(shí)別結(jié)果。
如果你的錄音是不止一個(gè)人,而是像電話錄音那種,想把轉(zhuǎn)寫結(jié)果中不同人說的話的分離出來,請(qǐng)按照下面這樣添加預(yù)處理參數(shù)(demo中默認(rèn)是沒有添加這兒最后兩個(gè)參數(shù)的,不添加的話,默認(rèn)是不進(jìn)行角色分離的):
這樣的話,轉(zhuǎn)寫結(jié)果的speaker的值就不全是0了,而是根據(jù)不同的人對(duì)轉(zhuǎn)寫結(jié)果進(jìn)行分離:
操作系統(tǒng):Windows
Python:3.6
可用時(shí)長(zhǎng): 免費(fèi)用戶時(shí)長(zhǎng)5小時(shí),且用且珍惜。
音頻屬性: 采樣率16k或8k、位長(zhǎng)8bits或16bits、單聲道&多聲道
音頻格式: wav/flac/opus/m4a/mp3
音頻大小: 不超過500M
音頻時(shí)長(zhǎng): 不超過5小時(shí),建議5分鐘以上
語言種類: 中文普通話、英文
轉(zhuǎn)寫結(jié)果保存時(shí)長(zhǎng) 30天。(同一通錄音不需要重新上傳識(shí)別,如果你已經(jīng)上傳識(shí)別過了,之后只需要使用api.get_result_request(taskid)的方式即可再次獲取識(shí)別結(jié)果,taskid是你第一次上傳錄音時(shí)給你分配的任務(wù)ID,避免重復(fù)上傳浪費(fèi)可用時(shí)長(zhǎng))
APP_ID, SECRET_KEY的獲取
訊飛的好像不需要API_KEY,開放授權(quán)的方式和其他大廠的類似:
1、頁面右上方“控制臺(tái)”點(diǎn)擊進(jìn)入,登錄訊飛賬號(hào)(沒有就注冊(cè)一個(gè)),進(jìn)入訊飛開放平臺(tái)。
2、左側(cè)導(dǎo)航欄上方,依次選擇 語音識(shí)別->語音轉(zhuǎn)寫->離線語音轉(zhuǎn)寫識(shí)別。
3、服務(wù)申請(qǐng)。點(diǎn)擊“創(chuàng)建應(yīng)用”,“接口選擇”已默認(rèn)勾選完成,如無其他需求,無需勾選,完成其他資料后,點(diǎn)擊最下方“立即創(chuàng)建”按鈕。自己可以手動(dòng)領(lǐng)取5小時(shí)免費(fèi)試用體驗(yàn)包。
4、應(yīng)用成功則頁面顯示“創(chuàng)建完畢”,點(diǎn)擊”返回應(yīng)用列表”, 查看新創(chuàng)建應(yīng)用詳情,在服務(wù)接口認(rèn)證信息窗口就可以看到返回的AppID,SecretKey。
話不多說,直接上代碼了
# -*- coding: utf-8 -*-
#
# author: yanmeng2
#
# 非實(shí)時(shí)轉(zhuǎn)寫調(diào)用demo
import
base64
import
hashlib
import
hmac
import
json
import
os
import
time
import
requests
lfasr_host
=
'http://raasr.xfyun.cn/api'
# 請(qǐng)求的接口名
api_prepare
=
'/prepare'
api_upload
=
'/upload'
api_merge
=
'/merge'
api_get_progress
=
'/getProgress'
api_get_result
=
'/getResult'
# 文件分片大小10M
file_piece_sice
=
10485760
# ——————————————————轉(zhuǎn)寫可配置參數(shù)————————————————
# 參數(shù)可在官網(wǎng)界面(https://doc.xfyun.cn/rest_api/%E8%AF%AD%E9%9F%B3%E8%BD%AC%E5%86%99.html)查看,根據(jù)需求可自行在gene_params方法里添加修改
# 轉(zhuǎn)寫類型
lfasr_type
=
0
# 是否開啟分詞
has_participle
=
'false'
has_seperate
=
'true'
# 多候選詞個(gè)數(shù)
max_alternatives
=
0
# 子用戶標(biāo)識(shí)
suid
=
''
class
SliceIdGenerator
:
"""slice id生成器"""
def
__init__
(
self
)
:
self
.
__ch
=
'aaaaaaaaa`'
def
getNextSliceId
(
self
)
:
ch
=
self
.
__ch
j
=
len
(
ch
)
-
1
while
j
>=
0
:
cj
=
ch
[
j
]
if
cj
!=
'z'
:
ch
=
ch
[
:
j
]
+
chr
(
ord
(
cj
)
+
1
)
+
ch
[
j
+
1
:
]
break
else
:
ch
=
ch
[
:
j
]
+
'a'
+
ch
[
j
+
1
:
]
j
=
j
-
1
self
.
__ch
=
ch
return
self
.
__ch
class
RequestApi
(
object
)
:
def
__init__
(
self
,
appid
,
secret_key
,
upload_file_path
)
:
self
.
appid
=
appid
self
.
secret_key
=
secret_key
self
.
upload_file_path
=
upload_file_path
# 根據(jù)不同的apiname生成不同的參數(shù),本示例中未使用全部參數(shù)您可在官網(wǎng)(https://doc.xfyun.cn/rest_api/%E8%AF%AD%E9%9F%B3%E8%BD%AC%E5%86%99.html)查看后選擇適合業(yè)務(wù)場(chǎng)景的進(jìn)行更換
def
gene_params
(
self
,
apiname
,
taskid
=
None
,
slice_id
=
None
)
:
appid
=
self
.
appid
secret_key
=
self
.
secret_key
upload_file_path
=
self
.
upload_file_path
ts
=
str
(
int
(
time
.
time
(
)
)
)
m2
=
hashlib
.
md5
(
)
m2
.
update
(
(
appid
+
ts
)
.
encode
(
'utf-8'
)
)
md5
=
m2
.
hexdigest
(
)
md5
=
bytes
(
md5
,
encoding
=
'utf-8'
)
# 以secret_key為key, 上面的md5為msg, 使用hashlib.sha1加密結(jié)果為signa
signa
=
hmac
.
new
(
secret_key
.
encode
(
'utf-8'
)
,
md5
,
hashlib
.
sha1
)
.
digest
(
)
signa
=
base64
.
b64encode
(
signa
)
signa
=
str
(
signa
,
'utf-8'
)
file_len
=
os
.
path
.
getsize
(
upload_file_path
)
file_name
=
os
.
path
.
basename
(
upload_file_path
)
param_dict
=
{
}
if
apiname
==
api_prepare
:
# slice_num是指分片數(shù)量,如果您使用的音頻都是較短音頻也可以不分片,直接將slice_num指定為1即可
slice_num
=
int
(
file_len
/
file_piece_sice
)
+
(
0
if
(
file_len
%
file_piece_sice
==
0
)
else
1
)
param_dict
[
'app_id'
]
=
appid
param_dict
[
'signa'
]
=
signa
param_dict
[
'ts'
]
=
ts
param_dict
[
'file_len'
]
=
str
(
file_len
)
param_dict
[
'file_name'
]
=
file_name
param_dict
[
'slice_num'
]
=
str
(
slice_num
)
elif
apiname
==
api_upload
:
param_dict
[
'app_id'
]
=
appid
param_dict
[
'signa'
]
=
signa
param_dict
[
'ts'
]
=
ts
param_dict
[
'task_id'
]
=
taskid
param_dict
[
'slice_id'
]
=
slice_id
elif
apiname
==
api_merge
:
param_dict
[
'app_id'
]
=
appid
param_dict
[
'signa'
]
=
signa
param_dict
[
'ts'
]
=
ts
param_dict
[
'task_id'
]
=
taskid
param_dict
[
'file_name'
]
=
file_name
elif
apiname
==
api_get_progress
or
apiname
==
api_get_result
:
param_dict
[
'app_id'
]
=
appid
param_dict
[
'signa'
]
=
signa
param_dict
[
'ts'
]
=
ts
param_dict
[
'task_id'
]
=
taskid
return
param_dict
# 請(qǐng)求和結(jié)果解析,結(jié)果中各個(gè)字段的含義可參考:https://doc.xfyun.cn/rest_api/%E8%AF%AD%E9%9F%B3%E8%BD%AC%E5%86%99.html
def
gene_request
(
self
,
apiname
,
data
,
files
=
None
,
headers
=
None
)
:
response
=
requests
.
post
(
lfasr_host
+
apiname
,
data
=
data
,
files
=
files
,
headers
=
headers
)
result
=
json
.
loads
(
response
.
text
)
if
result
[
"ok"
]
==
0
:
print
(
"{} success:"
.
format
(
apiname
)
+
str
(
result
)
)
return
result
else
:
print
(
"{} error:"
.
format
(
apiname
)
+
str
(
result
)
)
exit
(
0
)
return
result
# 預(yù)處理
def
prepare_request
(
self
)
:
return
self
.
gene_request
(
apiname
=
api_prepare
,
data
=
self
.
gene_params
(
api_prepare
)
)
# 上傳
def
upload_request
(
self
,
taskid
,
upload_file_path
)
:
file_object
=
open
(
upload_file_path
,
'rb'
)
try
:
index
=
1
sig
=
SliceIdGenerator
(
)
while
True
:
content
=
file_object
.
read
(
file_piece_sice
)
if
not
content
or
len
(
content
)
==
0
:
break
files
=
{
"filename"
:
self
.
gene_params
(
api_upload
)
.
get
(
"slice_id"
)
,
"content"
:
content
}
response
=
self
.
gene_request
(
api_upload
,
data
=
self
.
gene_params
(
api_upload
,
taskid
=
taskid
,
slice_id
=
sig
.
getNextSliceId
(
)
)
,
files
=
files
)
if
response
.
get
(
'ok'
)
!=
0
:
# 上傳分片失敗
print
(
'upload slice fail, response: '
+
str
(
response
)
)
return
False
print
(
'upload slice '
+
str
(
index
)
+
' success'
)
index
+=
1
finally
:
'file index:'
+
str
(
file_object
.
tell
(
)
)
file_object
.
close
(
)
return
True
# 合并
def
merge_request
(
self
,
taskid
)
:
return
self
.
gene_request
(
api_merge
,
data
=
self
.
gene_params
(
api_merge
,
taskid
=
taskid
)
)
# 獲取進(jìn)度
def
get_progress_request
(
self
,
taskid
)
:
return
self
.
gene_request
(
api_get_progress
,
data
=
self
.
gene_params
(
api_get_progress
,
taskid
=
taskid
)
)
# 獲取結(jié)果
def
get_result_request
(
self
,
taskid
)
:
return
self
.
gene_request
(
api_get_result
,
data
=
self
.
gene_params
(
api_get_result
,
taskid
=
taskid
)
)
def
all_api_request
(
self
)
:
# 1. 預(yù)處理
pre_result
=
self
.
prepare_request
(
)
taskid
=
pre_result
[
"data"
]
# 2 . 分片上傳
self
.
upload_request
(
taskid
=
taskid
,
upload_file_path
=
self
.
upload_file_path
)
# 3 . 文件合并
self
.
merge_request
(
taskid
=
taskid
)
# 4 . 獲取任務(wù)進(jìn)度
while
True
:
# 每隔20秒獲取一次任務(wù)進(jìn)度
progress
=
self
.
get_progress_request
(
taskid
)
progress_dic
=
progress
if
progress_dic
[
'err_no'
]
!=
0
and
progress_dic
[
'err_no'
]
!=
26605
:
print
(
'task error: '
+
progress_dic
[
'failed'
]
)
return
else
:
data
=
progress_dic
[
'data'
]
task_status
=
json
.
loads
(
data
)
if
task_status
[
'status'
]
==
9
:
print
(
'task '
+
taskid
+
' finished'
)
break
print
(
'The task '
+
taskid
+
' is in processing, task status: '
+
str
(
data
)
)
# 每次獲取進(jìn)度間隔20S
time
.
sleep
(
20
)
# 5 . 獲取結(jié)果
self
.
get_result_request
(
taskid
=
taskid
)
# 注意:如果出現(xiàn)requests模塊報(bào)錯(cuò):"NoneType" object has no attribute 'read', 請(qǐng)嘗試將requests模塊更新到2.20.0或以上版本(本demo測(cè)試版本為2.20.0)
# 輸入訊飛開放平臺(tái)的appid,secret_key和待轉(zhuǎn)寫的文件路徑
if
__name__
==
'__main__'
:
APP_ID
=
"***"
SECRET_KEY
=
"****"
file_path
=
r
"***.wav"
api
=
RequestApi
(
appid
=
APP_ID
,
secret_key
=
SECRET_KEY
,
upload_file_path
=
file_path
)
api
.
all_api_request
(
)
當(dāng)然,你可以根據(jù)自己的需求對(duì)demo進(jìn)行改進(jìn),比如你想并發(fā)識(shí)別錄音,你可以添加多線程執(zhí)行的函數(shù),為了獲取taskid方便,我在class的初始化里邊添加了self.taskid = “None”,并在預(yù)處理結(jié)果返回之后重新對(duì)taskid賦值。
def
thread_func
(
wav_file_path
,
txt_file_path
)
:
# 線程函數(shù),方便并發(fā)識(shí)別錄音
doc
=
open
(
txt_file_path
,
'w'
,
encoding
=
'utf-8'
)
# doc.close()
api
=
RequestApi
(
appid
=
APP_ID
,
secret_key
=
SECRET_KEY
,
upload_file_path
=
wav_file_path
)
api
.
all_api_request
(
)
# demo中這個(gè)函數(shù)是完整過程執(zhí)行,但我把提取結(jié)果的模塊提出來了
print
(
'taskid is: '
+
api
.
taskid
,
file
=
doc
)
result
=
api
.
get_result_request
(
api
.
taskid
)
result
=
eval
(
result
[
'data'
]
)
# print(result)
for
x
in
result
:
print
(
x
)
print
(
x
,
file
=
doc
)
doc
.
close
(
)
#主函數(shù)寫成類似這種
if
__name__
==
'__main__'
:
APP_ID
=
"***"
SECRET_KEY
=
"***"
file_read_path
=
r
"D:\MyProject\Python\Voice_SDK\20190820\\"
file_save_path
=
r
"D:\MyProject\Python\Voice_SDK\20190820_xunfei\\"
for
file
in
file_list
:
#多并發(fā)批量執(zhí)行
wav_file_path
=
file_read_path
+
file
+
".wav"
txt_file_path
=
file_save_path
+
file
+
".txt"
t
=
threading
.
Thread
(
target
=
thread_func
,
args
=
(
wav_file_path
,
txt_file_path
)
)
t
.
start
(
)
更多文章、技術(shù)交流、商務(wù)合作、聯(lián)系博主
微信掃碼或搜索:z360901061

微信掃一掃加我為好友
QQ號(hào)聯(lián)系: 360901061
您的支持是博主寫作最大的動(dòng)力,如果您喜歡我的文章,感覺我的文章對(duì)您有幫助,請(qǐng)用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧,狠狠點(diǎn)擊下面給點(diǎn)支持吧,站長(zhǎng)非常感激您!手機(jī)微信長(zhǎng)按不能支付解決辦法:請(qǐng)將微信支付二維碼保存到相冊(cè),切換到微信,然后點(diǎn)擊微信右上角掃一掃功能,選擇支付二維碼完成支付。
【本文對(duì)您有幫助就好】元
