OpenAI Whisper本地CPU推理的四种方法
ztj100 2024-12-17 17:48 34 浏览 0 评论
在这篇文章中,我将总结我为 Whisper 自动语音识别模型运行推理的实验。
Whisper 是 OpenAI 推出的基于 Transformer 的开源 ASR 模型。在我的案例中,该模型是在患有言语障碍的人的语音记录数据集上进行微调的。
我尝试了以下在 CPU 上进行推理的选项:
- HuggingFace 管道
- ONNX 运行时
- OpenVino 运行时
- PyTorch 推理
所有这些方法(ONNX 除外)都在此 Git 存储库中实现。
TL;DR
以下是最终结果:
- PyTorch(16 核)≈ 3.5 秒
- OpenVino int4 ≈ 4.0 秒
- OpenVino,int8 ≈ 4.2 秒
- PyTorch(8 核)≈ 4.2 秒
- PyTorch(4 核)≈ 8.0 秒
- HF 管道 ≈ 18.0 秒
接下来,我将详细展示每个解决方案的实现。
1、使用 HuggingFace 管道进行 Whisper 推理
由于我们的模型是使用 transformers 库进行预训练并存储在 HuggingFace 集线器上的,因此第一个也是最直接的选择是使用内置管道。
这是通过 HF 管道进行 Whisper 推理的类。有很多教程可用,所以我不会在这里详细介绍:
class WhisperService:
_initialized = False
def __init__(self, language='en'):
if not WhisperService._initialized:
os.environ["TRANSFORMERS_VERBOSITY"] = "error"
transformers_log.set_verbosity_error()
self.model_name = utils.MODEL_NAME
self.language = language
self.task = utils.TASK
try:
# Initialize model and related components
log.info("Starting Whisper service...")
self.peft_config = self.generate_model_config()
self.model = self.get_whisper_model_from_hf(self.peft_config)
self.tokenizer = self.create_tokenizer(self.peft_config)
self.processor = self.create_processor(self.peft_config)
self.pipeline_asr, self.forced_decoder_ids = self.create_whisper_pipeline(
self.model, self.tokenizer, self.processor
)
WhisperService._initialized = True
log.info("Whisper service started with success!")
except Exception as e:
log.error(f"Error during Whisper service init: {str(e)}")
raise
def generate_model_config(self) -> PeftConfig:
"""
"""
try:
login(token=os.environ['API_TOKEN'])
config = PeftConfig.from_pretrained(self.model_name)
log.info("Model config generated")
return config
except Exception as e:
log.error(f"Error during model config generation: {str(e)}")
raise
def get_whisper_model_from_hf(self, peft_config: PeftConfig) -> PeftModel:
"""
"""
try:
model = WhisperForConditionalGeneration.from_pretrained(
peft_config.base_model_name_or_path
)
# Check if GPU is available
if torch.cuda.is_available():
log.info("Model loaded on GPU")
else:
log.info("Model loaded on CPU")
model = PeftModel.from_pretrained(model, self.model_name)
log.info("Whisper model configured with PeftModel")
return model
except Exception as e:
log.error(f"Error during Whisper model loading: {str(e)}")
raise
def create_processor(self, peft_config: PeftConfig) -> WhisperProcessor:
"""
"""
try:
processor = WhisperProcessor.from_pretrained(
peft_config.base_model_name_or_path,
language=self.language,
task=self.task
)
log.info("WhisperProcessor created")
return processor
except Exception as e:
log.error(f"Error during WhisperProcessor creation: {str(e)}")
raise
def create_tokenizer(self, peft_config: PeftConfig) -> WhisperTokenizer:
"""
"""
try:
tokenizer = WhisperTokenizer.from_pretrained(
peft_config.base_model_name_or_path,
language=self.language,
task=self.task
)
log.info("WhisperTokenizer created")
return tokenizer
except Exception as e:
log.error(f"Error during WhisperTokenizer creation: {str(e)}")
raise
def create_whisper_pipeline(self, model: PreTrainedModel, tokenizer: WhisperTokenizer,
processor: WhisperProcessor) -> tuple:
"""
"""
try:
feature_extractor = processor.feature_extractor
pipe_lora = AutomaticSpeechRecognitionPipeline(
model=model,
tokenizer=tokenizer,
feature_extractor=feature_extractor
)
forced_decoder_ids = processor.get_decoder_prompt_ids(language=self.language, task=self.task)
log.info("Pipeline created")
return pipe_lora, forced_decoder_ids
except Exception as e:
log.error(f"Error during Pipeline creation: {str(e)}")
raise
async def transcribe(self, audio_path: str) -> str:
"""
"""
try:
loop = asyncio.get_event_loop()
log.info(f"Transcribing the following file audio: {audio_path}")
with torch.cuda.amp.autocast():
text = await loop.run_in_executor(
None,
lambda:
self.pipeline_asr(audio_path, generate_kwargs={"forced_decoder_ids": self.forced_decoder_ids},
max_new_tokens=255)["text"]
)
log.info("Transcription completed!")
return text
except Exception as e:
log.error(f"Error during transcription: {str(e)}")
raise
这里我们从 HuggingFace hub 获取模型( utils.MODEL_NAME 是 HF 模型标识符 — 例如“miosipof/asr_EN_medium_v1”)。
请注意,此模型是一个适配器,在 PEFT(参数高效微调)框架的帮助下进行训练。我们使用 generate_model_config() 函数来提取 PEFT 模型的配置。
管道通过以下代码建立:
pipe_lora = AutomaticSpeechRecognitionPipeline(
model=model,
tokenizer=tokenizer,
feature_extractor=feature_extractor
)
2、ONNX 运行时
2.1 模型格式转换
模型首先需要转换为 ONNX 格式。
让我们导入一些库:
from onnxruntime.quantization import quantize_dynamic, QuantType
import onnx
import numpy as np
import onnxruntime as ort
import torchaudio
接下来,我们将使用带有 CLI 的 transformers optimum 库将模型从 HuggingFace 转换为 ONNX 格式:
pip install optimum[exporters]
optimum-cli export onnx --model local_path --task trascribe local_model_folder/
这将从位于 local_path 的原始模型在 local_model_folder 中创建一堆文件。
让我们设置 ONNX 会话:
session_options = ort.SessionOptions()
session_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL
session_options.execution_mode = ort.ExecutionMode.ORT_PARALLEL
session_options.intra_op_num_threads = 4
session_options.inter_op_num_threads = 16
我们将分别使用编码器和解码器:
sess_encoder = ort.InferenceSession("./path_to/encoder_q.onnx")
sess_decoder = ort.InferenceSession("./path_to/decoder_q.onnx")
为了提高性能,我们定义一个模型量化函数,然后将其应用于编码器和编码器:
def quantize_onnx_model(onnx_model_path, quantized_model_path):
onnx_opt_model = onnx.load(onnx_model_path)
quantize_dynamic(onnx_model_path,
quantized_model_path,
weight_type=QuantType.QUInt8) #chnage QInt8 to QUInt8
quantize_onnx_model("./path_to/encoder.onnx","./path_to/encoder_q.onnx")
quantize_onnx_model("./path_to/decoder.onnx","./path_to/decoder_q.onnx")
2.2 使用 ONNX 模型进行推理
让我们初始化处理器和标记器:
processor = WhisperProcessor.from_pretrained("./path_to/q_whisper_onnx")
# tokenizer = processor.tokenizer
tokenizer = whisper.decoding.get_tokenizer(
model.is_multilingual,
task="transcribe",
language="en",
)
音频预处理脚本(类似于 Whisper log_mel_spectrogram() 函数)将 .wav 文件转换为 log-mel 频谱图数组:
def preprocessing_torchaudio(audio_path):
waveform, sample_rate = torchaudio.load(audio_path)
waveform = torchaudio.transforms.Resample(orig_freq=sample_rate, new_freq=16000)(waveform)
mel = processor.feature_extractor(waveform[0], sampling_rate=16000).input_features
return torch.tensor(mel, dtype=torch.float32)
样本 .wav 文件的音频数组 x_mel 将是:
x_mel = preprocessing_librosa("./path_to/audio.wav")
最后,使用我们的量化 ONNX 对序列进行编码和解码的自定义循环模型:
max_tokens = 448
out_encoder, = sess_encoder.run(["last_hidden_state"], {"input_features": x_mel.numpy()})
next_token = tokenizer.sot
# next_token = "<|startoftranscript|>"
while x_tokens.shape[1] <= max_tokens and next_token != tokenizer.eot:
out_decoder, = sess_decoder.run(
["logits"],
{
"input_ids": x_tokens.numpy(),
"encoder_hidden_states": out_encoder,
},
)
next_token = out_decoder[0, -1].argmax()
next_token = torch.tensor(next_token)
print(next_token,next_token.shape,x_tokens.shape)
x_tokens = torch.concat(
[x_tokens, next_token.reshape(1, 1)],
axis=1,
)
print(tokenizer.decode(x_tokens[0]))
我将代码保留为这种不愉快的格式,因为 ONNX 推理性能总是比通过 OpenVino 或 PyTorch 进行推理差得多,可能是因为 ONNX 格式最初是为卷积神经网络开发的,可能不是优化变换的最佳选择。
3、OpenVino 运行时
使用 OpenVino 进行推理的实现更加简单。
首先,一些必要的导入:
import os
from transformers import WhisperProcessor, logging as transformers_log
from optimum.intel.openvino import OVModelForSpeechSeq2Seq
import torchaudio
import torch
import numpy as np
import time
from src import log
from src.utils import utils
import asyncio
3.1 模型转换为 OpenVino 格式
我们将使用 transformers optimum 库将我们的 HuggingFace 模型导出为 OpenVino 格式(你可以将 openai/whisper-medium 替换为你自己的模型或托管在 HuggingFace hub 上的任何其他 Whisper 模型):
[openvino,nncf]optimum-cli export openvino --model openai/whisper-medium --weight-format int8 asr_openvino_int8
请注意,我们在导出时使用了 int8 量化。我也尝试过 int4 量化,但对我来说,它对转录质量影响很大。
以下是我们将用来获取 OpenVino 模型的方法
def get_openvino_model(self):
ov_config = {"CACHE_DIR": ""}
self.model = OVModelForSpeechSeq2Seq.from_pretrained(self.ov_model_name, ov_config=ov_config, compile=False)
log.info("OpenVino model loaded from " + str(self.ov_model_name))
try
ov_model_path = Path("src/model/" + self.model_name.replace("/", "_"))
ov_config = {"CACHE_DIR": ""}
if not ov_model_path.exists():
self.model = OVModelForSpeechSeq2Seq.from_pretrained(
self.model_name,
ov_config=ov_config,
export=True,
compile=False,
load_in_8bit=False,
)
self.model.half()
self.model.save_pretrained(ov_model_path)
log.info("HF model converted to OpenVino and saved in " + str(ov_model_path))
else:
self.model = OVModelForSpeechSeq2Seq.from_pretrained(ov_model_path, ov_config=ov_config, compile=False)
log.info("OpenVino model loaded from " + str(ov_model_path))
except Exception as e:
log.error(f"Error during OpenVino model loading: {str(e)}")
raise
return self.model
这里 self.ov_model_name 将是 asr_openvino_int8,我们之前用于最佳 CLI 命令(+ 它的路径)。我使用了一个丑陋的 self.model_name.replace("/", "_") 函数将 HuggingFace 上的 URL 转换为模型名称。
接下来,必须编译 OpenVino 模型,因为它将通过 OpenVino 运行时直接加载:
def compile_openvino_model(self):
"""
"""
try:
if torch.cuda.is_available():
log.info("Model loaded on GPU")
self.device = "GPU"
else:
log.info("Model loaded on CPU")
self.device = "CPU"
self.model.to(self.device)
self.model.compile()
log.info("OpenVino model compiled successfully")
except Exception as e:
log.error(f"Error during OpenVino model compilation: {str(e)}")
raise
return self.model
3.2 使用 OpenVino 模型进行推理
现在,我们定义两个辅助函数来创建 Whisper 处理器进行编码(与前向传递相比,它所需的时间可以忽略不计)和音频预处理:
def create_processor(self):
"""
"""
try:
processor = WhisperProcessor.from_pretrained(
self.model_name,
language=self.language,
task=self.task
)
log.info("WhisperProcessor created")
return processor
except Exception as e:
log.error(f"Error during WhisperProcessor creation: {str(e)}")
raise
def preprocess_audio(self, waveform):
"""
"""
# compute log-Mel input features from input audio array
audio_features = self.processor.feature_extractor(waveform, sampling_rate=self.sr).input_features[0]
audio_features = torch.tensor(np.array([audio_features]))
return audio_features
最后,管道定义一个用于转录的异步函数 — 类似于 HuggingFace 管道实现:
def openvino_pipeline(self,audio_path):
print("1 - starting audio load:", time.time())
waveform, sample_rate = torchaudio.load(audio_path)
waveform = torchaudio.transforms.Resample(orig_freq=sample_rate, new_freq=self.sr)(waveform)[0]
print("2 - starting preprocessing:", time.time())
audio_features = self.preprocess_audio(waveform)
print("3 - starting forward pass:", time.time())
predicted_ids = self.model.generate(audio_features, max_new_tokens=224)
print("4 - starting decoding:", time.time())
transcription = self.processor.batch_decode(predicted_ids, skip_special_tokens=True)
return transcription[0]
async def transcribe(self, audio_path: str) -> str:
"""
"""
try:
loop = asyncio.get_event_loop()
log.info(f"Transcribing the following file audio: {audio_path}")
print("0 - starting the loop:",time.time())
text = await loop.run_in_executor(
None,
lambda: self.openvino_pipeline(audio_path)
)
print("5 - all done:", time.time())
log.info("Transcription completed!")
return text
except Exception as e:
log.error(f"Error during transcription: {str(e)}")
raise
4、PyTorch 推理
通过直接 PyTorch 实现 Whisper 进行推理包括几个步骤:
- 在我的例子中,用于推理的微调模型位于 HuggingFace Hub,所以我必须首先从那里获取它;
- 我们还需要来自 OpenAI GitHub 的原始基础 Whisper 模型(大小应与我们的微调模型相对应 - 在我的情况下是 Whisper-Medium);
- 来自 HF 的微调模型必须映射到 OpenAI 格式(请参阅此处的详细信息);
- 我们的预训练权重将应用于基础模型;
- 然后我们可以将模型设置为评估模式并运行推理。
让我们从 HuggingFace 中心获取模型开始:
def get_hf_model(self):
"""
"""
try:
merged_model = WhisperForConditionalGeneration.from_pretrained(self.model_name)
pt_model_name = os.path.basename(self.model_name) + ".pth"
pt_dir_name = os.path.join("assets","pt_models")
self.pretrained_model_path = os.path.join(pt_dir_name, pt_model_name)
if not os.path.exists(pt_dir_name):
os.makedirs(pt_dir_name)
log.info(f"Directory {pt_dir_name} created and will be used to store PyTorch models")
else:
log.info(f"Directory {pt_dir_name} exists, using it to save PyTorch model")
torch.save(merged_model.state_dict(), self.pretrained_model_path)
log.info(f"HF model saved to {self.pretrained_model_path} in PyTorch format for conversion")
except Exception as e:
log.error(f"Error during HuggingFace model loading: {str(e)}")
raise
这里 self.model_name 代表我在 HuggingFace 中的模型 id(请注意,它不应该是适配器,而是完整合并的模型)。
4.1 从 HuggingFace 到 PyTorch 的模型转换
Whisper 的 transformers 实现中使用的层名称与 OpenAI 原始 repo 中使用的层名称不同。我在这里写了一篇关于此的简短说明。
映射函数(从 HF 到 OpenAI)是这个:
def map_hf_to_pt(self,pretrained_weights):
def rename_key(key):
new_key = key
for k, v in self.mapping:
new_key = new_key.replace(k, v)
return new_key
# Rename the keys in the state_dict
updated_weights = {rename_key(k): v for k, v in pretrained_weights.items()}
updated_weights.pop('proj_out.weight', None)
return updated_weights
现在,将此映射应用于基本 Whisper 模型,并使用我们从 HuggingFace hub 下载的模型的预训练权重:
self.mapping = [ ('model.', ''),
('decoder.layers', 'decoder.blocks'),
('encoder.layers', 'encoder.blocks'),
('encoder.embed_positions.weight', 'encoder.positional_embedding'),
('self_attn.k_proj', 'attn.key'),
('self_attn.q_proj', 'attn.query'),
('self_attn.v_proj', 'attn.value'),
('self_attn.out_proj', 'attn.out'),
('self_attn_layer_norm', 'attn_ln'),
('final_layer_norm', 'mlp_ln'),
('fc1', 'mlp.0'),
('fc2', 'mlp.2'),
('encoder_attn.k_proj','cross_attn.key'),
('encoder_attn.v_proj','cross_attn.value'),
('encoder_attn.q_proj','cross_attn.query'),
('encoder_attn.out_proj','cross_attn.out'),
('encoder_attn_layer_norm','cross_attn_ln'),
('decoder.embed_positions.weight','decoder.positional_embedding'),
('decoder.embed_tokens','decoder.token_embedding'),
('encoder.layer_norm','encoder.ln_post'),
('decoder.layer_norm','decoder.ln'),
]
4.2 使用 PyTorch 进行推理
我们几乎已全部设置完毕。定义 Whisper 处理器和编码函数:
def create_processor(self):
"""
"""
try:
processor = WhisperProcessor.from_pretrained(
self.model_name,
language=self.language,
task=self.task
)
log.info("WhisperProcessor created")
return processor
except Exception as e:
log.error(f"Error during WhisperProcessor creation: {str(e)}")
raise
def preprocess_audio(self, waveform):
"""
"""
# compute log-Mel input features from input audio array
mel = self.processor.feature_extractor(waveform, sampling_rate=self.sr).input_features
return torch.tensor(mel, dtype=torch.float32)
最后,管道和转录函数:
def inference_pipeline(self,audio_path):
log.info("1 - Starting audio load:")
# waveform, sample_rate = librosa.load(audio_path, sr=self.sr)
waveform, sample_rate = torchaudio.load(audio_path)
waveform = torchaudio.transforms.Resample(orig_freq=sample_rate, new_freq=self.sr)(waveform)[0]
log.info("2 - starting preprocessing:")
audio_features = self.preprocess_audio(waveform)
log.info("3 - Starting forward pass:")
with torch.no_grad():
result = whisper.decode(
self.model,
audio_features,
options=whisper.DecodingOptions(
fp16=False,
language="it",
without_timestamps=True,
suppress_blank=False,
suppress_tokens=[],
),
)
return result[0].text
async def transcribe(self, audio_path: str) -> DecodingResult | list[DecodingResult]:
"""
"""
try:
loop = asyncio.get_event_loop()
log.info(f"Transcribing the following file audio: {audio_path}")
log.info("Transcription started...")
text = await loop.run_in_executor(
None,
lambda: self.inference_pipeline(audio_path)
)
log.info("Transcription completed!")
return text
except Exception as e:
log.error(f"Error during transcription: {str(e)}")
raise
以下是 PyTorch 推理类实现的完整代码。注意初始化期间的 torch.set_num_threads(num_threads) — 在这一行中,我们设置了用于推理的 CPU 核心数,这在很大程度上影响了性能:
import os
from src import log
from src.utils import utils
import asyncio
import whisper
from whisper import DecodingResult
from transformers import WhisperForConditionalGeneration, WhisperProcessor, logging as transformers_log
from huggingface_hub import hf_hub_download, login
import torch
import torchaudio
import torch.quantization
class InferenceService:
_initialized = False
def __init__(self, language='it', num_threads=1, quantization=True, device = "cpu"):
try:
login(token=os.environ['API_TOKEN'])
log.info("HuggingFace login successful")
except Exception as e:
log.error(f"Error during HuggingFace login: {str(e)}")
raise
if not InferenceService._initialized:
os.environ["TRANSFORMERS_VERBOSITY"] = "error"
transformers_log.set_verbosity_error()
self.model_name = utils.MERGED_MODEL_NAME
self.language = language
self.pytorch_converted_model_source = utils.PRETRAINED_MODEL_PTH
self.pytorch_converted_model_filename = utils.PRETRAINED_MODEL_FILENAME
self.task = utils.TASK
self.device = device
self.sr = utils.SAMPLING_RATE
self.mapping = utils.HF_PT_MAPPING
try:
# Initialize model and related components
log.info("Starting PyTorch Inference service...")
try:
self.pretrained_model_path = hf_hub_download(repo_id=self.pytorch_converted_model_source,
filename=self.pytorch_converted_model_filename)
log.info(f"Whisper pretrained model downloaded to {self.pretrained_model_path}")
except Exception as e:
log.info(f"Unable to download the PyTorch model: {str(e)} - switching to model from HF for conversion")
self.get_hf_model()
self.model = self.set_pt_model()
if quantization:
self.model = torch.quantization.quantize_dynamic(self.model,
{torch.nn.Linear},
dtype=torch.qint8)
self.model = self.model.cpu()
self.processor = self.create_processor()
InferenceService._initialized = True
log.info("PyTorch Inference service started with success!")
except Exception as e:
log.error(f"Error during PyTorch Inference service init: {str(e)}")
raise
torch.set_num_threads(num_threads)
log.info(f"Number of threads set to {num_threads} for PyTorch calculations")
def get_hf_model(self):
"""
"""
try:
merged_model = WhisperForConditionalGeneration.from_pretrained(self.model_name)
pt_model_name = os.path.basename(self.model_name) + ".pth"
pt_dir_name = os.path.join("assets","pt_models")
self.pretrained_model_path = os.path.join(pt_dir_name, pt_model_name)
if not os.path.exists(pt_dir_name):
os.makedirs(pt_dir_name)
log.info(f"Directory {pt_dir_name} created and will be used to store PyTorch models")
else:
log.info(f"Directory {pt_dir_name} exists, using it to save PyTorch model")
torch.save(merged_model.state_dict(), self.pretrained_model_path)
log.info(f"HF model saved to {self.pretrained_model_path} in PyTorch format for conversion")
except Exception as e:
log.error(f"Error during HuggingFace model loading: {str(e)}")
raise
return 1
def map_hf_to_pt(self,pretrained_weights):
def rename_key(key):
new_key = key
for k, v in self.mapping:
new_key = new_key.replace(k, v)
return new_key
# Rename the keys in the state_dict
updated_weights = {rename_key(k): v for k, v in pretrained_weights.items()}
updated_weights.pop('proj_out.weight', None)
return updated_weights
def set_pt_model(self):
model = whisper.load_model("medium")
log.info("Whisper base model loaded")
pretrained_model = torch.load(self.pretrained_model_path)
log.info(f"Whisper pretrained model loaded from {self.pretrained_model_path}")
# Extract state_dict if the loaded model is not already a state_dict
if hasattr(pretrained_model, "state_dict"):
pretrained_weights = pretrained_model.state_dict() # extract the state dict
else:
pretrained_weights = pretrained_model # it's already a state_dict
#######################################################################
updated_weights = self.map_hf_to_pt(pretrained_weights)
model.load_state_dict(updated_weights, strict=True)
log.info(f"Model weights mapped from HuggingFace model to PyTorch")
# Activate to save converted model and/or its weights
# torch.save(model, 'src/model/whisper_pretrained_converted.pth')
# torch.save(updated_weights, 'src/model/whisper_pretrained_converted_weights.pth')
######################################################################
model.to(self.device)
model.requires_grad_(False)
model.eval()
log.info("Whisper PyTorch model loaded on " + str(self.device))
return model
def create_processor(self):
"""
"""
try:
processor = WhisperProcessor.from_pretrained(
self.model_name,
language=self.language,
task=self.task
)
log.info("WhisperProcessor created")
return processor
except Exception as e:
log.error(f"Error during WhisperProcessor creation: {str(e)}")
raise
def preprocess_audio(self, waveform):
"""
"""
# compute log-Mel input features from input audio array
mel = self.processor.feature_extractor(waveform, sampling_rate=self.sr).input_features
return torch.tensor(mel, dtype=torch.float32)
def inference_pipeline(self,audio_path):
log.info("1 - Starting audio load:")
# waveform, sample_rate = librosa.load(audio_path, sr=self.sr)
waveform, sample_rate = torchaudio.load(audio_path)
waveform = torchaudio.transforms.Resample(orig_freq=sample_rate, new_freq=self.sr)(waveform)[0]
log.info("2 - starting preprocessing:")
audio_features = self.preprocess_audio(waveform)
log.info("3 - Starting forward pass:")
with torch.no_grad():
result = whisper.decode(
self.model,
audio_features,
options=whisper.DecodingOptions(
fp16=False,
language="it",
without_timestamps=True,
suppress_blank=False,
suppress_tokens=[],
),
)
return result[0].text
async def transcribe(self, audio_path: str) -> DecodingResult | list[DecodingResult]:
"""
"""
try:
loop = asyncio.get_event_loop()
log.info(f"Transcribing the following file audio: {audio_path}")
log.info("Transcription started...")
text = await loop.run_in_executor(
None,
lambda: self.inference_pipeline(audio_path)
)
log.info("Transcription completed!")
return text
except Exception as e:
log.error(f"Error during transcription: {str(e)}")
raise
相关推荐
- sharding-jdbc实现`分库分表`与`读写分离`
-
一、前言本文将基于以下环境整合...
- 三分钟了解mysql中主键、外键、非空、唯一、默认约束是什么
-
在数据库中,数据表是数据库中最重要、最基本的操作对象,是数据存储的基本单位。数据表被定义为列的集合,数据在表中是按照行和列的格式来存储的。每一行代表一条唯一的记录,每一列代表记录中的一个域。...
- MySQL8行级锁_mysql如何加行级锁
-
MySQL8行级锁版本:8.0.34基本概念...
- mysql使用小技巧_mysql使用入门
-
1、MySQL中有许多很实用的函数,好好利用它们可以省去很多时间:group_concat()将取到的值用逗号连接,可以这么用:selectgroup_concat(distinctid)fr...
- MySQL/MariaDB中如何支持全部的Unicode?
-
永远不要在MySQL中使用utf8,并且始终使用utf8mb4。utf8mb4介绍MySQL/MariaDB中,utf8字符集并不是对Unicode的真正实现,即不是真正的UTF-8编码,因...
- 聊聊 MySQL Server 可执行注释,你懂了吗?
-
前言MySQLServer当前支持如下3种注释风格:...
- MySQL系列-源码编译安装(v5.7.34)
-
一、系统环境要求...
- MySQL的锁就锁住我啦!与腾讯大佬的技术交谈,是我小看它了
-
对酒当歌,人生几何!朝朝暮暮,唯有己脱。苦苦寻觅找工作之间,殊不知今日之事乃我心之痛,难道是我不配拥有工作嘛。自面试后他所谓的等待都过去一段时日,可惜在下京东上的小金库都要见低啦。每每想到不由心中一...
- MySQL字符问题_mysql中字符串的位置
-
中文写入乱码问题:我输入的中文编码是urf8的,建的库是urf8的,但是插入mysql总是乱码,一堆"???????????????????????"我用的是ibatis,终于找到原因了,我是这么解决...
- 深圳尚学堂:mysql基本sql语句大全(三)
-
数据开发-经典1.按姓氏笔画排序:Select*FromTableNameOrderByCustomerNameCollateChinese_PRC_Stroke_ci_as//从少...
- MySQL进行行级锁的?一会next-key锁,一会间隙锁,一会记录锁?
-
大家好,是不是很多人都对MySQL加行级锁的规则搞的迷迷糊糊,一会是next-key锁,一会是间隙锁,一会又是记录锁。坦白说,确实还挺复杂的,但是好在我找点了点规律,也知道如何如何用命令分析加...
- 一文讲清怎么利用Python Django实现Excel数据表的导入导出功能
-
摘要:Python作为一门简单易学且功能强大的编程语言,广受程序员、数据分析师和AI工程师的青睐。本文系统讲解了如何使用Python的Django框架结合openpyxl库实现Excel...
- 用DataX实现两个MySQL实例间的数据同步
-
DataXDataX使用Java实现。如果可以实现数据库实例之间准实时的...
- MySQL数据库知识_mysql数据库基础知识
-
MySQL是一种关系型数据库管理系统;那废话不多说,直接上自己以前学习整理文档:查看数据库命令:(1).查看存储过程状态:showprocedurestatus;(2).显示系统变量:show...
- 如何为MySQL中的JSON字段设置索引
-
背景MySQL在2015年中发布的5.7.8版本中首次引入了JSON数据类型。自此,它成了一种逃离严格列定义的方式,可以存储各种形状和大小的JSON文档,例如审计日志、配置信息、第三方数据包、用户自定...
你 发表评论:
欢迎- 一周热门
-
-
MySQL中这14个小玩意,让人眼前一亮!
-
旗舰机新标杆 OPPO Find X2系列正式发布 售价5499元起
-
【VueTorrent】一款吊炸天的qBittorrent主题,人人都可用
-
面试官:使用int类型做加减操作,是线程安全吗
-
C++编程知识:ToString()字符串转换你用正确了吗?
-
【Spring Boot】WebSocket 的 6 种集成方式
-
PyTorch 深度学习实战(26):多目标强化学习Multi-Objective RL
-
pytorch中的 scatter_()函数使用和详解
-
与 Java 17 相比,Java 21 究竟有多快?
-
基于TensorRT_LLM的大模型推理加速与OpenAI兼容服务优化
-
- 最近发表
- 标签列表
-
- idea eval reset (50)
- vue dispatch (70)
- update canceled (42)
- order by asc (53)
- spring gateway (67)
- 简单代码编程 贪吃蛇 (40)
- transforms.resize (33)
- redisson trylock (35)
- 卸载node (35)
- np.reshape (33)
- torch.arange (34)
- npm 源 (35)
- vue3 deep (35)
- win10 ssh (35)
- vue foreach (34)
- idea设置编码为utf8 (35)
- vue 数组添加元素 (34)
- std find (34)
- tablefield注解用途 (35)
- python str转json (34)
- java websocket客户端 (34)
- tensor.view (34)
- java jackson (34)
- vmware17pro最新密钥 (34)
- mysql单表最大数据量 (35)