20240202在WIN10下使用whisper.cpp

news2024/11/17 9:30:43

20240202在WIN10下使用whisper.cpp
2024/2/2 14:15


【结论:在Windows10下,确认large模式识别7分钟中文视频,需要83.7284 seconds,需要大概1.5分钟!效率太差!】
83.7284/420=0.19935333333333333333333333333333

前提条件,可以通过技术手段上外网!^_
首先你要有一张NVIDIA的显卡,比如我用的PDD拼多多的二手GTX1080显卡。【并且极其可能是矿卡!】800¥
2、请正确安装好NVIDIA最新的545版本的驱动程序和CUDA、cuDNN。
2、安装Torch
3、配置whisper


识别得到的字幕chs.srt是繁体中文的,将来要想办法更换为简体中文的!
1
00:00:00,000 --> 00:00:01,400
前段時間有個巨石恆虎

2
00:00:01,400 --> 00:00:03,000
某某是男人最好的醫妹

3
00:00:03,000 --> 00:00:04,800
這裡的某某可以替換為減肥

4
00:00:04,800 --> 00:00:07,800
長髮 西裝 考研 速唱 永潔無間等等等等


https://github.com/Const-me/Whisper/releases
https://www.cnblogs.com/jike9527/p/17545484.html?share_token=5af4092d-5b67-4e52-8231-0ae220fd2185
https://www.cnblogs.com/jike9527/p/17545484.html
使用whisper批量生成字幕(whisper.cpp)

c:\>
c:\>git clone https://github.com/ggerganov/whisper.cpp
Cloning into 'whisper.cpp'...
remote: Enumerating objects: 6773, done.
remote: Counting objects: 100% (1995/1995), done.
remote: Compressing objects: 100% (275/275), done.
remote: Total 6773 (delta 1826), reused 1810 (delta 1714), pack-reused 4778
Receiving objects: 100% (6773/6773), 10.18 MiB | 6.55 MiB/s, done.
Resolving deltas: 100% (4368/4368), done.


c:\>cd whisper.cpp

c:\whisper.cpp>dir
 驱动器 C 中的卷是 WIN10
 卷的序列号是 9273-D6A8

 c:\whisper.cpp 的目录

2024/02/02  14:20    <DIR>          .
2024/02/02  14:20    <DIR>          ..
2024/02/02  14:20    <DIR>          .devops
2024/02/02  14:20    <DIR>          .github
2024/02/02  14:20               863 .gitignore
2024/02/02  14:20                99 .gitmodules
2024/02/02  14:20    <DIR>          bindings
2024/02/02  14:20    <DIR>          cmake
2024/02/02  14:20            19,729 CMakeLists.txt
2024/02/02  14:20    <DIR>          coreml
2024/02/02  14:20    <DIR>          examples
2024/02/02  14:20    <DIR>          extra
2024/02/02  14:20            32,539 ggml-alloc.c
2024/02/02  14:20             4,149 ggml-alloc.h
2024/02/02  14:20             5,996 ggml-backend-impl.h
2024/02/02  14:20            69,048 ggml-backend.c
2024/02/02  14:20            11,932 ggml-backend.h
2024/02/02  14:20           451,408 ggml-cuda.cu
2024/02/02  14:20             2,156 ggml-cuda.h
2024/02/02  14:20             7,813 ggml-impl.h
2024/02/02  14:20             2,425 ggml-metal.h
2024/02/02  14:20           152,813 ggml-metal.m
2024/02/02  14:20           231,753 ggml-metal.metal
2024/02/02  14:20            87,989 ggml-opencl.cpp
2024/02/02  14:20             1,422 ggml-opencl.h
2024/02/02  14:20           411,673 ggml-quants.c
2024/02/02  14:20            13,983 ggml-quants.h
2024/02/02  14:20           696,627 ggml.c
2024/02/02  14:20            87,399 ggml.h
2024/02/02  14:20    <DIR>          grammars
2024/02/02  14:20             1,093 LICENSE
2024/02/02  14:20            15,341 Makefile
2024/02/02  14:20    <DIR>          models
2024/02/02  14:20    <DIR>          openvino
2024/02/02  14:20             1,835 Package.swift
2024/02/02  14:20            39,942 README.md
2024/02/02  14:20    <DIR>          samples
2024/02/02  14:20    <DIR>          spm-headers
2024/02/02  14:20    <DIR>          tests
2024/02/02  14:20           239,648 whisper.cpp
2024/02/02  14:20            30,873 whisper.h
              26 个文件      2,620,548 字节
              15 个目录 128,119,971,840 可用字节

c:\whisper.cpp>
c:\whisper.cpp>
c:\whisper.cpp>
c:\whisper.cpp>cd models

c:\whisper.cpp\models>dir
 驱动器 C 中的卷是 WIN10
 卷的序列号是 9273-D6A8

 c:\whisper.cpp\models 的目录

2024/02/02  14:20    <DIR>          .
2024/02/02  14:20    <DIR>          ..
2024/02/02  14:20                 7 .gitignore
2024/02/02  14:20             4,980 convert-h5-to-coreml.py
2024/02/02  14:20             7,584 convert-h5-to-ggml.py
2024/02/02  14:20            10,955 convert-pt-to-ggml.py
2024/02/02  14:20            12,761 convert-whisper-to-coreml.py
2024/02/02  14:20             1,799 convert-whisper-to-openvino.py
2024/02/02  14:20             2,272 download-coreml-model.sh
2024/02/02  14:20             1,440 download-ggml-model.cmd
2024/02/02  14:20             3,039 download-ggml-model.sh
2024/02/02  14:20           575,451 for-tests-ggml-base.bin
2024/02/02  14:20           586,836 for-tests-ggml-base.en.bin
2024/02/02  14:20           575,451 for-tests-ggml-large.bin
2024/02/02  14:20           575,451 for-tests-ggml-medium.bin
2024/02/02  14:20           586,836 for-tests-ggml-medium.en.bin
2024/02/02  14:20           575,451 for-tests-ggml-small.bin
2024/02/02  14:20           586,836 for-tests-ggml-small.en.bin
2024/02/02  14:20           575,451 for-tests-ggml-tiny.bin
2024/02/02  14:20           586,836 for-tests-ggml-tiny.en.bin
2024/02/02  14:20             1,506 generate-coreml-interface.sh
2024/02/02  14:20             1,355 generate-coreml-model.sh
2024/02/02  14:20             3,711 ggml_to_pt.py
2024/02/02  14:20                42 openvino-conversion-requirements.txt
2024/02/02  14:20             5,615 README.md
              23 个文件      5,281,665 字节
               2 个目录 105,396,047,872 可用字节

c:\whisper.cpp\models>main.exe -f samples\jfk.wav
'main.exe' 不是内部或外部命令,也不是可运行的程序
或批处理文件。

c:\whisper.cpp\models>dir
 驱动器 C 中的卷是 WIN10
 卷的序列号是 9273-D6A8

 c:\whisper.cpp\models 的目录

2024/02/02  14:23    <DIR>          .
2024/02/02  14:23    <DIR>          ..
2024/02/02  14:20                 7 .gitignore
2024/02/02  14:20             4,980 convert-h5-to-coreml.py
2024/02/02  14:20             7,584 convert-h5-to-ggml.py
2024/02/02  14:20            10,955 convert-pt-to-ggml.py
2024/02/02  14:20            12,761 convert-whisper-to-coreml.py
2024/02/02  14:20             1,799 convert-whisper-to-openvino.py
2024/02/02  14:20             2,272 download-coreml-model.sh
2024/02/02  14:20             1,440 download-ggml-model.cmd
2024/02/02  14:20             3,039 download-ggml-model.sh
2024/02/02  14:20           575,451 for-tests-ggml-base.bin
2024/02/02  14:20           586,836 for-tests-ggml-base.en.bin
2024/02/02  14:20           575,451 for-tests-ggml-large.bin
2024/02/02  14:20           575,451 for-tests-ggml-medium.bin
2024/02/02  14:20           586,836 for-tests-ggml-medium.en.bin
2024/02/02  14:20           575,451 for-tests-ggml-small.bin
2024/02/02  14:20           586,836 for-tests-ggml-small.en.bin
2024/02/02  14:20           575,451 for-tests-ggml-tiny.bin
2024/02/02  14:20           586,836 for-tests-ggml-tiny.en.bin
2024/02/02  14:20             1,506 generate-coreml-interface.sh
2024/02/02  14:20             1,355 generate-coreml-model.sh
2024/02/02  13:23        37,922,638 ggml-base-encoder.mlmodelc.zip
2024/02/02  13:23        59,707,625 ggml-base-q5_1.bin
2024/02/02  13:24       147,951,465 ggml-base.bin
2024/02/02  13:24        37,950,917 ggml-base.en-encoder.mlmodelc.zip
2024/02/02  13:24        59,721,011 ggml-base.en-q5_1.bin
2024/02/02  13:24       147,964,211 ggml-base.en.bin
2024/02/02  13:30     1,177,529,527 ggml-large-v1-encoder.mlmodelc.zip
2024/02/02  13:35     3,094,623,691 ggml-large-v1.bin
2024/02/02  13:31     1,174,643,458 ggml-large-v2-encoder.mlmodelc.zip
2024/02/02  13:30     1,080,732,091 ggml-large-v2-q5_0.bin
2024/02/02  13:35     3,094,623,691 ggml-large-v2.bin
2024/02/02  13:31     1,175,711,232 ggml-large-v3-encoder.mlmodelc.zip
2024/02/02  13:32     1,081,140,203 ggml-large-v3-q5_0.bin
2024/02/02  13:35     3,095,033,483 ggml-large-v3.bin
2024/02/02  13:57       567,829,413 ggml-medium-encoder.mlmodelc.zip
2024/02/02  13:57       539,212,467 ggml-medium-q5_0.bin
2024/02/02  14:03     1,533,763,059 ggml-medium.bin
2024/02/02  13:59       566,993,085 ggml-medium.en-encoder.mlmodelc.zip
2024/02/02  13:59       539,225,533 ggml-medium.en-q5_0.bin
2024/02/02  14:04     1,533,774,781 ggml-medium.en.bin
2024/02/02  14:08       163,083,239 ggml-small-encoder.mlmodelc.zip
2024/02/02  14:07       190,085,487 ggml-small-q5_1.bin
2024/02/02  14:09       487,601,967 ggml-small.bin
2024/02/02  14:09       162,952,446 ggml-small.en-encoder.mlmodelc.zip
2024/02/02  14:09       190,098,681 ggml-small.en-q5_1.bin
2024/02/02  14:11       487,614,201 ggml-small.en.bin
2024/02/02  14:10        15,037,446 ggml-tiny-encoder.mlmodelc.zip
2024/02/02  14:10        32,152,673 ggml-tiny-q5_1.bin
2024/02/02  14:11        77,691,713 ggml-tiny.bin
2024/02/02  14:11        15,034,655 ggml-tiny.en-encoder.mlmodelc.zip
2024/02/02  14:11        32,166,155 ggml-tiny.en-q5_1.bin
2024/02/02  14:12        43,550,795 ggml-tiny.en-q8_0.bin
2024/02/02  14:12        77,704,715 ggml-tiny.en.bin
2024/02/02  14:20             3,711 ggml_to_pt.py
2024/02/02  13:23             1,477 gitattributes
2024/02/02  14:20                42 openvino-conversion-requirements.txt
2024/02/02  13:23             1,311 README.md
              57 个文件 22,726,106,592 字节
               2 个目录 105,396,191,232 可用字节

c:\whisper.cpp\models>cd ..

c:\whisper.cpp>dir

c:\whisper.cpp>
c:\whisper.cpp>
c:\whisper.cpp>main.exe -f samples\jfk.wav
Using GPU "NVIDIA GeForce GTX 1080", feature level 12.1, effective flags Wave32 | NoReshapedMatMul
Loaded MEL filters, 62.8 kb RAM
Loaded vocabulary, 51864 strings, 3050.6 kb RAM
Loaded 245 GPU tensors, 140.539 MB VRAM
Computed CPU base frequency: 2.29469 GHz
Loaded model from "models/ggml-base.en.bin" to VRAM
Created source reader from the file "samples\jfk.wav"

[00:00:00.000 --> 00:00:11.000]   And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country.
    CPU Tasks
LoadModel       577.635 milliseconds
RunComplete     422.9 milliseconds
Run     319.505 milliseconds
Callbacks       5.4751 milliseconds, 2 calls, 2.73755 milliseconds average
Spectrogram     52.7935 milliseconds, 3 calls, 17.5978 milliseconds average
Sample  7.6473 milliseconds, 27 calls, 283.233 microseconds average
Encode  188.011 milliseconds
Decode  125.975 milliseconds
DecodeStep      118.306 milliseconds, 27 calls, 4.38169 milliseconds average
    GPU Tasks
LoadModel       249.459 milliseconds
Run     231.117 milliseconds
Encode  99.0044 milliseconds
EncodeLayer     77.7554 milliseconds, 6 calls, 12.9592 milliseconds average
Decode  132.112 milliseconds
DecodeStep      132.103 milliseconds, 27 calls, 4.89271 milliseconds average
DecodeLayer     87.4824 milliseconds, 162 calls, 540.015 microseconds average
    Compute Shaders
mulMatTiled     63.4898 milliseconds, 60 calls, 1.05816 milliseconds average
mulMatByRowTiled        50.9198 milliseconds, 1959 calls, 25.9928 microseconds average
softMaxLong     27.5314 milliseconds, 27 calls, 1.01968 milliseconds average
norm    12.3785 milliseconds, 526 calls, 23.5333 microseconds average
addRepeatGelu   11.9749 milliseconds, 170 calls, 70.4406 microseconds average
fmaRepeat1      7.652 milliseconds, 526 calls, 14.5475 microseconds average
addRepeatEx     7.4319 milliseconds, 498 calls, 14.9235 microseconds average
softMaxFixed    6.913 milliseconds, 168 calls, 41.1488 microseconds average
copyConvert     5.397 milliseconds, 348 calls, 15.5086 microseconds average
convolutionMain 5.3903 milliseconds
convolutionMain2Fixed   5.2572 milliseconds
copyTranspose   4.6246 milliseconds, 336 calls, 13.7637 microseconds average
scaleInPlace    4.5107 milliseconds, 168 calls, 26.8494 microseconds average
addRepeatScale  3.7607 milliseconds, 324 calls, 11.6071 microseconds average
softMax 2.9733 milliseconds, 162 calls, 18.3537 microseconds average
addRepeat       1.8574 milliseconds, 180 calls, 10.3189 microseconds average
diagMaskInf     1.3711 milliseconds, 162 calls, 8.46358 microseconds average
convolutionPrep1        439.3 microseconds, 2 calls, 219.65 microseconds average
convolutionPrep2        229.4 microseconds, 2 calls, 114.7 microseconds average
addRows 191.5 microseconds, 27 calls, 7.09259 microseconds average
add     60.4 microseconds
mulMatByScalar  29.7 microseconds, 6 calls, 4.95 microseconds average
mulMatByRow     27.6 microseconds, 6 calls, 4.6 microseconds average
    Memory Usage
Model   858.5 KB RAM, 140.539 MB VRAM
Context 1.19063 MB RAM, 186.732 MB VRAM
Total   2.02901 MB RAM, 327.271 MB VRAM

c:\whisper.cpp>main.exe -l zh -osrt -m models/ggml-medium.bin chs.wav
Using GPU "NVIDIA GeForce GTX 1080", feature level 12.1, effective flags Wave32 | NoReshapedMatMul
Loaded MEL filters, 62.8 kb RAM
Loaded vocabulary, 51865 strings, 3037.1 kb RAM
Loaded 947 GPU tensors, 1462.12 MB VRAM
Computed CPU base frequency: 2.29469 GHz
Loaded model from "models/ggml-medium.bin" to VRAM
Created source reader from the file "chs.wav"

[00:00:00.000 --> 00:00:01.400]  ?????????????
[00:00:01.400 --> 00:00:03.000]  ????????????
[00:00:03.000 --> 00:00:04.800]  ?????????????????
[00:00:04.800 --> 00:00:07.800]  ??? ?? ??? ?? ?????????
[00:00:07.800 --> 00:00:09.200]  ???????????
[00:00:09.200 --> 00:00:12.000]  ??????????????????????
[00:00:12.000 --> 00:00:13.400]  ?????????
[00:00:13.400 --> 00:00:14.400]  ???????
[00:00:14.400 --> 00:00:17.400]  ?????????????????????????
[00:00:17.400 --> 00:00:20.000]  ?????????????????????
[00:00:20.000 --> 00:00:21.600]  ???????????????
[00:00:21.600 --> 00:00:22.800]  ?????????
[00:00:22.800 --> 00:00:24.400]  ?????????????
[00:00:24.400 --> 00:00:29.600]  ?????????????????? ?????????????????????
[00:00:29.600 --> 00:00:32.400]  ??????? ???????? ???
[00:00:32.400 --> 00:00:34.600]  ??????????????????
[00:00:34.600 --> 00:00:36.200]  ???????????
[00:00:36.200 --> 00:00:37.000]  ???
[00:00:37.000 --> 00:00:38.000]  ?????
[00:00:38.000 --> 00:00:39.400]  ???????????
[00:00:39.400 --> 00:00:40.600]  ????????
[00:00:40.600 --> 00:00:41.800]  ????? ?????
[00:00:41.800 --> 00:00:44.000]  ???????????????????
[00:00:44.000 --> 00:00:46.600]  ?????????????????????????
[00:00:46.600 --> 00:00:49.600]  ???????????????????????
[00:00:49.600 --> 00:00:52.000]  ???????????????????
[00:00:52.000 --> 00:00:54.200]  ???????????????????
[00:00:54.200 --> 00:00:56.000]  ??????? ??????
[00:00:56.000 --> 00:00:58.000]  ???????????????????
[00:00:58.000 --> 00:01:00.000]  ??????????????
[00:01:00.000 --> 00:01:01.000]  ????????
[00:01:01.000 --> 00:01:02.600]  ???????????
[00:01:02.600 --> 00:01:04.800]  ????????????? ????????
[00:01:04.800 --> 00:01:07.000]  ??11 ??????????????????
[00:01:07.000 --> 00:01:10.000]  ?????????????????? ????????
[00:01:10.000 --> 00:01:13.200]  ???? ??????????????????296%
[00:01:13.200 --> 00:01:16.000]  ?????????????????????
[00:01:16.000 --> 00:01:20.000]  ??????11 ?????? ????????????7????????
[00:01:20.000 --> 00:01:21.000]  ?????????
[00:01:21.000 --> 00:01:22.400]  ???????????
[00:01:22.400 --> 00:01:24.200]  ???? ????????
[00:01:24.200 --> 00:01:26.800]  ???????????????????????
[00:01:26.800 --> 00:01:28.400]  ???? ?????????
[00:01:28.400 --> 00:01:29.800]  ??????????
[00:01:29.800 --> 00:01:31.800]  ?????????????? ????
[00:01:31.800 --> 00:01:33.400]  ??????????????
[00:01:33.400 --> 00:01:35.400]  ???????????????
[00:01:35.400 --> 00:01:37.600]  ??? ?????2198
[00:01:37.600 --> 00:01:40.600]  ????????? ??????699
[00:01:40.600 --> 00:01:42.200]  ?????? ???????
[00:01:42.200 --> 00:01:45.000]  400?????? ?????????300?
[00:01:45.000 --> 00:01:48.200]  ??????? ????????200???????????
[00:01:48.200 --> 00:01:51.600]  ????? ????????????Citywalk????
[00:01:51.600 --> 00:01:54.600]  ?????? ???????1000????
[00:01:54.600 --> 00:01:58.200]  ????????????????????????????
[00:01:58.200 --> 00:02:00.400]  ?????????????????
[00:02:00.400 --> 00:02:02.200]  ?????????????
[00:02:02.200 --> 00:02:05.000]  ???????????????????????
[00:02:05.000 --> 00:02:07.400]  ????????? ???????????
[00:02:07.400 --> 00:02:08.600]  ????????
[00:02:08.600 --> 00:02:10.000]  ??????????
[00:02:10.000 --> 00:02:13.400]  ???????????????????????? ????1?1???
[00:02:13.400 --> 00:02:15.800]  ??????????????? ?????
[00:02:15.800 --> 00:02:18.200]  ?????????? ?????????
[00:02:18.200 --> 00:02:20.600]  ???????????? ???????
[00:02:20.600 --> 00:02:22.400]  ?????????? ???
[00:02:22.400 --> 00:02:26.400]  ????????? ????? ???? ??????????
[00:02:26.400 --> 00:02:29.200]  ???????? ???????????????????
[00:02:29.200 --> 00:02:30.800]  ????????????
[00:02:30.800 --> 00:02:32.600]  ???? ???????
[00:02:32.600 --> 00:02:35.400]  ????????? ????????
[00:02:35.400 --> 00:02:38.600]  ????????????? ???????????
[00:02:38.600 --> 00:02:41.000]  ?????? ???????????
[00:02:41.000 --> 00:02:43.600]  ?????????1000? ???????
[00:02:43.600 --> 00:02:46.400]  500???????? 200???????
[00:02:46.400 --> 00:02:48.400]  ?99 ??????????
[00:02:48.400 --> 00:02:50.800]  ???????????? ?????????
[00:02:50.800 --> 00:02:53.800]  ???????GORTEX??????? ??3000??
[00:02:53.800 --> 00:02:56.200]  ???????????????????????
[00:02:56.200 --> 00:03:00.000]  ???????????GORTEX???????????4500
[00:03:00.000 --> 00:03:03.000]  ?????GORTEX ?????????????
[00:03:03.000 --> 00:03:05.800]  ????? ???????????????????
[00:03:05.800 --> 00:03:08.000]  ???????? ????? ????
[00:03:08.000 --> 00:03:09.800]  ?????????????????
[00:03:09.800 --> 00:03:11.800]  ????????????????????
[00:03:11.800 --> 00:03:14.200]  ???????? ????????????
[00:03:14.200 --> 00:03:17.000]  ???????????? ????????
[00:03:17.000 --> 00:03:20.000]  ??????????? ??????????
[00:03:20.000 --> 00:03:21.600]  ????????????
[00:03:21.600 --> 00:03:23.200]  ?????????????
[00:03:23.200 --> 00:03:26.000]  ????????????????? ?????????????
[00:03:26.000 --> 00:03:29.000]  ??????????? ????????? ?????????
[00:03:29.000 --> 00:03:31.800]  ?????????? ??????????????
[00:03:31.800 --> 00:03:35.000]  ??????? ????????????????????
[00:03:35.000 --> 00:03:36.800]  ????????????
[00:03:36.800 --> 00:03:40.000]  ???? ???????????? ???
[00:03:40.000 --> 00:03:42.600]  ?????????? ???????????
[00:03:42.600 --> 00:03:46.000]  ?????????? ????????????
[00:03:46.000 --> 00:03:49.200]  ??????????????? ?????????????
[00:03:49.200 --> 00:03:52.200]  ?????????? ??????????
[00:03:52.200 --> 00:03:55.000]  ???????????????? ?????
[00:03:55.000 --> 00:03:58.000]  ???????????? ?????????????
[00:03:58.000 --> 00:04:01.000]  ?????????????????????? ?????
[00:04:01.000 --> 00:04:04.000]  ??????????????? ??????
[00:04:04.000 --> 00:04:06.600]  ??????? ???????????????
[00:04:06.600 --> 00:04:08.800]  ???????????????
[00:04:08.800 --> 00:04:12.000]  ?????????????????? ?????????
[00:04:12.000 --> 00:04:13.600]  ??????????????
[00:04:13.600 --> 00:04:16.200]  ??????????? ??????????
[00:04:16.200 --> 00:04:18.400]  ???????? ???????
[00:04:18.400 --> 00:04:21.800]  ?? ?????? ??????????????
[00:04:21.800 --> 00:04:25.800]  ??????????????? ??????????????????
[00:04:25.800 --> 00:04:29.200]  ???????? ????????????????????
[00:04:29.200 --> 00:04:30.800]  ?????????????????
[00:04:30.800 --> 00:04:33.400]  ?????????? ?????????
[00:04:33.400 --> 00:04:36.200]  ??????? ????????????????
[00:04:36.200 --> 00:04:39.400]  ???????? ???????????????
[00:04:39.400 --> 00:04:41.200]  ??????????????
[00:04:41.200 --> 00:04:43.600]  ?????????? ?????????
[00:04:43.600 --> 00:04:45.000]  ??????????
[00:04:45.000 --> 00:04:47.600]  ????????????????????
[00:04:47.600 --> 00:04:51.600]  ????????????? ????????? ???????
[00:04:51.600 --> 00:04:53.200]  ???????????
[00:04:53.200 --> 00:04:55.800]  ??? ??????????????????????
[00:04:55.800 --> 00:04:57.400]  ????????????????
[00:04:57.400 --> 00:04:59.800]  ?????????????????????
[00:04:59.800 --> 00:05:03.000]  ?????????????? ???????????
[00:05:03.000 --> 00:05:04.800]  ?????????????????
[00:05:04.800 --> 00:05:07.200]  ???????????? ??????????
[00:05:07.200 --> 00:05:09.400]  ???? ??????????????
[00:05:09.400 --> 00:05:11.600]  ??????????????????
[00:05:11.600 --> 00:05:14.800]  ???????????????? ???????????
[00:05:14.800 --> 00:05:16.400]  ???? ??????
[00:05:16.400 --> 00:05:18.800]  ????? ??????????????
[00:05:18.800 --> 00:05:20.800]  ???????????????
[00:05:20.800 --> 00:05:23.200]  ????????? ????????????
[00:05:23.200 --> 00:05:25.600]  ????????? ??????????????
[00:05:25.600 --> 00:05:29.800]  ?????? ????????????????????881?
[00:05:29.800 --> 00:05:31.800]  ??????? ??2000?
[00:05:31.800 --> 00:05:34.600]  ?????? ??????????????????
[00:05:34.600 --> 00:05:38.400]  ?????????8000????????? 2000???????
[00:05:38.600 --> 00:05:41.200]  ????????? ????????????
[00:05:41.200 --> 00:05:43.600]  ?????? ??? ????????
[00:05:43.600 --> 00:05:46.600]  ??2000??8000????????????????
[00:05:46.600 --> 00:05:49.600]  ??????????? ?2018?2021?
[00:05:49.600 --> 00:05:52.200]  ?????4???????60%??
[00:05:52.200 --> 00:05:56.000]  ??5??? ?????????????20??????60??
[00:05:56.000 --> 00:05:59.200]  ?????????? ?????????????????
[00:05:59.200 --> 00:06:02.200]  ???????????? ?????????????????
[00:06:02.200 --> 00:06:05.200]  ?????????? ???????????????
[00:06:05.200 --> 00:06:09.600]  ??? ????????? ????????????????????
[00:06:09.600 --> 00:06:11.400]  ????????????
[00:06:11.400 --> 00:06:15.200]  ???? ?????????? ????????????????
[00:06:15.200 --> 00:06:17.800]  ???? ????????????????
[00:06:17.800 --> 00:06:20.600]  ?350?????????????????
[00:06:20.600 --> 00:06:23.000]  ??????? ??????????
[00:06:23.000 --> 00:06:25.000]  ?????????????????
[00:06:25.000 --> 00:06:27.400]  ??? ???????????OK
[00:06:27.400 --> 00:06:29.600]  ?????????????????????
[00:06:29.600 --> 00:06:31.800]  ???????????????????
[00:06:31.800 --> 00:06:36.600]  ???????????????? ???????????????????????
[00:06:36.600 --> 00:06:38.800]  ?????????????????
[00:06:38.800 --> 00:06:41.400]  ???????????????????
[00:06:41.400 --> 00:06:44.200]  ??????????????????????????
[00:06:44.200 --> 00:06:46.800]  ????????????????????
[00:06:46.800 --> 00:06:48.800]  ????????????????
[00:06:48.800 --> 00:06:51.200]  ???????????????????
[00:06:51.200 --> 00:06:53.000]  ????????????????
[00:06:53.000 --> 00:06:56.000]  ?????????????????????????
[00:06:56.000 --> 00:07:01.600]  ????????????IC????? ????? ??????
    CPU Tasks
LoadModel       1.43866 seconds
RunComplete     83.7284 seconds
Run     83.6255 seconds
Callbacks       457.784 milliseconds, 187 calls, 2.44804 milliseconds average
Spectrogram     1.21106 seconds, 90 calls, 13.4562 milliseconds average
Sample  1.01043 seconds, 3535 calls, 285.836 microseconds average
Encode  15.2296 seconds, 17 calls, 895.858 milliseconds average
Decode  67.9228 seconds, 17 calls, 3.99546 seconds average
DecodeStep      66.9103 seconds, 3535 calls, 18.928 milliseconds average
    GPU Tasks
LoadModel       1.03839 seconds
Run     83.4773 seconds
Encode  15.3219 seconds, 17 calls, 901.288 milliseconds average
EncodeLayer     13.0778 seconds, 408 calls, 32.0533 milliseconds average
Decode  68.1554 seconds, 17 calls, 4.00914 seconds average
DecodeStep      68.1535 seconds, 3535 calls, 19.2796 milliseconds average
DecodeLayer     61.7764 seconds, 84840 calls, 728.152 microseconds average
    Compute Shaders
mulMatByRowTiled        38.8209 seconds, 1016702 calls, 38.1831 microseconds average
mulMatTiled     15.8527 seconds, 8993 calls, 1.76278 milliseconds average
fmaRepeat1      3.71454 seconds, 258888 calls, 14.348 microseconds average
addRepeatEx     3.43395 seconds, 255336 calls, 13.4487 microseconds average
normFixed       3.29705 seconds, 258888 calls, 12.7354 microseconds average
softMaxLong     2.62421 seconds, 3535 calls, 742.351 microseconds average
copyConvert     2.6175 seconds, 171312 calls, 15.2791 microseconds average
addRepeatScale  2.43674 seconds, 169680 calls, 14.3608 microseconds average
copyTranspose   2.43484 seconds, 170496 calls, 14.2809 microseconds average
softMaxFixed    1.78188 seconds, 85248 calls, 20.9023 microseconds average
addRepeatGelu   1.39165 seconds, 85282 calls, 16.3182 microseconds average
softMax 1.27396 seconds, 84840 calls, 15.0161 microseconds average
scaleInPlace    1.00817 seconds, 85248 calls, 11.8264 microseconds average
addRepeat       954.089 milliseconds, 86064 calls, 11.0858 microseconds average
diagMaskInf     652.093 milliseconds, 84840 calls, 7.68616 microseconds average
convolutionMain2Fixed   388.382 milliseconds, 17 calls, 22.846 milliseconds average
convolutionMain 163.663 milliseconds, 17 calls, 9.62722 milliseconds average
convolutionPrep1        24.0373 milliseconds, 34 calls, 706.979 microseconds average
addRows 21.3709 milliseconds, 3535 calls, 6.04552 microseconds average
convolutionPrep2        7.0976 milliseconds, 34 calls, 208.753 microseconds average
add     1.8821 milliseconds, 17 calls, 110.712 microseconds average
    Memory Usage
Model   877.966 KB RAM, 1.42785 GB VRAM
Context 109.465 MB RAM, 785.219 MB VRAM
Total   110.322 MB RAM, 2.19467 GB VRAM

c:\whisper.cpp>


https://github.com/ggerganov/whisper.cpp/tree/master/models
https://github.com/ggerganov/whisper.cpp
ggerganov/whisper.cpp


https://blog.csdn.net/aiyolo/article/details/129674728?share_token=2c48b804-37f6-43a8-9159-08b28147ad67
Whisper.cpp 编译使用
whisper.cpp 是牛人 ggerganov 对 openai 的 whisper 语音识别模型用 C++ 重新实现的项目,开源在 github 上,具有轻量、性能高,实用性强等特点。这篇文章主要记录在 windows 平台,如何使用该模型在本地端进行语音识别。
whisper.cpp 的开源地址在 ggerganov/whisper.cpp: Port of OpenAI’s Whisper model in C/C++ (github.com),首先将项目下载在本地。
git clone https://github.com/ggerganov/whisper.cpp
whisper.cpp 项目里提供了几个现成的模型。建议下载 small 以上的模型,不然识别效果完全无法使用。


https://huggingface.co/ggerganov/whisper.cpp
ggerganov/whisper.cpp 
OpenAI's Whisper models converted to ggml format
Available models

Model    Disk    Mem    SHA
tiny    75 MB    ~390 MB    bd577a113a864445d4c299885e0cb97d4ba92b5f
tiny.en    75 MB    ~390 MB    c78c86eb1a8faa21b369bcd33207cc90d64ae9df
base    142 MB    ~500 MB    465707469ff3a37a2b9b8d8f89f2f99de7299dac
base.en    142 MB    ~500 MB    137c40403d78fd54d454da0f9bd998f78703390c
small    466 MB    ~1.0 GB    55356645c2b361a969dfd0ef2c5a50d530afd8d5
small.en    466 MB    ~1.0 GB    db8a495a91d927739e50b3fc1cc4c6b8f6c2d022
medium    1.5 GB    ~2.6 GB    fd9727b6e1217c2f614f9b698455c4ffd82463b4
medium.en    1.5 GB    ~2.6 GB    8c30f0e44ce9560643ebd10bbe50cd20eafd3723
large-v1    2.9 GB    ~4.7 GB    b1caaf735c4cc1429223d5a74f0f4d0b9b59a299
large-v2    2.9 GB    ~4.7 GB    0f4c8e34f21cf1a914c59d8b3ce882345ad349d6
large    2.9 GB    ~4.7 GB    ad82bf6a9043ceed055076d0fd39f5f186ff8062
note: large corresponds to the latest Large v3 model

For more information, visit:

https://github.com/ggerganov/whisper.cpp/tree/master/models
https://huggingface.co/ggerganov/whisper.cpp/tree/main

参考资料:
https://www.toutiao.com/article/7225218604160418338/?app=news_article&timestamp=1706803458&use_new_style=1&req_id=2024020200041726E9258609E554857D25&group_id=7225218604160418338&tt_from=mobile_qq&utm_source=mobile_qq&utm_medium=toutiao_android&utm_campaign=client_share&share_token=37e094d5-29b8-4d14-87bb-241cdc28b0ea&source=m_redirect
AI浪潮下的12大开源神器介绍
原创2023-04-23 20:33·IT小熊实验室丶


https://blog.csdn.net/sinat_18131557/article/details/130950719?share_token=25ca6bb5-8450-472c-9228-abc8c6ce74d8
whisper.cpp在Windows VS的编译
sinat_18131557 于 2023-05-30 16:03:53 发布


https://www.toutiao.com/article/7283079784329052726/?app=news_article&timestamp=1706803297&use_new_style=1&req_id=20240202000137411974769524167990E0&group_id=7283079784329052726&tt_from=mobile_qq&utm_source=mobile_qq&utm_medium=toutiao_android&utm_campaign=client_share&share_token=b7961b29-d87a-4b6c-bb8e-c7c213388390&source=m_redirect
【往期回顾】Github开源项目月刊精选-2023年8月
原创2023-09-27 08:30·Github推荐官


https://blog.csdn.net/weixin_45533131/article/details/132817683?share_token=72d8a161-4d49-4795-ad21-2ce5e2e4b197
在Linux(Centos7)上编译whisper.cpp的详细教程


https://blog.csdn.net/u012234115/article/details/134668510?share_token=e3835a0d-ac3b-4c86-9e32-e79ec85cddbe
开源C++智能语音识别库whisper.cpp开发使用入门


https://www.toutiao.com/article/7276732434920653312/?app=news_article&timestamp=1706802934&use_new_style=1&req_id=2024020123553463D3509B1706BC79D479&group_id=7276732434920653312&tt_from=mobile_qq&utm_source=mobile_qq&utm_medium=toutiao_android&utm_campaign=client_share&share_token=7bcb7488-a03d-4291-96fb-d0835ac76cca&source=m_redirect
OpenAI的whisper的c/c++ 版本体验
首先下载代码,注:我的OS环境是ubuntu 18.04。


https://post.smzdm.com/p/a3052kz7/?share_token=d4057cba-adb0-4c91-8a8b-d8a7adcf4087
显卡怎么玩 篇三:音频转字幕神器whisper升级版,whisper-webui使用教程


https://www.toutiao.com/article/7311876528407921162/?app=news_article&timestamp=1706801102&use_new_style=1&req_id=20240201232501647517150775FC7AD89A&group_id=7311876528407921162&tt_from=mobile_qq&utm_source=mobile_qq&utm_medium=toutiao_android&utm_campaign=client_share&share_token=dfa1976e-9422-49d2-a73b-6453becea90c&source=m_redirect
2023 AI 界7个最火的 Text-to-Video 模型


动画
https://www.toutiao.com/article/7312473532829745700/?app=news_article&timestamp=1706801052&use_new_style=1&req_id=2024020123241265D9BE3F954EB979A010&group_id=7312473532829745700&tt_from=mobile_qq&utm_source=mobile_qq&utm_medium=toutiao_android&utm_campaign=client_share&share_token=ca5d0d2a-2d9b-4959-b5c0-3dd869555240&source=m_redirect
推荐5款本周 超火 的开源AI项目
原创2023-12-15 07:32·程序员梓羽同学


https://blog.csdn.net/chenlu5201314/article/details/131156770?share_token=b8796ff0-44f8-471a-af6d-c1bc7ca57002
【开源工具】使用Whisper提取视频、语音的字幕
1、下载安装包Assets\WhisperDesktop.zip


https://www.toutiao.com/article/7222852915286016544/?app=news_article&timestamp=1706460752&use_new_style=1&req_id=2024012900523164164830D4E1ECF3CCE2&group_id=7222852915286016544&tt_from=mobile_qq&utm_source=mobile_qq&utm_medium=toutiao_android&utm_campaign=client_share&share_token=9bc8621f-b3b1-4f49-ae20-5214c1254515&source=m_redirect
从零开始,手把手教本地部署Stable Diffusion AI绘画 V3版 (Win最新)
原创2023-04-17 11:23·觉悟之坡


https://blog.csdn.net/S_eashell/article/details/135258411?share_token=f998e896-6dff-4fd4-8df2-c6aae132e95c
98秒转录2.5小时音频,最强音频转文字软件insanely-fast-whisper下载部署
老艾的AI世界 已于 2024-01-05 20:20:51 修改

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1429028.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

Latex学习记录

目录 1.Latex各种箭头符号总结 2.[Latex]公式编辑&#xff0c;编号、对齐 3.Latex公式编号: 多行公式多编号&#xff0c;多行公式单编号 4.LaTex中输入空格以及换行 1.Latex各种箭头符号总结 箭头符号 - ➚ (piliapp.com)https://cn.piliapp.com/symbol/arrow/Latex各种箭头…

【algorithm】一个简单的PID工程 base 用于手生时候快速复习 用于设计模式 cpp语法八股 快速复习校验

写在前面 最近项目一直用matlab&#xff0c;防止手生整一个回忆工具使用的简单的pid demo&#xff0c;走一边流程&#xff0c;包括配工程debug看结果&#xff0c;复用之前记录的配置见我的bloghttps://blog.csdn.net/weixin_46479223/article/details/135082867?csdn_share_t…

Faster-Whisper 实时识别电脑语音转文本

Faster-Whisper 实时识别电脑语音转文本 前言项目搭建环境安装Faster-Whisper下载模型编写测试代码运行测试代码实时转写脚本 参考 前言 以前做的智能对话软件接的Baidu API&#xff0c;想换成本地的&#xff0c;就搭一套Faster-Whisper吧。 下面是B站视频实时转写的截图 项…

thinkphp项目之composer快速安装使用

引言 由于项目的需求&#xff0c;thinkphp项目使用到composer。网上搜索有一堆的教程使用&#xff0c;根据自己的需要摸索了下。 步骤 1. 安装phpstudy v8&#xff0c;这个经常用的运行环境&#xff0c;方便好多开发者。安装教程一步一步到最后就行。 2. 安装composer组件&a…

问题:媒体查询语法中, 可用设备名参数表示“文档打印或预览“的是 #媒体#媒体#其他

问题&#xff1a;媒体查询语法中, 可用设备名参数表示"文档打印或预览"的是 A、C.?screen B.?projection C、A.?print D.?speech 参考答案如图所示

【LeetCode: 462. 最小操作次数使数组元素相等 II + 贪心】

&#x1f680; 算法题 &#x1f680; &#x1f332; 算法刷题专栏 | 面试必备算法 | 面试高频算法 &#x1f340; &#x1f332; 越难的东西,越要努力坚持&#xff0c;因为它具有很高的价值&#xff0c;算法就是这样✨ &#x1f332; 作者简介&#xff1a;硕风和炜&#xff0c;…

java.lang.UnsatisfiedLinkError: no onnxruntime4j_jni in java.library.path

目录 1.问题现象: 2.问题定位 3.问题解决 4.很少遇到JDK小版本导致出问题 1.问题现象: 使用langchain <!-- langchain4j start--><dependency><groupId>dev.langchain4j</groupId><artifactId>langchain4j</artifactId><version&g…

AtCoder Beginner Contest 338F - Negative Traveling Salesman【floyd+状态压缩dp】

原题链接&#xff1a;https://atcoder.jp/contests/abc338/tasks/abc338_f Time Limit: 6 sec / Memory Limit: 1024 MB Score: 500 points、 问题陈述 有一个有N个顶点和M条边的加权简单有向图。顶点的编号为 1 到 N&#xff0c;i/th 边的权重为 Wi​&#xff0c;从顶点 U…

影院购票|电影院订票选座小程序|基于微信小程序的电影院购票系统设计与实现(源码+数据库+文档)

电影院订票选座小程序目录 目录 基于微信小程序的电影院购票系统设计与实现 一、前言 二、系统功能设计 三、系统实现 1、用户功能实现 2、管理员功能实现 &#xff08;1&#xff09;影院信息管理 &#xff08;2&#xff09;电影信息管理 &#xff08;3&#xff09;已…

Docker 容器卷

1、概念介绍 如果是CentOS7安全模块会比之前系统版本加强&#xff0c;不安全的会先禁止&#xff0c;所以目录挂载的情况被默认为不安全的行为&#xff0c;在SELinux里面挂载目录被禁止掉了&#xff0c;如果要开启&#xff0c;我们一般使用--privlegedtrue命令&#xff0c;扩大…

springwebflux高性能服务

场景&#xff1a; 分别使用springwebmvc 使用tomcat &#xff08;tomcat 9&#xff09;和springwebflux 做一个简单的接口 &#xff0c;该接口返回一个随机数 压测环境&#xff1a; 4C 8G ECS 使用tomcat 压测结果 Max 抖动的厉害 保持压测的参数不变 使用webflux 压测结果 …

七普详细数据——广东省七普分乡、镇、街道数据,shp格式,自取

基本信息. 数据名称: 广东省七普分乡、镇、街道数据 数据格式: Shp 数据几何类型: 面 数据坐标系: WGS84 数据时间&#xff1a;2020年 数据来源&#xff1a;网络公开数据 数据字段&#xff1a; 序号字段名称字段说明1zrks总人口数&#xff08;人&#xff09;2a0-140…

YOLOv5改进 | Neck篇 | 2024.1最新MFDS-DETR的HS-FPN改进特征融合层(轻量化Neck、全网独家首发)

一、本文介绍 本文给大家带来的改进机制是最近这几天最新发布的改进机制MFDS-DETR提出的一种HS-FPN结构,其是一种为白细胞检测设计的网络结构,主要用于解决白细胞数据集中的多尺度挑战。它的基本原理包括两个关键部分:特征选择模块和特征融合模块,在本文的下面均会有讲解,…

python算法训练之有限域上的多项式运算

需求简述 求所有 GF(2)上 次数小于等于8 的 不可约多项式。 用list存储多项式系数&#xff0c;直接输出list即可。 算法资料&#xff1a; 有限域_百度百科 (baidu.com)https://baike.baidu.com/item/%E6%9C%89%E9%99%90%E5%9F%9F/4273049?frge_ala可约多项式_百度百科 (ba…

Entity实体设计

Entity实体设计 &#x1f4a1;用来和数据库中的表对应&#xff0c;解决的是数据格式在Java和数据库间的转换。 &#xff08;一&#xff09;设计思想 数据库Java表类行对象字段&#xff08;列&#xff09;属性 &#xff08;二&#xff09;实体Entity编程 编码规范 &#x1f4a…

转移表实现计算器

这节复习一下转移表 先实现一个简易的计算器&#xff1a; 加减乘除等计算封装成函数&#xff1a; 然后实现一个菜单&#xff0c;供使用者使用&#xff1a; 函数主体部分&#xff1a; do while循环是为了多次进行计算&#xff0c;只有输入为0时才会推出。 而switch有利于这种选…

【Linux】日志的实现——日志等级的分类、日志的实现和输出、日志在程序中的应用(以管道通信为例)

文章目录 日志实现1.日志的介绍2.日志的制作&#xff08;向屏幕直接打印&#xff09;2.1获取时间2.2输出内容2.3打印方式2.3.1向单个文件打印2.3.2向分类文件打印 3.日志的应用3.1以管道通信为例 日志实现 1.日志的介绍 Linux日志是以时间线-事件的方式记录操作系统和应用的信…

本体论(ontology)在工业4.0中的应用

信息技术中的本体与哲学的本体论是不同的&#xff0c;它代表了某个专业领域的基本概念&#xff0c;它们在智能制造和工业4.0 中具有不可或缺的作用&#xff0c;为了实现人与机器&#xff0c;机器与机器之间的确定性操作。一个标准化的&#xff0c;精确定义的本体服务是非常重要…

进程信号-

一.信号概念 信号是进程之间事件异步通知的一种方式&#xff0c;属于软中断。 二.信号的产生 1.通过键盘进行信号的产生。&#xff08;1-31多数都是杀掉进程&#xff09; &#xff08;ctrl c&#xff1a;向前台进程发送2号信号&#xff0c;杀掉进程&#xff09; &#xff0…

word调整论文格式的记录

页眉的分章显示内容 效果&#xff1a; 步骤&#xff1a; 确保“显示/隐藏的标记”符号打开点亮 前提是章节前面有“分节符&#xff08;下一页&#xff09;”&#xff0c;没有则添加&#xff0c;在菜单栏“布局”——》“下一页” 添加页眉&#xff0c;双击页眉&#xff0c;选…