Speech-to-Text » Whisper AI

Speech-to-Text » Whisper AI

Je napredan i revolucionaran open-source model, koji radi i na srpskom jeziku.

Introductionary OpenAI post and demo at Hugging Face Whisper Space by OpenAI. Fascinantno je koliko solidno radi i na srpskom jeziku.

chidiwilliams/buzz: Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI’s Whisper. je Windows app koja mi trenutno ne radi, ali bi trebalo da radi

Tools based on Whisper AI

Freesubtitles je besplatan alat baziran baš na Whisperu koji radi baš savršeno. Može da se bira i veličina Whisper modela a može da se zahteva i prevod, sve u jednom koraku. Jedino ograničenje je red čekanja koji je ponekada i po nekoliko sati. Za prevođenje se koristi LibreTranslate za koji se ispostavilo da je fantastičan. Ako i prestane da radi, ovo je apsolutno dokaz da sve može da se self-hostuje, a taj self-hosted projekat je mayeaux/generate-subtitles: Generate transcripts for audio and video content with a user friendly UI, powered by Open AI’s Whisper with automatic translations and download videos automatically with yt-dlp integration

Reddit - Dive into anything

Hugging Space for Free Fast YouTube URL Video-to-Text using OpenAI’s Whisper Model

https://github.com/saharmor/whisper-playground https://github.com/tobiashuttinger/openai-whisper-realtime

Woow: https://github.com/ggerganov/whisper.cpps Nestvarno, WASM demo: https://whisper.ggerganov.com/

In this repository you can find a Flutter/Dart port of Whisper: https://github.com/azkadev/whisper_dart it is based on the C++ port of whisper: https://github.com/ggerganov/whisper.cpp

https://exemplary.ai/blog/openai-whisper https://www.assemblyai.com/blog/how-to-run-openais-whisper-speech-recognition-model/

https://huggingface.co/blog/fine-tune-whisper with common dataset

https://github.com/openai/whisper/discussions/11 Ladno radi: https://play.google.com/store/apps/details?id=com.whisper.android.tflitecpp to je ovaj: https://github.com/usefulsensors/openai-whisper A ima i Flutter verzija: https://github.com/azkadev/whisper_dart Slično: https://codeberg.org/pluja/web-whisper

How to transcribe and translate with OpenAI’s Whisper | by Jason Boog | Medium with this Google Colab notebook


This is how to generate perfect transcripts for audio podcasts for free


Whisper Ecosystem

Ovde generalno ima sve pobrojano: sindresorhus/awesome-whisper: 🔊 Awesome list for Whisper — an open-source AI-powered speech recognition system developed by OpenAI

Lokalno procesiranje

MacWhisper upošljava sve core-ove i baš prži računar. Iako upozorava, meni je radio čak i sa large modelom. Pošto je toliko dobar, platio sam Lifetime za 25€

Buzz Captions — Offline audio transcription and translation sa repo na chidiwilliams/buzz: Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI’s Whisper. se instalira kroz brew iako na AppStore košta. Inače, odličan je jer biraš najbrži model na M2, koji je Faster-Whisper mislim. Potpuno je besplatan ali je povremeno pucao odnosno malo je bagovit. Inače, on omogućava i online OpenAI API Whisper, pa ustvari pruža najbolji izbor modela. Međutim, nijedan od modela osim Whisper nije radio i stalno je pucao, tako da nema baš mnogo koristi od toga.

Aiko — Sindre Sorhus jeste besplatan ali je baš najosnovniji. Skoro pa 90% da koristi najveći model iako nisam pronašao dokaz crno-na-belo. Pronašao sam! Hello. Which model does it use on Mac? It uses the large v2 model. https://www.reddit.com/r/macapps/comments/13qg1b7/comment/jlezj70/

How can I transcribe a YouTube video? Download the audio using a service like Dirpy and then open the file in Aiko.

Ima i neki Jojo Transcribe koji je besplatan na on the Mac App Store i jedino što obećava je što je švedska firma. Ispostavilo se da, iako je besplatna, aplikacija radi baš vrlo korektno to za large model na srpskom jeziku.

WhisperScript, an Electron desktop app GUI

WhisperScript - Unlimited AI Transcription for Mac Free licence for lite: A115D6EC-052B4E7F-86D970D2-F9E1B6D9 or AE3C18C8-4CF44806-B36364D7-D6E54E08

WhisperScript, an Electron desktop app GUI for Whisper · openai/whisper · Discussion #1028

WhisperScript - the ultimate transcription app for Mac

WhisperScript - Product Information, Latest Updates, and Reviews 2023 | Product Hunt

Audapolis

bugbakery/audapolis: an editor for spoken-word audio with automatic transcription Here’s a multi-platform open source app that does the same thing but uses vosk instead of whisper. Očigledno jesu neki drugi modeli za jezike, ali nema ništa od jezika sa balkana.


Procesiranje online

writeout.ai – Transcribe and translate any audio file koristi OpenAI API a sam po sebi je open-source iako se hosted seris plaća, sa repo na beyondcode/writeout.ai: Transcribe and translate your audio files - for free

Revoldiv uses Whisper. The transcription quality is near perfect and it’s free.

superwhisper za srpski traži licencu koja je perpetual, ali postoji i lifetime is $250, tako da zaboravi, ali mi se svidelo što se lako inkorporira u sve aplikacije.

oTranscribe nije alat koji radi automatski transcript, već pomađe da sam uradiš ti ručni transcript, a radi i za local video i za Youtube.

Ako želiš sam da hostuješ API, postavi ovaj docker ahmetoner/whisper-asr-webservice: OpenAI Whisper ASR Webservice API

saharmor/whisper-playground: Build real time speech2text web apps using OpenAI’s Whisper koristi faster-whisper za real-time speech2text. Ima i demo ali demo nije radio ni za srpski ni za engleski.

AI Modeli za Whisper

oficijelni: openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision

brži: ggerganov/whisper.cpp: Port of OpenAI’s Whisper model in C/C++ and its Windows port: Const-me/Whisper: High-performance GPGPU inference of OpenAI’s Whisper automatic speech recognition (ASR) model

najbrži: guillaumekln/faster-whisper: Faster Whisper transcription with CTranslate2


Woow: Vaibhav (VB) Srivastav on X: “Insanely fast whisper - now with a CLI⚡️ You can now translate/ transcribe 100s of hours of data across 99 languages! - all from your terminal. Here’s how you can use it: 1. Install requirements pip install transformers, accelerate, optimum 2. Grab the transcribe py file and… https://t.co/XIjCe67svc” / X

super brzi whisper na localu. Vaibhavs10/insanely-fast-whisper


sindresorhus/awesome-whisper: 🔊 Awesome list for Whisper — an open-source AI-powered speech recognition system developed by OpenAI



argmaxinc/WhisperKit: On-device Speech Recognition for Apple Silicon


Whisper, again:

Clients: MacOS - MacWhisper je free ali ne za Medium niti Large model. License key: 7832BA4B-E03740FB-96457D40-A7B3B5F0

Windows: Const-me/Whisper: High-performance GPGPU inference of OpenAI’s Whisper automatic speech recognition (ASR) model može sve modele, baš open source

In a conversation, can Whisper differentiate between the different speakers? Usput, to se zove “speaker diarization” i već postoji varijanta koja to radi. pyannote/pyannote-audio: Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

ter — like a live podcast, or both sides of a meeting — you could use an app like Audio Hijack to set system audio as a virtual microphone input.

date 10. Dec 2022 | modified 17. Aug 2024
filename: AI » Speech & Audio » Whisper