Speech-to-Text » Whisper AI
Je napredan i revolucionaran open-source model, koji radi i na srpskom jeziku.
Introductionary OpenAI post and demo at Hugging Face Whisper Space by OpenAI. Fascinantno je koliko solidno radi i na srpskom jeziku.
chidiwilliams/buzz: Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI’s Whisper. je Windows app koja mi trenutno ne radi, ali bi trebalo da radi
Tools based on Whisper AI
Freesubtitles je besplatan alat baziran baš na Whisperu koji radi baš savršeno. Može da se bira i veličina Whisper modela a može da se zahteva i prevod, sve u jednom koraku. Jedino ograničenje je red čekanja koji je ponekada i po nekoliko sati. Za prevođenje se koristi LibreTranslate za koji se ispostavilo da je fantastičan. Ako i prestane da radi, ovo je apsolutno dokaz da sve može da se self-hostuje, a taj self-hosted projekat je mayeaux/generate-subtitles: Generate transcripts for audio and video content with a user friendly UI, powered by Open AI’s Whisper with automatic translations and download videos automatically with yt-dlp integration
Hugging Space for Free Fast YouTube URL Video-to-Text using OpenAI’s Whisper Model
https://github.com/saharmor/whisper-playground https://github.com/tobiashuttinger/openai-whisper-realtime
Woow: https://github.com/ggerganov/whisper.cpps Nestvarno, WASM demo: https://whisper.ggerganov.com/
In this repository you can find a Flutter/Dart port of Whisper: https://github.com/azkadev/whisper_dart it is based on the C++ port of whisper: https://github.com/ggerganov/whisper.cpp
https://exemplary.ai/blog/openai-whisper https://www.assemblyai.com/blog/how-to-run-openais-whisper-speech-recognition-model/
https://huggingface.co/blog/fine-tune-whisper with common dataset
https://github.com/openai/whisper/discussions/11 Ladno radi: https://play.google.com/store/apps/details?id=com.whisper.android.tflitecpp to je ovaj: https://github.com/usefulsensors/openai-whisper A ima i Flutter verzija: https://github.com/azkadev/whisper_dart Slično: https://codeberg.org/pluja/web-whisper
How to transcribe and translate with OpenAI’s Whisper | by Jason Boog | Medium with this Google Colab notebook
This is how to generate perfect transcripts for audio podcasts for free
Whisper Ecosystem
Ovde generalno ima sve pobrojano: sindresorhus/awesome-whisper: 🔊 Awesome list for Whisper — an open-source AI-powered speech recognition system developed by OpenAI
Lokalno procesiranje
MacWhisper upošljava sve core-ove i baš prži računar. Iako upozorava, meni je radio čak i sa large modelom. Pošto je toliko dobar, platio sam Lifetime za 25€
Buzz Captions — Offline audio transcription and translation sa repo na chidiwilliams/buzz: Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI’s Whisper. se instalira kroz brew iako na AppStore košta. Inače, odličan je jer biraš najbrži model na M2, koji je Faster-Whisper mislim. Potpuno je besplatan ali je povremeno pucao odnosno malo je bagovit. Inače, on omogućava i online OpenAI API Whisper, pa ustvari pruža najbolji izbor modela. Međutim, nijedan od modela osim Whisper nije radio i stalno je pucao, tako da nema baš mnogo koristi od toga.
Aiko — Sindre Sorhus jeste besplatan ali je baš najosnovniji. Skoro pa 90% da koristi najveći model iako nisam pronašao dokaz crno-na-belo. Pronašao sam! Hello. Which model does it use on Mac? It uses the large v2 model. https://www.reddit.com/r/macapps/comments/13qg1b7/comment/jlezj70/
How can I transcribe a YouTube video? Download the audio using a service like Dirpy and then open the file in Aiko.
Ima i neki Jojo Transcribe koji je besplatan na on the Mac App Store i jedino što obećava je što je švedska firma. Ispostavilo se da, iako je besplatna, aplikacija radi baš vrlo korektno to za large model na srpskom jeziku.
WhisperScript, an Electron desktop app GUI
WhisperScript - Unlimited AI Transcription for Mac Free licence for lite: A115D6EC-052B4E7F-86D970D2-F9E1B6D9 or AE3C18C8-4CF44806-B36364D7-D6E54E08
WhisperScript, an Electron desktop app GUI for Whisper · openai/whisper · Discussion #1028
WhisperScript - the ultimate transcription app for Mac
WhisperScript - Product Information, Latest Updates, and Reviews 2023 | Product Hunt
Audapolis
bugbakery/audapolis: an editor for spoken-word audio with automatic transcription Here’s a multi-platform open source app that does the same thing but uses vosk instead of whisper. Očigledno jesu neki drugi modeli za jezike, ali nema ništa od jezika sa balkana.
Procesiranje online
writeout.ai – Transcribe and translate any audio file koristi OpenAI API a sam po sebi je open-source iako se hosted seris plaća, sa repo na beyondcode/writeout.ai: Transcribe and translate your audio files - for free
Revoldiv uses Whisper. The transcription quality is near perfect and it’s free.
superwhisper za srpski traži licencu koja je perpetual, ali postoji i lifetime is $250, tako da zaboravi, ali mi se svidelo što se lako inkorporira u sve aplikacije.
oTranscribe nije alat koji radi automatski transcript, već pomađe da sam uradiš ti ručni transcript, a radi i za local video i za Youtube.
Ako želiš sam da hostuješ API, postavi ovaj docker ahmetoner/whisper-asr-webservice: OpenAI Whisper ASR Webservice API
saharmor/whisper-playground: Build real time speech2text web apps using OpenAI’s Whisper koristi faster-whisper za real-time speech2text. Ima i demo ali demo nije radio ni za srpski ni za engleski.
AI Modeli za Whisper
oficijelni: openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision
brži: ggerganov/whisper.cpp: Port of OpenAI’s Whisper model in C/C++ and its Windows port: Const-me/Whisper: High-performance GPGPU inference of OpenAI’s Whisper automatic speech recognition (ASR) model
najbrži: guillaumekln/faster-whisper: Faster Whisper transcription with CTranslate2
super brzi whisper na localu. Vaibhavs10/insanely-fast-whisper
- WhisperFusion: Ultra-low latency conversations with an AI chatbot
- collabora/WhisperLive: A nearly-live implementation of OpenAI’s Whisper.
- collabora/WhisperSpeech: An Open Source text-to-speech system built by inverting Whisper.
argmaxinc/WhisperKit: On-device Speech Recognition for Apple Silicon
Whisper, again:
Clients: MacOS - MacWhisper je free ali ne za Medium niti Large model. License key: 7832BA4B-E03740FB-96457D40-A7B3B5F0
Windows: Const-me/Whisper: High-performance GPGPU inference of OpenAI’s Whisper automatic speech recognition (ASR) model može sve modele, baš open source
In a conversation, can Whisper differentiate between the different speakers? Usput, to se zove “speaker diarization” i već postoji varijanta koja to radi. pyannote/pyannote-audio: Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
ter — like a live podcast, or both sides of a meeting — you could use an app like Audio Hijack to set system audio as a virtual microphone input.