#Roma #fediverso ci sei? Lo vogliamo creare e supportare? Romanisti, noi siamo ovunque nel mondo e sicuramente siamo sparsi nel fediverso. Creiamo il #romafediverso
@thelinuxEXP I really like Speech Note! It's a fantastic tool for quick and local voice transcription in multiple languages, created by @mkiol
It's incredibly handy for capturing thoughts on the go, conducting interviews, or making voice memos without worrying about language barriers. The app uses strictly locally running LLMs, and its ease of use makes it a standout choice for anyone needing offline transcription services.
I primarily use #WhisperAI for transcription and Piper for voice, but many other models are available as well.
It is available as flatpak and https://github.com/mkiol/dsnote
#TTS #transcription #TextToSpeech #translator translation #offline #machinetranslation #sailfishos #SpeechSynthesis #SpeechRecognition #speechtotext #nmt #linux-desktop #stt #asr #flatpak-applications #SpeechNote
Yesterday, I ordered food online. However it went a little off. And I contacted Support. They called me and for one moment, I thought it's a bot or recorded voice or something. And I hated it. Then I realized it's a human on the line.
I was planning to do an LLM+TTS+Speech Recognition and deploy it on A311D. To see if I can practice british accent with it. Now I'm rethinking about what I want to do. This way we are going, it doesn't lead to a good destination. I would hate it if I would have to talk to a voice enabled chatbot as support agent rather than a human.
And don't get me wrong. Voice enabled chatbots can have tons of good uses. But replacing humans with LLMs, not a good one. I don't think so.
Arm unleashes ASR upscaling to boost mobile gaming performance https://www.developer-tech.com/news/arm-asr-upscaling-boost-mobile-gaming-performance/ #arm #developers #coding #gamedev #programming #asr #mobile #gaming #tech #news #technology
[18:46] Overstappen van ASR naar CJIB, de Belastingdienst of Bentacera: het kan door dit platform in Leeuwarden
Hoe behoud je personeel voor de stad? Het Platform Zakelijke Dienstverlening Leeuwarden moet er bijvoorbeeld voor zorgen dat medewerkers van verzekeraar ASR hier blijven.
https://lc.nl/economie/Overstappen-van-Asr-naar-CJIB-de-Belastingdienst-of-Bentacera-het-kan-dankzij-dit-platform-in-Leeuwarden-45836302.html
#Platform #ZakelijkeDienstverlening #ASR
Kies een genocidevrije #zorgverzekeraar!
Nederlandse zorgverzekeraars investeren fors in bedrijven die bijdragen aan de illegale Israëlische nederzettingeneconomie.
Jouw maandelijkse zorgpremie mag geen mensenrechtenschendingen financieren.
Stap nu over naar een ethischere verzekeraar en dien een klacht in. Laat jouw stem horen tegen deze onrechtvaardigheid!
Zeg nee tegen de volgende bedrijven:
#Achmea en haar dochterbedrijven:
– #ZilverenKruis
– #FBTO
– #CentraalBeheer
– #Interpolis
– #Eurocross
– #InShared
– De Christelijke Zorgverzekeraar
– De Friesland Zorgverzekeraar
#Aegon en #ASR
– Zij zijn recentelijk samengegaan en opereren nu in Nederland onder de naam ASR.
Waarom:
- Aegon en ASR investeren samen meer dan 1 miljard USD in bedrijven die bijdragen aan de Israëlische nederzettingeneconomie, waarmee de illegale bezetting wordt ondersteund.
- Achmea investeert 178 miljoen USD in bedrijven die hieraan verbonden zijn.
- Aegon 998,7 miljoen euro belegt in 10 wapenbedrijven die wapens leveren aan repressieve regimes.
- Achmea 79,5 miljoen euro investeert in 2 van deze bedrijven.
- Aegon investeert ook in wapenbedrijven die rechtstreeks wapens leveren aan #Israël.
️ Bronnen:
- Artsen voor Gaza: https://www.linkedin.com/posts/artsen-voor-gaza_maak-vandaag-het-verschil-kies-een-genocidevrije-activity-7271616636376256512-MWpa
- Stop Funding Genocide: https://stopfundinggenocide.nl/zorgverzekering
Two posts from British Library Oral History Archivist Charlie Morgan on the challenges of AI for oral history: key questions and theoretical and practical issues for automatic speech recognition (ASR) tools and chatbots:
https://blogs.bl.uk/digital-scholarship/2024/12/the-challenges-of-ai-for-oral-history-key-questions.html
https://blogs.bl.uk/digital-scholarship/2024/12/the-challenges-of-ai-for-oral-history-theoretical-practical-issues.html
#ASR #OralHistory #ML #AI
Each quarter, when the new @mozilla #CommonVoice #dataset is released, I do a #dataviz using @observablehq of its #metadata coverage, across all 100+ languages, based on the JSON summary that is part of the release.
Some of my observations from the v18 release are:
#Catalan (ca) now has a larger dataset than English, based on the number of audio recordings (including validated and yet-to-be-validated recordings). It’s also an interesting dataset because the number of recordings per unique contributor is relatively low (around 80). This means it’s likely to have a high diversity of speakers in the dataset, which is useful for building #ASR models that generalise well to many speakers.
Catalan also appears to have the highest percentage of audio recordings by older speakers - e.g. speakers in their forties, fifties and older. Again, this highlights the diversity of speakers in the Catalan dataset.
Although it’s very early to see any trends from the decision by Common Voice to expand the range of options for gender identity, we are starting to see some data being tagged with the new options that are available. For example, in #Uyghur (ug), we now have data tagged as “do not wish to say”. I don’t want to draw connections between the geopolitical situation in that area and the desire of data contributors not to provide demographic data which may in some way identify them without more evidence, but I think it’s telling that the first use of these expanded metadata categories appears in a language that is spoken in a contested geography.
Similarly, it’s very early to identify trends in sentence domain classification - as most of the sentences that do have a domain tag are labelled “general”, although “health_care” sentences are occurring frequently in languages such as #Albanian (sq).
#Bangla (Bengali) (bn) continues to have a very large number of yet-to-be-validated audio recordings. Due to this, the train split for Bangla is quite small.
#Dholuo (luo), a language spoken in Kenya and Tanzania, is an outlier in terms of the number of distinct data contributors to the dataset - this language has a very high average number of contributions for per contributor. This is often seen in languages that are new to Common Voice, before they have been able to recruit more contributors. Dholuo has nearly 5 million speakers.
The language with the highest average utterance duration is by far #Icelandic (is) at over 7 seconds. This may be because Icelandic has many words with several syllables, which take longer to pronounce. Consider "the cat sat on the mat" in English, cf "kötturinn sat á mottunni" in Icelandic.
Big thanks to all data contributors in this release for your donated utterances, and to Dmitrij Feller, @jessie, Gina Moape, EM Lewis-Jong and the team for all your efforts.
What are your thoughts? What conclusions do you draw?
https://observablehq.com/@kathyreid/mozilla-common-voice-v18-dataset-metadata-coverage
Like many other technologists, I gave my time and expertise for free to #StackOverflow because the content was licensed CC-BY-SA - meaning that it was a public good. It brought me joy to help people figure out why their #ASR code wasn't working, or assist with a #CUDA bug.
Now that a deal has been struck with #OpenAI to scrape all the questions and answers in Stack Overflow, to train #GenerativeAI models, like #LLMs, without attribution to authors (as required under the CC-BY-SA license under which Stack Overflow content is licensed), to be sold back to us (the SA clause requires derivative works to be shared under the same license), I have issued a Data Deletion request to Stack Overflow to disassociate my username from my Stack Overflow username, and am closing my account, just like I did with Reddit, Inc.
https://policies.stackoverflow.co/data-request/
The data I helped create is going to be bundled in an #LLM and sold back to me.
In a single move, Stack Overflow has alienated its community - which is also its main source of competitive advantage, in exchange for token lucre.
Stack Exchange, Stack Overflow's former instantiation, used to fulfill a psychological contract - help others out when you can, for the expectation that others may in turn assist you in the future. Now it's not an exchange, it's #enshittification.
Programmers now join artists and copywriters, whose works have been snaffled up to create #GenAI solutions.
The silver lining I see is that once OpenAI creates LLMs that generate code - like Microsoft has done with Copilot on GitHub - where will they go to get help with the bugs that the generative AI models introduce, particularly, given the recent GitClear report, of the "downward pressure on code quality" caused by these tools?
While this is just one more example of #enshittification, it's also a salient lesson for #DevRel folks - if your community is your source of advantage, don't upset them.
Folks, I'm starting my post-#PhD job search low-key on the side while I write up my #thesis.
I have an odd collection of skills - #Linux, #Python, #Jupyter, #pandas, #DevRel, and I've done a lot of work in team leadership and management, and have led a multi-million $ not for profit in the past. Keynote speaker.
My speciality is #voice and #speech AI, more on the #ASR side with models like #Whisper.
I'm looking for something that harnesses all of these skills - and it will be a senior role with senior pay, given my experience, qualifications and proven capability. I have time and will be discerning about my next step.
Job titles that might fit here would be Senior Research Engineer, Engineering Lead, Lead AI Engineer or similar.
Looking for fully remote work, with one day a fortnight max in #Melbourne, AU. If you don't believe in #RemoteWork or #WFH, we're not a good fit.
Super keen on something full time rather than splitting my attention over multiple part-time roles.
Looking to start around August, so a fair amount of lead time.
Keen on organisations that have strong values alignment - #FAIR and #CARE data use, #EthicalAI, AI for social good.
No crypto, no web3, no deepfake stuff.
Check out my LinkedIn for more info on my background:
https://www.linkedin.com/in/kathyreid/
Two new ASR rules just released in preview
The latter is especially interesting...
For folks who work in #DataScience, what's the easiest way for me to to calculate the #CosineSimilarity of two strings? I'm looking at sklearn cosine_similarity first.
Related to hallucination detection in #ASR - low cosine similarity indicative of hallucination.
Automatic Speech Recognition #Ai Assistant
Turning a #RaspberryPi 4B into a satellite for self-hosted language model, all with a sprinkle of #ASR and #NordVPN Meshnet
https://hackaday.io/project/193635-automatic-speech-recognition-ai-assistant
Architectural Decision Records (ADRs)
"An Architectural Decision Record (ADR) captures a single AD and its rationale; the collection of ADRs created and maintained in a project constitute its decision log. All these are within the topic of Architectural Knowledge Management (AKM), but ADR usage can be extended to design and other decisions ('any decision record')."
#Seabrook #nuclear plant faces ongoing challenge of managing concrete degradation
by Angeljean Chiaramida, July 12, 2023
"A 2023 report shows concrete degradation has expanded from seven to 10 structures at the Seabrook plant.
"Discovered and reported by NextEra’s personnel in 2009, #ASR is a slow-developing type of degradation that can occur in some #concrete when moisture is present. Found most often in dams and bridges, ASR manifests as micro-cracking, staining, expansion and deformation of concrete.
"So far, the #NRC’s repeated inspections determined ASR in Seabrook Station’s structures poses no immediate threat to public safety."
Do you work with #voice or #speech #data? You might contribute data, write data specifications for collection, perform filtering or pre-processing, train #ASR or #TTS models, or design or perform evaluations on #ML speech models.
If so, I’d love your help to understand current #dataset #documentation practices, and what we can do to make them better as part of my #PhD #research
The #survey takes 10-20 minutes to complete, and you can opt in to win one of 3 gift cards valued at $AUD 50 each.
Research Protocol 2021/427 approved by #ANU Human Research Ethics Committee