Quickly and accurately convert audio or video to text, particularly interviews, for the purpose of easily creating the content and publishing the news. Be able to monitor company names, brands, institutions and other entities for a large number of audio and video files.
AGERPRES main goals:
To provide a faster and more accurate transcription of the interviews compared to all other speech-to-text alternatives, including Google speech to text.
Publish the content easily and speed up the editing process by searching time stamped text to find the key moments fast.
Monitor with a high degree of accuracy brand names, companies and institutions, products and persons.
Agerpres, Romania's leading press agency, faced a major challenge in 2022: how to achieve a higher level of accuracy and scalability in speech recognition. The agency's previous speech to text solution was slow, inaccurate, and could not handle the large volumes of audio data that Agerpres was processing. This was a major problem, as Agerpres relies on speech recognition to transcribe interviews, press conferences, and other audio recordings in order to release the news and monitor the media.
After an extensive evaluation of different speech-to-text solutions, Agerpres chose Vatis Tech. The ability of Vatis’s solution to convert audio into text was much faster and more accurate than the previous options, and a very important aspect was the ability to handle large volumes of audio data. This has allowed Agerpres to improve the accuracy and timeliness of its news coverage, and it has also freed up staff time to focus on other tasks.
Vatis Tech achieved a transcription speed of approximately 15% of the file length, meaning that 1 hour of audio could be transcribed in less than 10 minutes. They achieved a high content accuracy of over 95% for the Romanian language, which is considered the hardest romance language to learn.
The team implemented speaker diarization technology, to identify multiple speakers throughout an audio-video file. As a result, the transcribed texts had diacritics, could be divided into paragraphs, predefined time sequences, or multiple speakers.
Veronica Tudor
Deputy Chief Editor, AGERPRESS
Vatis’ speech to text solution automatically reduced work time and helped the team focus more on news creation instead of transcribing the text. The audio/Video is pinned to the text so the team can quickly refer back to the source to check exactly what was said and who said it.
With over 95% accuracy rate, the technology effectively recognizes each speaker individually and also identifies a wide range of entities like people and company names, dates, or locations from the audio files.
With a transcription speed of approximately 15% of the file length, the team was able to transcribe 1 hour of audio in less than 10 minutes. This allowed them to post more content online at a faster pace, thus increasing their online presence.
Ability to take on higher business volume and fulfill even more complex projects with higher customer demands.
Agerpres's experience shows that achieving speech recognition accuracy and scalability is a major challenge. However, it is a challenge that can be overcome with the right solution. By choosing a solution that is both accurate and scalable, businesses can improve their efficiency and productivity, and they can also provide their customers with a better experience.
Agerpres is planning to significantly increase the volume of business it runs with Vatis Tech. The press agency is looking to expand its use of Vatis Tech's speech recognition solutions to transcribe even more audio and video recordings, This will allow Agerpres to release the news even more quickly and to provide better media monitoring services.