How to improve the audio quality of voice recordings with artificial intelligence

How to improve the audio quality of voice recordings with artificial intelligence

[ad_1]

Adobe, the software house that develops it Photoshop and the Creative Cloud suite of programs, has made available online AI-powered software that lets you clean up and enhance audio recordings. The service is called Enhance Speech, and as the name suggests it is meant for improve speech: it is capable of transforming a low quality clip, captured by the microphone of the mobile phone or computer, into an audio that (in most cases) can even seem recorded in the studio.

Trying Enhance Speech is very simple: just go to the site, register a new Adobe account or login with your existing credentials, then send an audio file (mp3 or wav only) by dragging it from the desktop or from a folder in the upload area. The system will immediately start processing the audio: the processing time varies according to the file size and the duration of the recording. The current version of Enhance Speech allows you to upload files up to 1GB or up to an hour long.

The case

To write a book with ChatGPT “4 hours are enough”. And there are hundreds of them on Amazon

by Emanuele Capone


Proof

We tried Enhance Speech with several files and the results are undoubtedly amazing. However, defining the processed files as “studio audio” seems like an exaggeration, as you can guess from tests that we performed in Italian and English. We tried with two different languages ​​to understand if Adobe’s AI works only at the level of spectral analysis or if instead it also uses the language to understand how to isolate the voice. The difference does not seem relevant to us. A test carried out with a gibberish clip (not included in the video below) confirms that the language used or the semantics of what is said in the clip are not relevant factors in improving the audio quality.

Training and errors

Adobe hasn’t revealed particular details about the development process of Enhance, but it’s safe to assume that the AI ​​behind the system was trained on large amounts of audio studio and low quality audio, in order to recognize differences in the signal, the various types of background noise, and many other elements that make the difference between poor quality audio and professional audio, such as the presence, compression and others.

Adobe Enhance isn’t perfect: in some cases, probably depending on your characteristics of the speaker’s voice, the result can leave something to be desired. Some frequencies are distorted more often and sometimes the impression is that the speaker has (for example) a stuffy nose. On various online forums, users have also reported cases of real hallucinations of the algorithm, with the appearance of voices, sounds and other artifacts not present in the original clips before Enhance Speech processing.

Special Gadgets

Five must haves for the 2.0 office

by Lorenzo Fantoni



A suite for podcasts

Adobe Enhance Speech is part of a suite of 3 applications which goes by the name of Adobe Podcast. This also includes Mic Check and the eponymous Adobe Podcast.

Mic Check is an online software that allows you to check the quality of your microphone, always free of charge and after registering. Just register a clip on the site and the system will automatically check various parameters that contribute to the good rendering of a recording: the distance from the microphone, the gain, the background noise and finally the echo. Above, the result of a test performed with the microphone of a MacBook Air M1 (the same used for test recordings).

Adobe Podcasts it is perhaps the most interesting software of the 3, but for now it is only available to selected beta testers: it is a text editing tool for podcasts based on speech transcription. Using a speech-to-text technology similar to the one Adobe already integrates with Premier or After Effect, it allows you to cut and recompose audio files simply by automatically correcting the transcribed script. A similar system has already been available for some time within online transcription applications such as Sonix or Descript. The latter also offers the possibility of applying the same system to video editing.

[ad_2]

Source link