Microsoft creates an AI speech tool so realistic they decide not to release it
Hindustan TimesMicrosoft has created VALL-E 2, a text-to-speech AI tool that is so realistic that they have decided not to release it to the public, fearing misuse of the ability to impersonate other people’s voices. “Currently, we have no plans to incorporate VALL-E 2 into a product or expand access to the public.” The tech giant’s researchers say that VALL-E 2 has achieved “human parity" in speech generation, which means that whatever the AI says cannot be distinguished from a real human voice. The AI becomes so realistic by using two aspects of its code: These are known as “Repetition Aware Sampling” and “Grouped Code Modeling.” Repetition aware sampling helps the AI to cut down on monotonous speech by recognising small units of language like words or syllables to prevent their repetition and sound more natural. Grouped code modeling reduces the sequence length and allows the AI to process lesser units of speech to speed up speech generation and reduce the challenge of processing long sentences.