As I research further into Machine Learning to gain a better understanding of what’s possible and how it might be applied, I found a couple of audio related articles. While mostly still in the lab, this research will guarantee the perfect Frankenbite in the future!
There are several projects that will “rewrite” what someone says by typing the change into a transcript. VoCo is one, but I have seen demos of there projects that do the same, or similar. With large enough data sets we can get results like this “fake Obama.”
Taken to another degree, recreating a never delivered speech from a President assassinated over 55 years ago, just took more time.
While it will raise ethical questions (I hope) it probably guarantees perfect Frankenbites in the future.
In fact, I would love a trained model that “reinflected” a phrase so it ends naturally. The Australian version will take a rising inflection at the end of a sentence and phrase it naturally.