Language (Text)       
 Language the input text is in.
Language (Output)       
 Target language/accent to output.
Task       
 Text Delimiter       
 How to split the text into utterances.
Use raw text rather than phonemize the text as the input prompt.
Auto play on generation (using sounddevice).