Micmonster: AI Speech Synthesis

Film & Animation


Micmonster: AI Speech Synthesis

I've recently caught an annoying summer cold and lost my voice. Feeling better, I decided to try out a text-to-speech service I bought called MikeMonster. This is my honest review; I'm not getting any monetization from MikeMonster or anyone else for this—it's just free information.

There are quite a few AI text-to-speech services available now. Most follow a similar pay-as-you-go format—you buy a subscription and receive a certain number of text-to-speech recordings per month. I have mixed feelings about text-to-speech: skeptical yet also a fan. Years ago, I bought some text-to-speech voices for my Mac, which were very expensive and took up a significant amount of storage on my hard drive. Although these early computer voices were robotic and phonetically assembled, I was charmed by them, much like 8-bit characters. Today’s voices, however, are far more sophisticated and less artificial; they resemble something out of Westworld more than Robbie the Robot, though they can still sound monotonous and corporate at times.

MikeMonster appealed to me because it doesn’t restrict audio sampling rates, eliminating low-quality "flutter" in the words. This comes with the downside that MikeMonster can't be queried in real time like Siri or Watson—it takes around 5 to 10 seconds to generate a response. Competitors tended to lock better sampling rates behind expensive enterprise accounts aimed at real-time customer service, which I don’t need. I value quality over real-time response, so MikeMonster was the best choice for me.

MikeMonster features a wide variety of English voices, some with multiple voice styles such as cheerful, sad, shouting, whispering, and more. These styles aren’t always straightforward and can vary between voices. MikeMonster lets you try these voice styles for free, but you’ll soon use up your free recordings. This technology is still in progress; it’s not perfect yet, but it’s continually evolving. The service includes five or six American English voices with multiple styles, but none of the British voices have styles. There are also many English voices with world accents, which could be great for adding diverse NPCs to projects.

For those wondering if these voices can serve as secondary characters or placeholders until real voice actors can be funded, MikeMonster offers potential. The service works as promised, but you should start with the free trial before committing to a paid subscription.

One feature I found useful is the phonetic panel, which allows you to type out sounds if required. Although, typically, the AI correctly interprets the script without needing such tweaks. Another useful tool in the advanced editor is the ability to adjust inflections by clicking up to five points on a graph—though editing long scripts for emphasis can be time-consuming.

Scripts with different voices are another paid feature. While the website is only available now, with an app touted as "coming soon," and editing long scripts can be cumbersome on a webpage, MikeMonster does allow saving projects, retaining original text even if emphasis data is lost upon reloading.

The subscription price initially is cheaper and then increases unless you contact them to cancel. Given my heavy usage needs, a one-time payment over $ 100 for two years of service made sense for me. For making projects, multiple projects, and advanced features are essential, and that's what you pay for with the subscription.

While MicMonster does not replicate your voice or offer an extensive API or tagging system, it uses a five-point graph for emphasis. This simpler approach works well enough, though it lacks fine-grained control.

Ultimately, I paid for MicMonster because of its sound quality and variety of voices. The high sampling rate is clean, making pitch adjustments feasible. While the "style" voices offer some options, they aren't yet fully developed for character voices. Instead, MicMonster seems more suited for clear, natural-sounding voiceovers and marketing videos.

Here's a brief rundown of what I've found:

  • Numerous English voices with various accents
  • High-quality sampling rate without a time restriction
  • Multiple voice styles, albeit inconsistently recorded
  • Project-saving features with some limits in advanced editing
  • No voice replication feature

For ambient noise, occasional NPCs, or secondary characters, MicMonster fits the bill. It even provides a commercial certificate of use, ensuring you can freely use your generated audio.

And, as I hit my 12,000-character limit, that concludes my review!


Keywords

  • Text-to-speech
  • AI
  • MicMonster
  • Subscription
  • Sampling rate
  • Voice styles
  • Advanced editor
  • Audio quality
  • Natural-sounding voice
  • Accents

FAQ

Q: Does MicMonster offer a free trial? A: Yes, MicMonster provides a free trial, allowing you to test the voice quality and styles before committing to a subscription.

Q: What sets MicMonster apart from other text-to-speech services? A: MicMonster does not restrict audio sampling rates, ensuring high-quality sound without the "flutter" common in lower-rated samples. Most competitors lock higher sampling rates behind enterprise accounts.

Q: How fast does the service generate audio files? A: MicMonster can generate audio files in about 5 to 10 seconds, which is not fast enough for real-time interaction but suitable for projects where quality is more important than speed.

Q: Can I use MicMonster voices for secondary characters or placeholders in projects? A: Yes, MicMonster voices are suitable for secondary characters or placeholders, especially given their high-quality sampling rate and diverse accent options.

Q: Is there support for different voice styles? A: Yes, some voices come with multiple styles (cheerful, sad, shouting, whispering, etc.). However, these styles may not be consistently recorded across all voices.

Q: Does MicMonster offer voice replication? A: No, MicMonster does not have the feature to replicate your voice. Other services provide this but often limit the sampling rate unless you pay for an enterprise account.

Q: How user-friendly is the service? A: While the website interface could use improvement, it is functional. The advanced editor allows emphasizing inflection by selecting and adjusting points on a graph. However, editing long scripts can be cumbersome.

Q: Is the subscription worth it? A: For users needing high-quality, diverse voices (with the option to extend the range of their usage), the subscription can be worthwhile. However, the user might face a steeper price after the initial discounted subscription period.