.NET 3 lines of code to implement text-to-speech

.NET 3 lines of code to implement text-to-speech

In the era of artificial intelligence, text-to-speech is a popular feature of AI nowadays. Major companies have businesses in this area, enabling conversion of various texts to speech via APIs, even simulating real people, which is very powerful.

Last updated 7/25/2022 9:42 PM
翔星 DotNet开发跳槽
5 min read
Category
.NET
Tags
.NET C# AI

In the era of artificial intelligence, text-to-speech is one of the more popular AI features. Major companies all have businesses in this area, offering APIs to convert various texts to speech, even simulating real voices—very powerful. Microsoft, the .NET parent company, also has such services. If your requirements for text-to-speech aren't too high, you can use Microsoft's own speech synthesis library. The implementation is very simple, requiring only 5 lines of code. This article will introduce how to use it.

Steps

  1. Environment Setup

Create a new console project and use NuGet to install System.Speech. Alternatively, you can add it to your local library. See the NuGet installation image below.

  1. Enter the following code and add the reference using System.Speech.Synthesis;
static void Main(string[] args)
{
    // Instantiate SpeechSynthesizer.
    SpeechSynthesizer synth = new SpeechSynthesizer();
    // Configure audio output.
    synth.SetOutputToDefaultAudioDevice();
    // Convert string to speech.
    synth.Speak("Hello! DotNet development career change");

    Console.WriteLine();
    Console.WriteLine("Press any key to exit...");
    Console.ReadKey();
}

It successfully speaks "Hello! DotNet development career change". This is a console application; you can also implement it in WinForms or WPF for a more polished experience.

Extended Example

Now let's implement the text-to-speech feature in a .NET web application. The requirement is to input a sentence in a text box, use the System.Speech library to convert it to audio, and load it on the page for playback.

  1. First, create a new .NET web application. Add the System.Speech NuGet package as in the previous example.

  2. Create an index.shtml page with a simple text box, a submit button, and an audio output control. The code is as follows:

<form>
  <p>
    Title: <input type="text" asp-for="Speektext" />
    <input type="submit" value="Generate Speech" />
  </p>
</form>
<audio style="width:350px;height:50px;" id="bofang" controls>
  <source src="@Model.filename" type="audio/mpeg" />
</audio>
  1. In the Index.cshtml.cs page, create properties for the input text and the file path, and inject the hosting environment in the constructor.
private readonly IHostingEnvironment _IhostingEnvironment;
[BindProperty(SupportsGet = true)]
// Text to be converted
public string? Speektext { get; set; }
// Output file path
public string filename { get; set; }
// Inject to read the resource file path
public IndexModel(IHostingEnvironment hostingEnvironment)
{
    _IhostingEnvironment = hostingEnvironment;
}
  1. Write the audio processing logic in the OnGetAsync method:
public async Task OnGetAsync()
{
    string wavname = "test";
    string filePath = _IhostingEnvironment.WebRootPath + $"\\speech\\{wavname}.wav";
    bool isFile = System.IO.File.Exists(filePath);
    if (isFile)
    {
        // Delete the file
        System.IO.File.Delete(filePath);
    }
    if (string.IsNullOrEmpty(Speektext))
        Speektext = "Hello! Welcome to follow 'dotnet development career change!'";
    if (!string.IsNullOrEmpty(Speektext))
    {
        using (SpeechSynthesizer synth = new SpeechSynthesizer())
        {
            // Configure audio file, set output stream and file format
            synth.SetOutputToWaveFile(filePath, new SpeechAudioFormatInfo(32000, AudioBitsPerSample.Sixteen, AudioChannel.Mono));
            // Create an empty Prompt object and provide methods to add content, select voice, control voice properties, and control pronunciation
            PromptBuilder builder = new PromptBuilder();
            builder.AppendText(Speektext);
            // Output file
            synth.Speak(builder);
        }
        // Return file path
        filename = $"\\speech\\{wavname}.wav";
    }
}
  1. Use JavaScript on the frontend to load the audio into the control.
<input type="button" id="aa" value="Play" onclick="bofang()" />
<script type="text/javascript">
  function bofang() {
      var url = "@Model.filename.Replace("\\","\\\\")";
      var audio = document.getElementById('bofang');
              $('#bofang').attr('src',url);
              audio.play();
          }
</script>

Now you can enter content into the text box, and the audio will be generated and played. You can encapsulate this code and turn it into a news voice reader. The effect is as follows:

Sample code link:

https://pan.baidu.com/s/1IJadMleVEM3ePHE_KHqRqA?pwd=soiq
Extraction code: soiq

Conclusion

This article introduced a simple text-to-speech method in .NET. You can refer to it and encapsulate it into your own methods. Note that this example only supports the Windows environment; for cross-platform projects, you can explore further on your own. I hope this article provides some reference value for your study and work. Thank you for your support.

Reference: Microsoft Official Technical Documentation

Webmaster's Extension

Based on the original article .NET 3 Lines of Code to Implement Text-to-Speech Functionality, the webmaster created a MAUI Blazor project, followed the steps, added the NuGet package, and successfully implemented text-to-speech. See the key code and the accompanying video:

@page "/"
@using System.Speech.Synthesis

<h1>Please enter the text to convert, then click the play button</h1>

<MRow>
    <MCol Cols="12" Md="4">
        <MTextField @bind-Value="_message" Label="Text to convert"></MTextField>
    </MCol>
</MRow>
<MRow>
    <MCol Cols="12" Md="4">
        <MButton Class="mr-4" Color="@(string.IsNullOrEmpty(_message) ? "warning" : "success")" OnClick="PlayWord">Play</MButton>
    </MCol>
</MRow>

@code{

    private string _message;
    private SpeechSynthesizer _synth;

    private void PlayWord()
    {
        if (_synth == null)
        {
            _synth = new SpeechSynthesizer();
            _synth.SetOutputToDefaultAudioDevice();
        }
        _synth.Speak(_message);
    }

}
Keep Exploring

Related Reading

More Articles
Same category / Same tag 1/5/2026

2025 Annual Summary for All .NET Developers

I believe everyone has seen many articles like "Sorry, C# has fallen out of the first tier" this year. How is the .NET ecosystem really? This article will systematically outline the technology trends and important events that .NET developers should pay the most attention to in 2025, covering the latest developments and trends in AI, .NET evolution, and the integration of the two, in order to help everyone find their positioning and meet future challenges and opportunities.

Continue Reading
Same category / Same tag 11/17/2023

.NET8 Officially Released, New Changes in C#12

Although .NET 8 brings many enhancements in areas such as artificial intelligence, cloud-native, performance, native AOT, etc., I am still most interested in the changes in the C# language and some framework-level aspects. Below I introduce the new features in C# 12 and the framework that I find most practical.

Continue Reading