:exclamation:conrad is a reboot of mscstts. Instead of httr, which is superseded and not recommended, we use httr2 to perform HTTP requests to the Microsoft Cognitive Services Text to Speech REST API.
conrad serves as a client to the Microsoft Cognitive Services Text to Speech REST API. The Text to Speech REST API supports neural text to speech voices, which support specific languages and dialects that are identified by locale. Each available endpoint is associated with a region.
Before you use the text to speech REST API, a valid account must be registered at the Microsoft Azure Cognitive Services and you must obtain an API key. Without an API key, this package will not work.
Install the CRAN version:
install.packages("conrad")
Or install the development version from GitHub:
# install.packages("devtools")
::install_github("fhdsl/conrad") devtools
+ Create a resource
(below “Azure services” or
click on the Hamburger button)Create
->
Speech
Pricing tier
(you can choose the free version
with Free F0
)Review + create
, review the Terms, and click
Create
.If the deployment was successful, you should see :white_check_mark: Your deployment is complete on the next page.
Next steps
, click
Go to resource
Resource Management
,
click Keys and Endpoint
KEY 1
or KEY 2
to clipboard.
Only one key is necessary to make an API call.Once you complete these steps, you have successfully retrieved your API keys to access the API.
:warning: Remember your Location/Region
, which you use
to make calls to the API. Specifying a different region will lead to a
HTTP
403 Forbidden response.
You can set your API key in a number of ways:
~/.Renviron
and set
MS_TTS_API_KEY = "YOUR_API_KEY"
R
, use
options(ms_tts_key = "YOUR_API_KEY")
.export MS_TTS_API_KEY=YOUR_API_KEY
in
.bash_profile
/.bashrc
if you’re using
R
in the terminal.api_key = "YOUR_API_KEY"
in arguments of functions
such as ms_list_voices(api_key = "YOUR_API_KEY")
.ms_list_voice()
uses the
tts.speech.microsoft.com/cognitiveservices/voices/list
endpoint to get a full list
of voices for a specific region. It attaches a region prefix to this
endpoint to get a list of voices for that region.
For example, to get a list of all the voices for the
westus
region, it uses the
https://westus.tts.speech.microsoft.com/cognitiveservices/voices/list
endpoint.
:warning: Be sure to specify the Speech resource region that corresponds to your API Key.
ms_list_voice(api_key = "YOUR_API_KEY", region = "westus")
ms_synthesize()
uses the
tts.speech.microsoft.com/cognitiveservices/v1
endpoint to
convert text
to speech. The endpoint requires Speech
Synthesis Markup Language (SSML) to specify the language, gender,
and full voice name.
:warning: Be sure to specify the Speech resource region that corresponds to your API Key.
# Convert text to speech
<- ms_synthesize(script = "Hello world, this is a talking computer", region = "westus", gender = "Male")
res # Returns hexadecimal representation of binary data
# Create file to store audio output
<- tempfile(fileext = ".wav")
output_path # Write binary data to output path
writeBin(res, con = output_path)
# Play audio in browser
play_audio(audio = output_path)
ms_get_token()
makes a request to the
issueToken
endpoint to get an access
token. The function require an API key and region as inputs. The
access token is used to send requests to the API.
:warning: Be sure to specify the Speech resource region that corresponds to your API Key.
ms_get_token(api_key = "YOUR_API_KEY", region = "westus")
mscstts::ms_synthesize()
,
the problem
arose due to the use of an invalid voice within the HTTP request,
specifically concerning the chosen region. For instance, the SSML might
have contained a voice name that was not supported in the
westus
region. As a consequence, the server would reject
the HTTP request.We believe that these improvements will greatly enhance the usability of the package and make it even more reliable in the long-term.
conrad wouldn’t be possible without prior work on mscstts by John Muschelli and httr2 by Hadley Wickham.