The [speech] command.
Introduction
The speech command lets you interact with the speech
synthesizer. Alpha currently has one speech synthesizer which can be
manipulated with the following commands.
Synopsis
The formal syntax of the [speech] command is:
speech subcommand ?options? |
The possible subcommands are described below. Depending on the
subcommand, some options may be specified.
Speech subcommands
The [attributes] subcommand
This subcommand returns the attribute dictionary of a voice.
The complete syntax is:
speech attributes voice
The voice argument is the name of a voice (see the
voices subcommand).
The dictionary contains the following keys:
- name
- The name of the voice suitable for display.
- age
- The perceived age (in years) of the voice.
- gender
- The perceived gender of the voice.
- demo
- A demonstration string to speak.
- locale
- The canonical locale identifier string describing the voice's locale.
Currently only en_US seems to be supported.
The [busy] subcommand
This subcommand tells whether the voice synthesizer is currently speaking.
The complete syntax is:
speech busy ?-a?
The -a option is used to find whether any application running on
the machine is currently synthesizing text.
The [continue] subcommand
This subcommand resumes speaking after a call to [speech pause]. The
complete syntax is:
speech continue
The [pause] subcommand
This subcommand pauses synthesis in progress. The complete syntax is:
speech pause
The [reset] subcommand
This subcommand is used to set the synthesizer back to its default state.
The syntax is:
speech reset
You can use this function to, for example, set speech pitch and speech rate
to default values. Note that it applies to the current voice.
The [set] subcommand
This subcommand is used to get or set certain properties of the speech
synthesizer. The syntax can take two forms:
speech set option
speech set option value
The first form retrieves the value of an option. The second form is used
to set the value of an option.
The possible options are:
- -feedback
- this option tells whether the synthesizer should use the Speech
Feedback Window (if it is visible). The default value is 0. This option
doesn't open the feedback window for you: it must be activated via the Accessibility preferences of your Mac.
- -modulation
- it is the synthesizer's pitch modulation. Its value is a floating number in
the range of 0.00 to 127.00. These values correspond to MIDI note values,
where 60.00 is equal to middle C on a piano scale. A pitch modulation value of
0.00 corresponds to a monotone in which all speech is generated at the
frequency corresponding to the speech pitch (see the -pitch option). Given a speech pitch value of 46.00, a
pitch modulation of 2.00 would mean that the widest possible range of
pitches corresponding to the actual frequency of generated text would be
44.00 to 48.00.
- -pitch
- it is the synthesizer's baseline speech pitch. Its value is a floating
number. Typical voice frequencies range from around 90 hertz for a
low-pitched male voice to perhaps 300 hertz for a high-pitched child's
voice. These frequencies correspond to approximate pitch values in the
ranges of 30.00 to 40.00 and 55.00 to 65.00, respectively. The most useful
speech pitches fall in the range of 40.00 to 55.00.
- -rate
- it is the speaking rate, that is to say the number of
words per minute. Its value is a floating number. Average human speech
occurs at a rate of 180 to 220 words per minute.
- -voice
- it is the name of the current voice.
- -volume
- it is the speaking volume. Its value is a floating number
between 0.0 and 1.0.
Important: each voice has its own characteristics in terms of
pitch, modulation, rate and volume. So, when you want to modify these
properties for a given voice, you must change the voice first. If you
change say the rate first and then change the voice, you will have the rate
reset for this voice.
Note also that changes in speech pitch or modulation may not be
noticeable until the next sentence or paragraph is spoken.
The [start] subcommand
This subcommand starts speaking a synthesized text through the system's
default sound output device. The complete syntax is:
speech start ?-voice str? ?-command str? ?-wordCommand str? text
The last argument is a string containing the text to speak. Note that this
command returns immediately: it does not wait for the text to be spoken
entirely.
The -voice option specifies the name of a voice: you can use
any of the names returned by the [speech voices] command. If this
option is not specified, a default voice is used: either a voice declared
with the [speech set] command or the default voice defined in the
System preferences.
The -command option specifies a
Tcl proc which will be invoked when the synthesizer starts speaking, and
when it has finished. This proc takes a single argument whose value is
either started or finished.
The -wordCommand option specifies
a Tcl proc which will be invoked before each word is spoken. This proc
takes as single argument a list of two positions representing the range of
the word in the string. One use of this proc might be to visually highlight
the words being spoken.
The [status] subcommand
This subcommand returns informations about the current state of the
synthesizer. The complete syntax is:
speech status
The returned value is a dictionary with key/value pairs. Currently the
following keys may be found in the dictionary: PhonemeCode,
NumberOfCharactersLeft, OutputBusy, OutputPaused.
The [stop] subcommand
This subcommand definitely stops the synthesis in progress. The complete
syntax is:
speech stop
The [voices] subcommand
This subcommand returns the list of the available voices. The complete syntax is:
speech voices
Examples
# Get the list of voices
speech voices
# Setup a string to speak
set speakText "There were a king with a large jaw \
and a queen with a plain face,\
on the throne of England."
# Start speaking with the Agnes voice
speech start -voice Agnes $speakStr
speech pause
speech continue
speech stop
Last updated 2019-11-27 14:13:09