The [speech] command.

Introduction

The speech command lets you interact with the speech synthesizer. Alpha currently has one speech synthesizer which can be manipulated with the following commands.

Synopsis

The formal syntax of the [speech] command is:

speech subcommand ?options?

The possible subcommands are described below. Depending on the subcommand, some options may be specified.

Speech subcommands

The [attributes] subcommand

This subcommand returns the attribute dictionary of a voice. The complete syntax is:

    speech attributes voice

The voice argument is the name of a voice (see the voices subcommand). The dictionary contains the following keys:

name: The name of the voice suitable for display.
age: The perceived age (in years) of the voice.
gender: The perceived gender of the voice.
demo: A demonstration string to speak.
locale: The canonical locale identifier string describing the voice's locale. Currently only en_US seems to be supported.

The [busy] subcommand

This subcommand tells whether the voice synthesizer is currently speaking. The complete syntax is:

    speech busy ?-a?

The -a option is used to find whether any application running on the machine is currently synthesizing text.

The [continue] subcommand

This subcommand resumes speaking after a call to [speech pause]. The complete syntax is:

    speech continue

The [pause] subcommand

This subcommand pauses synthesis in progress. The complete syntax is:

    speech pause

The [reset] subcommand

This subcommand is used to set the synthesizer back to its default state. The syntax is:

    speech reset

You can use this function to, for example, set speech pitch and speech rate to default values. Note that it applies to the current voice.

The [set] subcommand

This subcommand is used to get or set certain properties of the speech synthesizer. The syntax can take two forms:

    speech set option 
    speech set option value

The first form retrieves the value of an option. The second form is used to set the value of an option.
The possible options are:

-feedback: this option tells whether the synthesizer should use the Speech Feedback Window (if it is visible). The default value is 0. This option doesn't open the feedback window for you: it must be activated via the Accessibility preferences of your Mac.
-modulation: it is the synthesizer's pitch modulation. Its value is a floating number in the range of 0.00 to 127.00. These values correspond to MIDI note values, where 60.00 is equal to middle C on a piano scale. A pitch modulation value of 0.00 corresponds to a monotone in which all speech is generated at the frequency corresponding to the speech pitch (see the -pitch option). Given a speech pitch value of 46.00, a pitch modulation of 2.00 would mean that the widest possible range of pitches corresponding to the actual frequency of generated text would be 44.00 to 48.00.
-pitch: it is the synthesizer's baseline speech pitch. Its value is a floating number. Typical voice frequencies range from around 90 hertz for a low-pitched male voice to perhaps 300 hertz for a high-pitched child's voice. These frequencies correspond to approximate pitch values in the ranges of 30.00 to 40.00 and 55.00 to 65.00, respectively. The most useful speech pitches fall in the range of 40.00 to 55.00.
-rate: it is the speaking rate, that is to say the number of words per minute. Its value is a floating number. Average human speech occurs at a rate of 180 to 220 words per minute.
-voice: it is the name of the current voice.
-volume: it is the speaking volume. Its value is a floating number between 0.0 and 1.0.

Important: each voice has its own characteristics in terms of pitch, modulation, rate and volume. So, when you want to modify these properties for a given voice, you must change the voice first. If you change say the rate first and then change the voice, you will have the rate reset for this voice.
Note also that changes in speech pitch or modulation may not be noticeable until the next sentence or paragraph is spoken.

The [start] subcommand

This subcommand starts speaking a synthesized text through the system's default sound output device. The complete syntax is:

    speech start ?-voice str? ?-command str? ?-wordCommand str? text

The last argument is a string containing the text to speak. Note that this command returns immediately: it does not wait for the text to be spoken entirely.
The -voice option specifies the name of a voice: you can use any of the names returned by the [speech voices] command. If this option is not specified, a default voice is used: either a voice declared with the [speech set] command or the default voice defined in the System preferences.
The -command option specifies a Tcl proc which will be invoked when the synthesizer starts speaking, and when it has finished. This proc takes a single argument whose value is either started or finished.
The -wordCommand option specifies a Tcl proc which will be invoked before each word is spoken. This proc takes as single argument a list of two positions representing the range of the word in the string. One use of this proc might be to visually highlight the words being spoken.

The [status] subcommand

This subcommand returns informations about the current state of the synthesizer. The complete syntax is:

    speech status

The returned value is a dictionary with key/value pairs. Currently the following keys may be found in the dictionary: PhonemeCode, NumberOfCharactersLeft, OutputBusy, OutputPaused.

The [stop] subcommand

This subcommand definitely stops the synthesis in progress. The complete syntax is:

    speech stop

The [voices] subcommand

This subcommand returns the list of the available voices. The complete syntax is:

    speech voices

Examples

# Get the list of voices
speech voices
# Setup a string to speak
set speakText "There were a king with a large jaw \
	and a queen with a plain face,\
	on the throne of England."
# Start speaking with the Agnes voice
speech start -voice Agnes $speakStr
speech pause 
speech continue 
speech stop

Last updated 2019-11-27 14:13:09