Jaap's Psion II Page

Speech Synthesiser for the
Psion Organiser II

© Copyright Rovoreed Limited 1991.

All rights reserved. This Speech Synthesiser manual and software referred to herein are copyrighted works of Rovoreed Limited, England. Reproduction in whole or part, including utilisation in machines capable of reproduction or retrieval, without the express written permission of Rovoreed Limited is prohibited. Reverse engineering is also prohibited.

The information in this document is subject to change without notice.

This software, hardware and instruction manual are sold "as is" without warranty as to their performance, merchantability or fitness for any particular purpose. The entire risk as to the results and performance of this product is assumed by you.


Table of Contents

Section 1 - Introduction 1. Introduction 2. Installing the Speech Synthesiser Unit 3. Removing the Speech Synthesiser Unit 4. How it works Section 2 - User Guide 5. Using SETUP 5.1 Leaving SETUP 6. The EDITOR 6.1 Using the EDITOR 6.2 EDITOR Reference Section 3 - Programmers Guide 7. Programmers Guide Appendix A Appendix B


Section 1 - Introduction

1. Introduction

The Speech Synthesiser provides your Psion Organiser with another output medium. By default it will read out the text displayed on the screen, therefore it immediately interfaces to all packages and applications you might already have. It can also be made to read out text not displayed on the screen, thereby increasing the amount of information your Psion Organiser can present to you at any one time.

The inbuilt dictionary of over 900 words includes principal UK towns, colours, numbers, activities, error messages, and these may be supplemented by a user defined dictionary to provide further specialised vocabulary. The user defined dictionary is constructed using the in-built allophone editor. There is no restriction on the size of this user vocabulary, other than the size of the datapack on which it is held.

To summarise, the facilities provided are :-

  1. Ability to read straight off the screen.
  2. In built vocabulary of over 900 words.
  3. Ability to add user defined vocabulary.
  4. Programmers interface.

2. Installing the Speech Synthesiser Unit

To use the Speech Synthesiser, first switch off your Organiser by selecting OFF from the top level menu. Slide open the door at the top of the Organiser & insert the Speech Synthesiser unit into the socket, making sure it slides in all the way until you feel a click. The unit will only go in one way.

To load the Speech Synthesiser software into the Organiser, press ON/CLEAR twice. The first press switches the Organiser on and the second press loads the software. The Speech Synthesiser is now ready for use and a new menu option SPEECH, will have been inserted into the main Organiser menu, just before the last option OFF.

The Speech Synthesiser software occupies about 4k of memory, so if the internal memory of the machine is already nearly full, the OUT OF MEMORY message may be displayed. If this happens, delete any non-vital files or records from A: or use the TIDY option in the DIARY to clear some memory, or move some files from A: to a datapack.

If no error message is displayed & the SPEECH option does not appear, switch the machine off, remove the Speech Synthesiser unit, carefully re-insert it & try again.

The unit contains a socket for a mains adaptor lead. You do not have to use a mains adaptor when using the Speech Synthesizer but there are obvious advantages in doing so whenever you are conveniently near a mains socket.

The unit also contains a 3.5mm jack socket suitable for connecting a headphones set - not supplied, but as commonly used on personal stereo systems;. These headphones could be of use when creating your own vocabulary or at any other time when you don't wish the unit to annoy other people !

3. Removing the Speech Synthesiser Unit

After using the Speech Synthesiser unit, switch off the Organiser by selecting OFF. To remove the unit from the Organiser, press the click switch & pull.

The software will still be stored in memory however, and should be removed by pressing ON/CLEAR to switch the machine back on, and at the top level menu, pressing ON/CLEAR again. The SPEECH option should now have disappeared from the main menu, and the memory used by the speech synthesiser will have been freed for other uses.

If the unit is unplugged during use, a DEVICE MISSING error will be reported. For obvious reasons, this should be avoided.

After removal of the Speech Synthesiser unit and software, 19 bytes of memory will remain allocated These 19 bytes contain your preferred setups, and are used to save you having to configure the software each time you use it. Details on how to free up these 19 bytes will be found in the "Using Setup" section.

4. How it works

To make full and proper use of the synthesiser it would help to have an understanding how it works.

Basically, there are 3 types of computer generated speech.

The first, pulse coded modulation (PCM), which is no more than digital recording, storage and playback, requires about 70,000 bits per second of speech. To say just a few words using this method would require several hundred thousand bytes of data. Although the Organiser can now have over 1/2 mega-bytes "online" you can see that this is not really an acceptable method for the vast range of words that will need to be spoken in any particular application.

The second method, linear predicting coding (LPC), which predicts a speech sample from a weighted combination of previous samples, requires only one to two thousand bits per second of speech. Using this method, approximately 15 words can be stored in 2k bytes of memory. Again to create any reasonable vocabulary will exhaust the capacity of the Organiser.

The final method, allophone synthesis, has the major advantage of providing an unlimited vocabulary, since the stored units are not words, but individual speech sounds (allophones). These allophones are strung together to form complete words. Although completely understandable, the speech quality is not as good as it is for PCM or LPC.

The Speech Synthesiser for the Psion Organiser, uses this allophone method of "talking". A vocabulary of over 900 words has been built up using these allophones and is available for immediate use.

A full description of allophones and how to use them will be found in Appendix B. It will only be necessary to read that section if you intend to create your own vocabulary.

When the software is first loaded, it will automatically intercept all diary alarms and read them out. This is because the software can read directly off the screen, and, when a diary alarm occurs, the text is displayed on the screen.

Imagine the diary entry "Check a/c has 200". The string is split up into distinct items. These items are words, numbers and special characters. Each item ends when a delimiter is detected. The delimiter depends on what the item is.

Words start with a letter and continue until something that is not a letter is detected. In the above example the first word is "Check" because there is a space after the "k", but also notice that the "a" from "a/c" will be treated as a word in its own right, as will "c". The end of the string also terminates a word. Upper & lower case is irrelevant so "CHECK", "check", "Check" and "cHeCk" are all equivalent.

Numbers are any string of numeric characters. Scientific notation is allowed so "1.3e-10" is valid as a number. Numbers are terminated when anything is detected that would cause the number to become invalid. e.g. "1.2.3" is treated as two numbers, "1.2" and "0.3" because the second decimal point terminated the first number.

Special characters, are usually the delimiters themselves, and are almost always 1 character long. The special case is space, where multiple spaces are treated as a single space.

When a word is detected it is looked up in the vocabulary file to see how to pronounce it. The software will look in ONLY two vocabulary files for each word. It will look first in your own vocabulary file and if it is not found there, or if you have not defined your own file, or if the software cannot find your file, then the word is looked up in the inbuilt vocabulary. If the word is not found there then the word is spelt out letter by letter.

The range of words available in the in-built vocabulary file is too large to print in this manual. Indeed, the vocabulary may vary for custom versions of this software. A program that will list the entire contents of the inbuilt vocabulary is given in Appendix A.

Obviously there will be some words you use often that will not be in the inbuilt vocabulary, and these you will have to add to your own file. Details on how to do this are in Chapter 6. Note however that it is unadvisable to make your own vocabulary file too large since this will have an adverse affect on the speed with which the software will be able to find the next word before it has to say it. If at all possible, create several small vocabulary files and switch between them, either under program control, or, via the SETUP option within the SPEECH menu item.

Numbers are treated in different ways depending how the software has been configured. Suffice it to say a number like "200" can be pronounced as "two zero zero", "two hundred" or even "two hundred pounds". See chapter 5 (CCY option) for more details.

Special characters are ignored by default and they will not cause any output from the speech synthesiser. However, the software can be set up so that each delimiter does cause output. See chapter 5 (TEXT option) for more details.


Section 2 - User Guide

5. Using SETUP

When you select the SETUP option, you are presented with a full list of the parameters which alter the way in which the Speech Synthesiser operates. The list is arranged with the name of the parameter on the left of the line and the current setting on the right. Use the up & down arrow keys to scroll up and down the list.

The SETUP parameters and the things they set are as follows:

VOCABUser defined vocabulary file.
CCY Currency name used in amounts.
TEXT Enables/disables pronunciation of punctuation characters.
PAUSEDelay used when spelling.
VIDEOEnables/disables reading of onscreen text.
SETUPAllows you to free up the 19 bytes of allocated memory where your preferred setups are always held.

To edit a parameter value, first select the parameter using the UP and DOWN keys. The selected parameter is indicated with a ">" symbol after the parameter name. Change the selected parameter by pressing the RIGHT & LEFT keys to select a value from a list or by pressing EXE & then entering a value. Which method you use depends upon which parameter you are changing, see below under the relevant heading.

Pressing ON/CLEAR sets the selected parameter to its original value.

VOCAB

The Speech Synthesiser works by looking up each word that it wants to say in an inbuilt dictionary file. Appendix A contains a small program that will list the contents of this inbuilt file. If the word is not found, then it is spelt out letter by letter. This can become tedious, so, a mechanism exists for creating your own vocabulary file (see chapter 6 on the Editor), and telling the speech synthesiser where your file is. In fact your file will be searched BEFORE the inbuilt vocabulary therefore it is possible to redefine words that already exist, for example, when "Bath" is looked up it can be made to say "B - AH - TH" rather than "B - AR - TH" or even more extreme when "pavement" is looked up it could say "sidewalk" !

The VOCAB parameter is used to indicate to the speech synthesiser software where this user defined file is, and what it is called.

If the VOCAB parameter is supplied then it must be a full filename, i.e. complete with a device name. The speech synthesiser software will not search packs looking for your file. It expects it to be on the pack you say it is & if it can't find it, then there is no warning or error, it simply carries on. If the software cannot find your file and the word doesn't exist in the inbuilt vocabulary, then the word is spelt out letter by letter. This is the only indication you will get that there may be a problem with this parameter.

It should also be noted that this file should ONLY contain data generated using the allophone editor. If for example you set this parameter to be A:MAIN and you already use that file to hold names/addresses, then you will cause the speech synthesiser to treat those names and address as its own database, which will cause strange noises to be emitted. No actual harm will occur to the speech synthesiser, it will just not talk properly.

To change this parameter, either the LEFT or RIGHT arrow must be pressed. This "steps" the cursor into the field. Initially the field will have shown "None" and thus there is nothing there to edit. If a filename had already been specified, it will be presented for editing.

The following keys have special meaning in edit mode. Up arrow moves to the left end of the word, down arrow moves to the right hand side, left & right arrows move left and right, clear clears, delete deletes the character to the left, shift delete deletes the character to the right, any other characters get inserted at the cursor position. Exe is used to terminate the edit.

You must enter something that looks like a valid filename otherwise an error will be displayed & you will be left to correct your mistake. Note that it is not necessary for the file to exist. This parameter can be reset to "None" by using ON/CLEAR to clear the field & then EXE to terminate the edit.

CCY

The LEFT & RIGHT arrows can be used to toggle this parameter through the valid settings. This parameter affects how the speech synthesiser will treat numbers.

OFF
(Default)
each digit in the number is treated as a separate entity.
Consider the number 12.345
When this parameter is set to OFF, this number will be pronounced as "one two point three four five".
 
NONEthe number is now treated in a more English manner.
Consider the number 12.345
When this parameter is set to NONE, this number will be pronounced as "twelve point three four five". The software knows about numbers up to 999,999,999,999 so will can also say hundred thousand and million as appropriate.
 
POUNDS
 &
DOLLARS
the number is now treated as a currency amount. The number will be rounded to two decimal places, (12.345 becomes 12.35) and then treated in an English manner throughout, to be pronounced as "twelve pounds thirty five" or "twelve dollars thirty five". A single pound or dollar will cause the "s" to be dropped, as one would expect.

PAUSE

The LEFT & RIGHT arrows can be used to toggle this parameter through the range of valid values. This pause length is used between each letter when spelling out unknown words. By default it is set to PA2 which may be a little too fast for some people. The full list of values and the corresponding delay times in milli-seconds are :-

PA1 = 10 mS
PA2 = 30 mS (Default)
PA3 = 50 mS
PA4 = 100 mS
PA5 = 200 mS

TEXT

The LEFT & RIGHT arrows can be used to toggle this parameter ON or OFF. It will normally be left OFF. This parameter is analogous to transparent text mode found on some terminals. Every character as it is found is treated as an entity in its own right. Words will always be spelt out, numbers will be spoken a digit at a time - this parameter overrides the setting of CCY and will make CCY behave as if it had been set to OFF; but more importantly all punctuation characters will be spoken as well.

Eight bit ASCII will be converted to seven bit ASCII, therefore it is not possible to pronounce the extended character set.

The default value for this parameter is OFF.

VIDEO

The LEFT & RIGHT arrows can be used to toggle this parameter ON or OFF. When ON text that is displayed on screen will be output through the speech synthesiser. There are two other conditions that must be fulfilled. The screen image must not change for a period of 2 seconds, and there must be no packs being accessed at the time. When OFF this video interception is disabled.

The default value for this parameter is ON.

SETUP

Normally this parameter will be set to SAVED. This means that when you remove the speech synthesiser, and the software driver, your setups will remain saved in memory. These setups take up 19 bytes of memory. On the LZ range, they show up when using the Info option in the Utils sub-menu. If you do not wish to have your setups saved in this fashion, then toggle this parameter to LOST by using the LEFT or RIGHT arrow. You must then leave SETUP (see next section) switch off the Organiser, unplug the synthesiser unit, and press On/Clear twice. Your setups will then not be saved. If you simply set this parameter to LOST, return to the top level menu and press On/Clear twice (maybe by mistake) then this parameter will have reverted back to SAVED. This is because, the second hit of On/Clear caused all devices to unload, and then reload, and the loading of the synthesiser software causes this parameter to be set to SAVED.

The default value for this parameter is SAVED.

5.1 Leaving SETUP

When the parameters have been set to the required values, press the MODE key and a menu containing the following items is presented:-

EXIT Exit SETUP keeping any changes.
 
ABANDONExit SETUP discarding any changes.
 
EDIT Return to the parameter list to continue editing.
 
RESET Reset to the default SETUP settings.

Pressing EXE selects the EXIT item which exits SETUP making the adjusted parameters current.

If you have been experimenting with the SETUP parameter editor, it is a good idea to set all the parameters back to their original values by selecting RESET.

6. The EDITOR

The Editor is used to edit your own personalised vocabulary files. It can find, amend, delete or enter existing words in your file.

As mentioned in the introduction, the speech synthesiser operates using allophones. The editor is used to build sounds up from the allophones and to assign the sound to a word. This word and string of allophones is then saved in a file for later use by the speech synthesiser software. In this way, the known vocabulary can be extended from the basic vocabulary supplied, to include your own specialised words.

6.1 Using the EDITOR

Imagine now you want to add a word to your own vocabulary file. The following gives a step by step guide to adding a new word. A more comprehensive description of each editor function is given in section 6.2 which should be treated as a reference section.

Select "Editor" from the SPEECH sub-menu. If you have already told the software what filename you are using - either using the SETUP menu item, or a call to SAYSETUP: then that filename will be presented to you for confirmation, or amendment. If you haven't entered a filename, then you must enter one here. Entering or changing an existing filename will not actually modify the parameters displayed by SETUP. To make use of any words you create during the edit session, then you must change the VOCAB parameter either by manually changing it with the SETUP menu option or by calling SAYSETUP: from within a program.

A blank filename will cause the editor to exit back to the SPEECH sub-menu. Remember, the filename must contain a valid device letter, (A:, B: or C:).

The editor is a bit more selective than setup about what can be entered as a file name. If you had supplied a name like "C:MYFILE" in the setup option, then, there must at least be a pack plugged into the bottom slot for the editor to get further.

After the filename is entered, you are placed directly into edit mode. If you have specified a file that already exists, then the first word in that file will be displayed ready for editing. If the file you specify does not yet exist, you will be prompted with the message "Create? Y/N". Entering "N" will return you back to the SPEECH sub-menu. Entering a "Y" will create the file and you will be placed in edit mode. For a new or empty file you will be presented with a blank screen.

Whatever happened above, the cursor will be flashing on the bottom line of the screen at the left hand side.

Assume you want to use the word "penknife". It may not be necessary. The two words "pen" and "knife" already exist. If you use the word "pen-knife" on screen or as an argument to SAY: the software will look up both parts of the word, and say "penknife". Therefore there is no real need to create that word. The hyphen acts as a delimiter, thus causing the software to treat each part of the word separately.

Many words can by built up in this way, e.g. "block-ing". Assume we want to add the word "ORIENTAL". We could start by looking for a similar word. Press MODE, then "F" and the "Find:" prompt appears, type in "ENTAL" followed by EXE - maybe we already have something like "CONTINENTAL", or "MENTAL". Doing a search like this may save us some work. If no match is found, a message "NOT FOUND" is briefly displayed and the editor returns to edit mode. If a word is found, it is displayed ready for editing. To find the next occurrence of the word, hit MODE again, then "F". The last search key is still there, so just hit EXE, and the next occurrence is returned. If there is only one occurrence on file, then you will keep getting the same one.

Assume the above "Find" operation didn't find anything, and we have to put the whole thing in. So that the rest of this step-by-step lesson makes sense, if the above "Find" did find something, you should delete all the allophones by pressing On/Clear,

The next thing to do, is to add the word. Hit MODE then "W", and the "Word:" prompt will appear. If the above "Find" operation found something, then it can be deleted by hitting On/Clear. Now type in "ORIENTAL". It doesn't matter if it is in upper or lower case, because it will be converted to upper case by the editor.

Next the allophones have to be added.

Type "O", "R", space. After the space is hit, the synthesiser will say the sound corresponding to the allophone OR.

Type "I", "Y", space. After the space is hit, the synthesiser will say the whole word so far generated.

Next type "EH", "EH". Don 't forget the space after each allophone mnemonic. The "EH" allophone is a bit special in that several can be placed together to increase the length of the sound. It is not possible to do this with all allophones. Only the ones indicated in Appendix B have this property.

Next type "N", "N". At this point the editor knows that the next character will either be a space or a number. It will automatically shift the keyboard into numeric mode, so when you type "U", it actually comes out as a "1".

Finally type in "TT2" and "EL".

We now have a basic word. It doesn't sound that good. Using the left arrow move down the word so that the cursor is on the "IY". As the cursor moves, the word is spoken out. Another way of hearing the word again and again is to simply hit the space or exe keys. Any other alphabetic keys cause that character to be output on the screen as part of an allophone. Also, the DEL key will delete the allophone the cursor is on completely. The On/Clear key deletes the whole string of allophones. Type in "RR1". As you type, the string of allophones expands to accept the new allophone.

Using the right arrow move the cursor up to a "EH" allophone. Hit the DEL key. The "EH" allophone disappears, the word sounds a little better, but maybe the "T" sound it too strong. Move the cursor up to the "TT2" allophone and hit the down arrow. The up and down arrows are used to get allophones of a similar sound.

If you are now happy with the sound, hit the Mode key. This brings up the Editor sub-menu. Hit "S" for save, and the word is stored away for immediate use. After saving, the word is still available for editing should you wish to add suffixes or prefixes.

6.2 EDITOR Reference

On entry, you will be prompted for a file in which to save all words as they are generated. If you have already told the software what filename you are using - either using the SETUP menu item, or a call to SAYSETUP: then that filename will be presented to you for confirmation, or amendment. A blank filename will cause the editor to exit back to the SPEECH sub-menu.

After the filename is entered, you are placed directly into edit mode. If you have specified a file that already exists, then the first word in that file will be displayed ready for editing. E.g

On CM&XP

ABLE   EY BB1 PA1 EL

On LZ and LZ64

< 8:50a   ABLE   EY BB1 PA1 EL

The cursor will be under the "E" of EY.

If the file you specify does not yet exist, you will be prompted with the message "Create? Y/N". Entering "N" will return you back to the SPEECH sub-menu. Entering a "Y" will create the file and you will be placed in edit mode. For a new or empty file you will be presented with a blank screen, but, the cursor will be flashing on the bottom line of the screen.

All editing is done on the bottom line of the screen. The line before the last line, will show the word that this sound will be saved as. You can use the left and right arrow keys to move up and down the word, and as the cursor moves through the word it is pronounced an allophone at the time.

Allophones are entered simply by typing the allophone mnemonic.

A full list of all the allophones and the sounds they produce is given in Appendix B. When entering an allophone there is no need to worry about shifting between letters and digits. The editor will handle this for you. The editor thinks that all allophones are 3 characters long, so if you need to use a 2 character allophone then the final character MUST be a space. If you mis-type a character, then the DEL key can be used to delete the character, and if the allophone mnemonic doesn't exist, then the editor will simply delete it after to you end it.

Certain keys have special meaning during editing of allophones, and these are summarised below.

Up Arrow
Down Arrow
changes the current allophone into an allophone from the same "family". These families are, Stops, Fricatives, Affricates, Nasals, Resonants, High, Middle and Low vowels, and finally Pauses. Some of these families of sounds are quite small, for example affricates has just the CH and JH sounds. If you keep hitting the same arrow key, you will eventually get back to the original allophone mnemonic, it will take longer with some families than others. It is not possible to use the arrow keys to jump into another family. The families are structured so that mnemonics ending in digits are sequential, so if the cursor is on for example GG1, UP ARROW will change it to GG2, and again to GG3. Conversely if the cursor is on for example KK3, DOWN ARROW will change it to KK2 and again to KK1.
 
DELdeletes the current allophone.
 
On/Clearclears the whole line of allophones.
 
Mode keywill bring up the Editor sub-menu.

FIND

Find is used to locate a word within your vocabulary file. You may wish to do this if there is a similar sounding word to the one you wish to create, or indeed you may wish to find a word to delete it.

SAVE

Save is used to save the currently displayed item into your vocabulary file. A brief message "Saving..." is displayed as the data is stored away and the editor will then return to edit mode.

If you have not entered a word for the string of allophones, the save operation will terminated with an error message.

WORD

This option allows you to enter or change the word which will be saved in your vocabulary. The allophones will be saved alongside this word. It is necessary to enter both the word, and the allophones for the system to work correctly.

When selected the current word (if any) will be made available for editing, and the standard Psion editing rules apply. Up arrow moves to the left end of the word, down arrow moves to the right hand side, left & right arrows move left and right, clear clears, delete deletes the character to the left, shift delete deletes the character to the right, any other characters get inserted at the cursor position. Exe is used to terminate the edit.

Before accepting the word, the editor performs some simple validation. Any leading or trailing spaces will be automatically stripped from the word entered. The word is then scanned to ensure that it only contains alphabetic characters. It is not possible to enter hyphenated words, or words containing any digits or special characters. The reason is that, when the software is scanning the screen looking for things to say, a dash or hyphen acts as a delimiter. So, even if the editor did allow the entry of hyphenated words, then would never be sought in the vocabulary file. If you need hyphenated words, create two words, one for each part.

DELETE

This deletes the current item from the vocabulary file.

After selecting this option you will be prompted "Sure? Y/N". Entering "Y" will cause the current item to be deleted. Entering "N" will return you into edit mode. Although this will appear to work 100% of the time, to really delete the word, it is necessary to have first found the entry using the "Find" option detailed above.

EXIT

When selected you will be prompted "Exit? Y/N". Entering "Y" will cause the program to return to the SPEECH menu display. Entering "N" will return you into edit mode. All other characters are ignored.


Section 3 - Programmers Guide

7. Programmers Guide

This chapter assumes that you are familiar with the Organiser programming language (OPL). For the LZ range of Organisers, OPL is described fully in the Programming Manual, and for the CM and XP models, it is described in detail in the Operating Manual that came with your Organiser.

When the Speech Synthesiser is fitted to the Organiser, OPL programs may directly access the speech synthesiser by making calls to the OPL language extensions. These are summarised below.

SAYSETUPSets the speech synthesiser parameters.
 
SAY Says a string.
 
SAYALLO Pronounces one or more allophones.

Note that the user can abort any of these functions by pressing ON/CLEAR, in which case the error 206 (Escape) will be raised. This feature can be disabled by using the ESCAPE OFF command.

If any other error occurs an error is "raised" in the normal way. See the Error Handling chapter in your Organiser manual for a description of OPL error handling facilities. If the OPL program does not trap errors, the program will terminate when an error occurs, displaying an error message that corresponds to the error number.

The following procedures names are used internally:

SPEECH SPEECHST SPEECHDG

and you should avoid using these names for your own procedures.

7.1 SAYSETUP

The SAYSETUP procedure is used to change or set the current operating parameters without having to use the SETUP option under the SPEECH menu item. Parameters which are changed by SAYSETUP remain changed when the OPL program terminates.

The syntax is :-

SAYSETUP:(vocab$,ccy%,pause%,text%,alarm%)

The meaning of the parameters are described fully in chapter 5 "Using Setup". Except for the trailing % and $, the names of the procedure parameter names used here, correspond to the parameter names used in that section. It is not possible by using a call to SAYSETUP: to configure the software to lose the 19 bytes of memory used to save the setups. It is only possible to change this parameter by using the SPEECH sub-menu item SETUP.

Any procedure parameter may be set to -1 in which case the current value of that parameter is not changed by the call to SAYSETUP.

For example, the line :-

SAYSETUP:(-1,-1,1,-1,-1)

sets the PAUSE parameter to 1 without affecting the current setting of the other SETUP parameters.

If there are trailing parameters which you do not wish to change, then they may be omitted. The line :-

SAYSETUP:(-1,-1,1)

has the same effect as the previous example, and another example:-

SAYSETUP:("a:myfile")

resets the user vocabulary file to "a:myfile".

Omitting all the parameters to SAYSETUP: is a special case & has the effect of restoring the complete set of parameters to their default values.

As in SETUP, the speech parameters may only take their value from a list of allowed values.

Parameter Default Allowed values
vocab$ Empty stringAny valid file name. A device must be specified. The file does not have to exist.
ccy% 0 0 to 3. (Off, None, Pounds, Dollars).
pause% 1 0 to 4. (PA1, PA2, PA3, PA4, PAS)
text% 0 0 to 1. (Off, On)
video% 1 0 to 1 (Off, On)

An example:-

SAYSETUP:(-1,2)

sets the currency to Pounds.

The default action of the Speech Synthesiser is to read out all on-screen text. You may wish to disable this function whilst your program is running, and restore it again after the program terminates. You will need two calls like:-

MYPROG: SAYSETUP:(-1,-1,-1,-1,0) REM Your code goes here SAYSETUP:(-1,-1,-1,-1,1)

It might be prudent to remind you at this stage, that if you are using many "-1" parameters it is more efficient to use "SFFFF" instead. However, if you are in the habit of keeping the OPL source code in your Organiser along with the object code, then the 1 byte of object code saved by using $FFFF instead of -1 is offset by the 3 extra bytes needed to hold the extra source. The decision is up to you and how you work.

7.2 SAY

The SAY procedure is the procedure that will be used most often. The OPL statement :-

SAY:('good morning')

will do just that.

As with all string based parameters, there is a limit of 255 characters to the string. Therefore, if you need to say more than this in one part of your program, then you will need to make two or more consecutive calls. All the rules in the setup, such as the way numbers are pronounced & transparent text are obeyed.

If it is necessary to have some words spelt out whilst others are spoken, it will be necessary to intersperse calls to SAY: with calls to SAYSETUP:. However, it is unlikely that this will be necessary, but obviously it will depend on your application.

7.3 SAYALLO

This is a more primitive function that SAY: & is unlikely to be used, but it is documented here for completion. It is used to pronounce single or multiple allophones. Each allophone is specified by its decimal value. A complete list of allophones can be found in Appendix B. Any attempt to specify an allophone outside the valid range (0 to 63) will simply cause the number to be truncated to the least significant 6 bits thus bringing it into the range 0 to 63. Each allophone is separated from its neighbour by any non-numeric, so :-

SAYALLO:("27,7,45,0,45,53") SAYALLO:("27 7 45 0 45 53") SAYALLO:("27a7B45,,45.53")

are all equivalent, and

SAYALLO:('HELLO')

is equivalent to six PA1 allophones.

Appendix A.

The inbuilt vocabulary file can be accessed using the following program - which will work on either the CM, XP, or LZ range; and will list the entire contents of the file before terminating. The inbuilt file is sorted in alpha-numeric sequence.

VOCAB: ESCAPE ON OPEN "d:vocab",a.w$ WHILE NOT EOF PRINT a.w$ NEXT ENDWH

You may stop the program at any time by pressing On/Clear followed by Q.

To obtain a hardcopy of the inbuilt vocabulary is a little more difficult. The vocabulary has to first be copied to the internal RAM (A:) or to a datapack, and printed from there using Comms Link. Unfortunately, it is not possible to use the Copy function provided in the top level menu (on XP/CM) or in the Xfiles sub-menu on the LZ range. A simple program has to be written. Thus :-

CYVOCAB: IF EXIST("a:vocab") :DELETE "a:vocab" :ENDIF COPY "d:vocab","a:"

After that program as been run, the following program can be used to generate a hardcopy of the vocabulary.

PRVOCAB: LOCAL fil$(10),a$(20),l%,l$(1) fil$="a:vocab" OPEN fil$,a,w$ WHILE NOT EOF a$=a.w$ IF l$<>LEFT$(a$,1) l$=LEFT$(a$,1) :i%=0 LPRINT : LPRINT ELSEIF (I% AND 3) = 0 LPRINT ENDIF WHILE LEN(a$)<15 a$=a$+" " ENDWH LPRINT a$," "; l%=l%+1 NEXT ENDWH LPRINT LPRINT LPRINT "Total words=",COUNT

Appendix B

In order to successfully use a set of allophone sounds to synthesise words, there are a few preliminary points which should be made about speech and language. First, there is no one-to-one correspondence between written letters and the sounds of a language; second, speech sounds are not discrete units as beads on a string, and lastly, speech sounds are acoustically different depending on what position in a word they occur, and what sounds precede or follow them.

The first of these is a problem which a child encounters when learning to read. Each sound in a language may be represented by more than one letter, and conversely, each letter may represent more than one sound. E. g. "meet" is an example of the former whilst "people" is an example of the latter for just the letter "e". Because of these spelling irregularities we must be careful to think in terms of sounds not letters, when dealing with speech.

The second point to be made concerns segmentation of the speech signal. An adult who has learnt how to read usually thinks of the acoustic stream of speech as a string of discrete sounds which he/she calls by the letter names. But, in fact, speech is a continuously varying signal which cannot be easily broken into distinct sound-size units. For example, if one attempts to extract the "b" from "bat" by taking successively larger chunks of signal from the beginning of the word, one at first hears a non-speech noise, and then at some point hears "ha". In other words, there is no point at which the "b" sound can be heard in isolation; one hears either a non-speech sound, or the syllable "ba".

Finally, the most important point to make for users of the allophone set, is that the acoustic signal of a speech sound may differ depending on whether it occurs in word initial or word final position; or in the environment of a vowel which is articulated in the front or back of the mouth, a long or short vowel, or a voiced or voiceless consonant. For example, the initial "p" in "pop" will be acoustically different from the "p" in "spy", and may be different from the final "p" in "pop". Furthermore, the ear will perceive the same acoustic signal differently depending on what sounds precede or follow it. The word "cot" can be made to sound like "cod" by lengthening the duration of the "o", and the converse is also true.

It will be useful to know what the speech sounds of English are. The sounds of a language are called phonemes, and each language has a set which is slightly different from those of other languages. It is for this reason, that the Speech Synthesiser only supports the English language, and even then, says it with an American accent. (The speech processor chip is of American design).

It will be useful to remember that sounds which have features in common behave in similar ways. For example, the voiceless stop consonants "PP", "TT" and "KK" require 50 to 80 msec of silence before them and the voiced stop consonants "BB", "DD" and "GG" require 10 to 30 msec of silence before them. When you find a particular technique that works well with one sound, try using that same technique with similar sounds. For example, if you decide that "KK1" sounds good before a front vowel ("IY"), use it before other front vowels ("YR", "IY", "IH", "EY", "EH", "XR", "AE").

The allophone set contains 2 or 3 versions of some phonemes. You may find that you need to use one allophone or particular phoneme for word, or syllable, initial position and another for word or syllable final position. The following pages give a complete list of the allophone set together with a set of guidelines for using them. Note that these are suggestions and not rules.

One of the differences between initial and final position versions of a consonant is that the initial version may be longer. Therefore, to create an initial SS, you can use two "SS"s instead of the usual single SS at the end of a word or syllable, as in "sister". Note that this can be done with TH and FF and inherently short vowels, but with no other consonants. You will want to experiment with some consonant clusters to discover which versions work best in the cluster. For example KK1 sounds good before LL as in "clown", and KK2 sounds good before WW as in "square". One allophone of a particular phoneme may sound better before or after back vowels and another before or after front vowels. KK3 sounds good before UH and KK1 sounds good before IY. Some sounds (PP, BB, TT, DD, KK, GG, CH and JH) require a brief duration of silence before them. For the most, the silence has already been added, but you may decide to add more. Remember that you must always think about how the word sounds, not how it is spelt For example, the NG allophone obviously belongs at the end of the words "sing" and "long", but notice that it is represented by the letter N in "uncle". Finally remember that some sounds may not even be represented in words by any letters, as the YY in "computer".

MnemonicValue
(in mSec)
DurationAllophone
PA10 10 Pause
PA21 30 Pause
PA32 50 Pause
PA43 100Pause
PA54 200Pause
OY 5 420boy, noise, voice
AY 6 250sky, kite, mighty
EH 7 70 * end, extent, gentlemen
KK38 120Before: UW, UH, OW, OY, OR, AR, AO; Initial clusters: crane, quick, scream, comb
PP 9 210pow, ample, pleasure
JH 10140dodge, judge, injure
NN111140Before: YR, IY, IH, EY, EH, XR, AE, ER, AX, AW, AY, UW; Final clusters: earn, thin.
IH 1270 * sit, stranded
TT213140When TT1 not used. test, street, to
RR114170read, write, x-ray
AX 1570 * lapel, instruct, succeed
MM 16180milk, alarm, ample
TT117100Final clusters before SS: tests, its
DH118290this, then, they
IY 19250treat, people, penny, see
EY 20280great, statement, tray, beige
DD12170 Final position: played, end, could
UW122100After clusters with YY: computer, to
AO 23100* talking, song, aught
AA 24100* pat, action, pasta, umbrella.
YY225180Initial positions: yes, yarn, yo-yo
AE 26120* extract, acting, hat
HH127130Before front vowels: YR, IY, IH, EY, EH, XR, AE; he
BB12880 Final position: rib, Between vowels: fibber; in clusters: bleed, brown, business
TH 29180* thin
UH 30100* book, cookie, full
UW231260Monosyllabic words: two, food
AW 32370sound, mouse, down, out
DD233160Initial position: down, do
GG334140Before low vowels: AE, AW, AY, AR, AA, AO, OR, ER and medial clusters: anger; Final position: peg
 
MnemonicValue
(in mSec)
DurationAllophone
VV 35190vest, prove, even
GG13680 Before high front vowels: YR, IY, IH, EY, EH, XR; guest
SH 37160ship, leash, nation
ZH 38190azure, pleasure
RR239120Initial clusters: brown, crane, grease, brain
FF 40150* food
KK241190Final position: speak; Final clusters: sky
KK142160Before front vowels: YR, IY, IH, EY, EH, XR, AY, AE, ER, AX; Initial clusters: cute, clown, can't
ZZ 43210zoo, phase
NG 44220anchor, string, anger
LL 45110lake, hello, steel
WW 46180wool, we, warrant, linguist
XR 47360declare, stare, repair
WH 48200whig, white, twenty
YY149130Clusters: cute, beauty, computer, yes
CH 50190church, feature
ER151160letter, furniture, interrupt
ER252300Monosyllables: bird, fern, burn
OW 53240zone, close, snow, beau
DH254240Word final & between vowels: bathe
SS 5590 * vest
NN256190Before back vowels: UH, OW, OY, OR, AR, AA; no
HH257180Before back vowels: UW UH, OW, OY, AO, OR, AR; hoe
OR 58330fortune, adore, store
AR 59290farm, garment, alarm
YR 60350hear, irresponsible, clear;
GG26140 Before high back vowels:UW, UH, OW, OY, AX; and clusters: green, glue, got
EL 62190little, angle, saddle
BB26350 Initial position before a vowel: beast, business

* These allophones can be doubled.