Jaap's Psion II Page

Psion II Machine Code Tutorial, Part 1


This is the first of five articles about how to program machine code on the Psion.
It is not written to explain the machine code instructions, but rather to show how to handle machine code programs in the Organiser. I hope that the examples will give you a bit more confidence to try programming your own routines. You will of course need to know how to program OPL.
A passing knowledge of what bytes and addresses are is necessary, a knowledge of hexadecimal notation is preferable but not strictly needed.

Where to store machine code?

One of the obstacles to overcome when programming machine code is deciding where to store it. A program has to be put somewhere in the Psion's memory before it can be run. For small programs, the easiest is probably in a variable, for example a string variable.

Let me illustrate this method with the following very short MC program.

3D MUL 'Multiplies A and B together. 18 XGDX 'Put result in X 39 RTS 'Return

The codes for the instructions are 3D, 18, 39 in hexadecimal, or 61, 24, 57 in decimal notation. This can be easily put in a string variable using the CHR$() instruction.

MULT: LOCAL MC$(3),M%,N% MC$=CHR$(61)+CHR$(24)+CHR$(57) INPUT M% INPUT N% PRINT USR(ADDR(MC$)+1,M%*256+N%)

The last line may need some explaining. ADDR() is a function that returns the memory address where a variable is stored. A string variable A$ is stored in the following way:

Memory address: Contents: ADDR(A$)-1 The maximal length, l say, as defined in a LOCAL or GLOBAL. ADDR(A$) Current length of string. ADDR(A$)+1 \ ADDR(A$)+2 \ ADDR(A$)+3 >Space where the contents of string are stored. .... / ADDR(A$)+l /

If you declare LOCAL A$(10) or GLOBAL A$(10) then there will always be ten bytes reserved for the string, whether or not the A$ has that length. Also, the first byte of the string contents, and therefore of any machine code program that is stored in it is at address ADDR(A$)+1.
The USR instruction in the MULT procedure calls the code at the address ADDR(MC$)+1.

The other parameter of USR allows you to pass an integer value to the machine code program. This value is in the register D when the machine code starts. This program uses the high and low bytes of D (which are usually called A and B), separately. These are multiplied together by the first machine code instruction MUL. The value M%*256+N% consists of high byte M%, and low byte N%, so these are the two values that are multiplied. The outcome, which is in the D register, is now put into the X register using the XGDX instruction which swaps the values of X and D.
We do this because it is the value of X which is returned to OPL program.
What is printed on the screen is therefore M%*N% (provided they are not too large).

This example is of course a little pointless, but does illustrate the way machine code is run from OPL. Now a more useful and easier example:

3F0D OS BZ$ALRM 'Call the ROM routine that sounds an alarm call. 'This instruction is sometimes referred to as SWI 0D. 39 RTS 'Return (to OPL).

The codes for the instructions are 3F, 0D, 39 in hexadecimal, or 63, 13, 57 in decimal notation. The OPL implementation will be something like this:

ALARM: LOCAL MC$(4) MC$=CHR$(63)+CHR$(13)+CHR$(57) USR(ADDR(MC$)+1,0)

In this case no integer needs to be passed to the machine code, so the second parameter in the USR function is ignored and might as well be 0. The machine code doesn't return any value of interest either, so we can use the USR as an instruction rather than a function.
This is the same way other OPL functions like for example GET can be used as instructions if their return value is unimportant. When this program is run, it will sound one alarm ring.

For short programs like these ones, using CHR$ is a very good method.
For longer ones however, it is not practical to use a long list of CHR$ to store the program with. In such cases it is probably easier to use a small procedure to convert a string containing hexadecimal values to a string containing the program. Like this:

CONV$:(H$) LOCAL B$(127),B%,C%,D% B%=LEN(H$)-1 WHILE B%>=1 C%=ASC(MID$(H$,B%,1)) D%=ASC(MID$(H$,B%+1,1)) IF C%>%9 C%=C%-55 ELSE C%=C%-%0 ENDIF IF D%>%9 D%=D%-55 ELSE D%=D%-%0 ENDIF B$=CHR$(C%*16+D%)+B$ B%=B%-2 ENDWH RETURN B$

Make sure you save this procedure in a Datapak, in case of a system crash.

The ALARM procedure can now be rewritten in slightly shorter form as:

ALARM2: LOCAL MC$(4) MC$=CONV$:("3F0D39") USR(ADDR(MC$)+1,0)

This is a little slower of course, but that hardly matters. The delay that calling CONV$: causes, will only have to be incurred once. After the machine code is stored in MC$, it can be called as often as needed. If MC$ is made a GLOBAL variable, any sub-procedures can call the routine too, in exactly the same way.

Most machine code routines that you might want to write are fairly short, and can be programmed in this way. If a machine code program is longer than 255 bytes, it will no longer fit inside a string variable, because strings can be no more than 255 characters long.

For most small applications this is quite enough. In fact, none of the examples I will give in this article is even close to reaching that limit. If we were to need a longer program, we must work a little harder.

What we would need is to use several variables at the same time, but we must however ensure that they are next to each other in memory, so that the program has no gaps in it. The best way is to use an array.

Suppose we have declared a variable A$(k,m). It is stored in memory like this:

Address: Contents: ADDR(A$())-3 \ The number of strings in the array, k. ADDR(A$())-2 / ADDR(A$())-1 Maximal length of the string, m. ADDR(A$()) Current length of A$(1) ADDR(A$())+1 \ .... > Contents of A$(1) ADDR(A$())+m / ADDR(A$())+m+1 Current length of A$(2) ADDR(A$())+m+2 \ .... > Contents of A$(2) ADDR(A$())+2*m+1 / ....

and so on for each of the k strings.

The array A$(k,m) therefore has k times (m+1) bytes of space available to store code in. We can use the 'Current length' bytes too. We could of course design the code to simply skip past those bytes, but it makes programming much easier if we don't have to worry about that. We can't really control those length bytes directly, so we will have to poke the code into place. Simply POKEB the first byte of the program at address ADDR(A$()), the second at address ADDR(A$())+1 and so on. Remember that in this case the program starts at ADDR(A$()), because the first length byte has been overwritten too.

Instead of a string array, we can use an integer array, say A%(k). It looks like this:

Address: Contents: ADDR(A%())-2 \ The number of integers in the array, k. ADDR(A%())-1 / ADDR(A%()) \ Contents of A%(1) ADDR(A%())+1 / ADDR(A%())+2 \ Contents of A%(2) ADDR(A%())+3 / ADDR(A%())+4 \ Contents of A%(3) ADDR(A%())+5 / ....

and so on for each of the k integers.

Now we don't need to do any POKEing. We simply have to put the right values in the A%() and use call it using USR(ADDR(A%()),parameter).

The alarm program could have been written using this method, which would look like this:

ALARM3: LOCAL MC%(2) MC%(1)=$3F0D MC%(2)=$3900 USR(ADDR(MC%()),0)

Note that the starting address of the program is ADDR(MC%()). Again, short programs like this one can be put in an array directly.

For longer programs, it is shorter to write an alternative to the CONV$:(H$) procedure. You could write a procedure CONV%:(H$,P%) which stores the hexadecimal code contained in H$ into a global array, whether it be an integer or string array, at byte position P%. Its return value could be the next empty byte position. I'll leave the writing of this procedure up to you, if and when you need it.

Other places to store machine code.

It is possible to store machine code in other parts of the memory. For example, you can use the area from 0400 to 1FFF if it is available on your Psion model. On the LZ Psion this area is used for devices like the Comms link, bar code readers and so on, but if there is no such device attached, you can safely use this whole area. If there is something attached, then the system variables at $2329/A and $232B/C contain the addresses of the lowest and highest byte used.
The advantages of this area are that the code does not have to be relocatable, i.e. you can use absolute jumps instead of only relative ones. Also, the code will stay in memory all the time. This makes it ideal for interrupt routines and the like. Unfortunately this area is not available on CM/XP machines, just on LA and LZ models.

Another possibility is to embed machine code in the translated procedure code. I have an assembler program that does this for me, as you can see in some of my games that are also available from this site. This technique is very difficult to implement, unless you have or are able to write a program to do it for you. If you have a Comms link, it is far easier to write such a program using the PC than it is using the Psion. You can use the Cross Assembler to build the code, and the MakeProc program to put it in a procedure file.

In Part 2 of this series I will cover: passing parameters & the MC examples will be more ambitious.