Thursday, November 29, 2007

Parsing take #3

I want to make more structured messages so I parse the lists using a function I call compose/1 for this I again use Erlangs superior pattern matching.

So for GPGSA the compose looks like:


compose(["GPGSA",A,B,C1,C2,C3,C4,C5,C6,C7,C8,C9,C10,C11,C12,D,E,F]) ->
{gsa,
{auto,list_to_atom(A)},
{fix,list_to_integer(B)},
{satellites,satellites([C1,C2,C3,C4,C5,C6,C7,C8,C9,C10,C11,C12],[])},
{pdop,list_to_float(D)},
{horizdi,list_to_float(E)},
{vertdi,list_to_float(F)}};


This will only match the lists with "GPGSA" sentences (my GPS device only submits the sentences GPGSA,GPGGA,GPRMC,GPGSV).

The structured data I use are simple tuples, where I use Erlang atoms as identifiers for the fields, as you can see I also translate the integer data into integers (and for other messages to float values). A helper function is used so I do not repeat my self to much:

satellites([[] | _], Satellites) ->
Satellites;
satellites([A | Rest], Satellites) ->
satellites(Rest, [list_to_integer(A) | Satellites]).


The rest of compose for parsing the remaining sentences looks like:

compose(["GPGGA",A,B1,B2,C1,C2,D,E,F,G1,G2,H1,H2,I,J]) ->
{gga,
{fixtime,A},
{lattitude,[list_to_float(B1)/100,list_to_atom(B2)]},
{longitude,[list_to_float(C1)/100,list_to_atom(C2)]},
{quality,list_to_integer(D)},
{numberofsatellites,list_to_integer(E)},
{horizdi,list_to_float(F)},
{altitude,[list_to_float(G1),list_to_atom(G2)]},
{geoidheight,[list_to_float(H1),list_to_atom(H2)]},
{timesincedgps,list_to_float(I)},
{dgpsstation,J}};

compose(["GPRMC",A,B,C1,C2,D1,D2,E,F,G,H1,H2]) ->
{rmc,
{fixtime,A},
{status,list_to_atom(B)},
{lattitude,[list_to_float(C1)/100,list_to_atom(C2)]},
{longitude,[list_to_float(D1)/100,list_to_atom(D2)]},
{speed,list_to_float(E)},
{trackangle,list_to_float(F)},
{date,G},
{magneticvariation,[H1,H2]}};

compose(["GPGSV",A,B,C | SateliteData ]) ->
{gsv,
{numberofsentences,list_to_integer(A)},
{number,list_to_integer(B)},
{numberofsatellites,list_to_integer(C)},
satellitedata(SateliteData, [])}.


I divide the longitude/lattitudes by 100 as I wish that they closely match the expected format for Google maps.

The GPGSV sentences are variable length so I use a helper function for parsing those:


satellitedata([D,E,F,G | SateliteData ], Satellites) ->
satellitedata(SateliteData,
[{sattelite,list_to_integer(D)},
{elevation,list_to_integer(E)},
{azimuth,list_to_integer(F)},
{snr,list_to_integer(G)} | Satellites]);
satellitedata([], Satellites) ->
Satellites.


My final loop/1 looks like this:

loop(Port) ->
receive
{Port, {data, Data}} ->
case Data of
<<"$", Rest/binary>> ->
Tokens = tokenize (Rest),
Record = compose(Tokens),
io:format("~p~n", [Record]);
_ ->
sl:setopt(Port, mode, line),
sl:update(Port)
end,
loop(Port);
stop ->
exit(normal);
{'EXIT', Port, Reason} ->
exit({port_terminated, Reason})
end.


The resulting messages are reported in the shell like:

{rmc,{fixtime,"214612.620"},
{status,'A'},
{lattitude,[29.9792,'N']},
{longitude,[31.1343,'E']},
{speed,9.00000e-2},
{trackangle,81.9100},
{date,"291107"},
{magneticvariation,[[],[]]}}
{gga,{fixtime,"214613.620"},
{lattitude,[29.9792,'N']},
{longitude,[31.1343,'E']},
{quality,1},
{numberofsatellites,4},
{horizdi,4.00000},
{altitude,[146.2000,'M']},
{geoidheight,[0.0000,'M']},
{timesincedgps,0.00000e+0},
{dgpsstation,"0000"}}
{gsa,{auto,'A'},
{fix,3},
{satellites,[2,23,20,31]},
{pdop,7.30000},
{horizdi,4.00000},
{vertdi,6.00000}}
{rmc,{fixtime,"214613.620"},
{status,'A'},
{lattitude,[56.2778,'N']},
{longitude,[10.2091,'E']},
{speed,0.100000},
{trackangle,100.610},
{date,"291107"},
{magneticvariation,[[],[]]}}


Note I still need to parse the dates and times into some nice representations.
I will next time make an additional process which will receive these messages and report only the position data in a format compatible with Google maps

Tuesday, November 20, 2007

Parsing take #2

NMEA messages are quite simple. They are simply comma separated values (some values might be absent) with start/end markers and a checksum.
The checksum is the xor of the bytes between the start and end markers, and is sent as a hex encoded value after the end marker. Like this:


$GPGSA,A,3,11,20,01,17,,,,,,,,,7.0,2.4,6.6*36\r\n


$ being the start marker * being the end marker and 36 hex being the checksum.

Last time I made the decision to match on the start of the message like:

case Data of
<<"$GPGSA,", Rest/binary>> ->
getparams(Rest);
_ ->
io:format("~p~n", [Data])
end,


This was actually not a good decission, because the NMEA messages are so alike I find it better to simply match on the start marker and receive a list of parsed values (tokens).

So my new case looks like this:

case Data of
<<"$", Rest/binary>> ->
Tokens = tokenize (Rest),
io:format("~p~n", [Tokens]);
_ ->
sl:setopt(Port, mode, line),
sl:update(Port)
end,



Now the second _ clause should never match, but when you call the stop/0 and then start/0 funktions the sl module sometimes (read bug) looses the information that line mode was requested, in these cases this clause will match and I force the line mode on the Port, which works around this.

Now to split the received binary into tokens I call tokenize/1, now I was hoping to keep the data in binaries and build a list of these, but constructing binaries by continuously appending bytes to them is not supported in erlang, so for a first go I will turn the binary into a list of strings (a possible solution to my first wish is to scan the binary, record the comma positions and use the split_binary built in function, I will try this later and compare the performance implications).

Now the tokenize will have two missions: tokenize the binary and verify the checksum

First tokenize/1:

tokenize(<<$G:8, Rest/binary>>) ->
tokenize(Rest, [$G], [], $G).


Here I use that all NMEA messages starts with "G", so I can initialize the current token (second parameter) with a $G ($ is Erlangs syntax for characters) and the calculated checksum to the same value (fourth parameter) and the third parameter is the current list of parsed tokens which start out empty.

Now tokenize/4:

tokenize(<<$,:8 , Rest/binary>>, Token, Tokens, Sum) ->
tokenize (Rest, [], [lists:reverse(Token) | Tokens], Sum bxor $,);
tokenize(<<$*:8 , Rest/binary>>, Token, Tokens, Sum) ->
checksum (Rest, [lists:reverse(Token) | Tokens], [], Sum);
tokenize(<>, Token, Tokens, Sum) ->
tokenize(Rest, [N | Token], Tokens, Sum bxor N).


The algorithm is simple when a comma is first in the binary, a token is finished so it is added to the list of parsed tokens (as I add characters at the beginning of the list in usual Lisp style, the lists:reverse needs to be called) and $, is added to the checksum. When a * is first we move on to parse the checksum, in all other cases the first byte in the binary is added to the current token and the checksum.

Now the checksum needs to be parsed and verified, the code below does this, note that the first clause with the empty binary should in theory never match, but again in some circumstances (read stop/start again) this can happen and this takes care of that. Otherwise it is pretty straight forward, the read checksum is converted to its numerical value using {_,ReadSum,_} = io_lib:fread("~16u",Chk),:

checksum(<<>>, _, _, _) ->
{error};
checksum(<<$\r:8, Rest/binary>>, Tokens, Chk, Sum) ->
checksum(Rest, Tokens, lists:reverse(Chk), Sum);
checksum(<<$\n:8>>, Tokens, Chk, Sum) ->
{_,ReadSum,_} = io_lib:fread("~16u",Chk),
case lists:nth(1,ReadSum) of
Sum -> lists:reverse(Tokens);
_ -> {error}
end;
checksum(<>, Tokens, Chk, Sum) ->
checksum(Rest, Tokens, [N | Chk ], Sum).


Thats it, we get the binary split into a list of tokens, next I will turn these lists into more structured records.

Thursday, November 15, 2007

Parsing take #1

Erlang is not famous for its string representation, strings are just lists of characters and as such take up considerable memory, on 32 bit machines typically 64bit per character 32 for the characters value and 32 for the pointer to the next character (on 64bit machines that is doubled to 128 bit per character), ouch (for more details of Erlangs built in types see this).
Fortunately Erlang comes with a very compact binary type, which I will use. Last time we saw that the output was of the form "..." which indicates strings, Erlang uses a notation of <<...>> to indicate binaries.
Now to get the Port to send binaries to us instead of strings you simply pass, [binary] to sl:start/1 like this sl:start([binary])..

This is normal Erlang behavior for ports and looking at the sl you will see that the start/1 simply calls open_port like this:


start(OpenOpts) ->
%%...
open_port({spawn, "sl_drv"}, OpenOpts).

Now running the previous code with this change results in:

<<"$GPGSV,3,1,12,20,89,000,00,17,45,000,27,11,45,000,33,23,43,000,00*7E\r\n">>
...


So even after specifying that you want to receive binaries, the undocumented mode=line option still breaks messages on linefeed boundaries, nice.

GPS receivers typically uses NMEA messages when reporting position and other information. I use this as my NMEA documentation.

I will start with parsing the GPGSA message and modifies my loop/1 into this:



loop(Port) ->
receive
{Port, {data, Data}} ->
case Data of
<<"$GPGSA,", Rest/binary>> ->
getparams(Rest);
_ ->
io:format("~p~n", [Data])
end,
loop(Port);
stop ->
exit(normal);
{'EXIT', Port, Reason} ->
exit({port_terminated, Reason})
end.



The data received from the Port is matched against the pattern <<"$GPGSA,", Rest/binary>> for these messages so I simply call getParams with the rest of the parameters in this case. Next time I will look at getParams and a record representation of the received data.

Friday, November 9, 2007

Reading serial data

The way to communicate with foreign systems from Erlang are through so called Ports. From Joe Armstrongs book the recommended template for doing that goes something like this:

-module(gpsdriver).
-export([start/0, stop/0]).

start() ->
spawn(fun() ->
register(gps,self()),
process_flag(trap_exit, true),
%%openport
loop(Port)
end).

stop() ->
gps ! stop.

loop(Port) ->
receive
{Port, {data, Data}} ->
case Data of
_ ->
io:format("Sum: ~p~n", [Data])
end,
loop(Port);
stop ->
exit(normal);
{'EXIT', Port, Reason} ->
exit({port_terminated, Reason})
end.


The reason is that the process creating the Port is also the one receiving messages from it.
The code above is slightly simplified, as it only accepts stop messages or Data messages from the port (and process crash messages).

Now to read data from the serial connection, I of course had to read the sl (somewhat sparse) documentation (and the gps device specs). This boils down to communication settings 4800,N,8,1 from the device specs and discovering that it binds to the device /dev/ttyUSB0 on my Linux (remember that tty devices are usually only readable from root, so either do the chmod 666 or run the erl shell as root). From the sl docs a Port is created using start and options set using setopt, so my final start function ends up like (also setting options like receive buffer size, hardware flow control, and an undocumented feature called line mode, which makes the driver break messages on line boundaries):

start() ->
spawn(fun() ->
register(gps,self()),
process_flag(trap_exit, true),
Port = sl:start(),
sl:setopt(Port, dev, "/dev/ttyUSB0"),
sl:setopt(Port, baud, 4800),
sl:setopt(Port, csize, 8),
sl:setopt(Port, stopB, 1),
sl:setopt(Port, parity, 0),
sl:setopt(Port, hwflow, true),
sl:setopt(Port, bufsz, 100),
sl:setopt(Port, mode, line),
sl:update(Port),
sl:open(Port),
loop(Port)
end).


Remember to put the module in a file with the name of the module and extension .erl, also module names starts with lower-case (upper-case starting names in Erlang are reserved for variables).

Then in the Erlang shell typing:

c(gpsdriver).
gpsdriver:start().

Results in lines like:
"$GPGGA,071852.080,2997.9203,N,03113.4374,E,1,05,2.2,138.8,M,138.8,M,0.0,0000*4D\r\n" "$GPGSA,A,3,17,12,15,28,18,,,,,,,,5.0,2.2,4.5*34\r\n" ...

Stopping the flow using gpsdriver:stop().

Next: parsing time!

Sunday, November 4, 2007

Erlang - Serial Blues

So serial comunication does not work out of the box when retrieved from CEAN. This of course boils down to the fact that sl contains a native driver written in C. Now a makefile exists, so I just tried to call make, alas no go! Apparently make includes a file called ../../support/include.mk and this file was no-where in sight, not in any subdirectory of my erlang installation, not in the erlang source distribution, hmm. So this was where luck stroke, trying to peek around for this file I ended up in the Yaws source, and hey there was I file with the right name, so I made a link to that and tried make again... YES succes the C driver builded.

And now doing P = sl:start(). in the erlang shell gave me: #Port<0.100>

Time to do some Erlang ...