Correct Approach To Decoding Streams

Official FreeEMS vanilla firmware development, the heart and soul of the system!
Post Reply
User avatar
Fred
Moderator
Posts: 15431
Joined: Tue Jan 15, 2008 2:31 pm
Location: Home sweet home!
Contact:

Correct Approach To Decoding Streams

Post by Fred »

Hopefully this will help people get it more right, right from the start, in future.

Pulling packets out of a block of data is pretty trivial, as is correctly verifying them.

========================================================================

Stage One

De-encapsulation!

Step One

Find the packets!

Code: Select all

For every byte:

Is it a start?
    If it is, init your packet buffer and state. No need to do anything special for already in a packet, except increment a counter if you want.
Otherwise, if it's not a start:
    If we're in a packet:
        Is it a stop?
            If so close the packet buffer and transfer control to the next stage.
        Otherwise, if it's not a stop:
            Store the byte as is.
        End.
    Otherwise, if we're not in a packet:
        Drop the byte, it's noise.
    End.
End.

Repeat!
Step Two

The next step is to un-escape the packet found, this is done by iterating over each byte between start and stop (which were not stored in our buffer!) looking for escape bytes.

For each non-escape byte found we store it in our output buffer as is.
For each escape byte found we treat the following one differently.

A naive implementation will mask the assumed-to-be-escaped byte with 0xFF and store it as is.

A better implementation will check it to ensure it's one of the three valid escaped bytes, and if not reject the packet as invalid. If so we can store the associated un-escaped byte into our output.

Step Three

Finally we need to check the checksum. After this the encapsulation validation is complete! For the current format, simply add all bytes except the checksum together and compare the result (ignore overflow outside of the 8 bit unsigned region) to the received checksum pulled from the last byte.

========================================================================

Stage Two

Encapsulation validation is complete! It's now time to parse the contents into header information and a usable body and structurally verify it in the process.

First ensure that what's left is 3+ bytes long. 3 = minimum possible. If not throw the toys out of the cot.

The header byte drives the verification process. Check the flags and build up a new min header length, then check that. If less, throw the lego out of the cot again.

If we're still happy, extract the available fields, eg ID, length, body, sequence, etc.

If there was a length, make sure the body matches this requirement. If not, you guessed it, kick barbie to the kerb.

Store these in your object format, whatever that is. in C, use a struct and consider some fields invalid if flags not set in header. In more advanced languages, use an actual object.

========================================================================

Stage Three

Use the data! What you do here is largely up to you, however there are some recommendations to give.

Have one or more handlers registered to each interesting ID. Have a null handler for IDs that you know about but wish to ignore. Route the data to the appropriate handlers. If an ID is unhandled, route it to a default handler and dump out a message. Maybe have some code that monitors for _all_ errors and logs them, perhaps visually. Send nacks back to where they came from so that the failure can be appropriately handled in the right place.

========================================================================

Optimisations

Firstly, fucking DON'T! OK, now that that's out of the way, there are two small ones that you could do. One is a grand idea, the other will screw you later, as premature optimisations so often do!

1) (do this) Combine steps 1 and 2 in de-encapsulation to reduce double handling. It's silly not to, though you should keep the two phases distinct in your mind. Because they're fairly closely related, this is very safe and not likely to change.
2) (don't do this) Combine step 3 in de-encapsulation too. Also to reduce double handling. This is a terrible idea, though. If the sum style changes your encapsulation code needs a rewrite, and is vulnerable to breakage/regressions. Keep this distinct and separated!!!

Another thing that you should consider is buffer pooling of some sort to reduce memory churn. This applies to Java and C and all between.

========================================================================

Keeping It Clean

It's important to split these ideas apart and make damn sure that only ONE piece of code "knows" about each aspect such that your code is reliable and durable to change over time.

OK, go forth and conquer!

Fred.
DIYEFI.org - where Open Source means Open Source, and Free means Freedom
FreeEMS.org - the open source engine management system
FreeEMS dev diary and its comments thread and my turbo truck!
n00bs, do NOT PM or email tech questions! Use the forum!
The ever growing list of FreeEMS success stories!
Post Reply