Threading

Aaron Barnes' wxPython based FreeEMS tuning tool. No longer maintained and out of date with the protocol requirements.
User avatar
sry_not4sale
LQFP144 - On Top Of The Game
Posts: 568
Joined: Mon Mar 31, 2008 12:47 am
Location: New Zealand, land of the long white burnout
Contact:

Threading

Post by sry_not4sale »

Hopefully writing this out will help me understand it a bit more... and to get some feedback.

Goals of threading:
* Prevent the gui from becoming unresponsive during large amounts of serial activity. Currently caused by packet decoding running in the same thread
* Prevent large send or receive actions locking up the serial connection for the other
* Have the serial connection constantly monitored, not just during gui idle event as it is currently


Limitations:
* Reading/writing to serial connection cannot be done by two separate threads due to locking.
* Running any method (except the run() method) in a thread, will lock up the calling thread while its being run. Hence all non-near-instant processing should be kept in the run() method (packet to/from raw bytes processing!).


Pythonisms:
* A class can be threaded by extending the thread class.
* The __init__ method will be run, followed by the run() method of the class.
* The thread terminates when the run method does, so in essence only the run() method is threaded
* Either the threads must be notified of program shutdown, or run in daemon mode in which they will be terminated on parent thread completion.
* Other methods of the threaded class can be run from other threads
* Python has condition objects for locking, waiting, notifying logic


My current idea:
* One thread for serial, one for receive, one for send, one for gui.
* Serial thread will have a send queue, containing raw packets. The run() method will process the oldest packet in the queue, check the receive buffer, send any receive buffer to the receive thread, then loop again. Does it need to sleep between loops? The serial thread must keep the connected flag up-to-date, as other threads will poll this continuously, and cant wait for the run() method to answer.
* The send thread will use the current protocol interface, but instead of sending packets in each method. It will add them to a queue for processing. The run() method will loop through this queue, creating raw packets, which will then be appended to the serial thread send queue.
* The receive thread will append all raw buffer received from the serial thread to its own buffer, and process this buffer in the run() method. It would then send completed packet objects to the gui thread which would offload them to any watching classes.
* The gui thread would do its gui thing, send packet classes to the send thread, and receive packet classes from the receive thread reading for dispersing to watching classes.
* Gui methods that need to wait for responses, will need to set up wxPython events, and the receive thread will need to have the functionality to raise events.

Hope that all makes sense. Writing it out at least has helped a lot for me :)
Owner / Builder: 1983 Mazda Cosmo 12at (1200cc 2-rotor turbo) coupe [SPASTK]
165hp @ 6psi standard - fastest production car in japan Oct 82
User avatar
Fred
Moderator
Posts: 15431
Joined: Tue Jan 15, 2008 2:31 pm
Location: Home sweet home!
Contact:

Re: Threading

Post by Fred »

Caution, rambling post :-)

Another goal of threading is to keep the latency of any given piece of code down to a dead minimum. This way stuff is never waiting to start just because something else is operating.

On that note I think you are missing a thread between gui and send/receive. The gui one should just read data to be displayed that something else in the middle is updating. It should send events to the middle with user actions that need handling. The middle should action those things and contain the logic of the app. The gui should be thin and single purpose. MVC where M (the model) = the serial connection and internal representation of the data, not a database, V (the view) = the gui display and action receiver, C (controller) = the logic and middle layer that does the thinking.

The thread that handles the actual serial device should probably have and in and out buffer with locking on them or something such that it purely handles bytes from a stream/array. It should have a short sleep in there to wait between bytes. Based on the receive buffer in the pyserial, you'll probably want to check the outgoing buffer size before writing to it to ensure you aren't overflowing it by outrunning it. If there is a outgoing buffer like the incoming one, then you should be able to look at the size of it, see how much is free, grab the buffer, push it all, or the amount free of it into the outgoing thread and then sleep for the time it takes to send that much or a little less. (maybe min half as long). It might even be nice to send it an event to wake from sleep (can you do that in python like you can in java?) and push out whatever you just pushed in.

Typically you would only use the run thread and have the rest of the methods/functions private to that thread such that it's obvious what will return and what wont. Does that make sense in the context of python?

With regards the packet processing from/to raw bytes, it needs to be efficient regardless of whether it is in a thread or not. You need to be able to rattle through 10kB/s of data at low - lowish cpu load such that other stuff has a chance. A thread that hogs CPU still hogs CPU even as a thread :-)

With regards thread termination at shut down, you might want the main.py thing to be just a place where all the threads are started that itself sleeps till it receives some sort of event that needs handling (eg exit). It's an implementation thing for sure, but it seems reasonable to have a single parent that controls the application overall.

Down to your idea section now.

1) Again, I don't think the serial thread should know about packets. It should just have the single purpose job of taking bytes from one buffer and stuffing them into another in a timely efficient fashion without any serial dead time or excess latency. It does need to sleep, but how long for and based on what is up for discussion :-) All threads need to sleep, usually they are woken for some reason, or poll each time they wake. The advantage to being woken is that they can have a longer sleep period and less or no polling overhead.

2) The send thread. I think you are using thread to describe the whole object. Perhaps you need to keep all the stuff that isn't run out of the thread object such that it is single purpose (just the looping and loading only). That would be more modular. Yes, the other stuff is related, but it's not part of the actual thread as such, so don't put it in a thread class.

3) Same comment for receive, keep the functionality separate.

4) As mentioned above, I think you need a middle layer thread, not just gui.

5) It's true that receive will need to send events, it's equally true that send will need to be sent events.

6) Ignore anything above that doesn't make sense. It was written sequentially and in a hurry, so it could be wrong in places. :-)

Enjoy!

Fred.
DIYEFI.org - where Open Source means Open Source, and Free means Freedom
FreeEMS.org - the open source engine management system
FreeEMS dev diary and its comments thread and my turbo truck!
n00bs, do NOT PM or email tech questions! Use the forum!
The ever growing list of FreeEMS success stories!
User avatar
sry_not4sale
LQFP144 - On Top Of The Game
Posts: 568
Joined: Mon Mar 31, 2008 12:47 am
Location: New Zealand, land of the long white burnout
Contact:

Re: Threading

Post by sry_not4sale »

Any idea on overheads of lots of threads?
Owner / Builder: 1983 Mazda Cosmo 12at (1200cc 2-rotor turbo) coupe [SPASTK]
165hp @ 6psi standard - fastest production car in japan Oct 82
User avatar
Fred
Moderator
Posts: 15431
Joined: Tue Jan 15, 2008 2:31 pm
Location: Home sweet home!
Contact:

Re: Threading

Post by Fred »

It shouldn't be significant, I wouldn't worry if you are in the single digits or low 2 digits.

Fred.
DIYEFI.org - where Open Source means Open Source, and Free means Freedom
FreeEMS.org - the open source engine management system
FreeEMS dev diary and its comments thread and my turbo truck!
n00bs, do NOT PM or email tech questions! Use the forum!
The ever growing list of FreeEMS success stories!
User avatar
Fred
Moderator
Posts: 15431
Joined: Tue Jan 15, 2008 2:31 pm
Location: Home sweet home!
Contact:

Re: Threading

Post by Fred »

Some snippets from msn :
(11:00:20 PM) Fred: actually, the send function will want to work a bit differently wont it. You'll want to send a packet and then either wait a fixed time period for it to process, send out version requests till you get a response (ecu will ignore until its finished with the last packet), or wait for an ack/reply/error from your command.
(11:00:26 PM) Fred: so it wont be an issue
(11:00:49 PM) Fred: as you'll need to artificially limit your output speed anyway in one way or another
By that i meant, send out version requests to see when its ready to do more real work and then when you get a version back, send out the next burn command or table etc. ditto for the delay or waiting for a response, but those should have been obvious.

Fred.
DIYEFI.org - where Open Source means Open Source, and Free means Freedom
FreeEMS.org - the open source engine management system
FreeEMS dev diary and its comments thread and my turbo truck!
n00bs, do NOT PM or email tech questions! Use the forum!
The ever growing list of FreeEMS success stories!
User avatar
sry_not4sale
LQFP144 - On Top Of The Game
Posts: 568
Joined: Mon Mar 31, 2008 12:47 am
Location: New Zealand, land of the long white burnout
Contact:

Re: Threading

Post by sry_not4sale »

Are you going to implement required responses to all packets?
Owner / Builder: 1983 Mazda Cosmo 12at (1200cc 2-rotor turbo) coupe [SPASTK]
165hp @ 6psi standard - fastest production car in japan Oct 82
User avatar
Fred
Moderator
Posts: 15431
Joined: Tue Jan 15, 2008 2:31 pm
Location: Home sweet home!
Contact:

Re: Threading

Post by Fred »

There are a number of changes needed for the serial stuff and that is possibly one of them. The question is, should I just continue to handle both and leave it as your choice, or should I make the protocol less flexible and force replies. I'll have to think about it more. right now it will reply to all things that need it IIRC. certainly most stuff that isn't a reset command has a possible/probable/always response.

I'll start a discussion on changes to the current protocol fairly soon. I just want to get the JSON out of the way first. Also, the pin out stuff still needs my attention. Jared at least is getting itchy and starting to talk about dreamy things for which there are no developers such as FPGA IO which isn't needed to do anything at all currently. So I need to get onto that again ASAP.

I have 11 days straight off over xmas, so I'll get some solid work done in that period for sure. I'd like to nail the JSON before then though. The hardware stuff is so so hard to do when I get home at 8pm and have to have dinner, prepare for the next day and discuss things on the forum etc.

Fred.
DIYEFI.org - where Open Source means Open Source, and Free means Freedom
FreeEMS.org - the open source engine management system
FreeEMS dev diary and its comments thread and my turbo truck!
n00bs, do NOT PM or email tech questions! Use the forum!
The ever growing list of FreeEMS success stories!
tobz
TO220 - Visibile
Posts: 14
Joined: Sat Apr 25, 2009 10:28 am
Location: Rhode Island, United States
Contact:

Re: Threading

Post by tobz »

Well, it looks like I certainly picked a great post to make my introduction on... haha.

The threading is simple, so long as you keep it simple.

GUI responsiveness is one thing. Over in C# land (my primary environment) when we use WinForms, you have to multithread to not block the UI from updating and keep it "responsive." It's the same everywhere. One notion we have to abide by with WinForms, though, is that all UI actions invariably get executed on the UI thread. I don't know if wxWidgets has the same paradigm (I would hope it does) but that is the seperation you would want regardless.

Next is the networking. Again, in Windows/C# land, we deal IOCP threads and worker threads when it comes to networking. I don't know how well this carries over to Python, and most importantly, serial communication in Python. The general idea is that you have atleast 1 receiver thread and 1 worker thread. This where your throughput comes from: you're never waiting to receive in the worker thread, or waiting to work in the receive thread. Well, insofar as waiting because other things are happening. You're only waiting to do your job, not because somebody else is busy doing theirs on your thread's time.

Usually we scale threads to a percentage of users, or on other metrics. Massive amounts of threads, in the Windows sense, is usually useless when it comes to increasing performance/throughput. Having the number of CPU cores, or double, is probably the most you want to go for thread count. Otherwise, you're getting very diminishing returns. In your case, you'd most likely want/have a minimum of three threads.


A breakdown of the ideal threading strategy:

You'd have your UI thread, which for all intents and purposes is your main worker thread. It handles all the UI logic, UI notifications/event processing, and general things like reading/writing to the configuration file, etc etc.

You'd then have your true worker thread, handling things like IO. This thread is the one you pass off to from the UI thread to do things that would otherwise block your UI thread and make the application unresponsive. This is where calls to send data to the ECU are handled. This is also where data received is finally acted upon.

You'd then have your receiver thread. (also referred to as a producer in the Producer/Consumer software pattern) This is the thread that handles low-level IO, working directly with the serial port. This thread handles sending/receiving data, parsing the data to make sure its not corrupt, etc, and handles the shipping of the data to our consumer, the main worker thread.


Now, because it's critical to how you organize your synchronization points, you have to decide how data will be shared.

The most basic approach is to actually share it. This would be allowing everybody to access a global variable. For something like this, you need a lock. You could do a mutex, semaphore, etc etc. The lock is required by virtue of only wanting one thread to operate on the data at a time. All it would take is the receiver thread storing newly received data, and the worker thread removing data it just parsed from the same structure, to essentially have memory corruption.

The other choice is to use a strategy called message passing. Message passing is, very simply, like voicemail. People need you to do something, so they leave you a message on your voicemail. You can get multiple voicemails, and everytime you check your voicemail, you do each thing asked of you for each voicemail. Imagine having 10 people come up to your desk, each asking you to do something. You'd quickly lose track of things, miss things, etc, without any order to the requests. That order would come in the form of a lock. Message passing contrasts that by saying, hey, leave me a message tell me what you want me to do, and I'll get it to you eventually. Now, you still need a way to synchronize the receiving/parsing of messages, but it is internal to the object receiving the messages; the person sending them need not be concerned with anything other than sending the message.

My personal choice is message passing. It's a cleaner approach, in my opinion, and it forces the synchronization into the object getting messages which allows for a sort of.. zero-configuration approach to working with a new module. There are no synchronization points to learn, no corner cases to worry about... just send a message, with a callback if needed, and continue along your way.


I'm very interested in the arena of open-source/DIY automotive monitoring/tuning, so hopefully I can help contribute at some point in the future. It looks like you're certainly trying your best to take a clean and logical approach to this whole open-source ECU thing. :)
User avatar
Fred
Moderator
Posts: 15431
Joined: Tue Jan 15, 2008 2:31 pm
Location: Home sweet home!
Contact:

Re: Threading

Post by Fred »

Hi, welcome along :-)

One question and one statement :

Q) What do you use for message architecture inside a single C# application? When I started writing a java tuning tool (didn't get far due to time etc) I started writing it using JMS internally with activemq. Maybe there was a lighter weight way to do it in java, but that to me seemed scalable to a wider setup than single app. I'm curious about the C# approach to this style. Also, I agree that it is the best approach :-)

S) With regards threads, it isn't quite fair to say you want only as many as you have cores or tasks. You certainly want each major thing that needs doing to have its own thread, but if for example you had 4 cores, 5 types of thing that needed doing, 4 of them were low intensity, 1 was high and slicable, then you would probably want 8 threads to be able to max out all 4 cores with the heavy duty slicable tasks threads. I guess I'm just clarifying more than anything :-)

That's all, thanks for your thorough input!!

Fred.
DIYEFI.org - where Open Source means Open Source, and Free means Freedom
FreeEMS.org - the open source engine management system
FreeEMS dev diary and its comments thread and my turbo truck!
n00bs, do NOT PM or email tech questions! Use the forum!
The ever growing list of FreeEMS success stories!
tobz
TO220 - Visibile
Posts: 14
Joined: Sat Apr 25, 2009 10:28 am
Location: Rhode Island, United States
Contact:

Re: Threading

Post by tobz »

Fred wrote: Q) What do you use for message architecture inside a single C# application? When I started writing a java tuning tool (didn't get far due to time etc) I started writing it using JMS internally with activemq. Maybe there was a lighter weight way to do it in java, but that to me seemed scalable to a wider setup than single app. I'm curious about the C# approach to this style. Also, I agree that it is the best approach :-)
.NET has no real messaging architecture. What I've used is a freely-available lock-free queue implementation (implemented as a singly-linked list) that work off of the x86 architecture's interlocked instructions. The locking is done at the CPU bus level, rather than in software. Either way works, but I wanted performance, since software locks don't scale whatsoever. :p

This is a most likely a nil point as I haven't seen any support in Python yet for interlocked operations.

Long story short, it's not so much the architecture behind it as it is the practice. You can pass messages a million ways, but it must be synchronized and you never want to have to deal with the manual synchronization as a message sender, period. That's the baseline requirement.
Fred wrote: S) With regards threads, it isn't quite fair to say you want only as many as you have cores or tasks. You certainly want each major thing that needs doing to have its own thread, but if for example you had 4 cores, 5 types of thing that needed doing, 4 of them were low intensity, 1 was high and slicable, then you would probably want 8 threads to be able to max out all 4 cores with the heavy duty slicable tasks threads. I guess I'm just clarifying more than anything :-)
We're both right, pretty much. Aside from high high thread counts (say, a server application written to support each user with their own thread - very bad, hurts performance more than it helps) there isn't anything terribly wrong with a thread per each major task. Still, keeping the numbers low means the cores can stay saturated. Context switching is costly performance-wise, so having 4 threads on a quad-core (non-hyperthreaded) will always be as fast, most likely faster, than having 8 or 16 threads. This is where staying close to core count for thread count is good.

The only place where I'd detract from that thinking is where it's a number close to the number of cores. If I had 6 - 7 major tasks that I thought should be isolated, I'd probably do it with a thread for each task. Scaling threads to meet concurrency needs always needs to be evaluated for the situation. In .NET's case, the .NET runtime gives you 25 threads per core. This let's you maintain a high-level of concurrency, but it's nothing crazy.

Above all of that, however, is the fact you want to be able to avoid costly locking between threads for performance reasons, and because no matter how you lock, you still have to lock, and that leaves room for bugs to be introduced and errors to occur. With the server software I work with, this is hard because we are constantly working from a large pool of threads, and have hundreds of modules, many with their own synchronization points. With the FreeEMS tuner, it's very very simple in comparison, and you have only two - four points where you ideally want to separate work and thus need to synchronize.

Parallelization is by no means a cut-and-dry subject, and now .NET is rolling out dedicated libraries to help parallelize code and provide schedulers/task managers designed to provide on-demand concurrency, work-stealing, etc, to maximize work performed. Python has nothing like this as far as I know, so careful initial design and testing will ultimately lead to the best performance. In reality, though, what needs to be done is very simple, it just needs to be designed properly from the start. :)

I'm actually working on a flowchart/UML diagram/whatever-you-want-call-it of how I think the software should look. I actually haven't even looked at the code, because I want to see how close I come to what exists, and I don't want the existing design to influence how I think it should be designed. Hopefully my thoughts can give a fresh insight to things that could be done differently/better.
Post Reply