Hey folks, todays post will feature some ideas, advises and conclusions about task-based concurrency programming (mostly in C++).
First of all, what actually is task-based concurrency? If it comes to threading there are basically two and a half approaches you can choose from: multithreading, multitasking, – multiprocessing.
Multithreading: This approach relies on multiple threads on which the developer can spread his workloads to reach an even resource usage and optimal performance. This approach is very low-level and requires a lot of knowledge and care of the person who is coordinating the threads. He always have to know: How many threads do I have, which threads are doing what and also (he better knows) where his threads are running. (On which cpu core for instance)
Multitasking: This one is a more high-level approach. There are some ways to get things done here. The first way: thread pooling. The developer initializes a pool of threads which keep active after they finished their work so he can pass them tasks and don’t have to create new threads all over. He (usually) don’t knows which thread is executing which task (and he also doesn’t care about it) since he just queues tasks up in the pool and the pool decides where they are going. The Second way: One could use wrappers like C++ std::async. Asyncs are (kind of) used like a thread pool. There is simply one difference. Not the developer but the “std::async framework” decides where or how the tasks are executed. If you hand a task to a std::async it could also be executed on the same thread where the result is required until you don’t explicitly tell the async object to behave another way.
Multiprocessing: You spawn multiple processes and let them communicate with each other. You are (kind of) running multiple programs wich do multiple tasks.
I will focus on multithreading/tasking in this post since I have little to no experience with multiprocessing.
Okay, this might sound like you have very little differences from multithreading to multitasking with the exception that multithreading seems to be more work. And in some way, this is actually not to far away from reality. But, and here is the point, there are reasons for both to exist.
A Realworld example for multithreading:
MessageLoop and GUI/Game thread. Let’s consider we have the following code:
We have our main thread which is basically our main function and we have our game thread, which is spawned and will run until the main thread exits it. The only communication between the two threads is managed through the function calls on the game object and the join on the game thread.
And an example for multitasking:
We have our game thread which does some logic and which is containing the game loop ( run() ) and a thread pool which is executing little independent tasks (in this case updating actors) which won’t report back to the game thread.
Alright, let’s see: On our msgloop/gamethread example exactly know which thread is dealing with which piece of work (main() with the msg loop, gameThread with the gameloop). But how does is look for the update example? Well, until we try to figure out which thread of our pool is executing which task we would have to call any kind of thread identification from our task, so simple we don’t know who is who.
And these two examples are good for two things: Giving you guys an example where which approach may fit and giving you an example how these approaches are working.
In example one each thread executes some work completely independent from each other and sometimes they communicate to trigger actions on the other one which is basically a simple event handling pattern.
In example two one thread is creating tasks, the others are just “consuming” the tasks the game thread generates. That’s why it’s called the producer-consumer pattern.
Both of those patterns are suited for different use cases as you might recognize. The first pattern is very handy if it comes to things like controls or long-term work wich can be handled parallel or any kind of input loop or audio output, the producer-consumer pattern on the other hand is better suited if you need some small pieces of independent work to be done in parallel to the work of your long running other threads, like drawing, animation updates or simple outputs.
The p-c-pattern benefits a lot from lock free tasks but suffers even more if you need to synchronize the tasks you are pooling, since if you are locking a pool thread, you also lock some future tasks from being executed while you can easily create a new worker thread if you are managing your thread lifetimes on your own.
The best performance you can reach with a thread pool depends on the continuous and spike free stream of tasks you assign to your pool. Let’s display this in a little image:
In this image we got 4 pool threads which are executing work and 4 more tasks which are queued up. The next step would be each thread would grab one tasks and the queue would be empty again (or 4 more tasks could be queued). Pretty optimal isn’t it? All work is getting done fast and clean.
And this is the way you won’t like it:
Again, 4 threads, but this time there are 6 tasks scheduled. Now let’s consider 4 are executed, 2 remain in the queue 5 more are getting added. As you might recognize (easy calculation 6 – 4 + 5 = 7) now we got 7 threads in the queue. And that’s the danger with pooling up threads and using task queues. If you throw too much work at a pool which is to small you will generate a growing backlog which will hurt your performance pretty bad. And that’s the reason why the std::async object has the option to spawn additional threads. Basically it’s the hybrid from pool to thread. I personally don’t like it too much since the standard constructor does not guarantee a parallel execution. Nevertheless, with some tricks (like the launch flag) it can get pretty handy since (at least the MSVC++ version) internally relies on a thread pool with the option to spawn a new thread if the pool is under heavy load. I have to admit, I did what each professional programmer would call bad style and implemented my own one with my own rules so I basically know how it will behave since I feel pretty confident with C++ as such but well that’s nothing I can recommend to others…
Ok, that was tasking, threading on the other hand needs a lot of synchronization to work well and sometimes it might be better for your performance since you don’t have to rely on such high-level constructs like async or pools, but well… it’s pain in the ass to get it working as intended. Your parallelization is good as long as you don’t need to access resources from multiple threads but even then, you won’t block other threads from spawning. You always have to remember: If you try to access a locked item, actually your program gets serial instead of parallel. The trick is to get your work lock free and good timed. Try to split task well thought to threads to minimize data sharing between threads and you will se a huge gain.
Usually I would hand you a picture about regular threading now but this time I decided to simply leave it be since in my opinion you can imagine it pretty well your self.
Try this. You got a railway with a crossing in the middle where the rails change the sides. (Don’t ask why, they simply do.) Now you have a train on each of the rails. Well, one of them will have to wait until the other one is passed since they don’t want to crash. And that’s exactly what your locks do. Now imagine you would do those change-of-side very often. The trains would take a lot longer to the end of the track if they would try to go parallel as they would do if they would go one after another since both could go, none of them had to wait.
Super beautiful metaphor of mine with one simple conclusion. Sometime it’s better to go for single threaded application instead of forcing threading into a super complicated program with a lot of locking work to do.
I hope this one were helpful to some of you. If you have any ideas, suggestions or criticism, feel free to comment.