Amarok 3 architecture brainstorm (was: GSoC considerations)
Matěj Laitl
matej at laitl.cz
Thu May 2 15:25:27 UTC 2013
On 12. 4. 2013 Edward Toroshchin wrote:
> First of all, thanks for reading everything I've written under
> intoxication. And for taking seriously the need of designing the
> architecture for the future.
>
> You also present some good ideas. However, I am sure that the process
> split is compatible with your solution, and I'll defend it hereafter.
>
> You basically say that almost all of the benefits may be achieved
> without the process split. This is of course true. But you don't say
> what is wrong with the process split and why we shouldn't do it.
You've basically answered yourself - the process split would be superfluous if
the architecture cleanup materializes in the way it was outlined, IMHO.
Fortunately enough, 95% (if not more) of the changes I proposed are
prerequisite of the process split, so we can just slowly start implementing
these and postpone the process split decision until it is done.
> On Fri, Apr 12, 2013 at 04:29:52PM +0200, Matěj Laitl wrote:
> > That won't go away, unless you disable threading entirely (which you
> > probably don't want, moving all our async code to separate processes
> > could be a lot of work). But see below.
>
> Yes, the threads in UI would (probably) stay. But most of all the other
> components (playback engine, databases) could remain completely
> single-threaded.
Well, this wouldn't change either in my proposal. (well, in fact, the
EngineController is already heavily threaded, but you don't see it - Phonon is
asynchronous and its backends are allowed to create threads - for example pgst
creates a plethora of them)
> Having threads in UI code is quite common and every UI developer should
> be comfortable about that. But when each component is allowed to create
> its own thread, the complexity grows beyond reason.
The components as outlined in my proposal would "live" in the main thread, but
yes, nobody would prevent them from creating their own (private) threads. It
is their job to ensure that their threads only call thread-safe API of other
Amarok components. My proposal would even help in this case, because they can
legally only call "core" API of other components, whose thread safety would be
much better documented.
I don't think that this is a real problem, threads of component A would be
largely invisible to components B - as it is currently the case for example
with EngineController - most Amarok devs don't really know it uses threads.
> We are an open source project. Complexity is our enemy: the existing
> developers have to spend extra unneccessary effort and time (which is
> unpaid), and new developers are forced to read all the code before they
> can fix a tiny bug. (I understand that the last statement holds more due
> to the intertanglement of all the code rather than due to running
> everything in the same process, but it still holds)
I completely agree with this. Process split would be the complexity for me.
> How many of the currently active developers _really_ know how many
> threads are out there, and what do they do at any given moment?
Perhaps none of us. But that's entirely fine, the responsibilities are
different: whey *you* introduce a thread, *you* must ensure it will only call
thread-safe or reentrant API in a correct way. Again, documented "core API"
would help here to make this easy.
> I think having at least some components that aren't physically subject
> to deadlocks, race conditions, memory leaks, etc. would be beneficial in
> itself.
Yes, but I see multi-process design even more prone to deadlocks and race
conditions. For memory leaks, checking memory leaks when unit-testing and
tools like valgrind's massif and its visualizer may help fighting them in a
monolithic program.
> > For me, multithreading *is* the fun part and classes like
> > ThreadWeaver::Job make it easy.
>
> Make what easy? Writing code, yeah. But the code is most of the time
> read and executed not written. So I wouldn't say that it's easy.
The idea is that you would be able to easily check validity of threaded code:
1. for each ThreadWeaver::Job::run() reimplementation code statement:
* if it uses locally-constructed per-thread object: continue
* [uses global object from a different thread]: mouse-over the call:
* does the docs contain "This method is thread-safe"? if so, continue
* else: investigate -> ( realize it is thread safe || make it thread
-safe || cease using it from another thread. ) && update
documentation
> > I think this is tied to more general idea of better independence of the
> > components rather than to IPC suggestion specifically. When it comes to
> > debugging, I think this is very subjective and varies case-by-case, for
> > example debugging out-of-process kio slaves in gdb sense is all but
> > pleasant [1].
>
> It doesn't say that debugging is unpleasant, it just says that since the
> processes are started somewhere inside other process, you have to jump a
> little to attach a debugger. Big surprise.
Have you actually debugged a kio in the past? I did, and it was far more
inconvenient than debugging a monolithic application.
> > Hmm, right, but what would be an advantage of that?
>
> First advantage is that we know precisely how much resources each
> component uses. For example, if amarok now would eat up 500M of the RAM,
> or run at 100% cpu, would you know what's caused it.
Profiling will give you much finer granularity than "oh, so the process of
CollectionManager it is". Linux's perf tool can profile without any speed
sacrifice.
Valgrind's massif will give you much finer granularity of heap usage profiling
than "oh, so the playback process it is".
> Then we would also learn how to limit it in each case. Personally, I
> believe playing music is not a task that should use more RAM than some
> videogames. At least not all the time.
Me either. That's why I designed memory-management of components and their
reclaim when unused. No more "let's instantiate this bunch of singletons so
that they are available in future".
> > I think that summary memory usage of the IPC solution would be higher
> > - a little duplication of otherwise shared data is IMO inevitable.
>
> Assuming amarok is currently leakproof, yes. But I don't need to run
> valgrind to doubt that.
Process split alone wouldn't fix the leaks. And I doubt it would help
significantly to fix them. "this process is leaking" is way too much coarse.
> > Please spend a few moments thinking about it.
>
> Your points are good, and I think quite important (and some neccessary)
> for a successful process split.
Good.
> > * when you minimize Amarok to tray, the whole mainwindow component
> > could be deleted; starting Amarok to tray should be blazing-fast as
> > the bulk of gui won't be loaded until you open it.
> >
> > * when you stop playing, the enginecontroller and perhaps all phonon
> > thingies could be destroyed. (questionable)
>
> True, but in IPC case the processes themselves could be stopped, thereby
> releasing all the leaked memory, static resources, libs and whatnot.
Yes, that would be a slight advantage. But let's hope we'll be able to fix the
most offending leaks. OTOH this comes with a cost: responsiveness. releasing
static resources, libs, etc. means you have to acquire them next time, and
this can take surprisingly long (for example linking in shared libs on program
startup can be rather expensive, that's why prelink exists). For example I'd
be willing to wait 0.2 secs for main window to show up from dock/playback
starts, but I wouldn't be willing to wait 1.5 s for it.
Because I believe that the bulk of our memory usage is heap allocated by
Amarok code/mysqld cache, and because read-only portions of linked libraries
are shared among processes, I believe that reclaiming components (as opposed
to processes) is a good compromise.
> > The single most important advantage if this approach is that it could be
> > done *gradually*.
>
> Well, it could be done *gradually* before a process split. Your solution
> requires removing a library at the end, which is as abrupt as a process
> split. So I don't really see a difference there.
You're partially right. But we can gradually move classes around from let's
say libamarok to amarokcore or qwidgetui just by moving the core around and
making slight changes to CMakeLists. I don't know how easily can be moving
classes around processes, but I suppose that IDE cannot do it for you.
Removing libamarok would be a milestone, however the process to do it would be
to gradually remove classes out of it, so the change shouldn't be really
"abrupt" as you would be essentially removing an empty library.
> Having said all that, I should emphasize, that your views on components
> and memory usage rules are quite good and I agree with them completely
> (at least in general, there are some details that should be polished
> out, naturally).
Good. While it may look as the contrary, I actually think we have a very
similar goals and approaches, just with one technical difference (to split or
not to split, that's the question). As I've said above, that question doesn't
have to be answered until very late in the process.
I'd also like to have feedback from other devs rather than a 2-man discussion.
Bart, Ralf, Markey, Sam, Sven, Téo!
Many of us seem to be coming to Akademy, maybe we can actually start a bit of
hacking there? (not just discussing)
Cheers,
Matěj
More information about the Amarok-devel
mailing list