Review Request 116461: KConfigSkeleton: avoid calling reparseConfiguration() immediately after creation.

Sun Mar 16 22:12:42 UTC 2014

> On Feb. 28, 2014, 8:41 p.m., Matthew Dawson wrote:
> > While I'm fine with the idea behind this optimization, I worry that this implementation could create situations were a configuration change is not picked up by the system.  For instance, what happens if the user doesn't immediately call readConfig?  This could create some subtle bugs in downstream code.
> > 
> > I had two ideas for how this optimization could be implemented:
> > 1) Lazy load the KConfig object the first time it is used.  Then, in readConfig, the call to be reparseConfiguration could be avoided if the KConfig is created due to its call.  This would retain the current behaviour, while ensuring readConfig reads in the most configuration.  Other uses of the KConfig will have to ensure the KConfig object has already been created, and if the user calls one of those functions before readConfig, they will still double read the configuration.  But since this is just status quo, I'm not too worried.
> > 2) Alternately, create a set of construction functions, like make_shared, that imitate the creation of a KConfigSkeleton subclass, and then reading the configuration through readConfig.  Internally, it can use a private readConfig function to ensure the configuration is no re-read.  This would require changes to applications to avoid the extra configuration call, unfortunately.
> > 
> > I saw RR #115964, and I assume that some of the reductions to the readings of oxygenrc are caused by the sharing the KConfig between some KConfigSkeleton's?  If so, I'd suggest implementing solution 1 for the general case, and then making a special construction function to handle shared KConfig's.  I don't want to avoid calling reparseConfiguration without some warning around this, as it may again cause some surprises.  A new appropriately named function shouldn't be too bad though, as opposed to changing the behaviour of the constructor.
> 
> David Faure wrote:
>     I've been thinking about this too, but what good is a KConfigSkeleton if you don't call readSettings() on it? You can't read any settings from it then, so all you can do is a) keep it around for later or b) use it purely for writing out a new config file. Use case b) is no problem, so I think we're talking about use case a). Yes in theory an app could see a behavior change in that the config file is loaded from disk at the time of creating the skeleton rather than at the time of calling readConfig the first time. But this is why I'm making this change in 5.0 and not in 4.13. I think it's an acceptable behavior (matching KConfig's behavior more closely - it parses in the ctor, not in readEntry) and I doubt it affects many apps, since all I see everywhere is singletons - i.e. the KConfigSkeleton is created on demand at the time where it's first needed, therefore the ctor is immediately followed by readConfig.
>     
>     My alternative idea (let's call it "3" to complete your list) was to pass a flag to the KConfig constructor to tell it "don't parse now", and setting that flag from the KConfigSkeleton constructor. Then readConfig can keep always calling reparseConfiguration(). This would work, right?
>     
>     Your suggestion 1 is somewhat equivalent, but since one of the ctors for KCoreConfigSkeleton takes a KSharedConfig::Ptr as input, it's not applicable, we can't delay the creation of the kconfig within KCoreConfigSkeleton since it's created beforehand by the application.
>     Your suggestion 2 requires changing all the apps, which lowers the benefits of the optimization.
> 
> Matthew Dawson wrote:
>     I agree, it is a weird use case, and the software should probably be adjusted.  However, if an app does rely on that, it is very hard for the author of the software to notice the change, even with the port to 5.  If I just looked at the functions names, I'd expect readConfig to read the file all the time.  Following the principle of least surprise, I'd like to avoid readConfig ever not reading the file.
>     
>     I'm fine with your alternate idea.  I prefer that over my first idea, as it effectively does the same thing while being less invasive.
>     
>     For my second suggestion, I realize its downsides.  I just like following the principle of least surprise.  If your alternate idea is implemented, I believe that would cover most cases.  Suggestion 2 can then be implemented, and its related constructor could be marked deprecated.  This would allow for existing programs to continue working, while allowing developers to change their code to take advantage of the optimization.
>     
>     As I stated earlier, I'm not sure about who KDE wants to handle source compatibility with kdelibs.  I also wouldn't mind just removing the second constructor, forcing all users to upgrade their code.  Since the function is a drop in replacement, it wouldn't be that hard for developers to upgrade.
> 
> David Faure wrote:
>     I don't agree that "readConfig" necessarily implies reading from disk. All of the KConfig API says things like "readEntry", which does not read from disk, but uses the in-memory cache. Doing the same one level up, in KConfigSkeleton, would be rather consistent.
>     
>     Your second suggestion is still a bit unclear to me; some function calling "new Configuration" (assuming that's the name of the derived class, like in kdnssd-framework/src/settings.h) is already what the generated code has with its self() method that takes care of calling the constructor anyway. So the app code is then just Configuration::self()->readConfig(); Configuration::someValue() (see e.g. kdnssd-framework/src/mdnsd-domainbrowser.cpp).
>     I'm not sure what you'd do in there to avoid the double parsing.
>     
>     My own alternative idea leads to http://www.davidfaure.fr/2014/delayedparsing.diff which indeed helps less - still 3 parsings of oxygenrc, because of the three skeletons sharing the same KSharedConfig, and each of them calling readConfig. As you anticipated in your paragraph about RR #115964. What's your suggestion then for this issue? You suggest a named function - but changing names doesn't matter, this is called from generated code there too. See kde-workspace/libs/oxygen/oxygeninactiveshadowconfiguration.cpp (in the builddir) for instance.
>     
>     My initial patch (the one in this RR) fixes that testcase fully because all skeletons share the same ready-to-use KSharedConfig, and none of them call reparseConfiguration in the first call to readConfig(). I.e. with sharing in mind, it makes more sense to say that it's the job of the KConfig to do the initial parsing than of KConfigSkeleton (of which there could be more than one). Of course this approach doesn't help with subsequent reparsings though (on a "config changed" signal) ... each readConfig() will trigger a full reparsing of the config file behind the scenes.... not great. It's starting to sound like we should offer a variant of readConfig() which just never calls reparseConfiguration(), i.e. leave that up to the application to deal with. At least for oxygen that would be perfect :-) For simple apps readConfig() would stay.
>     
>     Conclusion: if you don't like this RR, I'm ok with http://www.davidfaure.fr/2014/delayedparsing.diff + a readConfig(NoReparse) or a separate method for which I can't find a good name right now.
> 
> Matthew Dawson wrote:
>     While readConfig doesn't necessarily imply reading values from disk, the documentation specifically states that it does ( http://api.kde.org/frameworks-api/frameworks5-apidocs/kconfig/html/classKCoreConfigSkeleton.html#a3772e0cd58790c312b8d0802bc78a70c ).  As we both agree, most applications don't care about a change in behaviour here.  I just worry some might in a subtle way that is hard to detect with this change.  If readConfig broke in obvious way because of this change, I wouldn't care as developers would know to make this change.
>     
>     For my second suggestion, I'm basically suggesting a static function that looks like:
>     template<typename T>
>     T *makeConfigSkeletonWithoutReparse<T>(KSharedConfig::Ptr ptr)
>     {
>       T *obj = new T(ptr);
>       obj->privateReadConfigWithoutReparse();
>     
>       return obj;
>     }
>     The oxygen code can just call this function to get the right object back, and the privateReadConfigWithoutReparse will avoid the reparseConfiguration call.  Generated code from kconfig_compiler can easily use this for construction, and projects can automatically pick up the enhancement.  It would be similar to using the KConfig::openConfig call.
>     
>     I'd be fine with an alternate readConfig (name suggestion: parseConfig, thus (re-)parsing the given KConfig).  Then my second suggestion can just go away.  I do worry that readConfig is currently a virtual, and thus parseConfig won't make use of it.  Probably best would be to make readConfig non-virtual, and make it basically call KConfig::reparseConfiguration, then call parseConfig.  Then users just override parseConfig instead.  Code make break because of this, but any code that moves to using c++11's override will break in an obvious way.  Any other code will break if readConfig is called through virtual dispatch, so it should be relatively obvious.  Add some documentation updates, and it I think it will go over ok.
>     
>     For delayedparsing.diff, can you throw it up on RB?  I'd like to get it in too, but I have a few thoughts on it and I rather do that in a separate request.
> 
> David Faure wrote:
>     I don't like the template-factory-method, it makes for horrible API IMHO :)
>     
>     But I love your parseConfig() suggestion, so I shall go ahead an implement it. It is much better because it fixes two issues: the N reparse on startup when sharing the same config object between N skeletons, but *also* the N reparse whenever we are notified that the config changed. (all my patches until now didn't fix that). If e.g. oxygen takes care of calling reparseConfiguration by itself upon a change, then it can simply call parseConfig in all skeletons.
>     
>     The one thing that might be confusing is the naming:
>      * in kconfigskeleton,  read(Config) = from disk,  parse(Config) = in memory
>      * in kconfig,  read(Entry) = in memory, reparse(Configuration) = from disk
>     !!!
>     
>     If only I could rewrite the past, I would swap that in kconfigskeleton.
>     
>     Alternatively, it's probably better to find another name for the new method... loadConfig? (as in, loading from kconfig to skeleton members)
> 
> Matthew Dawson wrote:
>     I'm not sure about loadConfig, as it kinda of implies loading from disk, so I think the best solution would be rewriting history :)
>     
>     Instead, maybe the best is to use use loadConfig to mean reading from disk.  So the matrix looks like:
>     
>      * kconfigskeleton: load(Config) = from disk, parse(Config) = in memory
>      * kconfig:         (re?)load(Configuration) = from disk, read(Entry) = memory -> though maybe even change this to parseEntry?
>     
>     Assuming readEntry stays the same, the load* functions are just a rename.  So the old names can be marked as deprecated, and the documentation points to the names.  SC/BC is simple, since the one function just calls the other.  And people still using readConfig get a huge warning about the fact they need to change to parseConfig.
>     
>     For more symmetry, readEntry can change to parseEntry, with a similar deprecation.  I'm not sure this is really necessary, as parseConfig is different from readEntry.  Also, this change can be done separately, so it shouldn't hold up the more useful optimization.
>     
>     Alternate thought: why are the methods even named readEntry?  That implies they actually parse the data returned.  They could just as easily be named fetchEntry, getEntry, or just entry.  writeEntry is the same, it doesn't actually hit the disk, so it could be put* or set*.  Thoughts?
> 
> David Faure wrote:
>     I like load(). Anyone seeing two calls to load() on the same object will wonder about the possible bad effect on performance, since it sounds heavy :)
>     
>     Renaming readEntry: veto! This is the most called method of them all, the porting effort would be huge, just to make the two of us happy with naming :-)
>     And read/write from memory doesn't seem so crazy (QBuffer, QMetaProperty, Q*Item etc. do this too).
>     
>     Overall I like your suggested matrix, but now we should look at minimizing the porting.
>     
>     KConfig::reparseConfiguration() wasn't misleading, before KConfigXT. The only reason for "reparsing" is reading the ini file on disk an updating the in-memory cache. However the problem with KConfigSkeleton is that it's one layer on top of KConfig. So parsing/loading there is all about getting stuff from KConfig. So yeah, to align naming, I could see loadConfiguration() [Qt says: no abbreviations] in both layers for "loading from disk". (Not "reload", given the new Delayed flag (to be renamed to DelayedLoading), in which case this would also be about the initial loading from disk).
>     
>     It just occured to me that all this "Config" or "Configuration" in the method names is rather useless, this is about KConfig* anyway.
>     So how about just load()?
>     
>     With all this in mind my current suggestion for a game plan would be:
>     1) New method KConfigSkeleton::read() for reading from KConfig without loading (i.e. without disk usage)
>     2) KConfigSkeleton::readConfig() deprecated for new name KConfigSkeleton::load()
>     3) KConfig::reparseConfiguration() deprecated for new name KConfig::load()
>     
>     What do you think?
>
> 
> Matthew Dawson wrote:
>     I agree the porting effort for renaming readEntry would be huge.  I just threw it out there as it popped into mind.
>     
>     I like your final game plan, so I think it is the best path forward.  I just have a couple of additions.
>     
>     Since we are going down the renaming path, I was thinking of the write side.  I think it be good to standardize the naming.  Right now KConfig uses sync(), while KCoreConfigSkeleton uses writeConfig().  Since writeEntry() exists, and is the larger porting effort, I think both should become sync().  So, the game plan would get:
>     
>     4) Deprecate KCoreConfigSkeleton::writeConfig() for the new name KCoreConfigSkeleton::sync().
>     
>     Also, don't forget the usr*Config functions.  I don't think they should just be renamed, as quite a bit of software relies on them.  I think replacement functions could be created, and those replacements call the originals.  So, also add:
>     
>     5) Deprecate KCoreConfigSkeleton::usr*Config() for KCoreConfigSkeleton::usrLoad()/KCoreConfigSkeleton::usrSync().

Hmm, sync() could sound a bit like what load() does, i.e. "the file changed on disk, sync the kconfig object with it, i.e. load it". But of course historically it didn't mean that.

Since it's the opposite of load(), naming it save() would have seemed much more intuitive to me.

Especially since (see documentation), while it merges with independent changes on disk, it doesn't actually load these changes into the object. So it's really just a save(), not a sync=save+load.

In suggestion 5, you mean adding non-virtual forwarding methods? But these are protected virtual methods, so it would be pointless. IOW the only API there is "what should I reimplement", which can only be one or the other...

- David

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/116461/#review51374
-----------------------------------------------------------

On Feb. 27, 2014, 9:18 p.m., David Faure wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/116461/
> -----------------------------------------------------------
> 
> (Updated Feb. 27, 2014, 9:18 p.m.)
> 
> 
> Review request for KDE Frameworks and Matthew Dawson.
> 
> 
> Repository: kconfig
> 
> 
> Description
> -------
> 
> KConfigSkeleton: avoid calling reparseConfiguration() immediately after creation.
> 
> KConfig already parses the config files from disk in the constructor,
> which is necessary for non-KConfigXT users. However when using KConfigXT
> the first thing one has to do after creation is to call readConfig(),
> which should therefore not call reparseConfiguration the first time.
> 
> strace -e open kate |& grep -v NOENT | grep oxygenrc | wc -l
>   went from 4 to 1 --> bingo, goal reached!
>   (and when looking for kdeglobals, from 10 to 7)
> 
> 
> Diffs
> -----
> 
>   src/core/kcoreconfigskeleton.cpp 9c5fb4a80d500e81b483b749a137ad5f2c99a55f 
>   src/core/kcoreconfigskeleton_p.h 0b020ed3493186e699d872ddc7a9f9294d797a95 
> 
> Diff: https://git.reviewboard.kde.org/r/116461/diff/
> 
> 
> Testing
> -------
> 
> (see commit log) + unittests in kconfig still pass.
> 
> 
> Thanks,
> 
> David Faure
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-frameworks-devel/attachments/20140316/11ccc4bd/attachment-0001.html>