[Kde-hardware-devel] [GSoC] Preliminary UPnP support proposal

Mon Apr 26 09:11:22 UTC 2010

Terve,

Dimanche, le 11 avril 2010, à 19:08, Tuomo Penttinen a écrit:
> Hello all,
> 
> On Fri, April 9, 2010 4:20 pm, Friedrich W. H. Kossebau wrote:
> > Vendredi, le 9 avril 2010, à 03:15, Tuomo Penttinen a écrit:
> >> On Wed, April 7, 2010 3:09 pm, Friedrich W. H. Kossebau wrote:
> >> > Mardi, le 6 avril 2010, à 23:41, Kevin Ottens a écrit:
> >> >> On Wednesday 31 March 2010 14:42:03 Tuomo Penttinen wrote:
> >> >> > I agree. The UPnP discovery protocol is lightweight enough that I
> >> >> > wouldn't worry about it in most environments.
> >> > 
> >> > Emphasis on _most_ environments ;)
> >> 
> >> Indeed. ;)
> > 
> > And add "today" ;)
> 
> Well, if it isn't a problem today, I'd say it's less likely to be a
> problem tomorrow, right? ;)

With more and more mobile devices and especially in the same local networks, 
I'd say, well, wrong :)

> >> Second, in the worst case the registrar would have to inform every
> >> registrant upon receiving a SSDP message via some IPC mechanism anyway.
> > 
> > Yes, but just the worst case (really, what kind of registrants would be
> > interested in all devices besides device browsers?).
> 
> And registrants who just don't care or want more control and want to make
> the decisions by themselves.

Registrants like what? 

> Device sniffers and validators probably
> belong to this category as well.

Special cases, no?

> >> Third, UPnP discovery is lightweight compared to the description phase,
> >> action invocation and eventing, so I'm not sure caching the results of
> >> discovery alone will have any meaningful impact on a large scale. I say
> >> this because if an application is interested in SSDP messages it is
> >> probably interested of UPnP in general, which includes action invocation
> >> and eventing. You can't really cache these and incidentally these two
> >> are potentially far worse resource consumers.
> > 
> > This doesn't stop SSDP from being outsourced to a central process with
> > some gain.
> 
> No, but my point was that optimizations are best targeted to issues that
> measurably matter. Inlining a call to a function that performs a
> bubblesort doesn't change the fact that the algorithm is still quadratic.
> ;)

Sure, but here UPnP = bubblesort, and Cagibi approach = inlining, no? So let's 
replace UPnP ;)

[snip content="some useful info about SSDP"]

> >> >> > In my opinion, even the description
> >> >> > phase doesn't add that much of network load that I'd want to pay
> >> >> > the price of added complexity, which the inter-process caching
> >> >> > solution introduces.
> >> >> 
> >> >> Fair enough, let's avoid premature optimization. And it's all buried
> >> >> in the backend anyway so it's not like we're going to break BC later
> >> >> on if we introduce complexity for the optimizations.
> >> 
> >> My thoughts exactly.
> > 
> > Well, having done some work already in this area I don't think this is
> > premature now. Remember Solid is linked to almost every KDE program.
> > Which means the backend to integrate UPnP stuff is active in many
> > processes, even those not interested in UPnP devices. And for the UPnP
> > backend to work I have to have a complete cache of all UPnP devices
> > there.  When I played with Cagibi as a lib for the Solid UPnP backend I
> > had to link Solid also to QtNetwork. With Cagibid as a daemon I don't.
> > And I also do not need to maintain a local cache, just like the HAL
> > backend on demand relays requests to the HAL daemon the Cagibi backend
> > relays requests to the Cagibi daemon. The code has become even smaller
> > compared to before. Also for the network:/ kio-slave this is all that is
> > needed. It just cares for the description and the presentation url,
> > never interacts with the devices itself. The overhead for UPnP listing
> > this way is just around a hundred lines of code altogether.
> 
> Unfortunately I can't really comment this due to my lack of understanding
> of Solid. Although I must say that I don't like a situation either where a
> process *completely* uninterested of UPnP has a link time dependency to a
> UPnP library. That should be avoided if at all possible.

Disclaimer: I am not yet a real Solid expert.
Solid as we discuss it here is basically one single hardware abstraction 
abstraction library (tm). It wraps the operatingsystem specific hardware 
access libraries with a common Qt-style API, to achieve the "code once, 
compile everywhere".
So e.g. if you are interested in the status of the network (like KMail or 
Konqueror) you use and link to the Solid lib.
Or if you want to list the attached storage devices (like indirectly all 
programs using a filedialog, so almost all), you query and link to Solid for 
these devices.

So, unless Kevin or someone else implements loading of backends on-demand-only 
doing the UPnP backend for Solid with a full UPnP library means indeed almost 
all programs would link against that.

> >> > Currently KDE software basically will be a client to services from
> >> > UPnP devices (being control point in UPnP terms). So if there is a
> >> > convenience lib one should be just for client stuff IMHO. Server stuff
> >> > should be handled by a different lib. I suppose that code for server
> >> > stuff is larger and would just be unneeded payload for most
> >> > applications (also in disk size). (not sure how the P2P situation with
> >> > mobile devices proves me wrong here)
> >> > 
> >> > We also wouldn't put http server stuff into the http access lib, would
> >> > we?
> >> 
> >> Actually, that's not entirely true. UPnP eventing requires a control
> >> point to listen for asynchronous events published by UPnP devices. The
> >> protocol is called "GENA", which is layered above HTTP and it requires
> >> minimal HTTP servers on both sides. In addition, I wouldn't say that the
> >> server stuff is larger or more complex compared to a proper control
> >> point. In many ways a UPnP device is actually more straightforward to
> >> implement compared to a control point that implements the UPnP stack in
> >> full and provides some type of an API for users.
> > 
> > Now, I would also put the eventing stuff into a dedicated proxy process,
> > if only for firewall and security reasons. Or am I the only one to
> > consider it a less good idea to have a full UI program with my user
> > rights accessable from the network?
> > Just that I have no real clue yet how this could be done best. Due to
> > authorization stuff there possibly should be one central process per
> > user, not globally.
> 
> This is another very interesting topic for discussion. :) I'd say it is
> about UPnP security (or lack of it) in general, not just about listening
> sockets. I think it is fair to say that the UPnP base design is not
> secure. There are the DeviceSecurity and SecurityConsole device templates
> to address at least some of the more prevalent security issues, but the
> base architecture is not secure. Because of that the use of UPnP without
> additional security measures in a public network is an inherent security
> risk no matter where you offload the socket code.
> 
> So before getting too carried away with security issues related to UPnP, I
> think it would be important to define the use-cases and requirements to
> find out how you are really going to approach the UPnP world. These should
> help in defining the security requirements and responsibilities of the
> system. If the requirements define proper security as an important system
> design attribute and that it is something *you* are responsible for,
> you're going to need whole lot more than the aforementioned offloading of
> socket code into a presumably safe process.

So what have been the use cases you designed and developed HUPnP for so far?

> >> The HUPnP shared library is about 1.2 megabytes built on my 64 bit
> >> kubuntu machine. I haven't optimized for the binary size yet, so there
> >> could be some leeway. The object codes for control point and upnp device
> >> functionality are roughly equal in size and they actually share quite a
> >> bit of the code base, including the "device model" users code against on
> >> client and server side. Taking purely the server stuff out would
> >> probably yield a save of 300-350 kilobytes. Certainly, purely UPnP
> >> device (server) stuff isn't required at a client that is purely a
> >> control point. On the other hand, there are valid use-cases where a
> >> client application will need both the server and client code.
> > 
> > Poses the question if it is a good idea to implement both controlpoint
> > and device in the same instance. Again for firewall and security reasons
> > I would run the server stuff in a separate program/process. Don't you
> > think this is a valid concern?
> 
> As noted above, I don't think security in this case is this simple. But
> since I find data/information security fascinating, I can't help but bite
> and forget that. ;)
> 
> So, I'm guessing you mean there's more attack surface when client and
> server code are run in the same process? Basically you're indicating that
> it increases the possibility for an attack, where the exploitable bug is
> either in client or in server, but the input that triggers the control
> flow leading to that would come in from the other? I admit, this *could*
> be more prevalent when the two are in the same process, but there are no
> guarantees that the malicious input can't reach the exploitable code even
> if the two live in different processes. Furthermore, separating the two as
> described definitely incurs a performance + complexity overhead, which
> might matter.

The latter is a price for more security. The idea with different processes 
would be that the server process would be run as a low-rights system user, so 
the impact of an exploit is reduced. 

> >> For instance, there could be a device or
> >> an application that is both a media renderer (a UPnP device) and a
> >> control point. The control point fetches the content from a media server
> >> and uses the built-in media renderer to render the content.
> > 
> > The program implementing this kind of UPnP device simply links to both
> > libs, control-point and device-impl. But this kind of device is an
> > exception, not the rule, isn't it? I expect most UPnP usages in KDEs
> > software to be controlpoint-only, so the server part in a single lib
> > would be just bloat to them.
> 
> You have a point there, especially if the library is not deployed
> system-wide, but as stand-alone with programs that need it. In system-wide
> deployment the presence of bloat is far less evident, since the library is
> already in the system and a client application that does not use the
> server part will not contain nor reference any part of the server code at
> runtime. The linker most certainly will not index unreferenced symbols for
> the loader to load. In system-wide deployment I'd say the point is truly
> valid only if no program uses the server part or the server part is used
> very rarely.

True. Unless the server part has static data which get initialized on lib 
load, but that could be avoided.

> That being said, of course such a separation could be done. In case of
> HUPnP it would mean that the code would be separated into three separate
> libraries; one for client, one for server and one for common used by both,
> each roughly around 400kb. But this is where opinions start to fly. Some
> want shared libraries small, some a bit bigger and some even very large.
> Consider any Qt library for instance. When you link to QtGui you rarely
> (if ever) make full use of it, but it still makes sense to have a single
> shared library instead of ten, even if it is a bit bigger and contains
> "bloat" to you in some case. I could throw in a joke or two regarding
> one's "preferences" concerning size, but I probably should not. ;)

I try not to imagine what you were thinking of ;)

Other than that, yes, it's hard to decide where to split. I don't have metrics 
here, too.

> >> > Doing the discovery in a proxy/cache not only avoids work duplication
> >> > (increase of resource needs per process), it has also the advantage to
> >> > be faster (cache). And it is done for other discovery systems, too:
> >> > There is Avahi for DNS-SD, as a central small proxy process (our code
> >> > layer to it is KDNSSD).
> >> 
> >> It could be faster if properly implemented, I agree, but are such
> >> (possibly very small) gains in speed in this particular area worth the
> >> trouble? Furthermore, if the work to be done is very small the benefits
> >> of avoiding it tend to be very small as well... ;) I'd still be inclined
> >> to properly profile and test the need for such a system before actually
> >> doing it. :)
> > 
> > Sure, if it's just about the pure resource usage I agree. But I hope you
> > understand that I think there is more than just this to consider.
> 
> Yes I do now, but I didn't before this mail of yours. :)
> 
> I see now that you're thinking of building some type of a "master" control
> point or a "activator/directory" service, which is instantiated either
> only once for system-wide deployment or once for every user. As you said,
> this is not just the matter of optimizing UPnP discovery any more. Or I
> should say that it probably never was, but I didn't see that before.
> Regardless, if you're going with some type of a central process idea that
> does all that has been described, then going "all the way" is probably the
> best way to do it. This could really have an impact on resource
> reservation and it could have some other notable benefits as well.
> 
> By "going all the way" I mean you design and implement this central
> "master control point" that practically does all the UPnP heavy lifting.
> Benefits for such a system include (I'm mosly summarizing the things you
> have already said):
> 
> * Resource reservation, in which various cache designs play a big role. At
> minimum it would cache the SSDP stuff, UPnP descriptions, possibly device
> events or at least relay them and perhaps even results of certain types of
> action invocations. This could lead to better efficiency as well, but only
> when cache hits are frequent.

While caching the results of action invocations (if this is possible due to a 
static nature) is a nice idea, this might not work with what I had in mind, 
which is one system-wide cache, not a per-session/user cache.
(Usecase: a mediaserver which restricts access to the members of the family 
knowing its password, but not to a random guest using a guest account on the 
same machine).

> * Directory service. It could detail the UPnP capabilities of the entire
> host machine. Control points and devices.
> * "Activation" service. With the help of the Directory service this could
> save resources and enable the activation and de-activation of UPnP devices
> when needed.
> * Some security improvements. But as I noted before, this topic requires
> vastly more thought anyway.
> 
> These benefits are noteworthy and I personally find the idea truly
> interesting (and quite possibly worth implementing), but I'd like to raise
> some of the disadvantages here as well, since they may very well outweight
> the benefits depending of the desired use:
> 
> * Possibly significant increase of complexity. Now, I don't consider
> writing such a central service a problem. It's basically a really robust
> control point that offers an IPC interface for clients use. I do consider
> using such a service "problem". A problem in a sense that it is much more
> difficult to use compared to a library loaded in-process, which provides a
> type-safe, hopefully very usable object model for interacting with UPnP
> devices. Certainly you could write a helper library for the clients that
> does exactly that: a decent object model for interacting with the central
> service. It could even be written in Qt to allow seamless integration with
> all the other Qt stuff.

Sure, and I would expect a complete solution for us to contain this.

> Regardless, all that definitely increases
> complexity in various ways. I'd say the impact as a whole is notable.

Numbers, please ;)

> * Writing multiplatform software becomes harder. I'm not sure if this is a
> valid concern or not, but having a dependency to such a central service
> requires the service to be present on all supported platforms as well. It
> is no longer a matter of writing an application and linking it to a UPnP
> library.

Well, the UPnP library ideally hides the complexity/difference of the 
supported platforms away in the API.
I guess you also are targetting other platforms like Windows. Don't they 
already provide their own libs for that, which you ideally would wrap around 
with the same HUPnP API?
So you would have different backends, choosen by compile flags. And for 
Linux/*BSD/... you would offer a compile flag to have the SSDP classes forward 
to Cagibi or a similar daemon.

> * Efficiency may be impaired in regard to action invocation. I don't think
> this is a concern, but since the topic has been on the wall, I thought I
> should mention this. I don't have any numbers to show right now, but it is
> easy to assume (and be wrong ;)) the vast majority of resources are spent
> on invoking actions and eventing. This is because to do anything with a
> UPnP device you have to invoke an action and it is SOAP all the way.
> Always going through a middle-man incurs overhead even if it only relays
> the data.

Which I do not advocate for. Instead I think SOAP talking should be done 
directly, like HTTP is often done (then perhaps it could be an idea to have a 
proxy process per remote UPnP device, similar to KIO slave processes).

> * If the majority of the benefits of the system depend on everybody using
> it, how can you enforce the use of it in favor of, say, using HUPnP
> directly?

Sorry, do not understand what you mean here?

> All in all, this is a very interesting field of discussion and I'm glad to
> participate, but I must point out that I don't have such a good idea yet
> what exactly are you planning to do with UPnP, who and what are involved
> and so on. So, everything I just wrote could very well be pointless, which
> of course makes me feel very good about spending a fair amount of time and
> thought in writing this. ;-)

It is of much value, at least to me, as I have to defend my idea with Cagibi 
and see if it stands.

> >> >> > By the way, if any of you have ideas or feature requests for HUPnP,
> >> >> > or feedback in general, this is a good chance to influence the
> >> >> > development before the API & ABI is locked for the first major
> >> >> > release. I'd appreciate your comments.
> >> > 
> >> > Would like to give you some more comments, but please accept my
> >> > limited
> >> > time and find my ideas/thoughts in these emails :)
> >> 
> >> Of course, I did and I thank you. Some interesting thoughts there to
> >> which I enjoyed answering. :)
> > 
> > Thanks. Which I enjoyed to reply to again :)
> 
> And the trend continues... :)

Hehe :)

Cheers
Friedrich
-- 
KDE Okteta - a simple hex editor - http://utils.kde.org/projects/okteta