Podcast Support

Bart Cerneels bart.cerneels at kde.org
Sun Nov 22 10:42:20 CET 2009


On Sun, Nov 22, 2009 at 00:48, Mathias Panzenböck
<grosser.meister.morti at gmx.net> wrote:
> On 11/21/2009 07:58 PM, Bart Cerneels wrote:
>> Long list. Hope I can give you some clarity.
>>
>> Answers and comments inline.
>>
>> On Fri, Nov 20, 2009 at 23:55, Mathias Panzenböck
>> <grosser.meister.morti at gmx.net> wrote:
>>>
>>> == Multiple Enclosures ==
>>
>> I've never really come across a feed with multiple enclosures per
>> item. Can you give me some examples?
>
> Well in practice I never encountered them either. I just read the standards and
> they all support multiple enclosures. I don't know if anyone uses them.
>
>> For the case of alternative formats we have the MultiSourceCapability.
>> Or at least I think we can use it for that.
>
> Ok, I'll look into that (some day when I've time, low priority after all).
>
>>> == Text Format ==
>>
>> Everything except description is indeed clear text.
>>
>> Here is what I've been planning to do since integrating your
>> PodcastReader rewrite:
>> Add an additional filed to PodcastMetaCommon, enum DescriptionType{
>> HtmlBody, HtmlDescription, ItunesSummery, ClearTextDescription }. The
>> listed order is also the priority. We only save one "description"
>> type.
>
> Well, I think it would be better to have:
> enum ContentType { TextContent, HtmlContent, XHtmlContent };

Yeah, probably better. How are we going to handle XHtml though? Why
would it be different then normal HTML?

> Because the DescriptionType is really just used in the PodcastReader to know
> which kind of description I've currently read so that I can choose which to drop
> and which to keep in case I read another one. It's RSS 1.0/2.0 specific. From
> that alone I can't deduced the ContentType because sometimes even <description>
> elements contain HTML (as CDATA!). We have to guess that for RSS. Only Atom
> tells us exactly what type the content has.
>
>> This was your idea and I'm convinced this is a very good and already
>> proven solution. No need to second guess yourself.
>
> It's not second guessing, another matter. ;)
>
>> I guess a more general name is in order to prevent confusion with the
>> RSS element. How about shownotes?
>
> For what? Description? It's already called description (and subtitle and
> summary) in PodcastCommonMeta.
>
>>> Question: Should I include the entity table and do the resolving and tag
>>> stripping by myself (won't be a problem for me)?
>>
>> There is no need to convert from clear-text to HTML. We save the
>> (cleaned up) HTML to database and use it directly in the info widget
>> if possible.
>
> I don't want to convert TO html. I want to convert html to plain text (strip
> tags, resolve entity refs), because in Atom even elements like title can contain
> (x)html!

Ah, I didn't know that. I have a feeling though that iTunes does not
support non-cleartext for any of the tags. So we should be ok with
just assuming it's clear text for all tags except rss:description and
atom:summary. For safety we should also check the content type of
itunes:summary.

I think there are Qt functions to convert HTML named refs and the XML
numeric character refs should be converted by
QXmlStreamReader::text(). Not sure about that last one though.

>
>>> Apropos: For feeds that do not support a type attribute (RSS 1.0/2.0), I found
>>> out there is already a function in Qt to guess whether it is (or might be html)
>>> or not:
>>> http://doc.trolltech.com/4.5/qt.html#mightBeRichText
>>> Haven't used it yet, though.
>>
>> We'll start to use it and see if it works.
>
> Ok.
>
>>> == Fields ==
>>> There are some fields in PodcastMetaCommon that seem not to be used and where
>>> not even read: summary, subtitle and I think author wasn't read either (or was
>>> it?). I do read them from the feed. In RSS 1.0/2.0 I do guessing about this
>>> this, because there actually is only the <description> element in the standard
>>> but there are often other elements used. I decide what to use this way:
>>
>> I guess this is leftover from the 1.4 porting or perhaps I just added
>> them since these are tags in the feed. In any case, doesn't seem we
>> are using them, yet.
>>
>>>
>>> If only the description std element is there:
>>> description=description
>>>
>>> If itunes:summary is there:
>>> summary=description, description=itunes:summary
>>> (Hm, maybe not that of a good guessing on this one, but usually description is
>>> shorter than itunes:summary.)
>>
>> I say compare lengths and keep the longest as description.
>>
>
> Mhm, good idea. Much more simple. So I should ignore the subtitle and summary
> PodcastCommonMeta fields in the RSS parser altogether? (In Atom I have 1:1
> correlation to elements of the same name for these!)
>
>>> However, subtitle and summary seem not to be used anywhere yet, or did I
>>> overlook something?
>>
>> Let's consider how and where we'll use them in the future then.
>> Subtitle: I would like to have as always visible, slightly desaturated
>> (grey) underneath the episode title in the podcast browser. Makes
>> sense doesn't it ;)
>>
>
> Mhm. I guess no subtitle from RSS feeds then.

There is itunes:subtitle, so yes: RSS also has subtitle and we *will* use it.

>
>>> You see that Atom seems to be an awesome format that already thinks about a lot
>>> of cases that aren't covered by RSS 1.0/2.0. However, one thing it's missing is
>>> some kind of <description> for *feeds*. The summary and content elements are for
>>> episodes only, the feed element only has a subtitle child, so I guess users will
>>> likely use a provides RSS feed instead (where we have to guess the content type
>>> of the <description>).
>>
>> In my experience atom feeds are not available for podcasts unless
>> auto-generated by a CMS. Since iTunes doesn't support atom it can be
>> considered irrelevant for podcasting.
>
> I think iTunes supports it. I have to look that up to confirm it.
>
>> We have had users request
>> support though and sometimes the atom feed is just easier to find.
>> The only reason not to support it is no developer interest. But you
>> fixed that :) as long as the code is not causing bug we can't fix or
>> you stick around, there will be atom support in amarok.
>
> Atom support is about as easy as RSS support. You just have to consider the type
> attribute (but on the other hand you don't have to rely on guessing for the
> content type!). Other than that the elements are named and structured a tiny bit
> different. It's not more or less complicated than RSS.
>
>>> But I have yet to find anything to put in the keywords field of
>>> PodcastMetaCommon. Maybe <category>?
>>
>> Some feed authoring tools (or CMS's) have a special keywords field.
>> But I think the itunes:category is indeed very suited to be added to
>> the list regardless of these special tags.
>
> I didn't mean <itunes:category> but the standard element:
> http://www.rssboard.org/rss-specification#ltcategorygtSubelementOfLtitemgt
> http://tools.ietf.org/html/rfc4287#section-4.2.2
>
>        -panzi
>


More information about the Amarok-devel mailing list