<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

Dan Dennedy wrote:

<blockquote

 cite="mid:27dd34e60902121048w2491bc0bieeee9c36a5ec534d@mail.gmail.com"

 type="cite">

  <pre wrap="">I hope you realize by now that this is common - both the treatment and

your reaction / perception. It's probably best to think of Michael &Co

as a gatekeeper to the holy land. Unfortunately, sometimes I think

they have that perception of themselves as well. ;-) However, the fact

that he reads your patches and responds is actually a good sign.

  </pre>

</blockquote>

Yes, that's why I still bear with it :-)<br>

<br>

<blockquote

 cite="mid:27dd34e60902121048w2491bc0bieeee9c36a5ec534d@mail.gmail.com"

 type="cite">[...]

  <blockquote type="cite">

    <pre wrap="">OTOH, MLT has also some bugs in libavformat handler, most prominently

off-by-two frames for MPEG-2 and H.264 (and possibly other codecs) as

well as wrong usage of av_read_frame(), which can be attributed to

deficient documentation. If correct solution for AVCHD is built into

FFmpeg, MLT has to be fixed as well.

    </pre>

  </blockquote>

  <pre wrap=""><!---->

I will take a look at this. Have you noticed this only with transport

stream or with program stream and mp4 or mov as well? If you have some

diff in your working copy, not matter how ugly, please send it so I

can start looking at what you changed and why.

  </pre>

</blockquote>

There is one big problem: av_read_frame() doesn't read a video frame as

one would expect for video streams, but a demuxer frame. This means,

for H.264 (e.g., AVCHD) videos, you'd need to make two av_read_frame()

calls before getting the next picture for field-coded H.264 streams

(which seems to be the case at least for 1080i videos on Panasonic,

Sony and Canon camcorders). This also isn't the biggest problem, since

MLT does seem to loop, but I believe the termination condition is

possibly not sufficient. Similarly, there are video formats, where one

demuxer frame might contain more than one video frame. I.e., you'd

probably need to feed avcodec_decode_frame() with NULL buffers to get

the rest of the pictures in one demuxer frame.<br>

<br>

For now, stock FFmpeg doesn't return timestamps for field-coded videos

properly. Timestamps of frame-coded progressive H.264 (sort of) work.

For field-coded videos, the first field of the frame would have DTS/PTS

as usual, the second one either the same or offset by 1/2 duration.

This doesn't work as of today and computation of proper timestamps is

quite an obnoxious thing. One of my patches simply joined the two

fields in parser, thus returning a buffer with one video frame and thus

alleviating the problem.<br>

<br>

Seeking in MPEG-TS stream (as produced by the camcorder) doesn't work

as well in FFmpeg, since frame positions in-stream are determined

incorrectly in general and in MPEG-TS in particular. Further, MPEG-TS

seeking doesn't look for key frame, it takes just some frame with

timestamp present. So decoding isn't restarted correctly. I've posted a

series of patches for review a week ago, with no answer from MPEG-TS

maintainer so far in spite of 2 pings and his activity on other

topics...<br>

<br>

I didn't test with program stream and/or mp4. I wanted to test with

mov, but since remuxing to mov failed (video can't be read at all from

resuling mov file), I cannot say much.<br>

<br>

I didn't study the MLT code thoroughly yet, so I don't know what/how to

fix and don't have a patch yet. I noticed the following, though:<br>

<br>

1) The aforementioned problem with av_read_frame().<br>

<br>

2) Seeking seeks 1 second before actually needed position. I suppose,

this is a workaround for aforementioned MPEG-TS seeking bug (and

possibly bugs in other formats), which simply starts at some frame, not

necessarily a key frame. This one second won't be needed after MPEG-TS

seeking is fixed. For now, just GOP time should suffice as workaround

(which is, granted, close to 1 second anyway :-).<br>

<br>

H.264 complicates things further, by having so-called recovery points

with convergence duration. I.e., you don't have a single frame where to

start from, you have a start frame and count of frames to decode until

the output converges to what it would be when decoding from start of

the stream. This should be addressed in FFmpeg seeking, though, by

positioning the stream at a keyframe with distance >= convergence

duration. Thus, decoding until requested PTS would converge the stream.

However, handling of recovery point is not yet in FFmpeg (just

partially). Fortunately, all H.264 videos I've seen so far have

recovery count set to 0, which effectively makes frames with recovery

count 0 unconditional keyframes. So we don't have to wait for full

solution there just to decode AVCHD from camcorders.<br>

<br>

3) The off-by-two stems from using DTS instead of PTS for timestamps.

After stream start or seek, a key frame is read with PTS = DTS + delay,

followed by B frames as needed with PTS=DTS. So first few

avcodec_decode_video() calls won't return a picture. The first picture

returned has PTS of the key frame. MLT seems to use DTS of last-read

frame instead (and generally something seems to go wrong there). So if

I open a file from my camcorder, first frame is displayed.

Single-stepping will then duplicate it for next 2 frames (delay) and

only then start displaying further frames. After a seek, single

stepping does work from beginning, but it's offset by delay (since DTS

is used instead of PTS).<br>

<br>

I would wish FFmpeg would pair the resulting decoded picture with PTS

of av_read_frame, so that this is clear. There actually is a field for

that, but it doesn't work correctly. So the only way to get proper

timestamps is to read frames at the beginning / after the seek until

the first picture is decoded. Then you know you are at PTS of the key

frame at start of the stream or after a seek (PTS of first frame read

via av_rad_frame). Then, for next frame, simply read frames via

av_read_frame() & decode until next picture is decoded. Again, you

know you are at PTS of the last frame + duration. Of course, if you

have a broken stream with gaps, this won't work (one can resync,

though, by computing the delay from a key frame and then correct PTS of

the picture after so many video frames are read & decoded to the

PTS of that last key frame).<br>

<br>

I hope this clarifies the things a bit...<br>

<br>

Regards,<br>

<br>

Ivan<br>

<br>

</body>

</html>