1356054 - Standardise units used by EncodedFrame

Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail

Reporter

Description

•

7 years ago

EncodedFrame[1] contains time and duration members. At the time of writing, it appears that the the time stamp is set by the Opus and VP8 track encoders, as well as by tests, and is always in microseconds[2].

The duration member of EncodedFrame is used in similar cases[3], however the units used here are inconsistent. The VP8 encoder will set the duration in terms of microseconds, and the Opus in terms of samples. The comment on the underlying member, mDuration suggests that it should contain the duration in samples.

Right now the downstream code that uses the duration[4] expects this behaviour. I.e. downstream VP8 code treats duration as uSecs, Ogg code treats it as samples.

I would like to see the duration standardised to a specific unit. This would make EncodedFrame easier to understand, and would also allow for more straight forward reasoning about adding the time and duration on the frame to get an end time.

I suggest we standardise this to use microseconds. However, I would appreciate feedback for if this will provide sufficient granularity for the ogg case, or if we do need to store the exact # samples we're currently doing.

[1]: https://dxr.mozilla.org/mozilla-central/source/dom/media/encoder/EncodedFrameContainer.h#40
[2]: https://dxr.mozilla.org/mozilla-central/search?q=%2Bfunction-ref%3Amozilla%3A%3AEncodedFrame%3A%3ASetTimeStamp%28uint64_t%29
[3]: https://dxr.mozilla.org/mozilla-central/search?q=%2Bfunction-ref%3Amozilla%3A%3AEncodedFrame%3A%3ASetDuration%28uint64_t%29
[4]: https://dxr.mozilla.org/mozilla-central/search?q=%2Bcallers%3A%22mozilla%3A%3AEncodedFrame%3A%3AGetDuration%28%29+const%22

Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail

Reporter

Comment 1

•

7 years ago

Andreas, what are your thoughts on this?

Flags: needinfo?(pehrson)

Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail

Reporter

Comment 2

•

7 years ago

Another thing we could do is store the value as a rational. E.g. could store the sample unit.

JW Wang [:jwwang] [:jw_wang]

Comment 3

•

7 years ago

(In reply to Bryce Van Dyk (:SingingTree) from comment #0)
> I suggest we standardise this to use microseconds.

That might introduce rounding errors when converting frames to microseconds which is what MSG tried to move away from. I think Paul has the context about the MSG refactoring.

Flags: needinfo?(padenot)

Paul Adenot (:padenot)

Comment 4

•

7 years ago

Yes, we've had our fair share of issues when we used to use time units and not sample count + sample rate. I think it's reasonable to continue doing what we do for audio.

It's a bit harder for video, because, unlike audio, the time between two frames is not fixed, so we can't simply refer to a number of events and a rate for those events (i.e. sample count + sample rate). What is the approach taken by other software that need to deal with this problem ?

Flags: needinfo?(padenot) → needinfo?(bvandyk)

Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail

Reporter

Comment 5

•

7 years ago

MediaData[1] stores more information to accommodate this. There are members for both microsecond based times as well as codec specific times. Duration is exposed microseconds, but there is a member for the number of frames.

The use case I'm looking at is that I want to ensure sync of data being written by the MediaEncoder. Since the EncodedFrames being given to the encoder have durations written on them it's possible to reason about when the next expected audio or video frame will arrive. I can do this right now, treating video durations as usecs, and converting from samples to usecs for audio. It feels a little wonky doing so, though. Is exposing a sample based duration, and a usec based duration (similar to MediaData) an amicable way to address this?

More generally, I think EncodedFrame could be replaced by MediaData, to allow for more consistent data structures across media stacks. But perhaps that's a discussion for another bug.

[1]: https://dxr.mozilla.org/mozilla-central/source/dom/media/MediaData.h?q=file%3AMediaData.h&redirect_type=single#275

Flags: needinfo?(bvandyk)

Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail

Reporter

Updated

•

7 years ago

Blocks: 1014393

Alex Chronopoulos [:achronop]

Updated

•

7 years ago

Rank: 25

Priority: -- → P2

Paul Adenot (:padenot)

Comment 6

•

7 years ago

(In reply to Bryce Van Dyk (:SingingTree) from comment #5)
> More generally, I think EncodedFrame could be replaced by MediaData, to
> allow for more consistent data structures across media stacks. But perhaps
> that's a discussion for another bug.

Yes. The two separate class are probably here for historical (i.e. not good) reasons, iirc.

Andreas Pehrson [:pehrsons] (back on Aug 5)

Comment 7

•

7 years ago

Is it possible to use the audio sample rate as the base for video? That would let us avoid a conversion as this is how we get the frames from MediaStreamGraph. I don't know the reason for using usecs for video. I have assumed it's codec or container related. :bechen might know this better.

Flags: needinfo?(pehrson)

Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail

Reporter

Updated

•

7 years ago

Blocks: 1354457

Bulk Bug Changes for mreavy's org

Comment 8

•

7 years ago

Mass change P2->P3 to align with new Mozilla triage process.

Priority: P2 → P3

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

Bugzilla

Standardise units used by EncodedFrame

Categories

(Core :: Audio/Video: Recording, enhancement, P3)

Tracking

()

People

(Reporter: bryce, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Updated

Comment 6

Comment 7

Updated

Comment 8

Updated