Support low latency encoding in MediaRecorder
Categories
(Core :: Audio/Video: Recording, enhancement, P3)
Tracking
()
People
(Reporter: quae, Unassigned)
References
Details
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:67.0) Gecko/20100101 Firefox/67.0
Actual results:
Currently using MediaRecorder with a stream from getUserMedia gives output every ~300ms.
Expected results:
I'd like to get a realtime(ish) Opus stream of a user's microphone.
Could you add support for low latency encoding?
Reporter | ||
Updated•5 years ago
|
Comment 1•5 years ago
|
||
While the MediaRecorder spec doesn't really say how to treat latency, it seems reasonable that one could affect it by calling requestData()
every so often, or setting a low enough timeslice.
Let's keep this bug around to track that work.
In the meantime, we do have a workaround of sorts. If you're encoding both video and audio (video/webm) and set a timeslice, you'll get data at roughly the interval of the timeslice.
Because our webm muxer doesn't pass data on to the gathered blob until it has finished a cluster -- and by spec recommendation we wait for a key frame before finishing a cluster, so the next cluster starts on a key frame, it can take a while. We issue keyframes at the interval of the timeslice as a way to reduce latency while we don't support passing partial clusters to the blob. Do note that frequent keyframes and small clusters means more overhead though.
If you're only encoding audio, I just fixed a bug that let's you get webm when you specify audio/webm. With audio/webm you get data at most every second. Before that bug, or if you don't set a mime type, you get ogg/audio, with which we flush data after libogg considers a page full, which seems to happen when it reaches 4kB.
If you get data every 300ms and you're encoding only audio I suppose your bitrate is around 100kbps? You should get lower latency if you could increase the bitrate (which you can affect with the audioBitsPerSecond
MediaRecorderOptions
member). Though it could be flaky if the opus bitrate varies with content.
Reporter | ||
Comment 2•5 years ago
|
||
If you get data every 300ms and you're encoding only audio I suppose your bitrate is around 100kbps? You should get lower latency if you could increase the bitrate
But I want a low bitrate stream....
Is there a way you could expose the opus encoder options more directly? From the wikipedia article (https://en.wikipedia.org/wiki/Opus_(audio_format)#Features):
SILK supports frame sizes of 10, 20, 40 and 60 ms. CELT supports frame sizes of 2.5, 5, 10 and 20 ms. Thus, hybrid mode only supports frame sizes of 10 and 20 ms; frames shorter than 10 ms will always use CELT mode. A typical Opus packet contains a single frame, but packets of up to 120 ms are produced by combining multiple frames per packet.
Comment 3•5 years ago
|
||
(In reply to Daurnimator from comment #2)
If you get data every 300ms and you're encoding only audio I suppose your bitrate is around 100kbps? You should get lower latency if you could increase the bitrate
But I want a low bitrate stream....
Is there a way you could expose the opus encoder options more directly?
There's nothing like that in the spec.
Like mentioned in comment 1 we could improve how we gather up data on requestData()
and with small timeslices. Patches welcome.
Reporter | ||
Comment 4•5 years ago
|
||
There's nothing like that in the spec.
Are you not free to implement MediaRecorder
for any content type?
https://tools.ietf.org/html/rfc7587#section-6.1 defines the audio/opus
type with parameters including maxplaybackrate
, maxptime
, cbr
and useinbandfec
.
we could improve how we gather up data on requestData() and with small timeslices. Patches welcome.
I wouldn't really know where to start; I'd need plenty of hand-holding if I attempt this.
Comment 5•5 years ago
|
||
(In reply to Daurnimator from comment #4)
There's nothing like that in the spec.
Are you not free to implement
MediaRecorder
for any content type?
https://tools.ietf.org/html/rfc7587#section-6.1 defines theaudio/opus
type with parameters includingmaxplaybackrate
,maxptime
,cbr
anduseinbandfec
.
Right. That can be done -- parsing parameters from the mime type. But seems like a different issue from this bug.
we could improve how we gather up data on requestData() and with small timeslices. Patches welcome.
I wouldn't really know where to start; I'd need plenty of hand-holding if I attempt this.
If you're willing, here are some resources:
In general, contributing code: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Introduction
Building from scratch: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions
When you've built you can run our mediarecorder tests with:
./mach mochitest dom/media/test/test_mediarecorder*
and ./mach wpt testing/web-platform/tests/mediacapture-record
For what I meant we can do in comment 1, there are two parts:
- requestData, and data requests based on timeslice, be made to explicitly extract data into the blob before pushing it.
- Make the container writers able to flush data before they've finished writing a certain amount of data. For webm the EbmlComposer currently only outputs full clusters, it could probably be convinced to output data directly after a block instead, but its internal state will need to be refactored to accomodate this. I'd say this is pretty fiddle-y to get right. There's the OggWriter as well, where something similar could probably be done. The work should result in them being stream-able, essentially.
Reporter | ||
Comment 6•5 years ago
|
||
Right. That can be done -- parsing parameters from the mime type. But seems like a different issue from this bug.
Wouldn't support of maxptime
effectively solve this issue?
But yes your approach would too.
Though for my specific use-case I don't need/want a container, so it really seems to be getting in the way.
Comment 7•5 years ago
|
||
(In reply to Daurnimator from comment #6)
Right. That can be done -- parsing parameters from the mime type. But seems like a different issue from this bug.
Wouldn't support of
maxptime
effectively solve this issue?
It would producer a higher number of packets. Opus packets that must be muxed before reaching content.
Our ogg muxer waits for a page to become full (of packets). Our webm muxer waits for a second worth of content before finishing a cluster, no matter what.
So no.
Reporter | ||
Comment 8•5 years ago
|
||
What if I don't want the stream muxed?
Comment 9•5 years ago
|
||
Then work that out with the w3c working group, because this spec doesn't define a containerless mode.
Updated•2 years ago
|
Description
•