Open Bug 1738748 Opened 3 years ago Updated 2 years ago

[doml10n] Evaluate synchronous l10n frame opportunity for fully loaded l10n contexts

Categories

(Core :: Internationalization: Localization, enhancement)

enhancement

Tracking

()

People

(Reporter: zbraniecki, Unassigned)

References

()

Details

As listed in bug 1737951 and bug 1738056 we have cases where engineers would like to be able to rely on l10n frame to be synchronous.

The early design decision for Fluent DOM was that we'll be predominantly asynchronous since it allows us to perform deep error recovery with async lazy I/O loading and if all of the API is async, we are securing ourselves ability to fine-tune the error recovery without having to rewrite tons of code from sync to async.

Several years into using Fluent and we have not been forced to rethink error recovery too much, but the forced asynchonisity of every l10n frame is creating paper cuts for engineers.

In some cases it's a race condition that is continously won so the code settles with it, but in some the API is inherently synchronous (popupshowing) and in result in conflict with DOM L10n async design.

Asking developers to write their popupmenus as:

<menu>
  <item data-l10n-id="show-tab"/>
  <item data-l10n-id="show-tabs" data-l10n-args="{count: 3}" hidden/>
</menu>

and flip between singular/plural as countchanges is an ugly dirty hack that doesn't scale and I'd love to avoid using it as a solution.

Early Proposal

My early proposal is to relax the design decision around async a bit and allow for a sync l10n frame to be exeuted under certain conditions: if no resource changes have been made since the last l10n frame, and requested IDs are present in the loaded context.

We have a sync hasMessage method, so all it would take is, roughly:

  1. Collect IDs to be localized in https://searchfox.org/mozilla-central/source/dom/l10n/L10nMutations.cpp#127-150
  2. Run all pending l10n ids via context.hasMessage
  3. If all return true execute synchronous l10n frame
  4. Otherwise fallback onto asynchronous

We have all the right paths in place already (TranslateElements can be sync or async), but we currently do not allow for sync formatMessages to be called when the context is in async mode. We'd need to redesign that API to allow for that.

Eemeli, Erik - thoughts?

Flags: needinfo?(enordin)
Flags: needinfo?(earo)
See Also: → 1737951
See Also: → 1738056

requested IDs are present in the loaded context.

That means we'd close the door to remodelling fallback in case a message is present but is broken in one of the two ways:

  • The message does not contain the shape we want
  • The message value/attributes formatting errors out (broken formatter, missing reference or broken variable)

At the moment, we only fallback if a message is missing. In all other error scenarios we take the broken message.

This design change would be observable and irreversibly lock us out of ability to perform fallback load because now has_message is sufficient to start a sync frame.

[0] https://github.com/projectfluent/fluent-rs/blob/master/fluent-fallback/src/bundles.rs#L300-L312

Unless we'd expand "check if all messages are present" to "try to format them synchronously" and be comfortable to introduce a double-pass:

  1. Try sync
  2. Fallback to async on errors

This may work quite well since we do have the design assumption that errors are rare and performance in case of errors is a trade-off we're willing to take.

Gijs pointed out that an alterantive to sync l10n frame would be to keep the frame async, but block painting on it.

This could solve the popupshowing problem if the actual showing was delayed until the pending l10n frame is completed.

:smaug, :emilio - how viable that would be?

Flags: needinfo?(emilio)
Flags: needinfo?(bugs)

In a general sense, I've been thinking about this quite a bit recently, albeit in the context of MF2. Here's my current generic opinion:

  1. Resource loading should be async.
  2. Message formatting should be sync.

This doesn't necessarily directly map to the current state of Fluent and its API, but at least it helps me answer your question on my part: I would be fine with complex async fallback handling needing to be implemented separately from the actual message formatting. I don't see why formatting should ever be doing anything so heavy or slow that it can't return synchronously.

Flags: needinfo?(earo)

I don't know what " l10n frame" means.

Blocking painting in general sounds hard. What else would be painted then at that point? Or not paint anything? That means user visible jank in practice.
Also, blocking a popup to be shown isn't just about blocking painting but opening the popup, so the popup implementation would need to be changed too, or all the callers would need to wait for some notification.

On which thread does the message formatting happen? Is it guaranteed to be fast and not blocked by anything else?

Could we force l10n promises (?) to be resolved when the refreshdriver ticks the next time?
I guess it depends on what https://searchfox.org/mozilla-central/rev/011ed92913b38e950977ab3fc56ae68a8f3bca12/intl/l10n/Localization.cpp#329-330 and other related code does. When is that promise resolved?

Flags: needinfo?(bugs)

If the work happens on the main thread, perhaps the internal task could use renderblocking priority to. That would mean execution would normally happen right after the current task.
(I don't know if we expose task priorities to rust)

There is still the problem that what if an animation frame callback wants to use formatMessages. If we want to support that case too, then formatMessages really needs to be synchronous. (how was is that call?)

We sorta provide the tools for this already (for popups at least), don't we? If you preventDefault() the popupshowing event, we stop opening the popup.

You could use that to trigger content translation, and call openPopup* again once that's done, couldn't you?

Flags: needinfo?(emilio)

@eemeli - that's the current state of Fluent as well. The low level FluentBundle API is completely sync and formats messages (your (2)). The Localization class is high level, wraps around FluentBundle, hooks in I/O and is mostly async (it does in fact have a sync mode tho!).

The challenge at hand is when those two interplot. If the request is for formatting, but in result of formatting we learn that we want to fallback, what should happen? If the outer API is async (as it is right now), we can, if it is sync, we have to fallback sync or give up.

The declarative bindings do give us a way around it which I suggested as the "double pass" or "optimistic pass" approach.

:smaug suggested another optimization on Matrix:

so on which thread does l10n do all the work? Is it on the caller? https://searchfox.org/mozilla-central/source/intl/l10n/rust/localization-ffi/src/lib.rs#372 is possibly hinting about that
if that is the case, one option could be to use renderblocking priority for the relevant task that would mean basically end-of-current-task handling

:smaug - can you shed more light on what renderblocking priority is? It seems like it may actually fit very well into this problem space at low cost!

renderblocking is just a higher than vsync priority task
https://searchfox.org/mozilla-central/rev/d21e359bd26dd0a7ba216472184d6fed8f0afd48/xpcom/threads/nsIRunnable.idl#34

but if Fluent does some work in some background thread to process formatMessages, then that doesn't help much.

but if Fluent does some work in some background thread to process formatMessages, then that doesn't help much.

The scenario we're trying to improve is when the async is close to no-op but since it is async it pushes the resolution for after paint.
What you described sounds like it might give us a chance to remain async but complete before paint?

(In reply to Zibi Braniecki [:zbraniecki][:gandalf] from comment #8)

[...] If the request is for formatting, but in result of formatting we learn that we want to fallback, what should happen? If the outer API is async (as it is right now), we can, if it is sync, we have to fallback sync or give up.

I think we should indeed fallback synchronously, but allow for error reporting and handling separately from that.

From my PoV an optimal formatting API would always return synchronously with fallback if necessary, and use a side channel of some description to report the error, which could then trigger an async process for improving the fallback or even resolving the original issue. And to be clear, I continue to look at this with MF2 primarily in mind, rather than Fluent directly.

I think we should indeed fallback synchronously

How can we? If the user has the following fallback chain: ["sr-SR", "sr-RU", "ru", "fr", "en"] and we encountered an error, would you like to force synchronous I/O to load second locale resources?

It think this is a no-go from the architecture perspective. We want the load to be async.

Nika took the time to help us by writing a more complete async support in moz_task to enable support for specifying the runnable priority for runnables.

I'm going to try to use that to elevate the priority of fluent frame and see how it impacts the microtask impact on the front end work.

Blocks: 1739727
Flags: needinfo?(enordin)
No longer blocks: 1739727
Depends on: 1739727

I spun off bug 1746124 to evaluate :smaug's suggestion from comment 9.

This bug should remain open as an additional optimization as described in the initial comment - if we have all messages available in loaded context, there's no need to spin a task.

That would require deeper refactoring in fluent-fallback crate of the Localization async methods so leaving it for later.

No longer depends on: 1739727

Filed an issue in fluent-rs to discuss how such API may work and whether MF2.0 is shaping up to allow for it - https://github.com/projectfluent/fluent-rs/issues/244

Blocks: 1738056
You need to log in before you can comment on or make changes to this bug.