Bug 1883793 - Catch negative durations and round to integer. r?chutten!
ClosedPublic

Authored by janerik on Mar 18 2024, 5:16 PM.

Diff Detail

Repository
rMOZILLACENTRAL mozilla-central
Lint
Lint Not Applicable
Unit
Tests Not Applicable
Build Status
Buildable 640741
Build 739415: arc lint + arc unit

Event Timeline

phab-bot changed the visibility from "Custom Policy" to "Public (No Login Required)".
phab-bot changed the edit policy from "Custom Policy" to "Restricted Project (Project)".
phab-bot removed a project: secure-revision.
chutten requested changes to this revision.Mar 18 2024, 7:35 PM
chutten added inline comments.
toolkit/components/glean/bindings/private/TimingDistribution.cpp
183

lround docs suggest

If the rounded value is outside the range of the return type, the value returned is unspecified, and a domain error or an overflow range error may occur (or none, depending on implementation).

so maybe we.. or we could.. uh..

Okay, I've done a little bit of reading on cppreference and, well, it's not great.

The * 1000.0 we're doing (and that TimeDuration's doing to get us from ToSeconds() to ToMicroseconds()) risks overflow, but that shouldn't be a problem for any reasonable duration, though, as std::numeric_limits<double>::max() is around 1.7x10^308. This gives us roughly 1.7x10^299ns of room, or roughly 5.4x10^282 _years_. So far so good.

We then need to round to an integer and get it into a uint64_t. lround proposes to get us 3/4 of the way there, but it has unspecified behaviour if it is outside the range of a long int. That's roughly 2^63ns or about 292 years, so we _should_ be good. But unspecified behaviour is bad: we'd like the result of giving us a nonsense TimeDuration to be an error (or at least a comment tapping the bug 1691073 sign).

We can static_cast<uint64_t>(std::round(durationNs)). std::round will take a double and round it to the nearest integral double value. This is great! ...until we exceed the range where double can represent all whole numbers. Which is 2^53 or 9007199254740992ns. Or about 104 days. Above that time we're going to be rounding further and further from "true" (first to the nearest even whole number. Than those that div by 4. And so on)

This is probably fine. Rounding between only the even numbers of _nanoseconds_ is an acceptable precision loss on durations longer than three months. Especially since with all the * 1000.0 in BaseTimeDuration and here means we're way off already. So I think I'd prefer std::round here.

Now, converting a double containing an integral value to a uint64_t, how do we do that... Well, static_cast<uint64_t> springs to mind. What will it do? Well, it prefers class conversion, reference conversion, then implicit conversion... and there does exist an implicit floating-integral conversion from double to uint64_t:

The fractional part is truncated, that is, the fractional part is discarded. If the truncated value cannot fit into the destination type, the behavior[sic] is undefined (even when the destination type is unsigned, modulo arithmetic does not apply).

Dagnabbit. Undefined again.

All this is to say that, to avoid undefined behaviour, I think we're going to need something like:

// `* 1000.0` is an acceptable overflow risk as durations are unlikely to be on the order of (-)10^282 years.
double durationNs = aDuration.ToMicroseconds() * 1000.0;
double roundedDurationNs = std::round(durationNs);
if (MOZ_UNLIKELY(roundedDurationNs < std::numeric_limits<uint64_t>::min() || roundedDurationNs > std::numeric_limits<uint64_t>::max())) {
  // TODO(bug 1691073): Instrument this error.
  return;
}
fog_timing_distribution_accumulate_raw_nanos(mId, static_cast<uint64_t>(roundedDurationNs));

It doesn't end up much different from lround in actuality: we still get to guard against the same behaviour. But it saves us against large durationNs a little more often.

toolkit/components/glean/tests/gtest/TestFog.cpp
283–288

There _has_ to be a better way to do this than a sleep, right? I mean, could we do FromSeconds(-1.0)?

292

I'm not sure I understand what "cannot be printed directly" means.

This revision now requires changes to proceed.Mar 18 2024, 7:35 PM
janerik updated this revision to Diff 838042.
chutten added a project: testing-approved.
chutten added inline comments.
toolkit/components/glean/tests/gtest/TestFog.cpp
289

Oh, that's neat. Do you think we should support that for ergo? (If so, please file a bug for me to backlog)

This revision is now accepted and ready to land.Mar 19 2024, 2:24 PM