Make WordPress Core

Opened 13 months ago

Closed 9 months ago

#58592 closed defect (bug) (fixed)

E2E tests: Investigate inconsistent navigation timeouts

Reported by: joemcgill's profile joemcgill Owned by:
Milestone: 6.4 Priority: normal
Severity: normal Version:
Component: Build/Test Tools Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

Recently, our automated end-to-end (E2E) tests have been failing inconsistently due to a Navigation Timeout error:

TimeoutError: Navigation timeout of 30000 ms exceeded

You can see an example of this type of error in a recent run of the workflow against on a PR to update the About Page, which should not have any affect on E2E tests.

The full list of runs where you can see this happening inconsistently, but frequently, is here: https://github.com/WordPress/wordpress-develop/actions/workflows/end-to-end-tests.yml

Often rerunning these failing jobs fixes the issue, but we should investigate why this is happening and try to reduce the frequency if possible since it leads to a poor experience and unnecessary overhead to rerun these failing jobs.

Change History (24)

This ticket was mentioned in Slack in #core by joedolson. View the logs.


13 months ago

#2 @mukesh27
13 months ago

#58591 was marked as a duplicate.

This ticket was mentioned in Slack in #core by mukeshpanchal27. View the logs.


13 months ago

#4 @SergeyBiryukov
13 months ago

These failures happen on tests using visitAdminPage(), sometimes even after multiple attempts.

It looks like the tests already increase the timeout to 30 seconds from the default 5 seconds. Perhaps GitHub is sometimes low on resources, and in that case increasing the timeout a bit further might help.

This ticket was mentioned in Slack in #core by nosolosw. View the logs.


13 months ago

#6 @talldanwp
13 months ago

Gutenberg puppeteer e2e tests also started failing at some point. A similar timeout issue.

After bisecting, it seems to be this commit that's causing the issue:
https://github.com/WordPress/wordpress-develop/commit/8d702842ce2c7dbb457624b8d66fe0536e1a8b48

(this is the associated pull request, which has links to the trac tickets - https://github.com/WordPress/wordpress-develop/pull/4570)

I noticed that if I change page.waitForNavigation( { waitUntil: 'networkidle0' } ), to page.waitForNavigation(), in the e2e test, then the test will pass, so possibly the change in the PR interferes with how Puppeteer detects network idle? I'm not entirely sure.

edit: I haven't made any progress on this, though I've narrowed it down to no-store that causes the failing tests. waitUntil: 'networkidle0' means the test should wait until there are no active network requests, and that it fails indicates there is still an active network request. I haven't been able to spot any active ones when testing manually, so it is unusual.

Last edited 13 months ago by talldanwp (previous) (diff)

This ticket was mentioned in Slack in #core-editor by talldanwp. View the logs.


13 months ago

#8 @talldanwp
13 months ago

A PR was merged in Gutenberg that removes no-store for the test environment - https://github.com/WordPress/gutenberg/pull/51826.

Core could probably use a similar approach. I'm not sure if it's a long term fix, but it may help get things moving at this important time.

Having said that, I also notice that core failures started a little earlier than the cache control changes, so it could be that there's more going on.

This ticket was mentioned in Slack in #core by ocean90. View the logs.


13 months ago

This ticket was mentioned in PR #4729 on WordPress/wordpress-develop by @Clorith.


13 months ago
#10

  • Keywords has-patch has-unit-tests added

This is a WIP (Work In Progress) PR for the E2E tests that have been failing recently, created to allow tests to run and check that enhancements work within the action runners, as well as locally.

Trac ticket: https://core.trac.wordpress.org/ticket/58592

#11 @Clorith
13 months ago

In 56089:

Build/Test Tools: Switch frame container when testing block editor output.

When the block editor is rendered, the editor content is wrapped inside an iframe tag; The tool used to run End to End (E2E) tests, Puppeteer, puts all such frames into separate containers, but the initial test was checking if the parent page had a given selector, which was leading to timeout failures. By actively switching the container to the iframe wrapper,and setting it as the active context, it helps ensure the expected selectors can be found, and its content can be verrified.

See #58592.
Props joemcgill, SergeyBiryukov, talldanwp, oglekler, Clorith.

#12 @Clorith
13 months ago

In 56090:

Build/Test Tools: Switch frame container when testing block editor output.

The initial commit added the frame lookup within the wrong test, this follow-up restores the previous test runner, and adds the container lookup to the correct test.

Follow-up to [56089].

See #58592.

@Clorith commented on PR #4729:


13 months ago
#13

Committed to trunk in 7bc7fcf

#14 @SergeyBiryukov
13 months ago

There is a PR for the e2e-test-utils package (still waiting for approval at the moment) that aims to fix a race condition within the login procedure and hopefully resolve these timeout issues.

This ticket was mentioned in Slack in #core by nekojonez. View the logs.


13 months ago

This ticket was mentioned in Slack in #core by sergey. View the logs.


12 months ago

#18 @SergeyBiryukov
12 months ago

I was hoping that bumping the package version in PR 4877 (with PR 52144 for the e2e-test-utils package included) would resolve the timeouts, but it did not, so this needs some further investigation.

#19 @audrasjb
12 months ago

  • Milestone changed from 6.3 to 6.4

Moving to milestone 6.4 as WP 6.3 RC3 has been released.

This ticket was mentioned in Slack in #core by sergey. View the logs.


12 months ago

#21 follow-up: @desrosj
12 months ago

While I was updating the PR associated with #56658 leading up to [56378], I realized I didn't receive a single E2E workflow failure. It's possible that this is now resolved as a result of those changes.

#22 in reply to: ↑ 21 @SergeyBiryukov
12 months ago

Replying to desrosj:

While I was updating the PR associated with #56658 leading up to [56378], I realized I didn't receive a single E2E workflow failure. It's possible that this is now resolved as a result of those changes.

It appears that Performance tests still fail with the same timeouts, as well as the tests on PR 4877.

This ticket was mentioned in Slack in #core by sergey. View the logs.


11 months ago

#24 @SergeyBiryukov
9 months ago

  • Resolution set to fixed
  • Status changed from new to closed

This appears to be resolved in [56926] / #59517.

Note: See TracTickets for help on using tickets.