Make WordPress Core

Opened 20 months ago

Last modified 6 days ago

#57271 assigned defect (bug)

Cron unschedule / reschedule event errors

Reported by: domainsupport's profile domainsupport Owned by: audrasjb's profile audrasjb
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 6.0
Component: Cron API Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

Multiple users have reported errors since v6.0 across installs with completely different themes and plugins ...

[09-Nov-2022 11:28:28 UTC] Cron unschedule event error for hook: do_pings, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {“schedule”:false,”args”:[]}

[09-Nov-2022 18:05:03 UTC] Cron unschedule event error for hook: wordfence_processAttackData, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {“schedule”:false,”args”:[]}

[21-Nov-2022 07:53:04 UTC] Cron reschedule event error for hook: wf_scan_monitor, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {“schedule”:”wf_scan_monitor_interval”,”args”:[],”interval”:60}

[21-Nov-2022 07:53:04 UTC] Cron unschedule event error for hook: wf_scan_monitor, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {“schedule”:”wf_scan_monitor_interval”,”args”:[],”interval”:60}

There are many more examples on the support thread https://wordpress.org/support/topic/cron-unschedule-event-error-for-hook/

I have so far been unable to find a way to manually re-produce this issue but this is what I have discovered so far ...

So this error happens when wp_unschedule_event() and wp_reschedule_event() fails.

could_not_set error is presented by _set_cron_array() which is the return function for wp_unschedule_event() and wp_schedule_event().

could_not_set error is returned by _set_cron_array() when the updating of the cron table fails with update_option( 'cron', $cron );.

update_option() fails when …

1) The $option parameter is empty
This should never happen
2) The old option value is the same as the new option value
This suggests the cron job is running more than once but may not be the only reason ... ?
3) The database insert query $wpdb->update() fails
This is a bit worrying but shouldn’t be the case as surely there should be an accompanying database error?

I think we can rule out (1) seeing as the $option parameter is never empty.

We can maybe rule out (2) because from the thread above, this happens on high traffic sites as well as low traffic sites and on sites using the default wp-cron firing as well as cron fired by the system … ?

The leaves (3) being the most likely cause of the "bug".

I have tried to assign to @audrasjb (I hope that's OK to do?) because they added the commit on the 20th September that started to highlight the issue https://github.com/WordPress/WordPress/commit/84d19848666a81584e0007a6ab7f7ad3c990d71a

I realise that this is not going to be easy to diagnose until the issue can be manually replicated but I think it should be looked into.

Oliver

Change History (88)

#1 @peterwilsoncc
20 months ago

I can see that wp_unschedule_event() may throw an error if the event being unscheduled is not set. Similarly wp_schedule_event() may throw the error if the event been scheduled is already set.

wp> wp_unschedule_event( time() - MONTH_IN_SECONDS, '57271cc', [], true );
=> class WP_Error#2637 (3) {
  public $errors =>
  array(1) {
    'could_not_set' =>
    array(1) {
      [0] =>
      string(39) "The cron event list could not be saved."
    }
  }
  public $error_data =>
  array(0) {
  }
  protected $additional_data =>
  array(0) {
  }
}

wp> $event_time = time() + MONTH_IN_SECONDS;
=> int(1672799267)
wp> wp_schedule_event( $event_time, 'weekly', '57271bcc', [], true );
=> bool(true)
wp> wp_schedule_event( $event_time, 'weekly', '57271bcc', [], true );
=> class WP_Error#2636 (3) {
  public $errors =>
  array(1) {
    'could_not_set' =>
    array(1) {
      [0] =>
      string(39) "The cron event list could not be saved."
    }
  }
  public $error_data =>
  array(0) {
  }
  protected $additional_data =>
  array(0) {
  }
}

I don't think these functions should return could_not_set in either of these circumstances. Whether the functions should return an error if called in this fashion is a discussion for elsewhere.

In both of these circumstances, the scenario being hit is number two in the original description: the options value already matches.

@domainsupport I noticed you mentioned in the support ticket you linked to that these events are not in your cron array, the leads me to suspect you are hitting these false error reports with wp_unschedule_event().

#3 @peterwilsoncc
20 months ago

  • Keywords needs-unit-tests added

I've linked a pull request to this issue to prevent the false could_not_set errors. It's in draft as it still needs some unit tests though.

#4 follow-up: @domainsupport
20 months ago

Thank you so much for taking a look at this.

Our site was (I say "was" because we've not had a PHP error notice about this for a few days now) showing errors for both "reschedule" and "unschedule". Other users in the support thread are still receiving multiple entries in their log files.

What you've explained sounds right ... BUT ... surely if events are attempting to unschedule when they don't exist and schedule when they do exist then this would suggest that the events are somehow running more than once (concurrently perhaps?) and have already unscheduled / rescheduled themselves?

If not then what else could cause this behaviour? And surely preventing the error for this scenario won't actually be resolving the root of the issue?

Thanks again.

#5 in reply to: ↑ 4 ; follow-ups: @peterwilsoncc
20 months ago

Replying to domainsupport:

What you've explained sounds right ... BUT ... surely if events are attempting to unschedule when they don't exist and schedule when they do exist then this would suggest that the events are somehow running more than once (concurrently perhaps?) and have already unscheduled / rescheduled themselves?

If not then what else could cause this behaviour? And surely preventing the error for this scenario won't actually be resolving the root of the issue?

My first thought was concurrency too, but I can't figure out the best way to test that idea.

The only other possible cause I could think of is if the event is been scheduled or unscheduled without plugins checking wp_next_scheduled() first. I did a quick scan of the plugin repository for the events recorded in the support ticket but couldn't see any likely causes initially.

Were you seeing the error in local or development enviroments?

#6 in reply to: ↑ 5 @tynanbe
20 months ago

Replying to peterwilsoncc:

Were you seeing the error in local or development enviroments?

Several LiteSpeed Cache users have reported similar issues happening on production sites, and it looks like they're not limited to plugins alone.

Error de evento de reprogramación de cron para el gancho: action_scheduler_run_queue. Código de error: could_not_set. Mensaje de error: No se ha podido guardar la lista de eventos del cron.. Datos: {"schedule":"every_minute","args":["WP Cron"],"interval":60}
Error de evento de anulación de programación de cron para el gancho: action_scheduler_run_queue. Código de error: could_not_set. Mensaje de error: No se ha podido guardar la lista de eventos del cron.. Datos: {"schedule":"every_minute","args":["WP Cron"],"interval":60}

#7 in reply to: ↑ 5 ; follow-up: @domainsupport
20 months ago

Replying to peterwilsoncc:

Were you seeing the error in local or development enviroments?

My experience is with a live site https://ryesussex.uk which receives about 15,000 unique visitors per month.

The following cron hooks ...

  • do_pings (I've personally seen this one)
  • wp_update_plugins
  • wp_privacy_delete_old_export_files
  • wp_version_check

... are all core hooks (I believe) and have all been seen in the support thread which rules out a plugin not using wp_next_scheduled() properly I believe?

One theory I had was concurrent requests to a WordPress installation from different hostnames (with and without "www." or with and without SSL for example, they don't have to be completely different domains) firing the same cron event?

In fact, if a site were visited via "http://" and WordPress then redirected to "https://" would these two requests (so close together) be able to fire the same cron event before it had completed and been removed from the schedule?

Oliver

#8 in reply to: ↑ 7 ; follow-up: @peterwilsoncc
20 months ago

Replying to domainsupport:

[snip]
... are all core hooks (I believe) and have all been seen in the support thread which rules out a plugin not using wp_next_scheduled() properly I believe?

Yes, that's correct. They are all core hooks.

One theory I had was concurrent requests to a WordPress installation from different hostnames (with and without "www." or with and without SSL for example, they don't have to be completely different domains) firing the same cron event?

In fact, if a site were visited via "http://" and WordPress then redirected to "https://" would these two requests (so close together) be able to fire the same cron event before it had completed and been removed from the schedule?

wp-cron.php sets a transient in order to prevent events running in this manner but this might be possible if the time between two requests is very quick or if the site is running a plugin that disable transients.

It's certainly something worth looking in to, thanks.

#9 in reply to: ↑ 8 ; follow-up: @domainsupport
20 months ago

Replying to peterwilsoncc:

One theory I had was concurrent requests to a WordPress installation from different hostnames (with and without "www." or with and without SSL for example, they don't have to be completely different domains) firing the same cron event?

wp-cron.php sets a transient in order to prevent events running in this manner but this might be possible if the time between two requests is very quick or if the site is running a plugin that disable transients.

I'd forgotten about this comment where he says he's not a high traffic site and his cron is set to run by the system so does that not rule out the concurrent request theory? https://wordpress.org/support/topic/cron-unschedule-event-error-for-hook/page/2/#post-16255896

#10 in reply to: ↑ 9 ; follow-up: @chargeup
20 months ago

Thinking out loud. If the cron job is set on the server for this domain ��http://domain.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1 then redirects to https://www.domain.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1 could that also cause a concurrent request? Just mentioning this as a possible cause in those cases for the concurrent request theory.

Replying to domainsupport:

Replying to peterwilsoncc:

One theory I had was concurrent requests to a WordPress installation from different hostnames (with and without "www." or with and without SSL for example, they don't have to be completely different domains) firing the same cron event?

wp-cron.php sets a transient in order to prevent events running in this manner but this might be possible if the time between two requests is very quick or if the site is running a plugin that disable transients.

I'd forgotten about this comment where he says he's not a high traffic site and his cron is set to run by the system so does that not rule out the concurrent request theory? https://wordpress.org/support/topic/cron-unschedule-event-error-for-hook/page/2/#post-16255896

#11 in reply to: ↑ 10 @domainsupport
20 months ago

Replying to chargeup:

Thinking out loud. If the cron job is set on the server for this domain http://domain.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1 then redirects to https://www.domain.com/wp-cron.php?doing_wp_cron >/dev/null 2>&1 could that also cause a concurrent request? Just mentioning this as a possible cause in those cases for the concurrent request theory.

Looking at the way the transient is set in wp-cron.php with microtime() it would be almost impossible for the cron to be running concurrently as @peterwilsoncc says without the transient being disabled (which it isn't on our site). It would require two requests to fire at exactly the same microsecond (different hostname request wouldn't matter as the transient is stored in the database). The chances of this happening would be infinitesimally small and would not present the quantity of errors we are seeing.

It's an absolute mystery!

#12 @domainsupport
20 months ago

Having not had these errors in the logs for a while, they have made a re-appearance ...

[16-Dec-2022 06:48:13 UTC] Cron reschedule event error for hook: wf_scan_monitor, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {"schedule":"wf_scan_monitor_interval","args":[],"interval":60}
[16-Dec-2022 06:48:13 UTC] Cron unschedule event error for hook: wf_scan_monitor, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {"schedule":"wf_scan_monitor_interval","args":[],"interval":60}
[16-Dec-2022 06:48:15 UTC] Cron reschedule event error for hook: wf_scan_monitor, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {"schedule":"wf_scan_monitor_interval","args":[],"interval":60}

I think this is significant because I thought that the update of core itself could have been a cause but there hasn't been an update for a long time. Furthermore, the schedule for this hook is 1 minute and it's a custom schedule / hook. This means that whatever the cause is, it is independent of core / custom schedule and independent of core / custom hook.

#13 @courageous999
19 months ago

We're a live but not a high traffic site that has also started seeing this exact error since upgrading WordPress to v6.1.1. The errors started to appear sometime in November but I've been patiently ignoring them (since at the time I did not find any mention of it by anyone online).

I had the following custom CRON event set in my child theme's functions.php file:

wp_schedule_event(strtotime('10:00:00'), 'daily', 'ft_remove_stripe_cron');

This was throwing the following error in my error_log about 50 times a minute:

[20-Dec-2022 15:59:02 UTC] Cron reschedule event error for hook: ft_remove_stripe_cron, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {"schedule":"daily","args":[],"interval":86400}

After going through this ticket, I was finally able to find out that what I needed was the following:

if ( !wp_next_scheduled('ft_remove_stripe_cron') )
    wp_schedule_event(strtotime('10:00:00'), 'daily', 'ft_remove_stripe_cron');

Suffice to say that I have now had over 20 minutes of peace and quiet in my error_log since adding this!

Now my understanding is that my implementation was always wrong and was always failing to schedule an already set CRON event (but silently before). However, I find it intriguing (and perhaps of value to your investigation) that I was receiving that error 50 times a minute. I guarantee you that our site was not receiving 50 visits a minute! I highly doubt it even had 1 link clicked a minute.

Lastly, I should also mention that I've occasionally (amidst the barrage of repeated CRON event errors) seen other CRON events (not sure if set by a plugin or WordPress itself) also failing with this exact error.

EDIT:
By the way, if you're going to introduce a new error check, it would have immensely helped if you added some documentation with the update about it. On the 2nd day after this update, I searched your forums and Google for this exact error and found exactly zero results. Rather than wait a month and a half for a ticket addressing this, it would have saved me a lot of time and headache if there was some documentation regarding this new error check somewhere that simply points in the right direction to fixing it (my case was a fairly simple fix).

EDIT 2:
Just found these on a completely different site today:

[07-Jan-2023 12:54:56 UTC] Cron reschedule event error for hook: action_scheduler_run_queue, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {"schedule":"every_minute","args":["WP Cron"],"interval":60}
[07-Jan-2023 12:54:56 UTC] Cron reschedule event error for hook: wp_session_database_gc, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {"schedule":"hourly","args":[],"interval":3600}
Last edited 19 months ago by courageous999 (previous) (diff)

#14 @johnbillion
19 months ago

#57448 was marked as a duplicate.

#15 @AndyHenderson
18 months ago

FWIW, I've just seen this error in our log:

[01-Feb-2023 19:17:22 UTC] Cron unschedule event error for hook: backwpup_cron, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {"schedule":false,"args":{"arg":"restart"}}

We implemented 6.1.1 shortly after release but have not seen the error until now. OTOH, we implemented PHP 8.1 from 7.4 yesterday. Coincidence? I hope I'm not setting a hare running.

Andy

#16 follow-up: @domainsupport
18 months ago

I've not seen the errors appear in our logs for some time now (although I'm under no illusion that the issue has resolved itself). However in all instances that we have seen this issue, it was with PHP 7.4 so @AndyHenderson I don't think that your PHP update to v8.1 was relevant.

Oliver

#17 in reply to: ↑ 16 ; follow-up: @j3gaming
17 months ago

Replying to domainsupport:

I've not seen the errors appear in our logs for some time now (although I'm under no illusion that the issue has resolved itself). However in all instances that we have seen this issue, it was with PHP 7.4 so @AndyHenderson I don't think that your PHP update to v8.1 was relevant.

Oliver

I'm still experiencing this and it's deleting my cron jobs. Are you saying the fix is to update PHP? Is there any code changes for this coming up in the next version of wordpress?

#18 in reply to: ↑ 17 @courageous999
17 months ago

Replying to j3gaming:

Replying to domainsupport:

I've not seen the errors appear in our logs for some time now (although I'm under no illusion that the issue has resolved itself). However in all instances that we have seen this issue, it was with PHP 7.4 so @AndyHenderson I don't think that your PHP update to v8.1 was relevant.

Oliver

I'm still experiencing this and it's deleting my cron jobs. Are you saying the fix is to update PHP? Is there any code changes for this coming up in the next version of wordpress?

We're currently running WordPres 6.1.1 and we updated to PHP 8.1 a little over a month ago and I'm still seeing errors like this in the logs about once a month (some of these are from plugins like Wordfence it seems):

[17-Feb-2023 15:24:47 UTC] Cron reschedule event error for hook: action_scheduler_run_queue, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {"schedule":"every_minute","args":["WP Cron"],"interval":60}
[08-Feb-2023 21:36:00 UTC] Cron reschedule event error for hook: wf_scan_monitor, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {"schedule":"wf_scan_monitor_interval","args":[],"interval":60}

So I can confirm that updating to PHP 8.1 will not resolve this issue.

Furthermore, I don't think that WordPress is officially ready for PHP 8.1 yet as our logs are filled with errors from the "wp-includes" folder in WordPress.

Last edited 17 months ago by courageous999 (previous) (diff)

#19 @domainsupport
17 months ago

@j3gaming No, I'm not saying updating PHP will solve your issue, I'm saying that updating PHP didn't cause the issue. There is still no fix for the issue as the cause has still not presented itself or been discovered.

#20 @j3gaming
17 months ago

I figured out how to trigger this 100% of the time, which also ends up deleting 1 or multiple cron tasks.

Environment:
Linux, with linux system crons running once a minute (crontab -e)
Webmin
PHP 7.4
Wordpress 6.1.1 (DISABLE_WP_CRON = true)

Within webmin, run a database backup. This usually takes ~5 minutes to complete for my website. Doing anything that would generally freeze/stop the website & database would probably work

During that time, the linux cron will run. The result is this error message and 1 or more cron jobs being deleted.

Log: https://dev.fillmore.mobi/wp-content/uploads/fillmore.error.log

Last edited 17 months ago by j3gaming (previous) (diff)

#21 @j3gaming
17 months ago

-deleted

Last edited 17 months ago by j3gaming (previous) (diff)

#22 @domainsupport
17 months ago

@peterwilsoncc I'm confident that I now know what's causing this.

After the comment from @j3gaming, I modified line 520 of /wp-includes/option.php as follows ...

<?php
        $result = $wpdb->update( $wpdb->options, $update_args, array( 'option_name' => $option ) );
        if ( ! $result ) {
if ('cron' === $option) { error_log('Cron failed to save to the database'); }
                return false;
        }

Then I waited for the next could_not_set error notice and sure enough ...

[05-Mar-2023 01:01:02 UTC] Cron failed to save to the database
[05-Mar-2023 01:01:02 UTC] Cron reschedule event error for hook: wf_scan_monitor, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {"schedule":"wf_scan_monitor_interval","args":[],"interval":60}

So this confirms that the issue occurs when (for whatever reason) update_option() is unable to save to the database.

This makes me immediately ask two questions ...

  1. What mechanism exists whereby if $wpdb->update() fails in update_option(), the request is saved (maybe as .json to the file system) until such time as the issue with the database has been resolved and the $wpdb->update() failures can be re-run or does this need to be developed?
  2. What other update_option() failures are going un-recorded?!

If the answer to 2. is unknown then I think this ticket's severity needs to be upgraded to "major" because it would mean that swathes of update_option() requests could be (and probably are) going silently uncompleted if there is a glitch with the database.

Oliver

#24 follow-up: @galbaras
17 months ago

update_option() calls wpdb::update(), which doesn't check for errors, but it called wpdb::query(), which seems to be outputting messages when errors occur. However, it doesn't do this when the database connection is lost, in which case, I retries, and the dies if unsuccessful.

This may be our real culprit. See here.

Also, wpdb::check_connection() uses $this->reconnect_retries, which is set to 5 without a filter, and sleep(1), also without a filter in the sleep duration. So if the connection isn't restored in 5 seconds, it's game over, wheras in real life, restoring the connection can take a lot longer.

Changing these values arbitrarily could cause pages to stall for too long, but assuming this is a rare occurrence, or, as is the case with cron, waiting isn't an issue, this may not be such a bad idea.

In fact, checking again in 1 second, then 5 seconds, then 10 seconds, then 30 seconds, then 1 minute can help get a quick reconnection when possible, but still a reconnection when that takes longer. It's the sort of retry logic used in networking and other systems to increase reliability.

#25 in reply to: ↑ 24 @j3gaming
17 months ago

Replying to galbaras:

Does this logic re-fire a cron? This gets into a tricky situation where you're actually re-running a cron because of a database issue, but what if the cron is doing non-database related things, processing data or emailing people for example? You may not want the cron to be re-trying to fire for the next hour. Also, what if the cron is what's causing the connection issues.

Need to either be able to pick up from exactly where you left off, or if a cron hit a connection error, I wouldn't schedule it to try again asap. Just wait until it's regularly scheduled next. Or, maybe a user defined option for if a cron should retry if you get a database error.

Last edited 17 months ago by j3gaming (previous) (diff)

#26 @galbaras
17 months ago

Does this logic re-fire a cron?

No, it just (almost) ensures that settings are updated when they need to be, and the cron setting will be covered too.

You may not want the cron to be re-trying to fire for the next hour

It won't fire any differently. While the database connection is down, cron won't "know" what to do, but (nearly) as soon as it comes up, the cron setting will be updated, and the next run will do the right thing.

#27 @jbcouton
15 months ago

Hello @audrasjb,

To follow up on our discussion at WordCamp Paris about the CRON error
Here is the return of WP Rocket:
This CRON error about the preload process is not something that can be handled/caused by WP Rocket.
This cron error is related to WordPress itself. It seems that there is something in the latest version that is causing this error.

See this comment for reference: https://wordpress.org/support/topic/cron-unschedule-event-error-for-hook/page/2/#post-16255658

#28 follow-up: @Krstarica
15 months ago

Any update on this? This happens a lot on high traffic site.

#29 in reply to: ↑ 28 @domainsupport
15 months ago

Replying to Krstarica:

Any update on this? This happens a lot on high traffic site.

This is where I am with it https://wordpress.org/support/topic/cron-unschedule-event-error-for-hook/page/9/#post-16740674

#30 @swissspaceboy
14 months ago

Can we not add some extra traces in the code when we catch this error? It would help us to understand what has happened or what is ongoing.

Possible to add to the next patch? Or a beta fix that I am willing to test.

Didier.

This ticket was mentioned in Slack in #forums by jan_dembowski. View the logs.


13 months ago

#32 @domainsupport
13 months ago

This issue's corresponding topic on the support forum has been closed to new comments and many replies archived so I will re-iterate everything I have learned surrounding the could_not_set error notice here for completeness and to hopefully spark some kind of path to a resolution.

The could_not_set notice appears when ...

  • The $option parameter of the update_option() command is empty. This will never happen (it’s always set, to "cron" when the CRON schedule is being saved).
  • The old option value is the same as the new option value. This suggests the cron job is running more than once (the chances of this happening are almost non-existent as they would require the transient to be the same which would require the two concurrent requests to have been received within the same microsecond).
  • The database insert query $wpdb->update() fails.

So I realise it’s a bitter pill … but everything about this issue points to the database for whatever reason.

It also explains why a single solution cannot be found because there are several reasons that can cause an issue with the database. Perhaps the database has become disconnected (see this comment by @galbaras). Perhaps the database is overwhelmed / out of RAM / has too many connections. Perhaps something has locked the database up. Who knows, the reason will almost certainly be unique to your experience.

If you’re not seeing anything in your database log … maybe enable more verbose logging (if that is possible) and find out exactly what happens with the database at the same timestamp as the error notice. For example, if the database had disconnected then there may not be an actual error in the database log.

And just because your site has low legitimate traffic, doesn’t mean that the database can’t be overwhelmed. All WordPress sites are regularly scanned for vulnerabilities by hackers and each scan is an enormous strain on a server with hundreds or (more often) thousands of requests that can look (and act) like a DDoS attack. We see it all the time and it’s why we developed Deny All Firewall to only allow content that exists to be requested to mitigate most of these scans and attacks.

I looked into duplicate CRON causing the issue right at the beginning but ruled it out because the transient would require the two requests to be fired at exactly the same millisecond. An unlikely scenario that even if it were possible would not be so frequently observed.

However … transients are stored in the database … you see where I’m going here? 🙂

If the doing_cron transient cannot be stored by one CRON request because the database is unavailable (overwhelmed / out of RAM / has too many connections, locked, disconnected, etc, etc) then it cannot be checked by another CRON request and is another symptom of the same underlying issue … namely … the database.

The problem, as I mentioned above, is that it’s very likely that each database issue is different. If I’m right (and I suspect that I am) then this entire thread is just discussing the symptom of a completely different, multi-facetted problem; namely "any database issue".

So where do we go from here ... ?

I would suggest a two-pronged attack on this one:

  1. If your website is experiencing this issue then try to identify your database issue perhaps by asking your ISP:
    • How you can gain access to your verbose MySQL error logs?
    • How many concurrent connections are allowed to your database?
    • Can they see if you exceeding this?
    • How much RAM is given to your database?
    • Can they see if you exceeding that?
    • Has your MySQL service had any interruptions?
    • Has your MySQL connection disconnected in PHP?
    • I’m not 100% sure if all of these questions are valid (and it’s not an exhaustive list) but if you can get timestamps for anything highlighted from them and marry any database issues up to the PHP error notices … you may have something you can try to resolve.
  2. WordPress does not seem to identify and / or notify via the logs when there are database issues such as those mentioned above. I think there are strong grounds to look into some kind of active database error logging which would be extremely useful to site owners so they know when their database needs attention.

What do you think?

#33 follow-up: @Krstarica
13 months ago

After changing crontab entry from:

*/5 * * * * curl https://example.com/wp-cron.php?doing_wp_cron > /dev/null 2>&1

to:

*/5 * * * * cd /var/www/example.com/htdocs; wp cron event run --due-now > /dev/null 2>&1

don't see "Cron unschedule event error for hook" errors in the log anymore.

#34 in reply to: ↑ 33 @domainsupport
13 months ago

Replying to Krstarica:

After changing crontab entry from:

*/5 * * * * curl https://example.com/wp-cron.php?doing_wp_cron > /dev/null 2>&1

to:

*/5 * * * * cd /var/www/example.com/htdocs; wp cron event run --due-now > /dev/null 2>&1

don't see "Cron unschedule event error for hook" errors in the log anymore.

I'm not 100% sure but you may have only silenced the notifications by running WP CRON this way rather than fix anything?

As per the WP-CLI command documentation you may need to use the --debug parameter to see PHP errors?

#35 @Krstarica
13 months ago

I see other warnings in the log as a result of cron job, so they shouldn't be silenced, unless it is done in the WP code itself.

Debug output is too messy.

#36 @swissspaceboy
13 months ago

Since I have moved to system cron instead of the WP internal scheduler, I have NO errors anymore on 5 sites. My cron command is:

wget -q -O - <http://www.example.com/wp-cron.php> >/dev/null 2>&1

I see other examples with curl, or sh, but this wget is working fine for me. I execute it every 5 minutes.

Maybe the PHP errors are still there, but as I cannot do anything about them, they don't pollute my empty logs anymore

Didier.

Last edited 13 months ago by swissspaceboy (previous) (diff)

#37 @hansgv
13 months ago

For what it's worth, I'm experiencing the same issue.

I tried switching to executing the cron jobs through the system cron every 5 minutes and setting define( 'DISABLE_WP_CRON', true ); in wp-config.php. It didn't seem to do anything noticeable to the issue.

My site isn't "high traffic" and it's exclusively static page content. My db is on AWS Aurora (MySQL 5.7, db.medium) and I cache (page, object, database, fragment, browser) with W3 total Cache to minimize db requests.

My database stats are very moderate. Connection count never comes even close to the max (typically 2 simultaneous connections, 7 tops), about 600M of freeable memory available, CPU utilization is consistently under 15%, nothing in error logs, nothing gets logged into slow_log (> 2 sec).

Last edited 13 months ago by hansgv (previous) (diff)

#38 @hansgv
13 months ago

The patch in this thread doesn't fix the issue for me.

#39 @hansgv
9 months ago

@audrasjb This ticket has been open for 11 months. I wonder if it's still on the radar. I'm still experiencing from it on 6.4.1.

Happy to help test possible fixes and provide additional feedback, if needed.

#40 @gas1123
7 months ago

Pressable Team help comfirm this issue still exit

@audrasjb @peterwilsoncc pls help upate fix it

#41 @krasi1
7 months ago

facing the issue as well, happy to provide any debugging info

Cron reschedule event error for hook: litespeed_task_lqip, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {"schedule":"litespeed_filter","args":[],"interval":60}

#42 @galbaras
7 months ago

Oh, it's definitely still an issue for me with Broken Link Checker on multiple sites.

#43 @tonygao
6 months ago

Jan-2024 02:18:07 UTC] Cron reschedule event error for hook: action_scheduler_run_queue, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {"schedule":"every_minute","args":WP Cron?,"interval":60}

#44 @galbaras
6 months ago

Interestingly, some of my sites had nested schedules that made no sense, e.g. a WooCommerce task as the child of another plugin's task. So this issue could be caused by data corruption, possibly something that was created by a temporary bug, and then retained in the system.

I've just audited my sites' cron schedules again and found several sites with multipe itsec_cron entries with no data whatsoever, which should never have been saved.

I've also found cron tasks with no specified action, and tasks belonging to plugins that have been removed from the respective sites.

Is there some way to rebuild the cron schedule?

#45 @domainsupport
6 months ago

@galbaras We rarely see this error anymore but when we do see it it always points to some kind of issue with the database (usually a lack of resources, temporary or otherwise) which prevents the cron from being saved into the options table.

It's not a welcome explanation to some but we have yet to find a situation where this has not been the route cause.

We created a plugin to detect cron schedules that have no associated actions and find it to be very common that plugins forget to register a deactivation hook to clean up their cron schedules. So common that we have even found and raised issues with WPForms, MonsterInsights, Site Kit by Google and even Jetpack!

Duplicate cron schedule entries are also usually attributed to plugin error (or a race condition) which we have also seen fairly regularly. We don't know of a way to "rebuild" the cron schedule but our previously mentioned plugin uses this to remove empty cron schedules if it is of use ...

<?php
                $empty_event_hooks = wordpress_maintenance_get_empty_cron_event_hooks();

                foreach ($empty_event_hooks as $hook) {

                    wp_clear_scheduled_hook($hook);

                }

       function wordpress_maintenance_get_empty_cron_event_hooks() {

            $cron_events  = _get_cron_array();
            $empty_event_hooks = array();

            if (is_array($cron_events)) {

                global $wp_filter;

                foreach ($cron_events as $cron_event) {

                    foreach ($cron_event as $hook => $value) {

                        if ('wc_admin_unsnooze_admin_notes' !== $hook &&!isset($wp_filter[$hook])) {

                            $empty_event_hooks[] = $hook;

                        }

                    }

                                }

                        }

            return array_unique($empty_event_hooks);

        }

#46 follow-up: @galbaras
6 months ago

Would it be possible to call all the plugin and theme registration and/or activation hooks (after deleting all the cron schedules)?

Is it completely safe to delete tasks with no action? I can see some from respectable plugins, like Redirection and iThemes (now Solid) Security.

I support the @domainsupport suggestion of improving database error logging. If we know WHY things fail, or at least WHERE they fail, we should be able to fix them.

#47 in reply to: ↑ 46 @tonygao
6 months ago

Totally agree with you!

If we can know more details then we may fix it. At least, I can use error_log function to trace it.

Replying to galbaras:

Would it be possible to call all the plugin and theme registration and/or activation hooks (after deleting all the cron schedules)?

Is it completely safe to delete tasks with no action? I can see some from respectable plugins, like Redirection and iThemes (now Solid) Security.

I support the @domainsupport suggestion of improving database error logging. If we know WHY things fail, or at least WHERE they fail, we should be able to fix them.

#48 @domainsupport
6 months ago

@galbaras @tonygao It may be that `wpdb::print_error ` hooked onto the update_option_cron hook would be sufficient to find out what's going on but I seem to remember that this wasn't enough when I was trying to get to the bottom of the issue and that a disconnected database, a query timeout or other database issue didn't result in an error when update_option( 'cron', $cron ); failed. If so, this would require more detailed error logging from $wpdb which is a larger ask of WordPress Core ... ?

@galbaras As for regenerating cron, in theory, if you were to completely delete the cron setting from the wp_options table then WordPress would just rebuild it so long as everyone uses the recommended method to add cron schedules ...

<?php
if ( ! wp_next_scheduled( 'bl_cron_hook' ) ) {
    wp_schedule_event( time(), 'five_seconds', 'bl_cron_hook' );
}

... and does so whenever they are relying on using an existing schedule!

For example, I never assume that a cron schedule exists just because (for example) I added it in a plugin activation hook so I make sure it's still there when I need it when developing a plugin. But I would imagine that just as plugin developers don't clean up after themselves with a deactivation hook, they may not check to make sure their schedules exist either.

#49 @emilycestmoi
5 months ago

I've had this issue pop up in my error logs for some time now and finally spent 30 minutes or so debugging it today. This error is happening for me roughly every 30 minutes on a fairly high traffic site (3000 uniques per day) with DISABLE_WP_CRON = true and wp-cron.php being called */1 (every 1 minute) from crond.

The issue is as described in #comment:1. The $cron data being set with update_option('cron', $cron) is the same as the value returned by get_option('cron'). This causes update_option('cron', $cron) to return false which is then interpreted as an error in _set_cron_array().

From option.php:

/*
	 * If the new and old values are the same, no need to update.
	 *
	 * Unserialized values will be adequate in most cases. If the unserialized
	 * data differs, the (maybe) serialized data is checked to avoid
	 * unnecessary database calls for otherwise identical object instances.
	 *
	 * See https://core.trac.wordpress.org/ticket/38903
	 */
	if ( $value === $old_value || maybe_serialize( $value ) === maybe_serialize( $old_value ) ) {
	        // XXX-THIS IS RETURNING FALSE
		return false;
	}

It looks like the pull from #comment:1 (https://github.com/WordPress/wordpress-develop/pull/3726) will fix the issue, although I haven't tried it yet. There is no drawback of ignoring the error as it's simply discarding duplicate cron data, but I'm still unsure why get_option('cron') is actually returning the duplicate of what is attempting to be saved in some cases (my initial thought is caching and a race condition?)--this will take more time to investigate. For now I'm going to patch this on my site to stop the error from coming up when duplicate data is saved. I will investigate further when I have time.

Hopefully the fix gets committed soon.

Last edited 5 months ago by emilycestmoi (previous) (diff)

#50 @emilycestmoi
5 months ago

Just as I posted the above I checked my logs again and found another instance of "Cron reschedule event error". When I checked my debug logging I found that this time the error occurred for an entirely different reason--the $wpdb->update() command failing in option.php:

From option.php:

	$result = $wpdb->update( $wpdb->options, $update_args, array( 'option_name' => $option ) );
	if ( ! $result ) {
                // XXX-THIS IS RETURNING FALSE
		return false;
	}

This would mean that the issue is likely more complex than was initially described in #comment:1 and would not be resolved by the pull I linked above. I suspect there is a single issue causing both of these errors (feels like a race condition), but I will need to open $wpdb and see why the update is failing.

Last edited 5 months ago by emilycestmoi (previous) (diff)

#51 @galbaras
5 months ago

Good work, @emilycestmoi. Seems to point back to a need for more information from DB failures. Once we have that, other problems may also become detectable and solvable.

#52 @emilycestmoi
5 months ago

Spent a few more minutes on this today as I am consistently seeing this error every 30 minutes or so. I added some debug logging in class-wpdb.php. Basically what I am seeing when this error occurs is that mysqli_query("UPDATE wp_options SET option_value = 'THE_SERIALIZED_DATA' WHERE option_name = 'cron'"); does not throw any errors but the affected row count after the query is 0, which is the source of the error that propagates back up to cause the error we are seeing.

From class-wpdb.php:

private function _do_query( $query ) {
    if ( defined( 'SAVEQUERIES' ) && SAVEQUERIES ) {
	$this->timer_start();
    }

    if ( ! empty( $this->dbh ) ) {
        
    	$this->result = mysqli_query( $this->dbh, $query );
        // AFFECTED ROWS = 0 FOR THE UPDATE HERE -- mysqli_affected_rows( $this->dbh ) === 0
    }

...

I am attempting to take a look at the mysqli driver to see if I can get any additional error information out of it. Interestingly if I add a query like "mysqli_query($this->dbh, "SELECT COUNT(*) FROM wp_options WHERE option_name = 'cron'");" to check to see if there is a wp_options entry for cron before the mysqli_query($this->dbh, "UPDATE ... WHERE option_name = 'cron'") runs then the failed update moves to a different query (in my case a woocommerce UPDATE returns 0 affected rows where 1 is expected, and I get a different exception in woocommerce). It seems like the $this->dbh is not executing the update/or is overloaded/or is waiting or something and/or skipping the update for some reason/??.

My DB (Mysql 8.0.36) has plenty of resources, and has no issues with connection limits (Max connections I've ever seen are less than half of my mysql hard limit), additionally this error does not occur anywhere else on my server, so it must be something independent of mysql resources.

Will see if I can get more error information out of mysqli.

#53 @tonygao
5 months ago

@emilycestmoi

Thanks for all your effort! Wish this annoy problem solved soon!

#54 follow-ups: @emilycestmoi
5 months ago

Ok, I can confirm that the issue is due to the cron option_value being updated to the identical value that already exists in the database. That when the query runs:

UPDATE `wp_options` SET `option_value` = serialized_data WHERE `option_name` = 'cron'

Rows matches: 1 but Changed: 0.

This is why mysqli_affected_rows() returns 0 which triggers the whole error.

I can see in my logs that a query with an UPDATE is ran with serialized_data (and it succeeds) just before the UPDATE that fails with the identical serialized_data.

So the question is, can we safely ignore this error or do we need to have additional locking to prevent the query from running twice with the same serialized_data (or to reload the cron wp_options data before saving)?

#55 in reply to: ↑ 54 @j3gaming
5 months ago

Replying to emilycestmoi:

Ok, I can confirm that the issue is due to the cron option_value being updated to the identical value that already exists in the database. That when the query runs:

UPDATE `wp_options` SET `option_value` = serialized_data WHERE `option_name` = 'cron'

Rows matches: 1 but Changed: 0.

This is why mysqli_affected_rows() returns 0 which triggers the whole error.

I can see in my logs that a query with an UPDATE is ran with serialized_data (and it succeeds) just before the UPDATE that fails with the identical serialized_data.

So the question is, can we safely ignore this error or do we need to have additional locking to prevent the query from running twice with the same serialized_data (or to reload the cron wp_options data before saving)?

It's my opinion that the race condition is trigged by the cron running twice. I proved it with logging that it's possible in another ticket.

-> Fix the cron being called twice
-> Function won't be called twice, in an unexpected order
-> no logging error, and the cron list won't be wiped

The locking system to prevent double cron calls, I don't think is any kind of industry standard. It just saves a unix timestamp to the database, and has a check for it.
Meaning that if the cron is called twice VERY QUICKLY (which my overpowered servers are able to), then it is able to skip the check.

We probably need sql database locking at the database level for a permanent and reliable fix.
Note, that after doing all that work, it only fixes the error of this type, and not all of the various logs that this ticket can apply to.

Last edited 5 months ago by j3gaming (previous) (diff)

#56 in reply to: ↑ 54 ; follow-up: @domainsupport
5 months ago

Replying to emilycestmoi:

Ok, I can confirm that the issue is due to the cron option_value being updated to the identical value that already exists in the database. That when the query runs:

UPDATE `wp_options` SET `option_value` = serialized_data WHERE `option_name` = 'cron'

Rows matches: 1 but Changed: 0.

This is why mysqli_affected_rows() returns 0 which triggers the whole error.

I can see in my logs that a query with an UPDATE is ran with serialized_data (and it succeeds) just before the UPDATE that fails with the identical serialized_data.

So the question is, can we safely ignore this error or do we need to have additional locking to prevent the query from running twice with the same serialized_data (or to reload the cron wp_options data before saving)?

Further to @j3gaming reply ... if you are seeing multiple UPDATE commands for the same serialised data when the cron option is updated then the queries are being run within the same second since the key to the serialised data is a unix timestamp.

Furthermore, as @j3gaming mentions, the lock that cron uses (also saved in the same wp_options table as a transient) is a microtome() timestamp so in order for there to be a race condition, either the two cron events would have to fire fast enough to not have enough time to save the first transient to the database or they would have to fire in exactly the same microsecond (unlikely).

How are you firing cron events and if via the system cron (rather than relying on the default requests to the server method) have you disabled the default cron by adding define( 'DISABLE_W_CRON', true); to wp-config.php?

The only other thing I want to mention here is to pose a question ... if $wpdb thought that the UPDATE query had failed (for whatever reason) would it attempt to retry? And if the original query had actually not failed and instead both queries were successful could this be a situation that was occurring?

Oliver

#57 in reply to: ↑ 56 ; follow-up: @j3gaming
5 months ago

Replying to domainsupport:

Replying to emilycestmoi:

Ok, I can confirm that the issue is due to the cron option_value being updated to the identical value that already exists in the database. That when the query runs:

UPDATE `wp_options` SET `option_value` = serialized_data WHERE `option_name` = 'cron'

Rows matches: 1 but Changed: 0.

This is why mysqli_affected_rows() returns 0 which triggers the whole error.

I can see in my logs that a query with an UPDATE is ran with serialized_data (and it succeeds) just before the UPDATE that fails with the identical serialized_data.

So the question is, can we safely ignore this error or do we need to have additional locking to prevent the query from running twice with the same serialized_data (or to reload the cron wp_options data before saving)?

Further to @j3gaming reply ... if you are seeing multiple UPDATE commands for the same serialised data when the cron option is updated then the queries are being run within the same second since the key to the serialised data is a unix timestamp.

Furthermore, as @j3gaming mentions, the lock that cron uses (also saved in the same wp_options table as a transient) is a microtome() timestamp so in order for there to be a race condition, either the two cron events would have to fire fast enough to not have enough time to save the first transient to the database or they would have to fire in exactly the same microsecond (unlikely).

How are you firing cron events and if via the system cron (rather than relying on the default requests to the server method) have you disabled the default cron by adding define( 'DISABLE_W_CRON', true); to wp-config.php?

The only other thing I want to mention here is to pose a question ... if $wpdb thought that the UPDATE query had failed (for whatever reason) would it attempt to retry? And if the original query had actually not failed and instead both queries were successful could this be a situation that was occurring?

Oliver

I posted my logging code, and the output of the logs on the other ticket:
https://core.trac.wordpress.org/ticket/57924#comment:15

I was able to get an output of 2 distinct, but very close microtimes.

Not using wp cron, it is disabled in wp-config. On linux, I'm using the OS cron:
https://core.trac.wordpress.org/ticket/57924#comment:17

Last edited 5 months ago by j3gaming (previous) (diff)

#58 in reply to: ↑ 57 ; follow-up: @domainsupport
5 months ago

Replying to j3gaming:

I posted my logging code, and the output of the logs on the other ticket:
https://core.trac.wordpress.org/ticket/57924#comment:15

I was able to get an output of 2 distinct, but very close microtimes.

Not using wp cron, it is disabled in wp-config. On linux, I'm using the OS cron:
https://core.trac.wordpress.org/ticket/57924#comment:17

The difference between your microtimes is 0.000369 of a second which may well be fast enough to not have time to save the locking transient before trying to read it to see if there is a lock!

The proposed fix to your unique issue could be usleep(rand(500000, 2000000); (sleep for a random amount of time between half a second and two seconds) rather than sleep(2); which would just prolong the same issue for two seconds.

But I feel like the real fix for you is to find out why your system is firing cron twice at exactly the same time. Because if you've disabled wp cron then it surely must be your system that's causing the duplicate cron events? And maybe it only causes the error for you when you're running the backup because when that's happening perhaps it loads the database sufficiently to not save the transient in time to be read by the subsequent cron event ... ?

#59 in reply to: ↑ 58 @j3gaming
5 months ago

Replying to domainsupport:

Replying to j3gaming:

I posted my logging code, and the output of the logs on the other ticket:
https://core.trac.wordpress.org/ticket/57924#comment:15

I was able to get an output of 2 distinct, but very close microtimes.

Not using wp cron, it is disabled in wp-config. On linux, I'm using the OS cron:
https://core.trac.wordpress.org/ticket/57924#comment:17

The difference between your microtimes is 0.000369 of a second which may well be fast enough to not have time to save the locking transient before trying to read it to see if there is a lock!

The proposed fix to your unique issue could be usleep(rand(500000, 2000000); (sleep for a random amount of time between half a second and two seconds) rather than sleep(2); which would just prolong the same issue for two seconds.

But I feel like the real fix for you is to find out why your system is firing cron twice at exactly the same time. Because if you've disabled wp cron then it surely must be your system that's causing the duplicate cron events? And maybe it only causes the error for you when you're running the backup because when that's happening perhaps it loads the database sufficiently to not save the transient in time to be read by the subsequent cron event ... ?

Maybe, that code for sleep would probably work. My goal was to not edit the core files. I simply changed the way my cron events are processed. All is running fine here, even if 2 are fired at once.

I have our major PHP version change, and linux update upcoming here, so maybe it's been fixed and I'm just behind on updates.

I do plan to revisit this one day, but I would like to make sure I'm running all current. Linux, PHP, wordpress, before further time on it.

Last edited 5 months ago by j3gaming (previous) (diff)

#60 follow-up: @galbaras
5 months ago

I run O/S crons once every 15 minutes, so definitely not too quickly, and still have MANY unschedule & reschedule errors.

Also, those errors seem to correlate with actual issues, like missed scans, so I'm not sure it's just a matter of preventing an update with the same value.

#61 in reply to: ↑ 60 ; follow-up: @j3gaming
5 months ago

Replying to galbaras:

I run O/S crons once every 15 minutes, so definitely not too quickly, and still have MANY unschedule & reschedule errors.

Also, those errors seem to correlate with actual issues, like missed scans, so I'm not sure it's just a matter of preventing an update with the same value.

I don't the frequency you set the cron to run is the issue. I'm thinking that your 15min cron, for some reason runs twice. Are you also using Linux? Did we all just stumble across an OS bug and we're all in the wrong spot to fix it?

#62 in reply to: ↑ 61 ; follow-up: @domainsupport
5 months ago

Replying to j3gaming:

Replying to galbaras:

I run O/S crons once every 15 minutes, so definitely not too quickly, and still have MANY unschedule & reschedule errors.

Also, those errors seem to correlate with actual issues, like missed scans, so I'm not sure it's just a matter of preventing an update with the same value.

I don't the frequency you set the cron to run is the issue. I'm thinking that your 15min cron, for some reason runs twice. Are you also using Linux? Did we all just stumble across an OS bug and we're all in the wrong spot to fix it?

The main server that we were seeing this issue on (we don't see it anymore since we allocated more resources to MySQL) is running Ubuntu 20.04 and we use an hourly system fired cron. We don't get duplicate cron events and never have done that we know of.

#63 in reply to: ↑ 62 @galbaras
5 months ago

Replying to domainsupport:

Replying to j3gaming:

I don't the frequency you set the cron to run is the issue. I'm thinking that your 15min cron, for some reason runs twice. Are you also using Linux? Did we all just stumble across an OS bug and we're all in the wrong spot to fix it?

The main server that we were seeing this issue on (we don't see it anymore since we allocated more resources to MySQL) is running Ubuntu 20.04 and we use an hourly system fired cron. We don't get duplicate cron events and never have done that we know of.

I'm on CloudLinux and I'm pretty sure the jobs only run once, because I use them to preload my cache, which would overload my accound if run twice.

#64 follow-ups: @galbaras
3 months ago

This is happening on my sites now more than ever before, and likely means that some jobs are being missed, which could be causing unknown issues.

I really wish the core team members would come back to this ticket, cause they don't see to be notified anymore, so if anyone has direct contact with them...

#65 in reply to: ↑ 64 @audrasjb
3 months ago

  • Keywords needs-patch added; has-patch removed

Replying to galbaras:

I really wish the core team members would come back to this ticket, cause they don't see to be notified anymore, so if anyone has direct contact with them...

Several people are still following this ticket but as far as I know it still needs a patch proposal before eventually moving it to a future release.

#66 in reply to: ↑ 64 @domainsupport
3 months ago

Replying to galbaras:

This is happening on my sites now more than ever before, and likely means that some jobs are being missed, which could be causing unknown issues.

We also noticed a recent increase / reappearance of this error on two of our larger sites. All occurrences coincided with extremely high traffic caused by DDoS / hackers. The spike in traffic caused very, very high load averages on the servers due to the huge strain on the database server. We finally resolved the issue by adding custom WAF rules in Cloudflare to challenge all non-UK (they are UK based sites) traffic that's not a known bot. This reduced the load on both servers and in turn allowed WP cron to reliably save the schedule to the database again.

So three things ...

1) I re-affirm my position that in the majority of cases this issue is caused by a problem with the database (whatever that might be) with the caveat that if your server fired cron is somehow firing twice in the same microsecond (as seems to the the case for @j3gaming) that will also cause the error but that's not something I've experienced

2) If WP cron is unable to save to the wp_options table, then the chances are there are many other INSERT / UPDATE queries that have failed too ... just those fail silently

3) If cron schedules are being added correctly (checked that they exist and added if they don't) then WordPress should just add them again when it is once again able to which would mean that jobs may be late but they shouldn't be missed

So, if you're seeing this error, it's almost certainly bigger than WP cron and I would take a look at your server's database / load / traffic ... or identify why your server fired cron is firing twice in the same microsecond as per @j3gaming case

It might be that the solution here, rather than a patch, is to add a section to the "Dashboard - Tools - Site Health" that takes a look at the web / database servers to make sure they're not overloaded or low on RAM.

#67 follow-up: @galbaras
3 months ago

My sites are not overloaded to the point of database choking, and WP Cron runs on a 15-minute schedule, so record locking is highly unlikely. Also, the recent surge in this occurrence may be related to changes in the (CloudLinux, PHP, WordPress, MySQL, Litespeed) software, rather than load or concurrency.

Since the logged messages don't provide enough information, and the actions may not be retried, in order to progress this issue, the WordPress code needs to include better error handling and/or reporting propagation.

Personally, I don't feel comfortable contributing a patch, which may miss things and have other repercussions, which is why I was hoping for core team involvement.

On the other hand, I'll be willing to test proposed any code changes that will allow better debugging, even if it means hacking my sites for a bit, so if anyone has any suggestions, please share.

#68 @emilycestmoi
3 months ago

For my sites this is not a case of overloading either. The issue is:

UPDATE `wp_options` SET `option_value` = serialized_data WHERE `option_name` = 'cron';

mysqli_affected_rows() == 0;

The update is run and rows matches: 1 but changed: 0. The reason is because the update statement is attempting to set the same serialized_data that is already set in the database. I have not found any cron events that have not been rescheduled or have been lost due to this warning, so for my case this warning can be safely ignored.

Question: Does anyone have solid evidence (i.e. debug logs and not just speculation) that shows that cron events are actually missed?

Otherwise, I would propose a patch that just ignores the error.

#69 @galbaras
3 months ago

How did you find out what was going on?

#70 @emilycestmoi
3 months ago

@galbaras-- I spent a number of hours debugging the issue from my sites 2 months ago. For me it was easy to reproduce the warning as it was happening numerous times per day, consistently. I inserted debug statements into the wordpress db code which led me to find out the issue was the cron db update was being triggered with identical contents that were already present in the db. Because an update with identical contents causes mysqli_affected_rows() == 0 (what wordpress checks for) an error is triggered which trickles back up.

If there is interest I can create a patch that others can apply to their wordpress installs to log whether the issue is the same as the one I am seeing.

#71 @galbaras
3 months ago

I got the feeling we were circling back to something that's already been discussed and, indeed, this ticket already has a patch from almost the very beginning. It just needs to be rolled out.

Once the patch is in place, we'll see if this was the only issue or not.

@audrasjb please confirm and change the ticket status if you agree.

#72 in reply to: ↑ 67 @domainsupport
3 months ago

Replying to galbaras:

My sites are not overloaded to the point of database choking, and WP Cron runs on a 15-minute schedule, so record locking is highly unlikely. Also, the recent surge in this occurrence may be related to changes in the (CloudLinux, PHP, WordPress, MySQL, Litespeed) software, rather than load or concurrency.

Sounds to me like you have the same issue as @j3gaming which I mention in (1) ticket:57271#comment:66 whereby your cron is being fired in the same microsecond. This would also cause this scenario ...

Replying to emilycestmoi:

The update is run and rows matches: 1 but changed: 0. The reason is because the update statement is attempting to set the same serialized_data that is already set in the database. I have not found any cron events that have not been rescheduled or have been lost due to this warning, so for my case this warning can be safely ignored.

... because the WP cron event schedule being saved would be exactly the same. So it is my belief that in both of your cases that somehow WP cron is being fired more than once in the same microsecond causing a race condition.

In my opinion the only way WP cron can fire more than once at the same time is if it is within the same microsecond (due to the transient lock). And, this would also cause the new serialised cron option to be identical resulting in the PHP error.

In each of these cases I would recommend that you identify why / how WP cron is firing twice in the same microsecond and resolve because the issue would not be that WP cron events were being missed, but, inversely, they would actually be run twice because the cron option is only attempted to be saved after the events are run.

So we have two distinct causes here ...

  1. Database issue preventing update_option() from being successful
  2. Race condition saving the same serialised data also preventing update_option() from being successful

I am firmly in camp 1. You appear to be in camp 2. Both camps exist :)

#73 follow-up: @galbaras
3 months ago

Shouldn't the patch written by @peterwilsoncc help sort this out? It prevents duplicate unscheduling and reports duplicate scheduling attempts.

#74 in reply to: ↑ 73 @domainsupport
3 months ago

Replying to galbaras:

Shouldn't the patch written by @peterwilsoncc help sort this out? It prevents duplicate unscheduling and reports duplicate scheduling attempts.

I don't believe so. It would prevent the error from showing in the log but I don't believe it would resolve the cause of the issue for you. The cause for you appears to be WP cron firing twice in the same microsecond which means that the patch would not prevent your scheduled events from running twice.

#75 @galbaras
3 months ago

How would we catch this?

#76 follow-up: @emilycestmoi
3 months ago

I cannot reproduce "WP cron firing twice in the same microsecond". Can we get some instructions on how to log this behaviour?

For example, I just modified my wp-cron.php and inserted a line to log the transient:

// XXX-EN
error_log('got transient: ' . $doing_cron_transient . "\n", 0);

/*
 * The cron lock (a unix timestamp set when the cron was spawned),
 * must match $doing_wp_cron (the "key").
 */
if ( $doing_cron_transient !== $doing_wp_cron ) {
        return;
}

After running this overnight on one of my sites this is my cron log output:

got transient: 1714659602.8254570960998535156250

got transient: 1714659722.2510559558868408203125

got transient: 1714659842.5517539978027343750000

Cron reschedule event error for hook: newsletter_bounce_run, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {"schedule":"newsletter","args":[],"interval":300}
got transient: 1714659962.9193520545959472656250

got transient: 1714660082.4408359527587890625000

Cron reschedule event error for hook: cartflows_ca_update_order_status_action, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {"schedule":"every_fifteen_minutes","args":[],"interval":1200}
got transient: 1714660203.2717330455780029296875

got transient: 1714660322.7761120796203613281250

got transient: 1714660442.4618968963623046875000

What am I missing?

#77 in reply to: ↑ 76 @j3gaming
3 months ago

Replying to emilycestmoi:

I cannot reproduce "WP cron firing twice in the same microsecond". Can we get some instructions on how to log this behaviour?

For example, I just modified my wp-cron.php and inserted a line to log the transient:

// XXX-EN
error_log('got transient: ' . $doing_cron_transient . "\n", 0);

/*
 * The cron lock (a unix timestamp set when the cron was spawned),
 * must match $doing_wp_cron (the "key").
 */
if ( $doing_cron_transient !== $doing_wp_cron ) {
        return;
}

After running this overnight on one of my sites this is my cron log output:

got transient: 1714659602.8254570960998535156250

got transient: 1714659722.2510559558868408203125

got transient: 1714659842.5517539978027343750000

Cron reschedule event error for hook: newsletter_bounce_run, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {"schedule":"newsletter","args":[],"interval":300}
got transient: 1714659962.9193520545959472656250

got transient: 1714660082.4408359527587890625000

Cron reschedule event error for hook: cartflows_ca_update_order_status_action, Error code: could_not_set, Error message: The cron event list could not be saved., Data: {"schedule":"every_fifteen_minutes","args":[],"interval":1200}
got transient: 1714660203.2717330455780029296875

got transient: 1714660322.7761120796203613281250

got transient: 1714660442.4618968963623046875000

What am I missing?

This is exactly what I'm seeing. In your case, your cron attempted to fire 3 times it looks like. (Never had that happen myself, way to show off) All with a slightly different microsecond, which allows it to pass the check intended to stop multiple fires.

If you had set any custom crons with a plugin/custom code, you would find that they are now gone. The only reason the default cron schedule is not completely wiped, is because wordpress re-creates the list when wp_schedule_event is called.

In our case, this could actually easily be fixed changing this line:

if ( $doing_cron_transient !== $doing_wp_cron ) {

to check for a range of time, and not an exact match. One line fix and our cases would be solved.

As @domainsupport has pointed out, this error has revealed more than 1 problem. You just happen to have the same one as me.

#78 @galbaras
3 months ago

Maybe I'm missing something, but could this output not be described as "2 crons ran successfully, then 1 triggered an error, then 1 ran succefully, and another 1 triggering an error, then 3 ran successfully"?

The error MAY imply concurrent crons, but are we sure that's what's causing it?

@emilycestmoi Could you perhaps use this code to catch those duplicates in the act:

/*
 * The cron lock (a unix timestamp set when the cron was spawned),
 * must match $doing_wp_cron (the "key").
 */
if ( $doing_cron_transient !== $doing_wp_cron ) {
    error_log( "doing_cron_transient: "$doing_cron_transient, doing_wp_cron: $doing_wp_cron \n", 0 );
    return;
} else {
    error_log( "identical cron transients: $doing_cron_transient \n", 0 );
}

#79 follow-up: @emilycestmoi
3 months ago

I feel like I'm missing something too.

To clarify, the cron log I attached above shows I have a cron run firing every 2 minutes (120 seconds), each "got transient: xxxx" line is a separate cron run 2 minutes apart--this is entirely expected and how my cron runs are scheduled. I am not seeing any duplicate cron runs at all. The reschedule warning occurs during my expected cron runs, again, no duplicate cron runs are seen, no "cron firing twice in the same microsecond" is seen.

I am also not seeing any lost cron events or events that failed to reschedule.

The original patch suggested https://github.com/WordPress/wordpress-develop/pull/3726 (just ignore the warning) would fix the issue that I am observing on my sites.

#80 @galbaras
3 months ago

To clarify, the cron log I attached above shows I have a cron run firing every 2 minutes

Oh, right.

The original patch suggested patch (just ignore the warning) would fix the issue that I am observing on my sites.

Maybe not, because it prevents duplicates and fixes the error message, but if you're not getting duplicates, then it won't stop the problem of unschedule/reschedule failures.

Maybe we should look at environmental issues. How is everyone triggering their crons?

I use the cPanel cron once every 15 minutes and have DISABLE_WP_CRON defined to block it running on page loads. It's highly unlikely my crons run more than once in that timeframe, unless there's a cPanel issue, perhaps.

More specifically, I use a script on a central host to cURL wp-cron.php on my sites remotely, in case this matters. And I'm on a shared Litespeed server.

#81 in reply to: ↑ 79 ; follow-up: @domainsupport
3 months ago

Replying to emilycestmoi:

I feel like I'm missing something too.

To clarify, the cron log I attached above shows I have a cron run firing every 2 minutes (120 seconds), each "got transient: xxxx" line is a separate cron run 2 minutes apart--this is entirely expected and how my cron runs are scheduled. I am not seeing any duplicate cron runs at all. The reschedule warning occurs during my expected cron runs, again, no duplicate cron runs are seen, no "cron firing twice in the same microsecond" is seen.

I am also not seeing any lost cron events or events that failed to reschedule.

The original patch suggested https://github.com/WordPress/wordpress-develop/pull/3726 (just ignore the warning) would fix the issue that I am observing on my sites.

If you aren't seeing duplicate WP cron ... I would be wondering why the serialised cron option data is the same when rescheduling or unscheduling? By definition the serialised data should either have an updated event or a removed event which would result in a different serialised string in either case?

#82 in reply to: ↑ 81 ; follow-up: @emilycestmoi
3 months ago

Replying to domainsupport:

If you aren't seeing duplicate WP cron ... I would be wondering why the serialised cron option data is the same when rescheduling or unscheduling? By definition the serialised data should either have an updated event or a removed event which would result in a different serialised string in either case?

Yes, that is a good question. The short answer is I haven't had the time to trace through the scenarios where this is happening. The long answer is updating the 'cron' option isn't exactly an atomic operation in WordPress. Every time something is scheduled/unscheduled the 'cron' option is updated. I'm guessing there are multiple updates happening to the 'cron' option as cron is running, perhaps a cron event is rescheduling itself during the event run? I'm not sure, but I can think of scenarios where this is plausible.

I'm crazy busy atm but if I can find some time over the next week or two I'll try to setup some logging to trace the sequence of events that leads to the duplicate cron data update.

Last edited 3 months ago by emilycestmoi (previous) (diff)

#83 in reply to: ↑ 82 @domainsupport
3 months ago

Replying to emilycestmoi:

Replying to domainsupport:
I'm crazy busy atm but if I can find some time over the next week or two I'll try to setup some logging to trace the sequence of events that leads to the duplicate cron data update.

I would do this for you but, as you have probably read, all of my experiences of this error are explained by a database issue so I cannot replicate your scenario :(

I will wait with interest your findings.

#84 @galbaras
3 months ago

I would be wondering why the serialised cron option data is the same when rescheduling or unscheduling?

Are we SURE this is what's happening? I think the proposed patch can help isolate the duplicate cron situation from other issues.

This ticket was mentioned in PR #6504 on WordPress/wordpress-develop by @johnbillion.


3 months ago
#85

  • Keywords has-patch has-unit-tests added; needs-unit-tests needs-patch removed

#86 @johnbillion
3 months ago

In 58109:

Cron API: Add tests for associative and indexed array arguments when rescheduling a cron event.

Props johnbillion, peterwilsoncc

See #57271

@szepe.viktor commented on PR #3726:


6 days ago
#88

@johnbillion How can we proceed here?

I'm willing to sacrifice your time and resources 🙃

Note: See TracTickets for help on using tickets.