Make WordPress Core

Opened 10 months ago

Last modified 10 months ago

#59592 new enhancement

Store last changed value in key instead of using it as a salt for query caches

Reported by: spacedmonkey's profile spacedmonkey Owned by:
Milestone: Future Release Priority: normal
Severity: normal Version:
Component: Cache API Keywords: has-patch
Focuses: performance Cc:

Description

This 'last changed' timestamp serves as a salt, implying that when this value changes, the cache key also changes, leading to the creation of a new cache. This method of cache invalidation has proven effective over the years but presents a drawback – it generates an excessive number of caches.

Imagine a high-traffic website constantly generating new content with new posts added every 10 minutes. This results in the 'last changed' cache being updated every time a new post or revision is created. Consequently, the WP_Query for the homepage could potentially generate a new cache each time a post is updated. This could lead to the creation of hundreds or even thousands of caches. The concept behind an object cache is to allow unused keys to expire and be removed from the cache. However, this process is not swift and may take hours or even days for these keys to naturally fall out of the cache. Furthermore, there's a risk that the object cache could become bloated and remove actively used cache keys.

The solution lies in reusing cache keys. Instead of using 'last changed' as a salt, store the 'last changed' values within the cache value itself. Then, retrieve the cache value and compare the 'last changed' value stored in the cache with the actual 'last changed' value. If they don't match, mark the cache as invalid, retrieve the values from the database, and resave the cache with an updated 'last changed' value.

This approach offers several benefits:

Cache keys are reused, ensuring that the number of cache queries remains stable, and the cache queries clean up after themselves.
Predictable cache keys allow caches to be primed efficiently. For instance, a caching plugin can discern the types of queries on a page and prime multiple cache keys in a single request.
However, there are some downsides to consider:

Third-party plugins may be reliant on salted cache keys.
When comparing the last updated value, the entire cache must be retrieved. This means that the first user after cache invalidation may experience a slightly larger request."

Change History (6)

This ticket was mentioned in PR #5458 on WordPress/wordpress-develop by @spacedmonkey.


10 months ago
#1

  • Keywords has-patch added

#2 @spacedmonkey
10 months ago

See attached PR of POC of this idea.

This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.


10 months ago

#4 @tillkruess
10 months ago

I think this is an excellent solution to the problem and would "truly" resolve #57625.

#5 follow-up: @nickchomey
10 months ago

Perhaps I'm missing something fundamental, but it seems to me that (at least part of) the conversation here and in #57625 is focusing on the wrong thing - largely how the cache is configured. Specifically, focusing on cache key TTL rather than having an appropriate cache eviction policy for the cache data (e.g. LRU, which I believe is the default for Redis). https://redis.io/docs/reference/eviction/

If you use something like allkeys-lru, then the only way actively-used keys could get evicted is if your cache is too small - a hosting error, not WP. If you do want to use TTL/expire on specific keys, then perhaps volatile-lru could be used. It seems to me that this decision should be up to the site owner, not WP Core.

This all seems to roughly echo what Peter Wilson said in his initial comment: https://core.trac.wordpress.org/ticket/57625#comment:3

The proposed solution here seems to add unnecessary complexity and processing overhead. So, perhaps a solution the present problems is to give users and plugins more control over caching policies such as whether to apply a TTL or not to a specific cache key? Also, to provide documentation/guidance on proper server cache (redis) config - both for users who manage their own servers, as well as for hosting providers to do a better job at.

This article from Ruby on Rails goes into detail about using key-based cache expiration, so it seems worth reviewing for inspiration.
https://signalvnoise.com/posts/3113-how-key-based-cache-expiration-works
I hope this is helpful rather than irrelevant/a distraction!

Last edited 10 months ago by nickchomey (previous) (diff)

#6 in reply to: ↑ 5 @owi
10 months ago

Replying to nickchomey:

Perhaps I'm missing something fundamental, but it seems to me that (at least part of) the conversation here and in #57625 is focusing on the wrong thing - largely how the cache is configured. Specifically, focusing on cache key TTL rather than having an appropriate cache eviction policy for the cache data (e.g. LRU, which I believe is the default for Redis). https://redis.io/docs/reference/eviction/

As I was the original poster of the aforementioend ticket I wanted to clarify (if it was not clear from my description) that I had custom TTL for the query cache groups (24-72h) and eviction policy - and this did the job for me.

Nevertheless I still believed that the way Core handles keys for these groups was not proper. On busy sites the last_updated could be regenerated tens or hundreds of times per day (as I proved in original post - it was enough to open add new post screen to just regenerate it twice, without even clicking save). That could case the numbers of query cache keys grow crazy fast and then TTL would be just some magic number which you can set based on some trial and error (to balance the cache efficiency vs memory levels).

Also I think (that's my personal opinion) that operating on the redis max memory all the time and relying on the eviction is not a healthy state and that's just covering the problem.

All of the above doesn't change the fact that I understand that the problem is not trivial to solve and query caches nature make it more difficult :)

Note: See TracTickets for help on using tickets.