Make WordPress Core

Opened 5 years ago

Last modified 22 months ago

#47511 new enhancement

Add specific "default settings" for different locales

Reported by: azaozz's profile azaozz Owned by:
Milestone: Future Release Priority: normal
Severity: normal Version:
Component: I18N Keywords: needs-patch
Focuses: Cc:

Description (last modified by azaozz)

In the UI there are several settings that are locale-specific, like "Date Format", "Time Format", "Week Starts On", etc. Also some locale specific differences in displaying information, like user's name on the Profile screen. In East Asia and some parts of Europe (and others) it is not accepted to have "John Doe", it is "Doe, John" (with or without the comma). For these locales the order of the form fields for First Name and Last Name should be swapped.

This is handled well in MacOS and Windows. Not sure if we need all the options they offer, but a default per-locale settings would be great to have. Then they can include word-count methods ("words" vs. characters), default lengths for strings/excerpts, (perhaps) some encoding changes for emails, decimal separator (3,14 vs. 3.14), grouping separator (1 000 000 vs. 1,000,000), and anything else that is locale-specific.

Change History (20)

#1 @azaozz
5 years ago

  • Keywords needs-patch added
  • Milestone changed from Awaiting Review to Future Release
  • Type changed from defect (bug) to enhancement

Thinking this can be (easily) implemented as a filterable white list. Then the different locale teams will be in charge to add the specific defaults for each locale. Alternatively we can "extract" the common settings data from glibc, perhaps. There are tons of locale settings/data there we only need a small subset.

It will also make redundant the current practice to pass locale specific settings as translatable strings (which seems sub-optimal). Even if we still need to support that, at least it will be "centralized" and filterable.

Last edited 5 years ago by azaozz (previous) (diff)

#2 @azaozz
5 years ago

Related tickets: #46278, #44548, #39733, #37491, #36259, #29664, #25585, #20739, and probably others.

#3 @azaozz
5 years ago

  • Description modified (diff)

#4 follow-up: @swissspidy
5 years ago

  • Component changed from Administration to I18N

So there‘d be a long list of defaults for every locale in PHP? That doesn‘t sound very easy to maintain for polyglots.

#5 in reply to: ↑ 4 @azaozz
5 years ago

Replying to swissspidy:

there‘d be a long list of defaults for every locale...

Don't think it will be long, maybe just for few locales :)

The way I see it is to have the "current defaults" in an array, and then override only what's needed/different for each locale. For several locales we won't need to change anything, and for most we will only change 1-2 things (as far as I see). Then wrap this into a function similar to get_bloginfo() which will make these settings filterable and easy to use (of course all of this will be in WP_Locale).

This would be much better/easier than the current things like:

/**
 * Register date/time format strings for general POT.
 *
 * Private, unused method to add some date/time formats translated
 * on wp-admin/options-general.php to the general POT that would
 * otherwise be added to the admin POT.
 *
 * @since 3.6.0
 */
public function _strings_for_pot() {
	/* translators: localized date format, see https://secure.php.net/date */
	__( 'F j, Y' );
	/* translators: localized time format, see https://secure.php.net/date */
	__( 'g:i a' );
	...

Or even this:

// Set text direction.
...
/* translators: 'rtl' or 'ltr'. This sets the text direction for WordPress. */
} elseif ( 'rtl' == _x( 'ltr', 'text direction' ) ) {
	$this->text_direction = 'rtl';
}

(The list of RTL languages/locales is well known, we shouldn't push that to the translators to set).

In addition this will let us return the proper type (string vs. int) and sanitize things that may need it.

Last edited 5 years ago by azaozz (previous) (diff)

#6 @miyauchi
5 years ago

I heard about following specification from a person who is a developer of the Concrete 5.

And also, following is a implementation of the CLDR for PHP.

I guess there are some of locales which has multiple timezone, date format or so.
CLDR looks good implementation to me as default values for locale settings.

#7 @miyauchi
5 years ago

Oh, I forgot to paste the URL. :)
https://punic.github.io/

#8 @miyauchi
5 years ago

Following is an example of locale settings for Japanese.
http://www.unicode.org/cldr/charts/31/summary/ja.html

#9 follow-up: @Takahashi_Fumiki
5 years ago

I'm really interested in this ticket!

As mentioned above, name fields in profile.php should be able to swap. But there are still things to be considered.

  1. Name parts are not always 2 in some locale(e.g. Phlip.K.Dick, Pyotr Ilyich Tchaikovsky).
  2. API for name parts is also preferred for plugin API like document_title_parts hook.
  3. Name formats vary very much. So, if you are not a linguist, it's hard to make whitelist.
  4. The gettext approach is a good start point(e.g. $name_order = _x( 'fist_name,last_name', 'name_parts' ), but translators are not always good developer. In terms of separation of concerns, the changeable settings should be free from .po files. JSON file wp-setting.json is a good recommendation, but it may give a broad effect.

This W3C document really helps.

https://www.w3.org/International/questions/qa-personal-names

Anyway, I will make a patch for UI matters in profile.php.

Last edited 5 years ago by Takahashi_Fumiki (previous) (diff)

#10 @azaozz
5 years ago

CLDR looks good implementation to me as default values for locale settings.

Yeah, it looks like a really good starting point :)

The (main) intention here is to have WordPress specific per-locale settings that are easy to define and maintain. Mainly things like excerpt length, word-count method, etc. This would include a few of the "more general" locale settings too, like default date and time format, decimal separator, person's name display order, etc. (as long as they are needed/used in the UI).

The "full" locale settings include a lot of other things, but don't think we need most of them. They are targeted at operating system level support, we get that "for free" through the web browsers :)

The next step is to define what these specific WP locale settings should include. I've mentioned a few already but am pretty sure I'm missing some.

Then we can look at the best way to implement these settings. For now it seems a (hard-coded) while list in core would be perfect. We can add a default (en_US) map (array) with the specific settings, then generate the initial per-locale changes/overrides to these default settings (perhaps from CLDR), and finally each of the polyglots teams would be able to adjust these settings if they wish.

Alternatively, especially if a white list becomes too big, we can think about generating some sort of locale settings file from GlotPress (again, can probably use CLDR for that) and distribute it with the main wp-admin .po and .mo files. This will be significantly more difficult to set-up and maintain. Frankly I don't think it will be needed, at least not for now.

The good part is that once we add some API for the settings, we will be able to switch from a white list to ...any other type we need or want.

Last edited 5 years ago by azaozz (previous) (diff)

This ticket was mentioned in Slack in #polyglots by swissspidy. View the logs.


5 years ago

#12 in reply to: ↑ 9 @azaozz
5 years ago

Replying to Takahashi_Fumiki:

I'm really interested in this ticket!
...
Anyway, I will make a patch for UI matters in profile.php.

Sounds great! :)

Yes, the UI around the user name is not fully "internationalized". Generally WordPress asks the users to enter their name, and then has a setting of how it should display the name (used on the front-end for things like post-author and archives). There is a bit of JS there to offer combinations of what the user has entered in the first-name and last-name form fields, but the users can type something else. This is somewhat limiting as pretty much everywhere a person has more than two names :)

Thanks for that link to the w3.org article, really good read about names and (web) forms. Looking there, seems WP should have a single form field for a user's name:

ask yourself whether you really need to have separate fields for given name and family name

Lets continue the conversation on the new ticket :)

Last edited 5 years ago by azaozz (previous) (diff)

#13 @Takahashi_Fumiki
5 years ago

I've opened a new ticket #47522 just for the name order.

#14 follow-up: @ocean90
5 years ago

I guess I'm missing something but what issue are we trying to solve here? Why should we replace something that works really well with a static list that can only be updated by releasing a new WordPress version?

#15 in reply to: ↑ 14 @azaozz
5 years ago

Replying to ocean90:

Why should we replace something that works really well with a static list that can only be updated by releasing a new WordPress version?

The problems I see with it is that most of these settings don't apply/are redundant for most locales, but are still in the pot file. Another thing is that the "global" locale settings like decimal point, date format, etc. are well known. They still can be in the pot, but..?

Another problem is that when these settings are in the po/mo files, they are virtually hard-coded and hard to re-use. For example a theme or a plugin cannot do $excerpt_length = intval( _x( '55', 'excerpt_length' ) ); and get the same value as in core.

If you think it's best to keep these settings in the pot, lets keep them there. However thinking we need to have a "centralized" place to process and expose them. That way they can be filtered and sanitized/verified if needed (think this will also group them in one place in the pot file which will make it easier for translators to set). Also that will make it trivial to expose them to themes and plugins and in the REST API.

This ticket was mentioned in Slack in #polyglots by nao. View the logs.


5 years ago

#17 follow-up: @nilovelez
5 years ago

Most of that values can be filtered, and they work just fine.
The effort should be put in making the purpose of those strings more clear.

Take this string as a example:
https://translate.wordpress.org/projects/wp/dev/admin/es/default/?filters%5Bstatus%5D=either&filters%5Boriginal_id%5D=40444&filters%5Btranslation_id%5D=61179294

"0" is the default timezone, in es_ES we have set it to "Europe/Madrid" as it is the Spanish mainland Timezone (the official one). This works fine, but it should be better documented.

Before we realised what was it intended for, we left it as "0" for years

Last edited 5 years ago by nilovelez (previous) (diff)

#18 in reply to: ↑ 17 @azaozz
5 years ago

Replying to nilovelez:

The effort should be put in making the purpose of those strings more clear.
...
"0" is the default timezone, in es_ES we have set it to "Europe/Madrid" as it is the Spanish mainland Timezone (the official one). This works fine, but it should be better documented.

Before we realised what was it intended for, we left it as "0" for years

Yeah, this is one of the things that would be fixed by having this type of "global" per-locale settings outside of the pot file. They are all "well known" and once added they don't change at all or very very rarely. Then, instead of relying on (scattered) inline comments in the code, we will have one place to maintain all of them. If this happens, translators won't have to deal with them at all.

The alternative solution is pretty good too: the settings will remain in the pot, but will be gathered in one place in the code. That would let us add more adequate/extended inline help (that will be also visible in the online Code Reference.

#19 @zodiac1978
3 years ago

We stumbled upon this problem in our plugin too. In our case it is the word count (words vs. characters):
https://github.com/pluginkollektiv/antispam-bee/issues/403

There is no way for us to read this information from the existing core translation, so we have to duplicate this "translation" in our plugin (which is in fact no translation but a choice) or we have to use the core translation.

The latter has the additional problem that the PO parser is adding this to the translation strings nevertheless although the correct text domain is missing.

Having this word count in the WP_Locale object would make this much easier.

At the moment there are 65 plugin shown for the string "Word count type. Do not translate!"
https://wpdirectory.net/search/01FBV7MTEHG0XDATA5823T85VV

WooComerce and others are using its own translation, but Jetpack or Site Origin Page Builder (and others) for example are using the core translation (like we do it at the moment in Antispam Bee).

Duplicating this string *and* using the core translation (producing an unused string in GlotPress) are both not the best solution here, I think.

#20 @pedromendonca
22 months ago

Having this word count in the WP_Locale object would make this much easier.

@zodiac1978 please check the new ticket #56698 and PR https://github.com/WordPress/wordpress-develop/pull/3377/

If you think it's best to keep these settings in the pot, lets keep them there. However thinking we need to have a "centralized" place to process and expose them.

@azaozz I agree with this. This is why I made this table where the possible errors are exposed, many strings do need sanitization, but currently, GlotPress has no way of doing it.
https://wp-i18n.org/stats/language-settings/
Many locales have actually translated these settings strings, some others are not, but differ from GP_Locale properties, like word_count_type and text_direction.

A suggestion, for further development of a GlotPress sanitization feature, set a specific context to strings that are meant to work as exclusively as settings, and a convention to pass some arguments, for example:

/* translators: This sets the text direction for WordPress. Options: 'ltr', 'rtl'. */
_x( 'ltr', 'no-translate-setting' );
Note: See TracTickets for help on using tickets.