CX2: Identify section types to exclude from MT abuse test
Closed, ResolvedPublic

Description

In T162113: CX2: Infrastructure for section-level progress calculation we did a first iteration of progress calculation. Now we need to identify some section types where we don't need to do abuse checking. Examples are infoboxes, tables, reference list, headings. And may be more. These sections should be filtered out from the abuse test. For now only section titles are excluded

Event Timeline

Pginer-WMF triaged this task as Medium priority.Aug 1 2018, 6:11 PM
Pginer-WMF moved this task from Needs Triage to CX2 on the ContentTranslation board.

Change 458634 had a related patch set uploaded (by Santhosh; owner: Santhosh):
[mediawiki/extensions/ContentTranslation@master] Create a list of exlcluded section types from MT abuse validation

https://gerrit.wikimedia.org/r/458634

Change 458634 merged by jenkins-bot:
[mediawiki/extensions/ContentTranslation@master] Create a list of excluded section types from MT abuse validation

https://gerrit.wikimedia.org/r/458634

@santhosh

[...] we need to identify some section types where we don't need to do abuse checking. Examples are infoboxes, tables, reference list, headings.

I checked the patch which states that the following should be excluded:

'cxBlockImage', 'mwBlockImage', // Both are required since new images can be inserted too.
'cxTransclusionBlock', 'mwTransclusionBlock',
'mwTable', 'list', 'mwHeading'

I did not find an easy way to identify on a page which page elements will correspond to the above, so I did testing per the task description. The following are not counted as MT abuse:

  • Article title
  • Sections title
  • Images with MT translated description
  • tables

I re-checked the following section types in testwiki (wmf.24) with cx2 enabled:

    • Article title
    • Sections title
  • Infoboxes
    • Images with MT translated description
    • tables

The MT translation of the above does not trigger MT abuse message. The calculation (the progress bar) will, indeed, indicate that there is 100% Machine translation (e.g. translated only images in en:Triode, the calculation bar will say 12% translated and 100% machine translated).
It's a relatively small usability issue - users may be confused why sometime 100% MT translation displays the warning and sometime is not. It might be a good idea to exclude MT translated specific section type from that progress bar calculation.