Making WordPress.org

Opened 4 years ago

Last modified 2 months ago

#5207 assigned enhancement

Improve heuristics for detecting block assets

Reported by: tellyworth's profile tellyworth Owned by: coreymckrill's profile coreymckrill
Milestone: Priority: high
Component: Plugin Directory Keywords: needs-patch
Cc:

Description

The block directory attempts to detect block names and assets, so they can be returned by the API.

If the scripts and blocks are defined in a block.json file, there's no ambiguity. But since almost no plugins use block.json files as yet, the plugin directory's svn import code contains some heuristic code that attempts to guess which asset files are needed.

Currently those heuristics are lacking, so many (most?) blocks in the block directory don't have any assets stored or returned by the API.

Here's an example of a block that does have some block_assets:

https://api.wordpress.org/plugins/info/1.2/?action=query_plugins&request[block]=listicle&request[per_page]=3&request[locale]=en_US&request[wp_version]=5.5

And here's one that doesn't:

https://api.wordpress.org/plugins/info/1.2/?action=query_plugins&request[block]=rubi&request[per_page]=3&request[locale]=en_US&request[wp_version]=5.5

The heuristic code is here:

https://meta.trac.wordpress.org/browser/sites/trunk/wordpress.org/public_html/wp-content/plugins/plugin-directory/cli/class-import.php#L445

Currently it basically just looks for JS and CSS files in the build or dist subdirs. That needs to be made smarter, perhaps with special cases for some of the plugins currently in the block directory (https://wordpress.org/plugins/browse/block/).

Attachments (3)

5207_tag-path.diff (1.4 KB) - added by coreymckrill 4 years ago.
Fix block files path when $stable_tag is a version number
5207-assets.diff (6.1 KB) - added by coreymckrill 4 years ago.
First stab at refactoring block and asset detection
5207-subdir-check.diff (5.7 KB) - added by coreymckrill 4 years ago.

Download all attachments as: .zip

Change History (19)

#1 @ryelle
4 years ago

I think plugins that use /tags/* are not indexed correctly— check out Listicles, it returns assets as "/1.0.6/src/block/listdd.js", …, but there are two problems:

  1. That doesn't translate to the full path, it should return tags/1.0.6/…, since this creates the link https://plugins.svn.wordpress.org/listicles/1.0.6/src/block/listdd.js
  2. That's not the right file to include, it's not finding the files in dist — because of the above problem, export_and_parse_plugin is searching in {$plugin_slug}/{$stable_tag}, and $stable_tag is just the version number.

@coreymckrill
4 years ago

Fix block files path when $stable_tag is a version number

#2 @coreymckrill
4 years ago

With the above patch, which addresses the problem noted by @ryelle, the paths used to find files and the paths stored in postmeta will now include tags/ if the stable tag is a version number instead of trunk.

Running this on a couple of known block plugins, we get new results:

https://api.wordpress.org/plugins/info/1.2/?action=query_plugins&request[block]=listicle&request[per_page]=3&request[locale]=en_US&request[wp_version]=5.5

This now shows files from dist instead of src, and includes /tags in the file paths.

https://api.wordpress.org/plugins/info/1.2/?action=query_plugins&request[block]=rubi&request[per_page]=3&request[locale]=en_US&request[wp_version]=5.5

This didn't have any block assets shown before, but now has /tags/1.0.1/build/index.js

We should probably still add some additional heuristics here, but this is a good first step.

#3 @tellyworth
4 years ago

In 9892:

Plugin dir: improve block asset detection

Fixes the logic for handling tag paths.

Props coreymckrill, ryelle.
See #5207.

#4 @tellyworth
4 years ago

Nice! That takes it up from 13 plugins with asset files, to 28 (of 32 currently). Currently the ones missing assets are:

layout-grid
pdf-viewer-block
embed-block-for-github
starscape

There's a separate but similar issue with the block metadata stored in the all_blocks array. Mainly because of the limitations of using regexes to detect the registerBlockType calls. The plugins in the block directory missing all_blocks are:

scratch-pad
simple-definition-list-blocks
emoji-conbini
ghstwrtr
meme-me
lazy-lists
altered-reality

#5 @coreymckrill
4 years ago

The plugins in the block directory missing all_blocks

I haven't checked the whole list, but at least a couple of them (both from sortabrilliant) aren't actually blocks. scratch-pad is an editor plugin that adds a text field to the block editor sidebar, but doesn't register any blocks. ghstwrtr adds a block style, but no block types.

In simple-definition-list-blocks our regex isn't picking up the registerBlockType call because it expects there to be parentheses and parameters included, but in this plugin the minified JS looks like this:

Object(l.registerBlockType)("simple-definition-list-blocks/list",r);

What if we modified the regex to just look for registerBlockType? Seems like the possibility of false positives is pretty low.

#6 @coreymckrill
4 years ago

Ah nevermind, then we wouldn't be able to get the name and title properties for the block...

@coreymckrill
4 years ago

First stab at refactoring block and asset detection

#7 @coreymckrill
4 years ago

The above patch refactors block and asset detection to prioritize data from any existing block.json files first. Then if those aren't available, or they don't contain any of the style/editor properties, it goes back to hunting for patterns, starting with build/dist directories.

This narrows down the list of blocks with no assets to one: pdf-viewer-block. This block unfortunately, I believe, has a directory structure that we cannot easily traverse using heuristics, without potentially getting a bunch of false positives.

This patch also improves the hits on existing block assets lists, reducing false positives. There were a couple that previously included both src and build versions of their scripts, but this is more successful at isolating just build files.

This patch does not do much to fix the list of plugins that are missing blocks. As noted above, I believe some of these don't contain actual blocks to begin with.

#8 @coreymckrill
4 years ago

Alright, so from the list above of block plugins with nothing in all_blocks, only simple-definition-list-blocks actually registers a block type. As mentioned above, the usage of registerBlockType is in a minified JS file and does not match with our regex.

All of the other plugins in the list modify existing blocks, add block styles, and/or add functionality to the block editor UI. But they don't actually add new blocks. Should these be in the block directory?

#9 @tellyworth
4 years ago

In 9934:

Plugin dir: improve block asset detection during import.

This refactors the code and does a smarter job of finding asset files. It successfully works with most blocks currently in the directory.

Props coreymckrill.
See #5207

#10 @tellyworth
4 years ago

I think we should probably remove those questionable blocks from the directory for now. They can always be re-submitted later.

#11 @tellyworth
4 years ago

In 9937:

Plugin dir: improve detection of blocks in code.

Use a more flexible regex for catching registerBlockType calls in JS code, especially minified code.

See #5207

#12 @tellyworth
4 years ago

In 9938:

Plugin dir: fix double slashes in patchs when importing assets.

See #5207

#13 @ryelle
4 years ago

I noticed that Voice Blocks isn't showing up in the directory because block_files is empty. It looks like there's an issue finding files in subdirectories that aren't build or dist, so the importer isn't finding the files blocks/voice/*

#14 @coreymckrill
4 years ago

The above patch does three things:

  • Adds info/warnings to the block plugin checker tool output to show how many JS/CSS asset files were found during the check.
  • Ensures that the block checker's call to Import::find_possible_block_assets includes other directories that might have asset files besides the plugin's root directory.
  • Updates the heurisitics in the Import::find_possible_block_assets method to recursively search possible directories for asset files, rather than only the directory itself.

With these changes, the Voice Blocks plugin shows 2 JS files and 0 CSS files in the checker. It looks like there are CSS assets in the plugin, but they are in a separate directory with no connection to the block's script files, so I'm not sure how we'd find those without potentially introducing a ton of false positives.

These changes themselves already do find a lot more files that are considered assets, and I'm not sure how many are false positives. When running bin/check-block.php (with some output modifications), here's the before and after for this patch:

Before / After

Total JS assets: 129 / 160
Total CSS assets: 77 / 84
Block plugins with no assets: 3 / 0

I guess the question is, will introducing these additional assets, some of which might be false positives, break the block directory?

#15 @ryelle
4 years ago

In get_detailed_help, you don't need to do the block data check, since this is only output on error & warning issues. It should never get here if data > 0. If you want to leave the check in, it should return a string to display, or false.

Finding extra files shouldn't be a problem right now, but I'm also seeing it not find some valid files. Testing with a recipe block I wrote, it fails to find my CSS file, and instead finds an extra JS file:

Found 0 CSS assets.
Found 2 JS assets.
Found file assets/js/blocks/recipe-directions/index.js.
Found file /build/recipe-block.js.

vs before the patch, it pulls from block.json correctly:

Found file /build/recipe-block.js.
Found file /build/recipe-block.css.

#16 @khag7
4 years ago

This may be off-topic, but only slightly.

The API is called on from the Gutenberg editor to return lists of plugins which have blocks. In some cases, the API includes a plugin with an empty list of blocks. This causes an error (see Core#51018). If "blocks" is empty, that plugin probably shouldn't be returned as a result at all by the API, right?

Example query link below. See the second result: "Gutenberg Map Block for Google Maps" has an empty "blocks" property and causes an error on any wordpress site initiating this query.

https://api.wordpress.org/plugins/info/1.2/?action=query_plugins&request[block]=free

Assuming the detection of block assets improves, this is less of an issue, but until then is there a way for the API to intentionally discard results when request[block] is part of the query and "blocks" is empty on a result?

Note: See TracTickets for help on using tickets.