Openverse Monthly Priorities Meeting 2024-06-05

OpenverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org. contributors will host a community meeting to discuss priorities for June at 1500 UTC on June 5th, 2024.

A sync video chat link will be provided in the #openverse channel of the Making WordPress Chat. We hope to see you there!

You can read the ongoing notes document for these meetings here.

This meeting in particular will also serve as a mid-year check in for the project and our goals.

#openverse-priorities, #priorities

A week in Openverse: 2024-06-24 – 2024-07-01

openverse

Merged PRs

Analytics

  • #4550: Fix Plausible setup after domain was already set
  • #4568: Specify pull policy for `openverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org.-` images

APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways.

  • #4499: Include xml in frontend attribution options
  • #4530: Drop FK constraint on media_obj in MediaDecisionThrough, update backfillmoderationdecision command
  • #4536: Do not use unstable pook reference for API
  • #4540: Add linting for Dockerfiles
  • #4544: Create sensitive and deleted media models for decisions
  • #4547: Publish changelog for api-2024.06.24.18.01.42
  • #4551: Shorten PDM hash to first 8 characters
  • #4554: Remove backfillmoderationdecision management command after production run
  • #4568: Specify pull policy for `openverse-` images

Catalog

  • #4475: Add DAG to decode and deduplicate image tags with escaped literal unicode sequences
  • #4495: Fix placing test S3 data into MinIO
  • #4497: Add CI/CD and PDM to new indexer worker
  • #4526: Fix separators in catalog and dev-env images and dev-env volume
  • #4540: Add linting for Dockerfiles
  • #4555: Ensure plpython3u exists in live db when using it
  • #4557: Remove `trim_and_deduplicate_tags` DAG after successful run
  • #4568: Specify pull policy for `openverse-` images

Documentation

  • #4417: Implementation Plan: Augment the catalog database with suitable Rekognition tags
  • #4475: Add DAG to decode and deduplicate image tags with escaped literal unicode sequences
  • #4546: combine frontend testing documentation pages
  • #4547: Publish changelog for api-2024.06.24.18.01.42
  • #4548: Publish changelog for frontend-2024.06.24.18.01.44
  • #4557: Remove `trim_and_deduplicate_tags` DAG after successful run
  • #4562: Publish changelog for frontend-2024.06.26.17.18.17

Frontend

  • #4291: Display generated tags separately
  • #4497: Add CI/CD and PDM to new indexer worker
  • #4499: Include xml in frontend attribution options
  • #4509: Replace "Over…" language with more precise "Top…"
  • #4516: Add caching to frontend NginxNGINX NGINX is open source software for web serving, reverse proxying, caching, load balancing, media streaming, and more. It started out as a web server designed for maximum performance and stability. In addition to its HTTP server capabilities, NGINX can also function as a proxy server for email (IMAP, POP3, and SMTP) and a reverse proxy and load balancer for HTTP, TCP, and UDP servers. https://www.nginx.com/. configuration
  • #4523: Fix possible TypeError when accessing properties of `route.value`
  • #4540: Add linting for Dockerfiles
  • #4548: Publish changelog for frontend-2024.06.24.18.01.44
  • #4549: Re-add tags page text
  • #4559: Fix flaky VCollectionHeader snapshot tests
  • #4562: Publish changelog for frontend-2024.06.26.17.18.17

Infra

  • #4516: Add caching to frontend Nginx configuration

Ingestion Server

  • #4471: Remove single quotes in values of Ingestion Server's TSV files
  • #4529: Upload Ingestion Server's TSV files to AWS S3 (skip tags)

Management

  • #4497: Add CI/CD and PDM to new indexer worker
  • #4526: Fix separators in catalog and dev-env images and dev-env volume
  • #4539: Add dev tools jq and HTTPie to `ov`
  • #4540: Add linting for Dockerfiles
  • #4546: combine frontend testing documentation pages
  • #4568: Specify pull policy for `openverse-` images

Closed issues

API

  • #4430: Attribution: XML/RDF/Turtle please.
  • #4454: Determine if all tags in the catalog database have an associated provider
  • #4512: The `AbstractMediaDecisionThrough` class and its inheriting classes shouldn't use actual foreign keys to media tables
  • #4513: Creating `MediaDecision` has no effect on deindexed actions

Catalog

  • #663: Upgrade catalog to Python 3.11
  • #1464: Create a DAG to log and report code review response times
  • #4199: Remove and de-duplicate tags with leading/trailing whitespace
  • #4454: Determine if all tags in the catalog database have an associated provider
  • #4494: Test S3 inaturalist files are not found in MinIO

Documentation

  • #4040: Implementation Plan: Augment the catalog database with suitable Rekognition tags
  • #4514: Combine frontend testing documentation pages

Frontend

  • #461: Add a message to inform the user about more filters when one media type is chosen
  • #2130: Update sensitive browsing designs to allow re-blurring of search results
  • #2213: Frontend local dev error `Cannot convert undefined or null to object`
  • #4192: Displaying machine-generated content
  • #4379: Write a page describing the machine-generated tags for the frontend
  • #4430: Attribution: XML/RDF/Turtle please.
  • #4470: Add caching of static assets to frontend Nginx
  • #4522: TypeError: Cannot read properties of undefined (reading 'name') in `useMatchRoute()`
  • #4558: vcollectionheader storybook visual regression test broken

Infra

  • #4470: Add caching of static assets to frontend Nginx

Ingestion Server

  • #3912: Upload Ingestion Server's TSV files to AWS S3

openverse-infrastructure

Merged PRs

API

  • #951: Fix and improve api-management-command script

Frontend

  • #940: Cache frontend assets at edge for 3 days

Infra

  • #924: Add `StatusCheckFailed` alarms for EC2 services
  • #940: Cache frontend assets at edge for 3 days
  • #942: Add Grafana PDC
  • #944: Touch up indexer worker pools to match IP requirements
  • #947: Bypass WAF for Cloudflare Access services
  • #948: Ignore changes to `actions_enabled` on externally controlled alarm
  • #951: Fix and improve api-management-command script
  • #956: BlockBlock Block is the abstract term used to describe units of markup that, composed together, form the content or layout of a webpage using the WordPress editor. The idea combines concepts of what in the past may have achieved with shortcodes, custom HTML, and embed discovery into a single consistent API and user experience. malicious ASNs and UA string pattern 2024-06-27/28 incident

Management

  • #955: 🔄 synced file(s) with WordPress/openverse

Closed issues

API

  • #950: Disable migrationMigration Moving the code, database and media files for a website site from one server to another. Most typically done when changing hosting companies. running during management command executions

Frontend

  • #927: Change frontend edge caching rules

Infra

  • #254: Audit logging costs and find savings
  • #792: Add EC2 instance state change monitor
  • #927: Change frontend edge caching rules
  • #941: Wire up Grafana PDC
  • #943: Add Cloudflare WAF skip rule for Airflow
  • #950: Disable migration running during management command executions

#openverse, #week-in-openverse

A week in Openverse: 2024-06-17 – 2024-06-24

openverse

Merged PRs

Analytics

  • #4330: Add catalog indexer worker

APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways.

  • #4330: Add catalog indexer worker
  • #4500: Publish changelog for api-2024.06.17.15.33.56
  • #4508: Log dead link verification request timings

Catalog

  • #4330: Add catalog indexer worker
  • #4473: Fix trim and deduplicate tags deduplication
  • #4483: Changes all sensible occurrences of the just commands to have them run using ov
  • #4488: Publish changelog for catalog-2024.06.13.17.07.54
  • #4501: Publish changelog for catalog-2024.06.17.15.33.56
  • #4502: Update dependency apacheApache Apache is the most widely used web server software. Developed and maintained by Apache Software Foundation. Apache is an Open Source software available for free.-airflow to v2.9.2 [SECURITY]
  • #4524: Explicitly include FilterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. Data step in ingestion server removal IP
  • #4532: Bump requests from 2.31.0 to 2.32.2 in /indexer_worker

Documentation

  • #4465: Add data flow diagram for various ETL steps in pipelines
  • #4483: Changes all sensible occurrences of the just commands to have them run using ov
  • #4486: Update works count on the frontend
  • #4488: Publish changelog for catalog-2024.06.13.17.07.54
  • #4498: Publish changelog for frontend-2024.06.17.15.33.55
  • #4500: Publish changelog for api-2024.06.17.15.33.56
  • #4501: Publish changelog for catalog-2024.06.17.15.33.56
  • #4518: Update current_maintainers.md to add @zackkrida
  • #4524: Explicitly include Filter Data step in ingestion server removal IP

Frontend

  • #4446: Stop opening links in a new tab
  • #4483: Changes all sensible occurrences of the just commands to have them run using ov
  • #4486: Update works count on the frontend
  • #4498: Publish changelog for frontend-2024.06.17.15.33.55

Infra

  • #4491: Use a persistent container for `ov`
  • #4508: Log dead link verification request timings
  • #4527: Set `ov` workdir to current working directory

Ingestion Server

  • #4330: Add catalog indexer worker
  • #4519: Bump urllib3 from 2.2.1 to 2.2.2 in /ingestion_server
  • #4524: Explicitly include Filter Data step in ingestion server removal IP

Management

  • #4330: Add catalog indexer worker
  • #4483: Changes all sensible occurrences of the just commands to have them run using ov
  • #4496: Fix ov reference in hooksHooks In WordPress theme and development, hooks are functions that can be applied to an action or a Filter in WordPress. Actions are functions performed when a certain event occurs in WordPress. Filters allow you to modify certain functions. Arguments used to hook both filters and actions look the same.
  • #4503: Bump urllib3 from 2.1.0 to 2.2.2 in /utilities/project_planning
  • #4504: Make read contents permission explicit for PR automations
  • #4506: Prevent concurrency between release app and draft releases
  • #4511: Bump urllib3 from 2.2.1 to 2.2.2 in /automations/python
  • #4525: Make `ov clean` work when a container, image or volume does not exist
  • #4537: Sync the dependencies for PR automation init workflow to infra repo

Closed issues

API

  • #3199: Avoid API failure when requests URLURL A specific web address of a website or web page on the Internet, such as a website’s URL www.wordpress.org params aren't fully encoded
  • #3480: Bad Request error for url from Europeana when requesting thumbnail

Catalog

  • #4147: Implement new catalog indexer-worker
  • #4456: Update ingestion server removal IP to include plan for filtering tags

Documentation

  • #4455: Document current & desired ETL steps and data flow
  • #4480: Update the record count on the homepage
  • #4482: Update references to our developer tools to have the `./ov` prefix

Frontend

  • #496: Do not open external links in new tabs
  • #519: `Unable to get property 'name' of undefined or null reference` in useMatchRoute on Edge
  • #520: `TypeMismatchError` on search in Edge
  • #4480: Update the record count on the homepage

Infra

  • #4490: Refactor `ov` to create a persistent container

Ingestion Server

  • #4456: Update ingestion server removal IP to include plan for filtering tags

Management

  • #4422: `ov` hooks should reference the `ov` script directly, rather than relying on it being in the PATH
  • #4505: Prevent race condition with "Draft release" and "Release app" workflows

openverse-infrastructure

Merged PRs

Catalog

  • #937: Bump airflow to rel-2024.06.17.15.33.56

Frontend

  • #938: Fix duplicate nuxt alarms clashing

Infra

  • #930: Remove unnecessary policy from ECS task role
  • #936: Fix ansible/exec recipe
  • #938: Fix duplicate nuxt alarms clashing

Ingestion Server

  • #931: Bump ingestion server to rel-2024.06.13.17.07.56

Management

  • #939: 🔄 synced file(s) with WordPress/openverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org.
  • #946: 🔄 synced file(s) with WordPress/openverse

Closed issues

Infra

  • #216: Remove the execution role from the task role
  • #844: Exclude `/api/event` endpoint from Nuxt HTTPHTTP HTTP is an acronym for Hyper Text Transfer Protocol. HTTP is the underlying protocol used by the World Wide Web and this protocol defines how messages are formatted and transmitted, and what actions Web servers and browsers should take in response to various commands. 5XX response alarm

#openverse, #week-in-openverse

A week in Openverse: 2024-06-10 – 2024-06-17

openverse

Merged PRs

APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways.

  • #4415: Add `backfillmoderationdecision` management command
  • #4444: More precisely handle waveform generation failures
  • #4467: Publish changelog for api-2024.06.07.17.19.06

Catalog

  • #4068: Add verbose logging option to `ProviderDataIngester`
  • #4429: Add DAG to trim and deduplicate tags
  • #4447: Capture thumbnails during europeana ingestion
  • #4460: Update the 'updated_on' column during popularity refresh
  • #4481: Moved by tag from the fuzzy match group to exact match

Documentation

  • #4441: Add favicon to Storybook
  • #4466: Publish changelog for frontend-2024.06.07.17.19.06
  • #4467: Publish changelog for api-2024.06.07.17.19.06
  • #4485: Order quickstart links, add missing catalog link
  • #4487: Publish changelog for ingestion_server-2024.06.13.17.07.56

Frontend

  • #4441: Add favicon to Storybook
  • #4442: Tags page copy
  • #4466: Publish changelog for frontend-2024.06.07.17.19.06

Ingestion Server

  • #4487: Publish changelog for ingestion_server-2024.06.13.17.07.56

Management

  • #4441: Add favicon to Storybook
  • #4472: Fix ov corepack and pdm existence issues

Closed issues

API

  • #3641: Create `ModerationDecision` backfill management command
  • #4218: Audio waveform should return 424 instead of 500 when waveform cannot be generated
  • #4474: The API `result_count` is no more than 240 for unauthenticated requests

Catalog

  • #1420: Add verbose logging option to `ProviderDataIngester`
  • #4403: Capture thumbnails during Europeana ingestion
  • #4453: Remove deny-listed tags in the catalog with the `batched_update` DAG
  • #4464: Move "by" tag contains filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. to tag exact match filter

Documentation

  • #4479: Link to the catalog quickstart guide from the central quickstart page

Infra

  • #2037: Move OpenverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org. API and catalog to `openverse.org` subdomains
  • #4489: Environment variables set when running `ov` not passed to the container

Management

  • #4468: `ov` will hang silently if `corepack` is used and there is an update to PNPM
  • #4469: `ov` does not capture error output if `pdm` not installed on host

openverse-infrastructure

Merged PRs

Infra

  • #920: Remove openverse.engineering Cf Access rules and update documentation
  • #928: Move Nuxt 3 to prod, create new listener rule for split testing

Management

  • #926: 🔄 synced file(s) with WordPress/openverse

Closed issues

Infra

  • #609: Use pre-commit and lint setup identical to the monorepo
  • #785: Remove any remaining Cloudflare resources from `openverse.engineering` zone

Management

  • #438: Enable merge queues and require PRs to be up-to-date before merging
  • #609: Use pre-commit and lint setup identical to the monorepo

#openverse, #week-in-openverse

A week in Openverse: 2024-06-03 – 2024-06-10

openverse

Merged PRs

APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways.

  • #4402: Rename ContentProvider to ContentSource
  • #4419: Update docker.io/redis Docker tag to v7.2.5
  • #4434: Publish changelog for api-2024.06.03.15.35.02
  • #4440: Handle tags without provider in media admin view

Catalog

  • #4366: Add catalog media properties documentation

Documentation

  • #4366: Add catalog media properties documentation
  • #4432: Update docs to recommend blobless cloning strategy
  • #4435: Add a link to the committer announcements in the committer docs
  • #4436: Update assets in the documentation
  • #4448: Updated Playwright Codegen broken link
  • #4449: Jest docs broken link fixed

Frontend

  • #4420: Update pnpm to v9.1.4
  • #4423: Update Node.js to v20.14.0
  • #4424: Update dependency @playwright/test to v1.44.1
  • #4425: Update dependency eslint-pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party-tsdoc to ^0.3.0
  • #4426: Update dependency prettier-plugin-tailwindcss to v0.6.1
  • #4428: Ensure required DB extension is installed before attempting to setup plausible
  • #4431: Add Nuxt 3 folders to gitignore
  • #4433: Publish changelog for frontend-2024.06.03.15.35.03
  • #4437: Delete `frontend/src/stories/` directory
  • #4445: Update pnpm to v9.2.0

Ingestion Server

  • #4418: Update dependency elasticsearch to v8.13.2
  • #4443: Revert "Save cleaned data of Ingestion Server to AWS S3 (#4163)"

Management

  • #4392: Add load testing script for frontend
  • #4416: Move NGINXNGINX NGINX is open source software for web serving, reverse proxying, caching, load balancing, media streaming, and more. It started out as a web server designed for maximum performance and stability. In addition to its HTTP server capabilities, NGINX can also function as a proxy server for email (IMAP, POP3, and SMTP) and a reverse proxy and load balancer for HTTP, TCP, and UDP servers. https://www.nginx.com/.-based services out of the API profile
  • #4421: Update workflows
  • #4438: Overhaul the complete labelling system
  • #4450: Fix incorrect brackets in PR automation
  • #4451: Update pr_automations.yml with missing character
  • #4462: Bump tornado from 6.4 to 6.4.1 in /utilities/project_planning

Closed issues

API

  • #3943: Implement logging for moderation events
  • #3944: Implement and surface value-based deferred metrics
  • #3946: Implement and surface list-based deferred metrics
  • #4289: CI + CD builds `nginx` image during API up
  • #4346: Rename the `ContentProvider` model to `ContentSource`
  • #4439: `/api/api/admin/media_report.py, line 387, in change_view` can fail if the tag does not have a provider

Catalog

  • #2187: Create the media properties description file
  • #4255: iNaturalist is no longer able to access S3

Documentation

  • #4329: Dramatically improve cloning speed for contributors
  • #4395: Add a favicon to our Docs site

Frontend

  • #3972: Update references to audio works to use "audio track(s)"
  • #4391: Create a script for load-testing the frontend

Management

  • #1968: Implementation Plan: Computer vision metadata for content reports
  • #3823: Seek alternatives to `banyan/auto-label`
  • #4203: Stack label is not applied to contributor PRs
  • #4391: Create a script for load-testing the frontend
  • #4400: Local Plausible setup can fail

openverse-infrastructure

Merged PRs

Infra

  • #916: Redirect all .engineering API requests
  • #918: Add nuxt-preview cache rule
  • #921: Update .engineering to .org redirect to exclude GutenbergGutenberg The Gutenberg project is the new Editor Interface for WordPress. The editor improves the process and experience of creating new content, making writing rich content much simpler. It uses ‘blocks’ to add richness rather than shortcodes, custom HTML etc. https://wordpress.org/gutenberg/ media inserter requests
  • #922: Bypass cache and WAF for non-production frontends with load testing UA string

Management

  • #923: Add Princewill Onyenanu (madewithkode) as a committer

Closed issues

API

  • #781: Open PR in Gutenberg to point integration to `api.openverse.org`
  • #782: Open PR to point Jetpack integration to api.openverse.org
  • #783: Remove headerHeader The header of your site is typically the first thing people will experience. The masthead or header art located across the top of your page is part of the look and feel of your website. It can influence a visitor’s opinion about your content and you/ your organization’s brand. It may also look different on different screen sizes. check from Cloudflare redirect rule

Infra

  • #779: Redirect production API requests to `api.openverse.org` when a special testing header is present
  • #784: Replace API openverse.engineering Cloudflare domain records with noops
  • #787: Downgrade openverse.engineering Cloudflare plan to the free tier
  • #917: Add cache rules for `nuxt-preview.openverse.org` to not cache it in Cloudflare

Management

  • #740: PR labeller should apply stack labels for infrastructure repo

#openverse, #week-in-openverse

Openverse maintainers welcome Princewill Onyenanu as a new committer

It gives us great pleasure to announce that Princewill Onyenanu (@madewithkode) has been added as a committer to the OpenverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org. project! His contributions to Openverse across the stack as well as his work on our documentation (for the catalog, Airflow, and in general) have helped us address long-desired improvements throughout the project. We’re thankful for his continued effort as a community contributor!

#openverse-committers

Community Meeting Recap (2024-06-03)

[Meeting start]

This week we discussed our process for updating data in the Catalog, and in particular the use of the batched update DAG versus the introduction of cleanup steps into the data refresh. As a result of this discussion noted here, this PR to decode and deduplicate tags during the data refresh was closed in favor of a solution that uses the batched update DAG, similar to this PR which trims and deduplicates tags using batched updates.

[Meeting end]

#openverse-weekly-community-meeting

A week in Openverse: 2024-05-27 – 2024-06-03

openverse

Merged PRs

APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways.

  • #4198: Warn on `license_url` computation in the API
  • #4360: Add favicon to Django API
  • #4372: Reduce permissions of default authentication scope
  • #4376: Configure IPython configuration dir in the API
  • #4386: Make media items the centre for all moderation activity
  • #4387: Make miscellaneous improvements to the API developer experience
  • #4394: Publish changelog for api-2024.05.27.15.21.38
  • #4397: Revert "Change search query approach to include only available providers (#4238)"
  • #4398: Publish changelog for api-2024.05.28.21.25.54
  • #4414: Add report creation, better filtering and more improvements to admin views for media

Catalog

  • #4370: Modify `add_license_url` DAG to use `batched_update`
  • #4385: Always assume special urgency for contributor PR pings
  • #4388: Added documentations for how to run DAGs in development alongside how to add new documentations.

Documentation

  • #4385: Always assume special urgency for contributor PR pings
  • #4388: Added documentations for how to run DAGs in development alongside how to add new documentations.

Frontend

  • #4339: Fix recent searches keyboard navigation
  • #4393: Publish changelog for frontend-2024.05.27.15.21.40
  • #4396: Improve accessibilityAccessibility Accessibility (commonly shortened to a11y) refers to the design of products, devices, services, or environments for people with disabilities. The concept of accessible design ensures both “direct access” (i.e. unassisted) and “indirect access” meaning compatibility with a person’s assistive technology (for example, computer screen readers). (https://en.wikipedia.org/wiki/Accessibility) labels for filters tab and button

Ingestion Server

  • #4382: Drop `ORDER BY` clause from copy step when adding a limit
  • #4390: Publish changelog for ingestion_server-2024.05.27.13.36.10

Management

  • #4343: Dockerfy the OpenverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org. development environment
  • #4389: Fix path to banner in `README.md`
  • #4396: Improve accessibility labels for filters tab and button
  • #4399: Add recipes for cleaning up
  • #4401: Make Dockerfied development environment compatible with macOS
  • #4409: Add support for aliases to `ov`
  • #4410: Ignore v8 compile cache

Closed issues

API

  • #3638: Add content moderation actions to expanded media admin view
  • #3639: Soft lock moderation actions for works in review by a moderator
  • #4324: Reduce permissions of default authentication scope
  • #4341: Add favicon to Django API
  • #4412: Make admin media endpoint work for all media items not just those with reports
  • #4413: Include report filing in admin media view

Catalog

  • #1093: Remove the Community Involvement handbook page
  • #1095: Remove the provider ingestion script refactor handbook page
  • #3885: Backfill `license_url` field for images where it's null in the meta_data
  • #4348: The `add_license_url` DAG keeps timing out

Documentation

  • #4356: Create a document for how to start the catalog stack and run a DAG for testing

Frontend

  • #480: Refactor recent searches to reduce code duplication
  • #957: Better accessible name for Filters button and tab
  • #3195: Looped in the recent searches when browsing with keyboard

Ingestion Server

  • #4381: Drop `ORDER BY` clause from copy step of image data refresh when adding a limit

Management

  • #2068: Make linting more contributor-friendly
  • #4137: Create a "dev dependencies check" script for identifying what a contributor may need
  • #4327: Update the PR Review Reminder DAG with special timing for non-maintainers
  • #4404: Replace single usage of perl with Python
  • #4407: Add support for aliases in `ov`

openverse-infrastructure

Merged PRs

Management

  • #913: 🔄 synced file(s) with WordPress/openverse

Closed issues

API

  • #895: Incorrect log line separator definition for API

Documentation

  • #786: Update references to openverse.engineering domains in our public and internal documentation to use openverse.org instead

Infra

  • #895: Incorrect log line separator definition for API

#openverse, #week-in-openverse

A week in Openverse: 2024-05-20 – 2024-05-27

openverse

Merged PRs

APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways.

  • #4238: Change search query approach to include only available providers
  • #4334: Add 'revoked' field to ThrottledApplication to enable easily revoking access to client applications violating openverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org. TOS
  • #4362: Publish changelog for api-2024.05.20.15.14.53
  • #4377: Publish changelog for api-2024.05.23.15.02.00
  • #4380: Remove overridden function that doesn't do anything over super

Catalog

  • #4297: Set up airflow variable defaults with descriptions automatically
  • #4345: Fix SlackSlack Slack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/. message formatting for ES health alert
  • #4357: Convert longer media `varchar` fields to `text` in the catalog db
  • #4369: Use `.venv` for catalog virtualenv instead of `venv`
  • #4378: Update dependency apacheApache Apache is the most widely used web server software. Developed and maintained by Apache Software Foundation. Apache is an Open Source software available for free.-airflow to v2.9.1 [SECURITY]

Documentation

  • #4302: Implementation Plan: Machine-generated tags on the frontend
  • #4326: Document retired node replacement in ES
  • #4383: Update link to openverse-attribution documentation

Frontend

  • #4313: Add frontend media documentation
  • #4361: Publish changelog for frontend-2024.05.20.15.14.53
  • #4363: Fix frontend to include languages that do not have iso-639-1 codes
  • #4368: Install caniuse-lite as a frontend dev dependency
  • #4375: Only set the user-agent headerHeader The header of your site is typically the first thing people will experience. The masthead or header art located across the top of your page is part of the look and feel of your website. It can influence a visitor’s opinion about your content and you/ your organization’s brand. It may also look different on different screen sizes. on the server

Ingestion Server

  • #4357: Convert longer media `varchar` fields to `text` in the catalog db
  • #4358: Add logs to cleaning steps in the ingestion server and skip saving tags
  • #4364: Publish changelog for ingestion_server-2024.05.20.19.47.22
  • #4365: Bump requests from 2.31.0 to 2.32.0 in /ingestion_server

Management

  • #4384: Bump requests from 2.31.0 to 2.32.2 in /automations/python

Closed issues

API

  • #673: Move audio thumbnail retrieval into grouped query
  • #688: Use domain in primary API docs README
  • #694: The mature filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. is not working
  • #736: Use alternate method for getting fast subset of rows
  • #739: Notifications when receiving content reports
  • #1055: Test issue to check the CI
  • #1232: Integrity error causes oauth registration view to 500
  • #4076: Exclude media from sources without `ContentProvider` record from search
  • #4321: Add ability to revoke access to specific Openverse API registered client applications

Catalog

  • #1436: Configure pools & priority weights
  • #4109: Use `.venv` for catalog virtualenv
  • #4202: Set up Airflow Variable defaults with descriptions automatically
  • #4312: Convert longer media `varchar` fields to `text` in the catalog database

Documentation

  • #4039: Implementation Plan: Determine and design how machine-generated tags will be displayed/conveyed in the Frontend

Frontend

  • #2766: Set UA string for frontend API requests server-side only
  • #2904: Refused to set unsafe header "User-Agent"
  • #4025: Write TSDoc to document frontend fields
  • #4367: Browserlist (caniuse-lite) DB needs updating on the frontend

openverse-infrastructure

Merged PRs

API

  • #894: Improve support for initializing ES nodes in the userdata script and ansible playbook

Documentation

  • #912: Update contact information for Europeana

Infra

  • #884: Convert Kibana to `immutable-ec2-service`
  • #891: Use non-inference based container definition sensitivity filtering
  • #893: Remove dangling references to airflow.openverse.engineering
  • #905: Challenge repeat 401/403 requesters
  • #906: Fix immutable ec2 service deployDeploy Launching code from a local development environment to the production web server, so that it's available to visitors. workflow expression usage
  • #907: Include user's SSHSSH Secure SHell - a protocol for securely connecting to a remote system in addition to or in place of a password. configuration file

Ingestion Server

  • #908: Rollback `prod` ingestion server, bump `dev`, re-enable data refresh limit and set `CLEANUP_BUFFER_SIZE`

#openverse, #week-in-openverse