-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set up airflow variable defaults with descriptions automatically #4297
Merged
AetherUnbound
merged 4 commits into
WordPress:main
from
madewithkode:4202_set_up_airflow_variable_defaults_with_descriptions_automatically
May 20, 2024
Merged
Changes from 1 commit
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
3c1ecf3
Automatically set up airflow variable defaults with descriptions
madewithkode 5f1997b
Automatically set up airflow variable defaults with descriptions.
madewithkode d45461f
Automatically set up airflow variable defaults with descriptions.
madewithkode f9b0a5d
Update comments
obulat File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next
Next commit
Automatically set up airflow variable defaults with descriptions
- Loading branch information
commit 3c1ecf32c41dfccca3a45ac4ae0c35f8253011c3
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -61,4 +61,63 @@ while read -r var_string; do | |||||
# only include Slack airflow connections | ||||||
done < <(env | grep "^AIRFLOW_CONN_SLACK*") | ||||||
|
||||||
# Set up Airflow Variable defaults with descriptions automatically | ||||||
# List all existing airflow variables | ||||||
output=$(airflow variables list -o plain) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As you've captured below, Airflow adds
Suggested change
This can also be done for the creation of |
||||||
found_existing_vars=true | ||||||
|
||||||
# if there are no existing variable, print this notification and continue | ||||||
if [[ -z $output || $output == "No data found" ]]; then | ||||||
header "No existing variables found, proceeding to set all variables" | ||||||
found_existing_vars=false | ||||||
fi | ||||||
|
||||||
# Initialize an empty array to store the variables from the output | ||||||
existing_variables=() | ||||||
|
||||||
# Iterate through each variable and add it to $existing_variables | ||||||
while IFS= read -r variable; do | ||||||
# skip airflow's default descriptive 'key' output | ||||||
if [[ $variable == "key" ]]; then | ||||||
continue | ||||||
fi | ||||||
# Append the current variable to the array | ||||||
existing_variables+=("$variable") | ||||||
done <<<"$output" | ||||||
|
||||||
if $found_existing_vars; then | ||||||
header "Found the following existing variables(The values of these will not be overwritten):" | ||||||
for variable in "${existing_variables[@]}"; do | ||||||
echo "$variable" | ||||||
done | ||||||
fi | ||||||
|
||||||
# now iterate through each row of variables.tsv and and only | ||||||
# run airflow variables set --description <description> <key> <value> | ||||||
# if the key doesn't already exist in the database i.e not found in | ||||||
# $existing_variables | ||||||
while IFS=$'\t' read -r column1 column2 column3; do | ||||||
# skip the first meta row | ||||||
if [[ $column3 == "description" ]] || [[ ${existing_variables[*]} =~ $column1 ]]; then | ||||||
continue | ||||||
fi | ||||||
|
||||||
if [ "$column1" != "Key" ]; then | ||||||
airflow variables set --description "$column3" "$column1" "$column2" | ||||||
fi | ||||||
done <"variables.tsv" | ||||||
|
||||||
# Print the new variables list | ||||||
new_varibles_list=$(airflow variables list -o plain) | ||||||
header "The following variables are now set:" | ||||||
echo "$new_varibles_list" | ||||||
|
||||||
# if the last line in variables.tsv did not correctly terminate | ||||||
# with a new line character then this variable would not be empty | ||||||
# and this means the last line would not be read correctly. | ||||||
if [ -n "$column1" ]; then | ||||||
header "Missing new line character detected!!!" | ||||||
echo -e "Last variable added to variables.tsv might not be picked up,\nensure it ends with a new line character and retry." | ||||||
fi | ||||||
|
||||||
exec /entrypoint "$@" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
key default description | ||
SILENCED_SLACK_NOTIFICATIONS {} Configuration for a silencing Slack notifications from a DAG. Mapping of DAG ID to a list of dictionaries containing the following keys: "issue" (a link to a GitHub issue which describes why the notification is silenced and tracks resolving the problem), "predicate" (Slack notifications whose text or username contain the predicate will be silenced, matching is case-insensitive), and "task_id_pattern" (a regex pattern that matches the task_id of the task that triggered the notification, optional). Declaration: https://github.com/WordPress/openverse/blob/d500b7764c411f7d228ae12c57dce519c8709610/catalog/dags/common/slack.py#L72-L86 Example: { "finnish_museums_workflow": [ { "issue": "https://github.com/WordPress/openverse/issues/1605", "predicate": "AirflowTaskTimeout", "task_id_pattern": "clean_data" } ] } | ||
SKIPPED_INGESTION_ERRORS {} Configuration for silencing an ingestion error and preventing a Slack message from being sent. Mapping of DAG ID to a list of dictionaries containing the following keys: "issue" (a link to a GitHub issue which describes why the error is silenced and tracks resolving the problem), and "predicate" (errors whose classname or message contain the predicate will be skipped, matching is case-insensitive). Declaration: https://github.com/WordPress/openverse/blob/6636dcfbb57abca19ef32027975f78548e10411f/catalog/dags/providers/provider_api_scripts/provider_data_ingester.py#L53-L64 Example: { "science_museum_workflow": [ { "issue": "https://github.com/WordPress/openverse/issues/4013", "predicate": "Service unavailable for url" } ] } | ||
CONFIGURATION_OVERRIDES {} DAG configuration overrides for the provider ingestion workflows. Currently only supports overriding the execution timeout for certain tasks, but allows dynamic overrides at DAG run time. Mapping of DAG ID to a list of dictionaries containing the following keys: "task_id" (a regex pattern that matches the task_id of the task to be modified), and "timeout" (str in "%d:%H:%M:%S" format giving the amount of time the task may take, example: 6d:10h:30m). Declaration: https://github.com/WordPress/openverse/blob/2cffcb9f8da6961e84a00854a3cd472fd0f9dad8/catalog/dags/providers/provider_workflows.py#L42-L58 Example: { "brooklyn_museum_workflow": [ { "task_id_pattern": "pull_image_data", "timeout": "10h" } ] } |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add a
header
here, for clarity:The other calls to
header
which have been added below should be changed toecho
instead.