2
\$\begingroup\$

I have a scenario where I need to store values from some environment variables (names of which I cannot control). My first run at this was pretty basic, however this is not usable as an exception would be raised for the first missing environment variable only. A requirement is the capture all missing environment variables and raise an exception listing them all.

For this, I have come up with two possible techniques, but I'm not sure which would be considered the most Pythonic, or whether there would be a better way of doing it?

Note that the name of the variable I store may not match the name of the environment variable (as some are very long, but I cannot change them).

Thoughts and criticisms on this techniques would be applicated.

Technique 1

Each environment variable is checked individually. While this looks reasonable, in reality there will be 15-20 environment variables to set.

environ_errors = []
environ1 = (os.environ.get('ENVIRON1')
    or environ_errors.append('ENVIRON1'))
environ2 = (os.environ.get('ENVIRON2')
    or environ_errors.append('ENVIRON2'))
environ3 = (os.environ.get('ENVIRON3_IS_A_REALLY_LONG_NAME')
    or environ_errors.append('ENVIRON3_IS_A_REALLY_LONG_NAME'))

if len(environ_errors) > 0:
    base_err = 'One or more required environment variables do not exist:'
    err_msg = ('%s %s' % (base_err, ', '.join(environ_errors)))
    raise Exception(err_msg)

Technique 2

Use a dict to store the variable name and environment variable to target, then iterate over the dict and update the value of each with the value of the corresponding environment variable.

My issue with this is that I won't be able to use python variables to represent each environment variable, but instead they are all in one dict.

I'm also unsure if updating a dict during a for loop is considered Pythonic.

environs = {
    'environ1': 'ENVIRON1',
    'environ2': 'ENVIRON2',
    'environ3': 'ENVIRON3_IS_A_REALLY_LONG_NAME'
}
environ_errors = []
for k, v in environs.items():
    try:
        environs[k] = os.environ[v]
    except KeyError as e:
        environs_error.append(e.args[0])

if len(environ_errors) > 0:
    base_err = 'One or more required environment variables do not exist:'
    err_msg = ('%s %s' % (base_err, ', '.join(environ_errors)))
    raise Exception(err_msg)
\$\endgroup\$
4
  • \$\begingroup\$ Do you know / have control of where these specific env variables are being stored? .env file / some other place or do you have to hardcode them in your code? \$\endgroup\$ Commented Jul 4 at 10:59
  • \$\begingroup\$ The scenario is that the Python script will be run as part of an Init Container (for a Kubernetes Job). The environment variables will be set through a Helm chart and will be injected into the container. As the environment variables are already being used in the 'main' container, for consistency reasons I have to use the same names. But the variable names can (should) be shorter. \$\endgroup\$
    – David Gard
    Commented Jul 4 at 11:05
  • \$\begingroup\$ Thanks to the time taken by genuinely helpful users, I have a better technique for achieving my goals. Sadly an over zealous moderator won't let me share that technique, and has deleted comments they made so that others are not aware. \$\endgroup\$
    – David Gard
    Commented 5 hours ago
  • \$\begingroup\$ You can open a new question here on CR if you want other solutions to be reviewed ^^ Editing your question directly is against the rules as it modifies the initial request \$\endgroup\$ Commented 3 hours ago

3 Answers 3

6
\$\begingroup\$

Neither of your options is very well-suited to static type analysis. To improve that, consider instead something like

import inspect
import os
import typing


class ConfigEnv(typing.NamedTuple):
    PS1: str
    PWD: str
    DOES_NOT_EXIST: str
    ALSO_MISSING: str

    @classmethod
    def from_environ(cls, environ: dict[str, str] | None = None) -> typing.Self:
        if environ is None:
            environ = os.environ
        needed = inspect.get_annotations(cls).keys()
        missing = needed - environ
        if missing:
            raise EnvironmentError(
                'Missing environment variable ' + ', '.join(missing)
            )
        return cls(**{k: environ[k] for k in needed})


env = ConfigEnv.from_environ()
print(env)

Once the dust settles and you have an object, you no longer have to rely on a dictionary.

\$\endgroup\$
2
  • 2
    \$\begingroup\$ Using a dataclass instead would allow using dataclasses.fields(cls) - probably a bit cleaner than inspect.get_annotations. \$\endgroup\$
    – STerliakov
    Commented Jul 4 at 21:07
  • 2
    \$\begingroup\$ @SUTerliakov dataclasses have a vastly more complex API and are overkill here. \$\endgroup\$
    – Reinderien
    Commented Jul 4 at 21:57
5
\$\begingroup\$

The second technique, where you iterate over a sequence of name pairs, is much much better. There is nothing wrong with catching KeyError, but you may find it more convenient to use os.getenv(v), which will return None if the environment variable is missing.

Raising Exception is always a bad idea. Why? It prevents caller from catching fine grained exceptions. Definitely prefer to raise KeyError at the end. Raising an exception is good, but it should be more specific than the Exception class. Either use the pre-existing KeyError, or define an app-specific exception class, perhaps named EnvError or ConfigError.

It’s enough to say if environ_errors:, without mentioning len().

The base_err could simply explain “missing env var(s)”.

Avoid using the % percent operator for string formatting. Here, a simple + catenation suffices. Prefer an f-string for fancier formatting needs.

If you’re not fond of [ ] de-referencing a dict for some reason, you can always define an env result object, and then do setattr(env, k, os.getenv(v)). That lets the caller use env.a and env.b notation to access individual attributes.

\$\endgroup\$
2
  • \$\begingroup\$ Thanks for your reply. I'm unsure if you are suggesting I get rid of the raise statement entirely, or if you were meaning that I shouldn't re-raise? I have added Technique 3 to be question which I believe captures what you have suggested, though if you could critique that suggestion that would be great. With regards to your other points - I've removed len(), I sadly cannot change the error message, and I use % because the message will be logged using the logging module, and that's what the docs suggest to use. \$\endgroup\$
    – David Gard
    Commented Jul 4 at 11:25
  • 1
    \$\begingroup\$ @DavidGard Raising is fine, but raising the basic ˋExceptionˋ type is not. Use a ˋKeyErrorˋ or an even better-suited exception type. \$\endgroup\$ Commented Jul 6 at 8:51
3
\$\begingroup\$

Both techniques have their pros and cons, but there is a slightly (IMO) better way to achieve this using a dictionary to map environment variables to their corresponding variable names, as in your second example, but with some small improvements.

Here’s a refined version that keeps your dictionary structure, but uses a more concise approach to handle missing environment variables.

Steps:

  • use a dictionary to map environment variable names to their corresponding variable names.
  • iterate over the dictionary to fetch the environment variables.
  • store the values in a new dictionary or individual variables.
  • collect all missing variables and raise an exception if any are missing.

Suggested code:

import os


ENVIRON_MAPPING = {
    'environ1': 'ENVIRON1',
    'environ2': 'ENVIRON2',
    'environ3': 'ENVIRON3_IS_A_REALLY_LONG_NAME'
}

env_values = {}
environ_errors = []

for var_name, env_name in ENVIRON_MAPPING.items():
    value = os.environ.get(env_name)
    if value is None:
        environ_errors.append(env_name)
    else:
        env_values[var_name] = value


if environ_errors:
    base_err = 'One or more required environment variables do not exist:'
    err_msg = f"{base_err} {', '.join(environ_errors)}"
    raise Exception(err_msg)

Additional Note:

If you need to access the variables individually later in the code, you can either keep them in the dictionary and access them as env_values['environ1'] or unpack them into individual variables as shown below:

environ1 = env_values.get('environ1')
environ2 = env_values.get('environ2')
environ3 = env_values.get('environ3')
\$\endgroup\$
1
  • \$\begingroup\$ Thanks for your reply. The idea of a separate ENVIRON_MAPPING dict seems sensible, so I've included that in Technique 3 in my question, along with some other suggestions for other answers so far. \$\endgroup\$
    – David Gard
    Commented Jul 4 at 11:26

Not the answer you're looking for? Browse other questions tagged or ask your own question.