Environment variables are a simple way to configure software. They let you set a variable in the shell so that any processes you run can use the values provided.
This is extremely helpful for cli tools but shouldn't be used for managing configuration of complex software.
In order to understand why, we need to look at the downsides of using environment variables for configuration.
1. All Types Must be Parsed from Strings
Environment variables are always set as strings. This means any value that needs to be set as another type in the software will need to parse it from a string.
For example, if I want an integer from the environment variable "FOO" in python I would need to run the following code:
import os foo = int(os.environ['FOO'])
This requires an understanding of how the programming language parses strings into the appropriate datatypes and requires a lot more testing than directly using native data structures. This problem is also extended to include files.
If you want to configure files like ssh keys or ssl certs then doing so with environment variables can get very messy due to newlines and formatting.
2. No Nesting or Built In Structure
Environment variables are global values, which means that they can have naming collisions. For example, if you need to set the "URL" for multiple downstream services they will need to be prefixed to prevent collisions like this:
FOO_URL=x.com BAR_URL=y.com
This lack of structure leads to long names that can unintentionally overlap with other services or features if you are not careful.
It also prevents encapsulation and grouping of configuration values. Using structured data like JSON or YAML instead can allow you to pass through groups of keys without needing to worry about the parent structure of the configuration.
3. Global Access within a Process
Good software engineering minimises the use of global state so that code is easy to test and trace. Environment variables do not - they actively encourage the use of global state and make it very easy to pull in values from anywhere. The deeper in the code this is the harder it can be to inject configuration and run tests.
import os def a(): foo = int(os.environ['FOO']) return foo * 42 def b(foo): return foo * 42
In the python example above function "a" is much harder to test than function "b" because we have separated the management of configuration from the pure functionality. This doesn't go away by avoiding environment variables but in my experience it is a contributing factor.
4. Need to be Set at Runtime
Environment variables need to be set in the process before running an application. This creates additional complexity in starting and managing processes. If you execute a python application that pulls config from a file it requires no additional shell commands:
python app.py
However if you need to set environment variables it will need to be scripted into the specific shell of the environment:
FOO=BAR && python3 app.py
Summary
Environment variables are a great way to get started in managing software for configuration, however any complex software should replace the use of environment variables with file based configuration like JSON or YAML.
This makes managing the configuration simpler and enables a more mature implementation of types, testing, and process management.
For one example of how to implement configuration files refer to: How to Use Feature Flags.
For more content like this follow or contact me:
- Twitter: @BenTorvo
- Email: ben@torvo.com.au
- Website: torvo.com.au