Secrets and configuration – pragmatic approach

Most of programs, services, scheduled jobs, scripts that we create are likely need to connect to an external resource to pull or push some data. External resource can be a database, distributed cache, message queue, object store and so on. In order to connect to a resources we need at minimum an address and a secret (user name and password). If we have multiple environments where the code runs, we are going to have a combination of credentials for every environment.

How do we handle all of this in the code?

Approach

All information that our code needs to run we can call Environment. The Environment is represented by two classes of data: configuration and secrets. Configuration is public information, it is not protected and already shared. Examples of Configuration are database server domain, service discovery URI, proxy URI and etc. Secrets is private and carefully protected information. Under any circumstance it should not be shared or made public. Examples of secrets are user names, passwords, private keys, JWT access and refresh tokens and etc.

Environment consists of Configuration and Secrets

There are few approaches to setup and read Environment related data from application code.

  • Hardcoded secrets and configuration.
  • Configuration File.
  • Read from a secret store.
  • Read from environment variables.

Hardcode is the simplest, but most dangerous and less versatile approach. Even for experimenting or doing a proof-of-concept hardcoding secrets is dangerous, because the code can be accidentally committed and secrets become public. There are malicious crawlers that scan code in public repositories trying to find and extract secrets. Even if we delete accidentally committed secret from the repository, a crawler may have already fetched it. Pushing a delete commit on top of leaked secret does not save us either, the commit with the secret is still in the git tree. We deliberately need to remove the commit from the git tree. However, my advice would be to change the secret immediately after you discover it was leaked. If we have multiple environments where the code runs it is not convenient to hardcode even configuration. So in general hardcoding is not something that I would recommend.

Reading configuration from a config file is not a bad idea at all. Keeping secrets in a config file is ok for local or development environments. For production environment, however, we need to make sure the file is protected against unauthorized access. Reading from a config file in the code implies a dependency on the config file format and structure. This may not be ideal if we share configuration files between multiple applications, especially if the applications are written in different programming languages. Each application would needs to implement code to read the file and understand it’s format and structure. Secondly, if we need to change shared file structure or format we’ll have to introduce changes to all application that read the file and therefore risk to break some applications.

Reading from a secret store is another approach, for example AWS Secret Manager. However code becomes bound to the type of a secret store.

Reading configuration and secrets from environment variables is probably most universal approach. I can’t recall any programming language that doesn’t have capabilities to read environment variables. This approach is also infrastructure agnostic, secrets and configuration can be read from anywhere and passed as environment variables before starting an application. The approach helps to decouple source of secrets and configuration from application code, therefore you can change the source without changing the code. The source can be a file, secret store or anything else. Because this approach is so versatile, we are going to focus primarily on it.

Starting application with Environment Variables

The idea is to read environment variables from a single or multiple sources before application starts and pass secrets and configuration as environment variables to the application process. Shell script below demonstrates how we can read configuration and secrets and pass it to Python program.

#!/bin/bash

WORKING_DIR="$(dirname "${BASH_SOURCE[0]}")"

read_secret="$WORKING_DIR/../read_secret.sh"
read_config="$WORKING_DIR/../read_config.sh"

# Reading environment varaibles and passing them to
# Python process
DB_USERNAME=$($read_secret db_username) \
DB_PASSWORD=$($read_secret db_password) \
DB_SERVER=$($read_config db_server) \
DB_SERVER_PORT=$($read_config db_server_port) \
python3.7 $WORKING_DIR/simple_app.py  "$@"

read_secret and read_config are introduced for demonstration purpose. There will most likely be a CLI available to read the secrets and configuration in your environment or machines. In our example read_secret and read_config play the role of such CLI.

Now we need to write code to read process environment variables. It is a good idea to keep such code in one place, so for that we can create Environment class that is going to be responsible for retrieving all the secrets and configuration from the process environment.

class Environment:
    @classmethod
    def db_user_name(self):
        return os.environ["DB_USERNAME"]

    @classmethod
    def db_password(self):
        return os.environ["DB_PASSWORD"]

    @classmethod
    def db_server(self):
        return os.environ["DB_SERVER"]

    @classmethod
    def db_server_port(self):
        return os.environ["DB_SERVER_PORT"]

Environment class can now be used in our program, for example in Repository class to read secrets and configuration necessary to construct a database connection string.

class Repository:
    def connect(self):
        if Environment.db_server() and Environment.db_password() and \
                Environment.db_user_name() and Environment.db_server_port():
            self._connection_string = (
                f"postgresql://"
                f"{Environment.db_user_name()}:{Environment.db_password()}@"
                f"{Environment.db_server()}:{Environment.db_server_port()}"
            )
        else:
            raise Exception("Unable to build connection string due to "
                            "missing configuration")

Environment Variables and Tests

Our automated tests, especially integration and system tests, will need access to secrets and configuration. To solve this we can setup environment variables before tests are run. This can be accomplished by reading secrets and configuration using same approach as on application startup and then injecting data into environment variables. In our case we are going to use read_secret and read_config. Notice, that we avoid code duplication that reads secrets and configuration and we don’t need to hardcode anything, even for tests.

import os
import subprocess

ENVIRONMENT_READ_TOOLS_PATH = "."

def _read_secret(secret_name: str) -> str:
    result = subprocess.run(
        [f"{ENVIRONMENT_READ_TOOLS_PATH}/read_secret.sh", secret_name],
        stdout=subprocess.PIPE)
    secret = result.stdout.decode('utf-8')
    return str.strip(secret)


def _read_config(config_name: str) -> str:
    result = subprocess.run(
        [f"{ENVIRONMENT_READ_TOOLS_PATH}/read_config.sh", config_name],
        stdout=subprocess.PIPE)
    config = result.stdout.decode('utf-8')
    return str.strip(config)

# Injecting environment variables
os.environ["DB_USERNAME"] = _read_secret("db_username")
os.environ["DB_PASSWORD"] = _read_secret("db_password")
os.environ["DB_SERVER"] = _read_config("db_server")
os.environ["DB_SERVER_PORT"] = _read_config("db_server_port")

One more thing that’s left is to ensure that environment variables are injected before the first test runs. It would also be nice if can ensure that the code executes only once per test session. We can code this ourselves, but pytest has amazing autouse fixtures. You can read more about it here. We can place environment injection code inside a session scoped autouse fixture.

@pytest.fixture(scope="session", autouse=True)
def test_env():
    os.environ["DB_USERNAME"] = _read_secret("db_username")
    os.environ["DB_PASSWORD"] = _read_secret("db_password")
    os.environ["DB_SERVER"] = _read_config("db_server")
    os.environ["DB_SERVER_PORT"] = _read_config("db_server_port")

scope="session" means that the fixture will be executed only once per test session. autouse=True makes the fixture to be executed implicitly before the first test runs. Now we can add a test to make sure that this indeed holds true.

import src.simple_app as simple_app


def test_db_username_is_present():
    assert simple_app.Environment.db_user_name()


def test_db_password_is_present():
    assert simple_app.Environment.db_password()


def test_db_server_is_present():
    assert simple_app.Environment.db_server()


def test_db_server_port_is_present():
    assert simple_app.Environment.db_server_port()

Conclusion

To summarize the approach we took to handle secrets and configuration let’s literate over main ideas.

  • When starting a program, script, application, service and etc. we pass secrets and configuration as process environment variables. This approach supports almost any source of configuration and secrets, most of operating systems and deployment environments.
  • Application implements a class or module that reads configuration and secrets from environment variables, other classes use the class or module to access secrets and configuration.. This helps to keep application code agnostic to the source of configuration and secrets.
  • For tests that require secrets and configuration we can implement a fixture or take similar approach to inject environment variables from the same source as on application startup.

Code examples can be found on github.com/PavelHudau.

Posts created 28

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Posts

Begin typing your search term above and press enter to search. Press ESC to cancel.

Back To Top