Keeping credentials out of the repo

24 June 2021

You use credentials in your (jupyter) notebook or python script and you are about to check it in to a (git) repository. What’s the quickest way to get the credentials into a separate file and load them?

Firstly, create a creds.env file with key value pairs in it in a similar way to a shell script

username="bob"
password="topsecret"
server="server.location"
database="database"

Secondly, add this to your .gitignore

echo creds.env >> .gitignore

Note that this assumes you are in the root of the repo so if not you will need to adjust the path to the .gitignore file

Thirdly, this is the minimal code to get a dict set up in python with the creds from your creds.env.

from pathlib import Path

# Note that this just takes the first match of *.env in the current directory
p = Path("creds.env")

# Need the first line as f-strings may not contain slashes
key_val_pairs = p.read_text().replace("\n", ",\n")
exec(f"creds = dict({key_val_pairs})")
# creds is now a dict with the key value pairs in

Clearly one should be careful about using exec in a production/professional environment because it facilitates arbitrary code execution but if we’re aiming for efficiency five lines is not bad.

By way of example one can then use this dictionary to construct e.g. a connection string to a database.

import pandas as pd
from sqlalchemy import create_engine

def js(s):
    return "".join(s.split())

conn_str = js(
    f"""
mssql+pyodbc://
{creds['username']}:{creds['password']}@{creds['server']}/{creds['database']}
?driver=SQL+Server
"""
)

query = "SELECT * FROM information_schema.tables;"

engine = create_engine(conn_str)
with engine.connect() as c:
    df = pd.read_sql(query, c)

Something a bit more robust

Let’s come up with something a bit more robost which:

  • doesn’t use exec
  • uses shell script syntax, such that we also have the option of running source creds.env in the shall to have exvironment variables available to us there too

Our creds.env now looks like:

export USERNAME="bob"
export PASSWORD="topsecret"
export SERVER="server.location"
export DATABASE="database"

And our python might look something like this:

import pathlib
import sys
import functools

from typing import Final

stderr_print = functools.partial(print, file=sys.stderr)


def bail(*args, **kwargs):
    stderr_print(*args, **kwargs)
    sys.exit(1)


def read_creds() -> dict[str, str]:
    """Assumes 'creds.env' in current working directory"""

    def remove_punctuation(s):
        return s.replace('"', "").replace("'", "")

    equals: Final = "="
    creds_file_name: Final = "creds.env"

    # Load creds file contents
    fd = pathlib.Path(creds_file_name)
    if fd.exists() and fd.is_file():
        pass
    else:
        bail(f"{fd.resolve()} does not seem to exist")

    list_of_lines = [line.replace("export ", "") for line in fd.read_text().strip().split("\n")]

    # Must be an '=' in every line
    for line in list_of_lines:
        assert equals in line

    # Split on '=' and remove punctuation
    new_list_of_lines = [[remove_punctuation(s) for s in line.split(equals)] for line in list_of_lines]

    # [['USERNAME', 'bob'], ['PASSWORD', 'topsecret'], ['SERVER', 'server.location'], ['DATABASE', 'database']]
    creds_d = dict()
    for list_ in new_list_of_lines:
        k, v = list_
        creds_d[k] = v

    # Return dict of creds
    return creds_d