Skip to content

Implement API/backend versioning #76

Closed
campb303 opened this issue Oct 22, 2020 · 14 comments
Closed

Implement API/backend versioning #76

campb303 opened this issue Oct 22, 2020 · 14 comments
Assignees
Labels
documentation Related to the writing documentation or the tools used to generate docs feature-request Request for functionality that has not already been implemented

Comments

@campb303
Copy link
Collaborator

We need to implement some way of versioning the API and backend to denote changes.

@campb303 campb303 added api tooling Related to tools and utilities for the management of the project feature-request Request for functionality that has not already been implemented labels Oct 22, 2020
@campb303 campb303 added this to the v1 milestone Oct 22, 2020
@campb303 campb303 removed the tooling Related to tools and utilities for the management of the project label Nov 25, 2020
@campb303
Copy link
Collaborator Author

campb303 commented Jan 4, 2021

This could be controlled using the packaging tools mentioned in #159 . MkDocs also supports versioning through plugins. That conversation can be tracked in #25

@campb303 campb303 added the documentation Related to the writing documentation or the tools used to generate docs label Jan 4, 2021
@campb303 campb303 self-assigned this Jan 5, 2021
@campb303
Copy link
Collaborator Author

campb303 commented Jan 5, 2021

Versioning for Python code can be controlled with packaging. See #159

@benne238
Copy link
Collaborator

benne238 commented Jan 21, 2021

Methods for Storing the Version Number

Python offers a couple of methods for saving versions of different modules and packages.

A Dedicated version.py file

In this method, the version.py file would contain a global variable __version__ or something to that extent in which the variable would be set equal to the string value of the version number (in this case, it probably wouldn't make sense to save the version as an integer or something other than a string). The contents of version.py would look like this:

# version.py
__version__ = "1.0"

This file would be dedicated to storing the version number of a package. Depending on our implementation, it might be necessary to have multiple variations of version.py if it is necessary to track the versions of the different backend scripts. In other words, if implementation ECNQueue.py and api.py might need different versions and thus different dedicated version.py scripts, however this would be a bit abnormal to have the api.py and ECNQueue.py scripts versioned differently since they are both a part of the back end

Packaging

The version is tracked in version.py. this would serve the basic need of simply storing the version number. Accessing the version number could be as simple as opening the contents of version.py However, it is possible to set the version attribute of a package by scripting the setup.py script to read the contents of version.py instead of directly setting the version number in setup.py. While this is redundant, identifying the version number, in setup.py makes it is possible to get the version number by using the command line if it is installed as a package:

package_name --version

Note: Updating the version number is a different issue, in that we need to have a standardized or even automated way to update the version number.

This guide offers a more extensive view on the different methods that can be used to store the version number of a package.

@benne238
Copy link
Collaborator

Versioning

https://semver.org/

Using the standard above as a basis, it is relatively easy to update a version based on changes made to the back end. The semvar versioning method uses a standard major.minor.patch syntax, in which each of the different parts of the version are incremented by one based on the changes made to the python script.

  • A major change being anything that would cause anything that depends on the api to no longer function as a result of the change
  • A minor change being anything relating to a feature request that adds additional features but does not cause anything depending on the api to no longer function as a results of the change
  • A patch is anything that slightly changes functionality, typically comparable to a bugfix or anything of that nature. (I would assume an inline documentation update would fall under this category as well)

It seems like there isn't really a way to automate the version number that would be easier than manually incrementing the version based on the extent of a github push. However, a method for accessing versions would need to be implemented for this to make sense for developing the frontend and backend asynchronously.

@benne238
Copy link
Collaborator

benne238 commented Jan 26, 2021

Git Repo + Pip

It is possible to install a python package that is being hosted on git as opposed to pypi. It is as simple as:

pip install git+<git_repo_url>#egg=<python_package_name>

This method requires the presence of a setup.py script within the python package (#egg=<python_package_name> looks for a python package based on the setup.py script.

Furthermore, it is possible to install a package from a repository with a specific tag using this slightly modified command:

pip install git+<git_repo_url>@<tag_name>#egg=<python_package_name>

By using tags, it would then be possible to implement a relatively easy way to version the backend api.

Possible Issues

• A standardized way of versioning would have to be implemented so that frontend development can easily reference/install the appropriate backend version.
• This would cause a change in workflow because commits would also need to be tagged with a version number
• I'm not certain how the front-end references the api, but using this method, it would have to reference the installed package in the dev environment, meaning it may be necessary to separate the frontend and the backend into different branches or even different repositories.

https://matiascodesal.com/blog/how-use-git-repository-pip-dependency/
https://pip.pypa.io/en/stable/reference/pip_install/#git
https://www.google.com/amp/s/www.freecodecamp.org/news/how-to-use-github-as-a-pypi-server-1c3b0d07db2/amp/

@campb303 campb303 modified the milestones: v1-proof-of-concept, v2-production-ready-read-only Feb 5, 2021
@benne238
Copy link
Collaborator

Best Practices

https://packaging.python.org/guides/single-sourcing-package-version/

This guide outlines several different methods of versioning python packages, all of which are more or less the same in that the version is stored in a dedicated file, such as version.py, version.txt, __init__.py or even in the setup.py script.

Each method however, centers around getting the version and defining it in the setup.py script, so the ultimate method of getting the version number from a package is by having it defined in setup.py, but storing the version number can be done in any way, as long as the version ends up being referenced in setup.py, regardless of where it is defined. The version can be stored in

  • version.py as a global variable (__version__) or a "normal" variable (version) as long as it can be read accessed from setup.py
  • __init__.py, similarly to version.py
  • version.txt and simply storing it directly in a text file without declaring it as a variable and just paring the .txt file for the version number within setup.py
  • setup.py, directly storing the variable, with other metadata, in the module that sets up the python package when it is installed

All of the above methods are similar and about as easy to implement, with setup.py being the easiest because no additional parsing of other files is required for this method.

Additional methods

Alternatively, the guide recommends storing the version within github tags (or a different external repository), and accessing the version via github tags. While this is possible, it would prevent github tags from being used for anything else other than versioning, and it is possible to start running into issues with complexity because of the version not being stored locally with the package.

Recommendation

Storing the version number within setup.py should be perfectly acceptable and easy to implement as there is no need to add unnecessary files. This method assumes however, that the two scripts, ecnqueue.py and api.py will be maintained equally and have the same version number. Currently we are already doing this, as the version for webqueueapi is 1.0:

setup.py:

from distutils.core import setup
setup(name='webqueueapi',
      version='1.0',
      py_modules=['api', 'ECNqueue'],
      )

but when the command pip freeze is run in the python environment, the output is not the expected <package_name>==<version_num> syntax, rather it is this:

pip freeze:

...
typing-extensions==3.7.4.3
wcmatch==8.1.1
-e git+git@github.itap.purdue.edu:ECN/webqueue2.git@7fae107d7bc7d2b9981b309677577025a0dffd5a#egg=webqueueapi&subdirectory=api
Werkzeug==1.0.1
...

This is not useful to anyone attempting to find the version number, but this also has to do with how the package is installed:
pip install -e webqueue2/api. The -e flag is an option to install the specified package in a "development mode", however, the package version is not accessible when installing a package this way, so to mitigate the problem, drop the -e flag:

pip install webqueue2/api

pip freeze:

...
typing-extensions==3.7.4.3
wcmatch==8.1.1
webqueueapi==1.0
Werkzeug==1.0.1
...

@campb303
Copy link
Collaborator Author

Functionally, the -e flag for installing packages changes the expected path to the module from {VENV DIR}/lib/<package> to an absolute path and, when in a git workspace, attaches the last associates git commit hash as the version number. This is useful for development because we can edit our code in place but these features are not expected in production.

@benne238
Copy link
Collaborator

Tags

In git, it is possible to point to a specific commit using without using the checksum of the commit. This pointer is called a tag, and it is typically human readable, and in our case, will be used to store the version number associated with a commit, like 1.0.0 or 0.9.1. Creating a tag via bash is relatively simple:

git tag <name_of_tag>

This creates a tag and applies it to HEAD, or the latest commit in the current branch. It is possible to apply the tag to a different commit by following this syntax:

git tag <name_of_tag> <commit>

However, like commits, tags must be pushed to the remote repository, which is not done by default with git push. To push a tag to the remote repo, the syntax looks like this:

git push origin <name_of_tag>

Deleting a tag is a similar syntax, to delete from the local version:

git tag -d <name_of_tag>

and deleting from the remote repo:

git push origin --delete <name_of_tag>

Types of tags

git allows for the creation of different types of tags. The difference between these tags is the amount of information stored within them: an annotated tag can store a message, info about who created the tag and the date the tag was created, a lightweight tag simply points to a commit and has no other information associated with it. By default, a tag will be lightweight (which is what is demonstrated above) however, to create an annotated tag, then pass the -a flag, as such:

git tag -a <name_of_tag>

Pulled from https://git-scm.com/book/en/v2/Git-Basics-Tagging, the contents of different types of tags looks like this:
Annotated flag:*

git show v1.4
tag v1.4
Tagger: Ben Straub <ben@straub.cc>
Date:   Sat May 3 20:19:12 2014 -0700

my version 1.4

commit ca82a6dff817ec66f44342007202690a93763949
Author: Scott Chacon <schacon@gee-mail.com>
Date:   Mon Mar 17 21:52:11 2008 -0700

    Change version number

Lightweight Tag:

git show v1.4-lw
commit ca82a6dff817ec66f44342007202690a93763949
Author: Scott Chacon <schacon@gee-mail.com>
Date:   Mon Mar 17 21:52:11 2008 -0700

    Change version number

For our purposes, we can use lightweight tags, but it also depends how much information we want to store in a tag, so it maybe more optimal to use annotated flags for sake of tracking as much as possible

@benne238
Copy link
Collaborator

gitpython

git python allows for python scripts to interact with git in an object based way, in which the git repository is an object within the python script with various functions that can be called on that object, which are actual git commands that could be run in a bash session. An example script with very basic error handling is included below:

import git # Requires "pip install gitpython"
from os import path, environ
from pathlib import Path
from dotenv import load_dotenv

# Initialize the webqueue-api repository as an object
repository_path = Path(__file__).parent
#repository = git.Repo("/home/pier/e/benne238/webqueue2-api")
repository = git.Repo(repository_path)

version_file_path = Path(repository_path, "VERSION")
load_dotenv(version_file_path)
python_version = environ.get("VERSION")

def createGitTag(version):
    try:
        repository.create_tag(version)
    except git.InvalidGitRepositoryError:
        # Instead of print statments, an official logger will eventually be implemented
        print("The directory specified is not a git repository")
        exit()
    except git.GitCommandError as e:
        error_msg = e.stderr
        if ("already exists" in error_msg):
            print(f"The tag name \"{version}\" already exists")
            exit()
        else:
            print(e)
            exit()
    except Exception as e:
        print(e)
        exit()

def pushToRemote(tagname):
    try:
        #origin = repository.remote('origin')
        #origin.push(tagname)
        repository.remote('origin').push(tagname)
    except Exception as e:
        print(e)
        exit()

createGitTag(python_version)
pushToRemote(python_version)

This script works, and directly running this script will set a git tag equal to the version variable in the VERSION file and push that change to the most recent commit in the current branch in the github repository.
Some changes that should be implemented include:

  1. implementation of a formal logger instead of print statments
  2. being able to pass arguments to this script including which branch or possible which commit to apply a tag to
  3. modularity changes and ensuring functions are effectively used
  4. Update the gitignore to ignore this file since this is not something that is actually included in the webqueue-api package

@campb303
Copy link
Collaborator Author

Let's use annotated tags for a blame trail because they store who created the tag.

This is a great first draft of a script. I agree that logging should be implemented for something like this. We can likely lift the logging setup from venv-manager and repurpose it here. We can also use argparse from venv-manager here.

As for modularity, the way you've separated these functions is good. I'm imagining that the final product will allow us to do the following:

  • Create a local tag
    • Default to latest commit
    • Allow for commit override by passing the first 7 characters of a commit's hash
  • Delete a local tag
    • Default to latest tag (not latest commit)
    • Allow for tag override by passing tag name
  • Create a remote tag
    • As you've noted this is just pushing a local tag to a remote
  • Delete a remote tag
    • This is the same command as creating a remote tag with the delete flag.

Here is a great intro the argparse for argumet parsing.
Here is a great intro the the logging library.

@benne238
Copy link
Collaborator

updated tag_manager script

This script is still preliminary, however, this does make effective use of logging and argpars(ing)

import git, logging, argparse
from os import path, environ
from pathlib import Path
from dotenv import load_dotenv

current_dir = Path(__file__).parent

logger_name = "tag_manager"
logger = logging.getLogger(logger_name)
logger.setLevel(logging.DEBUG)

# See Formatting Details: https://docs.python.org/3/library/logging.html#logrecord-attributes
# Example: Jan 28 2021 12:19:28 venv-manager : [INFO] Message
log_message_format = "%(asctime)s %(name)s : [%(levelname)s] %(message)s"
# See Time Formatting Details: https://docs.python.org/3.6/library/time.html#time.strftime
# Example: Jan 28 2021 12:19:28
log_time_format = "%b %d %Y %H:%M:%S"
log_formatter = logging.Formatter(log_message_format, log_time_format)

# Configure output to stdout
stream_handler = logging.StreamHandler()
stream_handler.setFormatter(log_formatter)
stream_handler.setLevel(logging.INFO)
logger.addHandler(stream_handler)

# Configure out to logfile, located in '/tmp/webqueueapi install log.log'
log_file_path = path.abspath(str(current_dir) + logger_name + '.log')
file_handler = logging.FileHandler(log_file_path)
file_handler.setFormatter(log_formatter)
logger.addHandler(file_handler)

# Initialize the webqueue-api repository as an object
#repository = git.Repo("/home/pier/e/benne238/webqueue2-api")
repository = git.Repo(current_dir)

# Initialize environment variable
logger.debug("Initializing version environment file.")
version_file_path = Path(current_dir, "VERSION")
load_dotenv(version_file_path)
python_version = environ.get("VERSION")
#print(python_version)

if(python_version == None or python_version == ""):
    logger.error("VERSION has not been defined: exiting script")
    exit()
else:
    logger.debug(f"VERSION: {python_version}")

# Initialize argument parser
logger.debug("Initializing the argument parser.")
parser = argparse.ArgumentParser()

scope = parser.add_mutually_exclusive_group(required=True)
scope.add_argument("-l", "--local", 
    action="store_const",
    const="local",
    dest="scope",
    help="create/delete a tag on the local branch"
)
scope.add_argument("-r", "--remote", 
    action="store_const",
    const="remote",
    dest="scope",
    help="create/delete a tag on the local branch and push it to the remote branch"
)
parser.add_argument("action",
    action="store",
    choices=["create", "delete"],
    help="specify the creation or deletion of a new tag"
)
parser.add_argument("-c", "--commit",
    action="store",
    dest="commit",
    default="HEAD",
    help="specifiy the first 7 characters of a commit hash to create the tag."
)
parser.add_argument("-t", "--tag",
    action="store",
    dest="tag",
    default=None,
    help="specify the tage name to be deleted"
)
logger.debug("Finished initializing the argument parser")
args = parser.parse_args()

#print(args.action)
#print(args.scope)
#print(str(args.commit_hash))

def argumentHandler(action, scope, commit, tag):
    if(action == "delete" and tag == None):
        logger.error("delete command passed without a tag")
        parser.error("delete requires -t/--tag to be specified.")

    if(action == "delete" and commit != "HEAD"):
        logger.info("unecessary -c/--commit flag passed with delete command")

    if(action == "create" and tag != None):
        logger.info("unecessary -t/--tag flag passed with create command")
    
    if action == "create": createLocalGitTag(commit)

def createLocalGitTag(commit):
    logger.debug("Creating local git tag")
    try:
        repository.create_tag(python_version, ref=commit)
    except git.InvalidGitRepositoryError:
        # Instead of print statments, an official logger will eventually be implemented
        logger.error(f"{current_dir} is not a git repository")
        exit()
    except git.GitCommandError as e:
        error_msg = e.stderr
        if ("already exists" in error_msg):
            logger.error(f"The tag name \"{python_version}\" already exists")
            exit()
        elif (f"Failed to resolve \'{commit}\' as a valid ref." in error_msg):
            logger.error(f"Invalid commit reference: \"{commit}\"")
            exit()
        else:
            logger.error(e)
            exit()
    except Exception as e:
        logger.error(e)
        exit()

def pushToRemote(tagname):
    try:
        #origin = repository.remote('origin')
        #origin.push(tagname)
        repository.remote('origin').push(tagname)
    except Exception as e:
        logger.error(e)
        exit()

#createGitTag(python_version)
#pushToRemote(python_version)
argumentHandler(args.action, args.scope, args.commit, args.tag)

@benne238
Copy link
Collaborator

functioning tag manager script

import git, logging, argparse
from os import path, environ
from pathlib import Path
from dotenv import load_dotenv

current_dir = Path(__file__).parent

logger_name = "tag_manager"
logger = logging.getLogger(logger_name)
logger.setLevel(logging.DEBUG)

# See Formatting Details: https://docs.python.org/3/library/logging.html#logrecord-attributes
# Example: Jan 28 2021 12:19:28 venv-manager : [INFO] Message
log_message_format = "%(asctime)s %(name)s : [%(levelname)s] %(message)s"
# See Time Formatting Details: https://docs.python.org/3.6/library/time.html#time.strftime
# Example: Jan 28 2021 12:19:28
log_time_format = "%b %d %Y %H:%M:%S"
log_formatter = logging.Formatter(log_message_format, log_time_format)

# Configure output to stdout
stream_handler = logging.StreamHandler()
stream_handler.setFormatter(log_formatter)
stream_handler.setLevel(logging.INFO)
logger.addHandler(stream_handler)

# Configure out to logfile, located in '/tmp/webqueueapi install log.log'
log_file_path = path.abspath(str(current_dir) + logger_name + '.log')
file_handler = logging.FileHandler(log_file_path)
file_handler.setFormatter(log_formatter)
logger.addHandler(file_handler)

# Initialize the webqueue-api repository as an object
#repository = git.Repo("/home/pier/e/benne238/webqueue2-api")
repository = git.Repo(current_dir)

# Initialize environment variable
logger.debug("Initializing version environment file.")
version_file_path = Path(current_dir, "VERSION")
load_dotenv(version_file_path)
python_version = environ.get("VERSION")
#print(python_version)

if(python_version == None or python_version == ""):
    logger.error("VERSION has not been defined: exiting script")
    exit()
else:
    logger.debug(f"VERSION: {python_version}")

# Initialize argument parser
logger.debug("Initializing the argument parser.")
parser = argparse.ArgumentParser()
parser.add_argument("-r", "--remote", 
    action="store_true",
    default=False,
    dest="remote",
    help="create/delete a tag on the local branch and push it to the remote branch"
)
parser.add_argument("action",
    action="store",
    choices=["create", "delete"],
    help="specify the creation or deletion of a new tag"
)
parser.add_argument("-c", "--commit",
    action="store",
    dest="commit",
    default="HEAD",
    help="specifiy the first 7 characters of a commit hash to create the tag."
)
parser.add_argument("-t", "--tag",
    action="store",
    dest="tag",
    default="None",
    help="specify the tage name to be deleted"
)
logger.debug("Finished initializing the argument parser")
args = parser.parse_args()

def argumentHandler(action, remote, commit, tag):
    if(action == "delete" and tag == None):
        logger.error("delete command passed without a tag")
        parser.error("delete requires -t/--tag to be specified.")

    if(action == "delete" and commit != "HEAD"):
        logger.info("unecessary -c/--commit flag passed with delete command")
    
    if action == "create": createLocalGitTag(commit, tag)
    if action == "delete": deleteLocalGitTag(tag)
    if remote: pushToRemote(python_version, action)

def createLocalGitTag(commit, tag):
    logger.debug("Creating local git tag")
    try:
        if tag != "None":
            repository.create_tag(python_version, ref=commit)

        else:
            repository.create_tag(tag, ref=commit)

    except git.InvalidGitRepositoryError:
        logger.error(f"{current_dir} is not a git repository")
        exit()

    except git.GitCommandError as e:
        error_msg = e.stderr
        if "already exists" in error_msg and args.remote:
            logger.info(f"Tag \'{tag}\' already exists, pushing to remote")

        elif "already exists" in error_msg and not args.remote:
            logger.error(f"Tag \'{tag}\' already exists, exiting.")
            exit()

        elif "fatal" in error_msg:
            logger.error(error_msg.strip("\n"))
            exit()

        else:
            logger.error(error_msg.strip("\n"))

    except Exception as e:
        logger.error(e)
        exit()

def deleteLocalGitTag(tag):
    logger.debug("Deleting local tag")
    try:
        repository.delete_tag(tag)

    except git.InvalidGitRepositoryError:
        logger.error(f"{current_dir} is not a git repository")
        exit()

    except git.GitCommandError as e:
        #logger.error(e.cmd)
        logger.error(e.stderr.strip())
        exit()

def pushToRemote(tagname, action):
    try:
        if action == "delete":
            logger.debug(f"Deleting tag \'{tagname}\' from remote.")
            repository.remote('origin').push(refspec=[tagname, "--delete"])

        elif action == "create":
            logger.debug(f"Creating tag \'{python_version}\' and pushing to remote.")
            repository.remote('origin').push(tagname)

    except Exception as e:
        logger.error(e)
        exit()

argumentHandler(args.action, args.remote, args.commit, args.tag)

This script works, while it does not account for all edge cases at this time. The use for this script is like this:

  1. Specify either create or delete while running this script. This will either create or delete a tag respectively.
  2. pass the optional -r or --remote flag: this will push all changes to the remote repository. If this flag is not passed, any tag deletion or creation will only be modified on the local branch and not on the remote branch.
  3. If delete has been specified, then the -t / --tag flag is required. The -t flag specifies which tag is to be deleted. There is no default value for this flag, this is to ensure that the delete command is run with explicit knowledge of what the command will do.
  4. If create has been specified, then the optional -c/--commit flag can be passed. This will specify which commit to apply the tag to. If this option is left blank, then the tag will be applied to HEAD.
  5. create does accept the -t/--tag where it is possible to create a tag with a specific name. However, by default, the created tag name will be the python version. Using the -t/--tag with create is not recommended as the default name is pulled from the version file to ensure tags and version match, so it is advisable to use the -t/--tag flag with the -c/--commit flag.

@campb303
Copy link
Collaborator Author

After several day of experimenting with gitpython it appears that the type of automation we're looking for is not possible In lieu of this functionality, we will manage releases manually.

To create a release:

git tag version [commit]
git push origin version

To delete a release:

git tag --delete version
git push --delete origin version

@campb303
Copy link
Collaborator Author

The package version number is managed inline with the setup() function in setup.py. Git tags are managed separately. Closing.

Sign in to join this conversation on GitHub.
Labels
documentation Related to the writing documentation or the tools used to generate docs feature-request Request for functionality that has not already been implemented
Projects
None yet
Development

No branches or pull requests

2 participants