Continuing from my last post dealing with python pipes, and usage of the reduce pipeline, today I released functional-pipeline, a library to package the reduce pipeline along with a bunch of helpers to make working on functional pipelines easier.
I have already posted at length about functional programming in python, and this library doesnt really add anything new beyond the helpers (which you can read about on the docs), so I wont go on more about that. Instead, this post I want to go over a little bit about the process of releasing that package, and the work surrounding it.
This was the first open source library I got to a full release since the changelog-cli; which means its been over a year since I last went through the rigor of open sourcing a codebase and using all of the tools available for open source. Back then my workflow looked a bit like this:
- Checkout the codebase and update to the latest
- Make and changes, test them, get everything running locally
- Make sure tox and tests pass locally
- Push changes to github and make sure travis, coveralls, landscape.io all pass and look good
invokelocally to use the changelog-cli to cut a release, tag it, build a wheel and publish via twine
Some of the things this didn’t really account for.
- How to contributors help out or pick up the workflow
- How is documentation done
- The CI (travis, coveralls, landscape.io) is independent of the release and deploy
- Pypi Supports RST but not markdown (at the time)
So after a year of delivering software, getting intimately familiar with gitlab, docker, and other build tools, and publishing a few private repositories at work, I decided to start refining my workflow.
The first thing I addressed was do I continue using Github + Travis or do I go with the one-stop-shop that is Gitlab. Over the past few years working in Gitlab, I have fallen in love the CI configuration as code model. Travis had this as well, however it being all in the same platform and just working together is awesome. I ended up hosting the open source code on Gitlab.
Next step was to setup a proper branching strategy. Gitlab allows me to deny pushes and restrict merges to
master. This way all changes need to go through CI to make it into master, and at least need to go through a MR. This not only gives a stop-gap for approval, code review, and ensuring that CI is passing; it significantly cleans up the git history of the codebase by allowing me to force fast-forwards and squashing all branches before they make it to master. This allows me to have a ton of “commit early and often” commits while I am working on a feature, but then small book length commit messages and very descriptive history by the time it all gets to master.
Next I changed to only doing deploys from CI and only on Tags. Like the
master branch, I can restrict access on who can cut tags. Once a few changes have made it to
master (Along with the changelog changes), I cut a release branch, update the changelog and
__version__ variable in the root
__init__.py, and merge the release branch into master. As soon as it builds, I cut a tag off of master, which triggers a
only: - tags build in gitlab which pulls my protected Username and Password for Pypi and builds and deploys that tag as the release version. This way releases are 1:1 with tags as creating tags kicks off the build and deploy of a release.
One of the main things I have been wanting to upgrade about my workflow has been documentation. Previously I have just used Markdown files in the repo. Since then I have played with Sphinx, MkDocs, and a few other python documentation generation libraries. Due to how much I love markdown, I decided to go with MKDocs. I wanted to publish the docs on ReadTheDocs, and luckily they have a guide on how to do it with mkdocs.
Along with documentation, testing has previously been a tedious thing for me. I have had a long love/hate relationship with
pytest, mostly due to some very poorly written tests that pytest allowed for far too many layers of abstraction for. However pytest has one of the best remaining test discovery runners, and for this project, the fact that it can run
doctests as part of that discovery is great.
Quick sidenote on doctest. When I was first learning python, doctest taught to me as a quick way to prove something is working. Since then, I have almost never used them, always going with proper
unittest suites and testing folders. A few years ago, while doing daily challenges, I found it nice to write the function name, its signature, and a few smoke tests in doctest as a form of mini TDD. I would then work on the problem until the tests passed. In the end, I had the tests, the documentation and the code all in one place that was easy to read like so:
def fizzbuzz_gen(n: int) -> List[str]:
doctest, I can write easier to read documentation with examples, use basic
TDD principles, and have it all in one place.
The self contained nature of
doctest makes it perfect for testing small pure
functions. So writing a functional pipeline, I decided to use docstring doctests
pytest can discover all doctests with a flag so it was an easy choice.
While I had previously used
doctest for unit tests, this time I discovered something
new, doctest can also read your documentation. By adding the
flag to pytest, it will scan all of my documentation examples and make sure they
compile. This often needs the multiline syntax of doctest:
from functional_pipeline import pipeline, not_none, lens
This discovery ended up being my favorite part of the project. I could confidently demo all of my code in the documentation that gets deployed to ReadTheDocs while increasing coverage.
Overall I am happy with the new pipeline driven workflow of CI and releasing
this package. It means a lot less manual work for me and releases being much
better tested and documented. I was also happy to discover that the new
version of pypi supports markdown long_descriptions, so I can just read in
README.md at deploy time.