Solution Development in Python, Part 1

It’s been awhile since I tackled anything too traditionally “technical.” Lately I’ve encountered many testers who are interested in using Python as their ecosystem of choice for test solutions, particularly in data science or machine learning environments. So here I’ll talk about being a test solution developer in a Python context and what it means to create solutions in this ecosystem.

Just to be clear: I won’t be talking about using test solutions written in Python; rather this is for you test solution developers who want to write such test solutions in the Python ecosystem, which includes packaging and distributing your hard work.

I’m doing this because I’ve written a lot of tooling to support testing in the Ruby and Java ecosystems. It’s generally very easy to do this due to packaging and distribution mechanisms around Ruby; less easy in Java. Python is in between the two in my opinion.

Unlike Ruby, however, packaging and distribution for Python can seem like a maze of twisty passages, none of which are all that well documented. At the very least, a lot of documentation seems out of date or conflicts with other documentation.

Python 2 or 3?

There is still a divide between Python 2.x and Python 3.x, particularly in regards to certain documentation and support for various packages. Here I’m pretty much going to gloss over all of that, creating a very simple solution to get a Python 2 and 3 module packaged up, tested, and uploaded to the Python Package Index, also called PyPI.

Oh, and please note that PyPi is pronounced “pie pee eye”, not “pie pie”. You’ll tick off a lot of Pythonistas if you get this wrong. Later I’ll also touch on the binary wheel format because I’ve found it to be another one of those confusing areas at first.

Gather Your Resources

Everything I say here, and in this post in general, is applicable to Unix/Linux, MacOS, and Windows. I won’t cover individual nuances of each operating system but if you’re building test tools in Python, presumably you’ve already got some experience with Python on your operating system of choice.

Beyond needing Python, of course, you need pip. You’ll probably already have it when you installed Python. Specifically, pip is installed for you if you are using Python 2.7.9 or greater as well as Python 3.4 or greater. Python also provides isolated virtual environments for development, using tools like virtualenv or venv. If you’re working in one of those, pip will also be available automatically.

You’ll need setuptools and wheel. The good news is that you should likely have these already if you also have pip. Just do pip list at your Terminal window/command prompt and you’ll find out what you have.

If for some reason you don’t have pip, you can use the get-pip script and run python get-pip.py. Running this script will automatically get you setuptools and wheel.

Finally, you’ll probably want Twine. This you will likely not have automatically and so you install it by using pip: pip install twine.

Incidentally, if you dual installed Python 2 and 3, you’ll likely have two executables for Python (python and python3) as well as two for pip (pip and pip3). Just be aware of that. If on Windows, I covered a bit about dual installing Python versions on Windows. On Mac, I find it simplest to just use Homebrew to install both Python 2 and Python 3.

PyPI and PyPI Test

In order to deploy anything you write, you can register on PyPI Live and also on PyPI Test. You must create an account in order to be able to upload your code. I recommend using the same email/password for both accounts, just to make your life easier when it comes time to push up your work.

Not a requirement, but it might help you to create a .pypirc configuration file. Make sure to put this file in your home folder – its path should be ~/.pypirc or $HOME/.pypirc. This would be %USERPROFILE% on Windows.

This file can hold your information for authenticating with PyPI, both the live and the test versions. Again, this is just to make your life easier, so that when it comes time to upload you don’t have to type in your username and password.

Putting on my security hat, because this file holds your username and password, you may want to change its permissions so that only you can read and write it. From the terminal, run:

chmod 600 ~/.pypirc

On Windows, this likely means using icacls.

In terms of what to put in this file, I’ve found tutorials on this vary a surprising amount, but what seems to work best is:

I’ve seen other ways in various tutorials but the above has consistently worked for me.

Starting Structure

I’m just going to show you a possible starting structure and then explain the files. Feel free to create all this as you read along. I’ll show you what to add to each file as we go. NOTE: if you do want to follow along, you’re going to have to choose a name other than “proverb” because when we get to the deploy part of this (which will be in the third post), you have to deploy with a unique name.

proverb
    proverb
	  __init__.py
.gitignore
LICENSE
MANIFEST.in
README.rst
setup.py

I’ll be the first to admit, a lot of tutorials start out a bit slower, incrementally building a relatively easy solution. In the case of Python, I’ve personally found it’s better to include as many of the moving parts as possible as early as you can. So now let’s break this down a little bit.

Structure Breakdown

The top level directory is the root of my code repository; basically, the project directory. The important point for now is that the second proverb directory is my actual Python module. So two “proverb” directories is not a mistake here.

Basic Support Files

If you’ve deployed any project source code to GitHub, I’m fairly sure you’re familiar with why you want a .gitignore file. Here’s what mine looks like:

# Compiled python modules.
*.pyc
 
# Setuptools distribution folder.
/dist/
 
# Python egg metadata, regenerated from source files by setuptools.
/*.egg-info
/*.eggs
/*.egg

A license file is always good, of course, presuming you want people to use your solution.

You might notice my README file is not in Markdown format, which is the most common format you’ll generally see, particularly when coming from Ruby or Java. PyPI doesn’t pay attention to Markdown format at all. It won’t reject it but it also won’t use it. PyPI does recognize reStructuredText (RST). So if you want PyPI to render your no doubt carefully crafted README on the package homepage, it has to be in RST format. There are some tricks to get around this and I’ll likely come back to one of them later.

Manifests

The MANIFEST.in file is necessary to tell setuptools to include the README.rst file when generating source distributions. Otherwise, only Python files will be included. Here’s what my file looks like:

include README.rst

If you wanted to include other files, such as the LICENSE, or whatever else, you could just add more include lines. Basically you just have to add all files and directories that are not already packaged due to the “packages” keyword in the setup file. It’s that setup file which we’ll look at next.

setup.py

One of the most important files here is setup.py because it handles the coordinating aspects of making your module/package available. Every package on PyPI needs to have a file called setup.py at the root of the directory. PyPI uses the metadata in this file.

Every project that you want to package in a Python context needs a setup.py file that will be executed whenever you build a distribution and — unless you install a wheel — on each installation. Unlike other ecosystems there really isn’t much of an “init” process for constructing a new package, although PEP-0517 and PEP-0518 are attempting to address that. Again, it’s kind of odd to me that Python still doesn’t have this sort of mechanism in place.

The setup.py file is the file that describes the files in your project and other meta information. One of the problems you’ll likely run into when learning all this is that you’ll see a lot of examples that show this at the top of the file:

Others, however, will start with this:

The second form is the correct one for all current Python development.

When looking into Python development, what you have to do is learn to skip past the fact that there was once a thing called “distribute” because it was merged into setuptools. You also have to learn to block out mentions of “easy_install” because that too was merged into setuptools and then eventually came to be replaced by pip. You also have to put out of your mind any references to “distutils2” which was also referred to as “packaging” as these were entirely abandoned. However, the forerunner to “distutils2” — called “distutils” — is still in place alongside something called “distlib” but they both operate underneath setuptools and shouldn’t worry you.

The bolded words in that last paragraph are all you need to really concern yourself with. The quoted words in that paragraph are things you will find in various examples, tutorials, and source code that can distract you from what you need to concern yourself with.

Example Setup

I’m not going to go through a tutorial of everything you can put in setup.py. For that, I do recommend How To Package Your Python Code. Here’s what my example looks like:

So the main thing to note is that we are calling the setup() method of the setuptools package.

Since I was just talking about the manifest file prior to this, please note that in order for the files listed in the manifest to be copied at install time to the package’s folder (inside site-packages, which is where Python installs everything) you must use the include_package_data=True call.

The version is important for obvious reasons. But you should note that PyPI and the packaging ecosystem do have some opinionated stances on the structure of the version string. See PEP-0440 if you are curious. That said, if you’re used to semantic versioning or, quite frankly, just the idea of a major and minor build, you’re pretty much good to go.

The classifiers field has a debatable amount of usefulness to it. However, if you are going to have it you must pick from the approved list of classifiers. If you don’t do this, and use what are unkown classifiers, PyPI will refuse your package.

Notice how the “long_description” makes a call to a readme() function I defined. This is a fairly common practice in the Python world, where the readme file serves for what is called the “long description” which simply prints out more information about your package. It seems useful to reuse the readme in many cases but, obviously, that wouldn’t apply if your readme became incredibly long and involved.

Module File: __init__.py

The __init__.py files are required to make Python treat any directory that the file is in as one that contains packages. This is a very different concept if you are coming from other languages, so let me talk about this a little.

You will hear that Python uses these files to prevent directories with a common and possibly already-used name, such as ‘string’, from unintentionally hiding valid modules that occur deeper on the module search path.

The rationale here is that Python searches a list of directories to resolve names. This most notably occurs when import statements are handled. Because modules can be any directory, and arbitrary ones can be added by the end user, Python has to be concerned with directories that happen to share a name with a valid Python module, such as ‘string’. To alleviate this, Python ignores directories which do not contain a file named __init__.py.

It’s important to note that Python 3.3 and up has the concept of Implicit Namespace Packages. These allow you to create packages without an __init__.py file. If you want your code to also work in Python 2, however, you should keep these files in place. All of your Python 2 packages that do have __init__.py files will still work in terms of imports in Python 3.

To keep things simple for this initial post, put the following in this file:

Install It!

Now we can install the package locally (for use on our system), with:

pip install .

This is very similar to rake install for a Ruby gem or mvn clean install for a Java package.

If you check pip list you should see “proverb” listed among whatever else you have installed. You can also try pip show proverb, which will give you information about your package.

Run It!

Now start up your Python interpreter and do the following:

>>> import proverb
>>> print(proverb.saying())

Bask in Your Pythonic Glory

How exciting is our life, huh ?! We got us a working package. The next steps would be to deploy this to PyPI. That’s where a lot of tutorials would now take you and, admittedly, that would give a sense of accomplishment beyond what we’ve done here.

But I want to build out this example a little bit more before we take that route. You can continue on in part 2 of this series.

Share

About Jeff Nyman

Anything I put here is an approximation of the truth. You’re getting a particular view of myself … and it’s the view I’m choosing to present to you. If you’ve never met me before in person, please realize I’m not the same in person as I am in writing. That’s because I can only put part of myself down into words.

If you have met me before in person then I’d ask you to consider that the view you’ve formed that way and the view you come to by reading what I say here may, in fact, both be true. I’d advise that you not automatically discard either viewpoint when they conflict or accept either as truth when they agree.

This entry was posted in Python. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *