Writing Python command-line tools with cliff

As part of the Keepsafe Europe project at EDINA, we needed to build a tool to help us understand and manipulate files sent to us by publishers. We didn’t know much about the data formats we’d get (other than, as always, they would be completely inconsistent), but we knew that we would have to manipulate them in various ways, and bring them into a roughly standard format to interact with an API we’d built. We also knew we would have to be able to script this so that we could run it in the future with different data, and enable others outside the university to do this sort of manipulation for themselves.

As a first-step, we wanted to build a command-line application which would let us manipulate CSV-like data and submit requests to our API. Luckily, using Python for this turned out to be pretty simple, thanks to a framework called cliff. I couldn’t find much about cliff online, so I thought I’d write up a quick intro/advert for it here.

Python’s built-in argparse module already makes it straight-forward to build command-line scripts that take in arguments and handle them sensibly. What cliff does is build on top of that, allowing you to make an entire application with: subcommands (think git add, git commit, git ...); an interactive shell; base classes for common types of commands such as listing data; and output  formatted in a variety of fashions.

I’m going to work through a very basic, stripped-down version of our app, to help you get a feel for what cliff can do.

First, setup an environment and install cliff in the standard fashion:

$ virtualenv -p /usr/bin/python3 env
$ source env/bin/activate
$ pip install cliff

Then, create a module, which I’ll just call app.py:

import os
import sys

from cliff.app import App
from cliff.command import Command
from cliff.commandmanager import CommandManager
from cliff.show import ShowOne

data = [
    [ 'Journal', 'Publisher', 'Print ISSN', 'Online ISSN' ],
    [ 'Journal of Software', 'Computer Publishings', '0000-0000', '0000-0001' ],
    [ 'Journal of Hardware', 'Computer Publishings', '1111-0000', '1111-0001' ],
    [ 'Software Development Monthly', 'Megacorp', '2222-0000', '2222-0001' ],
    [ 'Hardware Letters', 'XIT University Press', '3333-0000', '3333-0001' ],
]

class MyApp(App):

    def __init__(self):
        super().__init__(
            description='Does some awesome stuff',
            version='0.1',
            command_manager=CommandManager('myapp'),
            deferred_help=True,
        )

    def initialize_app(self, argv):
        commands = [ Select, ]
        for command in commands:
            self.command_manager.add_command(command.__name__.lower(), command)

class Select(ShowOne):
    'display details of journal with given title'

    def get_parser(self, prog_name):
        parser = super().get_parser(prog_name)
        parser.add_argument('title', help='Title of journal to show')
        return parser

    def take_action(self, parsed_args):
        headers = data[0]
        for d in data[1:]:
            if d[0] == parsed_args.title:
                return (headers, d)
        return (None, None)

def main(argv=sys.argv[1:]):
    myapp = MyApp()
    return myapp.run(argv)

if __name__ == '__main__':
    sys.exit(main(sys.argv[1:]))

You’ll also need to add a module called setup.py:

#!/usr/bin/env python

from setuptools import setup, find_packages

setup(
    name='myapp',
    version='0.1',
    description='Does awesome stuff',
    author='Steven Carlysle-Davies',
    author_email='steven.carlysle-davies@ed.ac.uk',
    entry_points={
        'console_scripts': [
            'myapp = app:main'
        ],
    },
)

Then, you just need to install your application into your environment:

$ pip install -e .

That’s it, you now have a full (if somewhat minimal) application ready to interact with:

$ myapp select "Journal of Software"
+-------------+----------------------+
| Field       | Value                |
+-------------+----------------------+
| Journal     | Journal of Software  |
| Publisher   | Computer Publishings |
| Print ISSN  | 0000-0000            |
| Online ISSN | 0000-0001            |
+-------------+----------------------+

You can also get the data in a different format:

$ myapp select -f json "Hardware Letters"
{
 "Online ISSN": "3333-0001",
 "Publisher": "XIT University Press",
 "Journal": "Hardware Letters",
 "Print ISSN": "3333-0000"
}

And use the interactive application:

$ myapp
(myapp) help select
usage: select [-h] [-f {json,shell,table,value,yaml}] [-c COLUMN]
 [--prefix PREFIX] [--max-width <integer>] [--print-empty]
 [--noindent]
 title

display details of journal with given title

positional arguments:
 title Title of journal to show

optional arguments:
 -h, --help show this help message and exit

...
(myapp) select -f value "Journal of Hardware"
Journal of Hardware
Computer Publishings
1111-0000
1111-0001
(myapp) quit
$

As you can see, a fairly powerful range of functionality for not much code. Let’s take a closer look at what’s going on.

First, we override the default constructor to App with some parameters to let cliff know what our application is and what it does, which is used in the generated help.

    def __init__(self):
        super().__init__(
            description='Does some awesome stuff',
            version='0.1',
            command_manager=CommandManager('myapp'),
            deferred_help=True,
        )

Next, we need to tell cliff what commands are available.

    def initialize_app(self, argv):
        commands = [ Select, ]
        for command in commands:
            self.command_manager.add_command(command.__name__.lower(), command)

The recommended way of doing this is actually through specifying entry-points in your setup.py. That would complicate this example, and in practice we’ve actually found that it’s more readable to specify them directly in your application, instead of having to edit a list in two places.

Next we define the command class. We’re inheriting from a built-in cliff command called ShowOne, which is useful for any commands where you want to show the details of one ‘thing’, e.g. a row, a file etc. The docstring is again used in the generated help text.

class Select(ShowOne):
    'display details of journal with given title'

Then, we specify what arguments our command takes. This uses the built-in argparse module, but cliff has already added some common options to it in the parent class. All we’re adding is that we take in one required argument, called ‘title’.

    def get_parser(self, prog_name):
        parser = super().get_parser(prog_name)
        parser.add_argument('title', help='Title of journal to show')
        return parser

Finally, we need a subroutine that actually does all the work once the command is called. This gets given the arguments that have been parsed according to the parser defined above, and ShowOne expects the subroutine to return a 2-tuple of (headers, values). In our case, we’re just looping through the data to find the matching row (and not handling errors very gracefully!).

    def take_action(self, parsed_args):
        headers = data[0]
        for d in data[1:]:
            if d[0] == parsed_args.title:
                return (headers, d)
        return (None, None)

In our actual application, this subroutine is often usually just a line or two long, taking in the parsed_args variable and extracting the values to pass to another subroutine elsewhere in our application. That allows us to test the logic of the code more easily.

Last, we need a bit of boilerplate that’ll be in almost all your cliff applications, but this is what actually instantiates your application and passes through the parameters from the command-line.

def main(argv=sys.argv[1:]):
    myapp = MyApp()
    return myapp.run(argv)

if __name__ == '__main__':
    sys.exit(main(sys.argv[1:]))

I’m going to gloss over setup.py, but it’s part of setuptools, and there’s more information about how cliff interacts with it in the cliff documentation.

As you can see, quite a lot of the above code is just needed to initialise the application , and only a small part is actually needed to define the select command. Let’s add another command, one that displays multiple rows of the data:

from cliff.lister import Lister
...
commands = [ Filter, Select, ]
...
class Filter(Lister):
    'display selected columns of the data'

    def get_parser(self, prog_name):
        parser = super().get_parser(prog_name)
        parser.add_argument('--index', action='store_true', help='Use column index numbers instead of names')
        parser.add_argument('column', nargs='+', help='Selected columns to display')
        return parser

    def take_action(self, parsed_args):
        if parsed_args.index:
            columns = [int(c) for c in parsed_args.column]
        else:
            columns = [data[0].index(c) for c in parsed_args.column]

        selected = [ [d[c] for c in columns] for d in data ]
        return (selected[0], selected[1:])

This command uses the other major built-in cliff command, Lister. This is similar to ShowOne, but used where you want to display a list of things to the user. This time, we’re letting the user give us a list of columns to display. If they specify --index, they can give us column indices instead of headers.

$ myapp filter "Journal" "Publisher"
+------------------------------+----------------------+
| Journal                      | Publisher            |
+------------------------------+----------------------+
| Journal of Software          | Computer Publishings |
| Journal of Hardware          | Computer Publishings |
| Software Development Monthly | Megacorp             |
| Hardware Letters             | XIT University Press |
+------------------------------+----------------------+

$ myapp filter --format json --index 1 2
[
 {
 "Publisher": "Computer Publishings",
 "Print ISSN": "0000-0000"
 },
 {
 "Publisher": "Computer Publishings",
 "Print ISSN": "1111-0000"
 },
 {
 "Publisher": "Megacorp",
 "Print ISSN": "2222-0000"
 },
 {
 "Publisher": "XIT University Press",
 "Print ISSN": "3333-0000"
 }
]

$ myapp help filter
usage: myapp filter [-h] [-f {csv,json,table,value,yaml}] [-c COLUMN]
 [--noindent] [--quote {all,minimal,none,nonnumeric}]
 [--max-width <integer>] [--print-empty] [--index]
 column [column ...]

display selected columns of the data

positional arguments:
 column Selected columns to display

optional arguments:
 -h, --help show this help message and exit
 --index Use column index numbers instead of names
...

There’s a lot more to cliff, but hopefully this has given you a taste for what it can do. We’ve found it trivially easy to implement commands, and let cliff handle all the complications of parsing arguments and formatting the output appropriately. It lets us easily build scripts that manipulate data (and let others who don’t know Python build the same scripts). For anyone looking to build quick and powerful command-line applications, I’d highly recommend it.