Writing Python command-line tools with cliff

As part of the Keepsafe Europe project at EDINA, we needed to build a tool to help us understand and manipulate files sent to us by publishers. We didn’t know much about the data formats we’d get (other than, as always, they would be completely inconsistent), but we knew that we would have to manipulate them in various ways, and bring them into a roughly standard format to interact with an API we’d built. We also knew we would have to be able to script this so that we could run it in the future with different data, and enable others outside the university to do this sort of manipulation for themselves.

As a first-step, we wanted to build a command-line application which would let us manipulate CSV-like data and submit requests to our API. Luckily, using Python for this turned out to be pretty simple, thanks to a framework called cliff. I couldn’t find much about cliff online, so I thought I’d write up a quick intro/advert for it here.

Python’s built-in argparse module already makes it straight-forward to build command-line scripts that take in arguments and handle them sensibly. What cliff does is build on top of that, allowing you to make an entire application with: subcommands (think git add, git commit, git ...); an interactive shell; base classes for common types of commands such as listing data; and output  formatted in a variety of fashions.

I’m going to work through a very basic, stripped-down version of our app, to help you get a feel for what cliff can do.

First, setup an environment and install cliff in the standard fashion:

$ virtualenv -p /usr/bin/python3 env
$ source env/bin/activate
$ pip install cliff

Then, create a module, which I’ll just call app.py:

import os
import sys

from cliff.app import App
from cliff.command import Command
from cliff.commandmanager import CommandManager
from cliff.show import ShowOne

data = [
    [ 'Journal', 'Publisher', 'Print ISSN', 'Online ISSN' ],
    [ 'Journal of Software', 'Computer Publishings', '0000-0000', '0000-0001' ],
    [ 'Journal of Hardware', 'Computer Publishings', '1111-0000', '1111-0001' ],
    [ 'Software Development Monthly', 'Megacorp', '2222-0000', '2222-0001' ],
    [ 'Hardware Letters', 'XIT University Press', '3333-0000', '3333-0001' ],
]

class MyApp(App):

    def __init__(self):
        super().__init__(
            description='Does some awesome stuff',
            version='0.1',
            command_manager=CommandManager('myapp'),
            deferred_help=True,
        )

    def initialize_app(self, argv):
        commands = [ Select, ]
        for command in commands:
            self.command_manager.add_command(command.__name__.lower(), command)

class Select(ShowOne):
    'display details of journal with given title'

    def get_parser(self, prog_name):
        parser = super().get_parser(prog_name)
        parser.add_argument('title', help='Title of journal to show')
        return parser

    def take_action(self, parsed_args):
        headers = data[0]
        for d in data[1:]:
            if d[0] == parsed_args.title:
                return (headers, d)
        return (None, None)

def main(argv=sys.argv[1:]):
    myapp = MyApp()
    return myapp.run(argv)

if __name__ == '__main__':
    sys.exit(main(sys.argv[1:]))

You’ll also need to add a module called setup.py:

#!/usr/bin/env python

from setuptools import setup, find_packages

setup(
    name='myapp',
    version='0.1',
    description='Does awesome stuff',
    author='Steven Carlysle-Davies',
    author_email='steven.carlysle-davies@ed.ac.uk',
    entry_points={
        'console_scripts': [
            'myapp = app:main'
        ],
    },
)

Then, you just need to install your application into your environment:

$ pip install -e .

That’s it, you now have a full (if somewhat minimal) application ready to interact with:

$ myapp select "Journal of Software"
+-------------+----------------------+
| Field       | Value                |
+-------------+----------------------+
| Journal     | Journal of Software  |
| Publisher   | Computer Publishings |
| Print ISSN  | 0000-0000            |
| Online ISSN | 0000-0001            |
+-------------+----------------------+

You can also get the data in a different format:

$ myapp select -f json "Hardware Letters"
{
 "Online ISSN": "3333-0001",
 "Publisher": "XIT University Press",
 "Journal": "Hardware Letters",
 "Print ISSN": "3333-0000"
}

And use the interactive application:

$ myapp
(myapp) help select
usage: select [-h] [-f {json,shell,table,value,yaml}] [-c COLUMN]
 [--prefix PREFIX] [--max-width <integer>] [--print-empty]
 [--noindent]
 title

display details of journal with given title

positional arguments:
 title Title of journal to show

optional arguments:
 -h, --help show this help message and exit

...
(myapp) select -f value "Journal of Hardware"
Journal of Hardware
Computer Publishings
1111-0000
1111-0001
(myapp) quit
$

As you can see, a fairly powerful range of functionality for not much code. Let’s take a closer look at what’s going on.

First, we override the default constructor to App with some parameters to let cliff know what our application is and what it does, which is used in the generated help.

    def __init__(self):
        super().__init__(
            description='Does some awesome stuff',
            version='0.1',
            command_manager=CommandManager('myapp'),
            deferred_help=True,
        )

Next, we need to tell cliff what commands are available.

    def initialize_app(self, argv):
        commands = [ Select, ]
        for command in commands:
            self.command_manager.add_command(command.__name__.lower(), command)

The recommended way of doing this is actually through specifying entry-points in your setup.py. That would complicate this example, and in practice we’ve actually found that it’s more readable to specify them directly in your application, instead of having to edit a list in two places.

Next we define the command class. We’re inheriting from a built-in cliff command called ShowOne, which is useful for any commands where you want to show the details of one ‘thing’, e.g. a row, a file etc. The docstring is again used in the generated help text.

class Select(ShowOne):
    'display details of journal with given title'

Then, we specify what arguments our command takes. This uses the built-in argparse module, but cliff has already added some common options to it in the parent class. All we’re adding is that we take in one required argument, called ‘title’.

    def get_parser(self, prog_name):
        parser = super().get_parser(prog_name)
        parser.add_argument('title', help='Title of journal to show')
        return parser

Finally, we need a subroutine that actually does all the work once the command is called. This gets given the arguments that have been parsed according to the parser defined above, and ShowOne expects the subroutine to return a 2-tuple of (headers, values). In our case, we’re just looping through the data to find the matching row (and not handling errors very gracefully!).

    def take_action(self, parsed_args):
        headers = data[0]
        for d in data[1:]:
            if d[0] == parsed_args.title:
                return (headers, d)
        return (None, None)

In our actual application, this subroutine is often usually just a line or two long, taking in the parsed_args variable and extracting the values to pass to another subroutine elsewhere in our application. That allows us to test the logic of the code more easily.

Last, we need a bit of boilerplate that’ll be in almost all your cliff applications, but this is what actually instantiates your application and passes through the parameters from the command-line.

def main(argv=sys.argv[1:]):
    myapp = MyApp()
    return myapp.run(argv)

if __name__ == '__main__':
    sys.exit(main(sys.argv[1:]))

I’m going to gloss over setup.py, but it’s part of setuptools, and there’s more information about how cliff interacts with it in the cliff documentation.

As you can see, quite a lot of the above code is just needed to initialise the application , and only a small part is actually needed to define the select command. Let’s add another command, one that displays multiple rows of the data:

from cliff.lister import Lister
...
commands = [ Filter, Select, ]
...
class Filter(Lister):
    'display selected columns of the data'

    def get_parser(self, prog_name):
        parser = super().get_parser(prog_name)
        parser.add_argument('--index', action='store_true', help='Use column index numbers instead of names')
        parser.add_argument('column', nargs='+', help='Selected columns to display')
        return parser

    def take_action(self, parsed_args):
        if parsed_args.index:
            columns = [int(c) for c in parsed_args.column]
        else:
            columns = [data[0].index(c) for c in parsed_args.column]

        selected = [ [d[c] for c in columns] for d in data ]
        return (selected[0], selected[1:])

This command uses the other major built-in cliff command, Lister. This is similar to ShowOne, but used where you want to display a list of things to the user. This time, we’re letting the user give us a list of columns to display. If they specify --index, they can give us column indices instead of headers.

$ myapp filter "Journal" "Publisher"
+------------------------------+----------------------+
| Journal                      | Publisher            |
+------------------------------+----------------------+
| Journal of Software          | Computer Publishings |
| Journal of Hardware          | Computer Publishings |
| Software Development Monthly | Megacorp             |
| Hardware Letters             | XIT University Press |
+------------------------------+----------------------+

$ myapp filter --format json --index 1 2
[
 {
 "Publisher": "Computer Publishings",
 "Print ISSN": "0000-0000"
 },
 {
 "Publisher": "Computer Publishings",
 "Print ISSN": "1111-0000"
 },
 {
 "Publisher": "Megacorp",
 "Print ISSN": "2222-0000"
 },
 {
 "Publisher": "XIT University Press",
 "Print ISSN": "3333-0000"
 }
]

$ myapp help filter
usage: myapp filter [-h] [-f {csv,json,table,value,yaml}] [-c COLUMN]
 [--noindent] [--quote {all,minimal,none,nonnumeric}]
 [--max-width <integer>] [--print-empty] [--index]
 column [column ...]

display selected columns of the data

positional arguments:
 column Selected columns to display

optional arguments:
 -h, --help show this help message and exit
 --index Use column index numbers instead of names
...

There’s a lot more to cliff, but hopefully this has given you a taste for what it can do. We’ve found it trivially easy to implement commands, and let cliff handle all the complications of parsing arguments and formatting the output appropriately. It lets us easily build scripts that manipulate data (and let others who don’t know Python build the same scripts). For anyone looking to build quick and powerful command-line applications, I’d highly recommend it.

 

 

Python: How you import impacts how you mock

Recently, I had a problem with monkeypatching external service based function. In a nutshell, monkeypatch is a pytest fixture that allows you to replace an object or function with your mock version. Try as I might I could not get it to work. This blog post is something I wish I had read when getting my mocks to work first time.

This is the module structure with get_user function that connects to LDAP. We want to mock get_user in our tests to avoid connecting to LDAP and to not have to use real users for tests.

base.py

def get_user(uun):
    return ca_user

SCENARIO 1: importing get_user function in the module where it’s required

forms.py

from central_auth.base import get_user

def clean_user(uun):
   ca_user = get_user(uun)

Trying to monkeypatch it as below will not have the desired effect. The real get_user still gets called.

tests.py

from central_auth import base

def test_form_POST_OK(monkeypatch):
    monkeypatch(base, 'get_user', get_mock_user_function)

    get_view_with_form() # where get_user is called

# DOES NOT WORK

This is because when we do named importation i.e. importing object or function as opposed to module, the object/function gets a new namespaced name. This way what exists as central_auth.base.get_user is referred to forms.get_user within forms.py.

To make the monkeypatch work, base module should be imported like so:

SCENARIO 2: module import

forms.py

from central_auth import base

def clean_user(uun):
   ca_user = base.get_user(uun)

Then we end up having central_auth.base.get_user and forms.base.get_user, both referring to the same base module.

Alternatively, we can use unittest.mock.patch to the same effect which also allows a greater level of granularity:

SCENARIO 1:

forms.py

from central_auth.base import get_user

def clean_user(uun):
   ca_user = get_user(uun)
tests.py

@mock.patch('forms.get_user', side_effect=get_mock_user_function)
def test_form_OK(form_get_user)

SCENARIO 2:

forms.py

from central_auth import base

def clean_user(uun):
   ca_user = base.get_user(uun)
tests.py

@mock.patch('forms.base.get_user', side_effect=get_mock_user_function)
def test_form_OK(forms_get_user)

The only disadvantage of using mock.patch is that if get_user is called in different modules in your tested function then you need to mock all of them specifically like so:

forms.py

from central_auth import base

def clean_user(uun):
   ca_user = base.get_user(uun)
views.py

from central_auth import base

def view_user(request, uun):
    ca_user = base.get_user(uun)
tests.py

@mock.patch('forms.base.get_user', side_effect=get_mock_user_function)
@mock.patch('views.base.get_user', side_effect=get_mock_user_function)
def test_form_OK(views_get_user, forms_get_user):
    get_view_with_form()

 

Since I am a newbie to python and pytest, give me a shout if I got something terribly wrong. For now, my tests work 🙂

Cheers

Accelerating End User Testing

Do you develop software?

I suspect if you are looking at blogs in this space then that’s probably true, or at least it might be an area of interest for you.

Do you carry out End User Testing?

Well again, if you develop software then in most cases you are likely to be doing so with users in mind.

So End User Testing can be a pretty difficult thing to really get to grips with. In Applications Division we have adopted a technology called TestRail to help us, and frankly, it’s really doing an impressive job

http://www.gurock.com/testrail/

TestRail allows us to do things that have previously been terrifically hard and complex to coordinate. It allows us to understand how User Testing is going and to rapidly get defects straight back into development as they surface in our testing processes. Tracking progress is done through a very intuitive gui and when defects are found they are created by the tester and JIRA is created immediately. No messing about, straight back to development. This is really impacting our productivity positively

Often with big systems there can be quite literally hundreds of workflows and user test cases that need to be validated by teams of End User Testers. Keeping track of progress, or actually more importantly lack of progress, is a project manager’s nightmare. Traditionally, teams have used things like spreadsheets, email and word of mouth to know how far testers are getting on with their test scenarios and plans. Often, people unfortunately become distracted or have their plans interrupted or perhaps they might be unexpectedly absent from work. Knowing someone has not managed to complete a set of tests is crucial in making sure that things are not disappearing down rabbit holes and so that projects can complete in time.

Getting issues straight back to development allows us to start working on the problem or defect straight away, we don’t need to wait until the test run has completed, getting the tester to create the JIRA when the defect is found really speeds things up

Surfacing this has often been a really difficult thing to do but TestRail really helps to address this. It is easy to interpret and allows the project to adapt to the current situation in a way that would not really have been possible previously

We have introduced TestRail as part of our Digital Transformation programme.

You can find out more here;

http://www.projects.ed.ac.uk/project/dti002

Currently we are using TestRail in Human Resources and Finance. We are extending its use to link into both agile and waterfall projects and  expect to adopt this across the entire range of projects we undertake in Apps

If you fancy finding out more about how we are getting on with this please do get in touch

Iain

Front-End Development Community Lightning Talks

Word cloud of front-end development words in the shape of a lightning boltLast Thursday the Front-End Development Community hosted our first Lightning Talks event. Eight speakers had five minutes each to introduce us to who they are, the work they do and the tools they work with.

We had contributors and attendees from various parts of the University including the Business School, Information Services and the Edinburgh Clinical Research Facility. Despite the wide range of different teams and systems represented, we found that we have a lot in common. Lots of University staff are using or want to use Git for version control, many are using Handlebars to display content through JavaScript, and quite a few of us are thinking about service oriented architecture.

It was great to hear from so many different voices, and to start piecing together some of the common themes that run through the University. We hope to be able to follow this up with talks diving into a bit more detail, and workshops to see how we’re putting some of our common tools into use.

Continue reading “Front-End Development Community Lightning Talks”

UCISA17 and disruptive technologies

I recently attended the UCISA annual conference and exhibition.

UCISA Logo

UCISA (Universities and Colleges Information Systems Association) runs an annual conference which is a great chance to meet with peers working in the sector hear about how others have addressed challenges and develop ideas on how we can overcome obstacles that are common to our community.

You can find out a bit more about UCISA here;

https://www.ucisa.ac.uk/

I am actually the vice chair for the infrastructure group which specialises in looking at things like Cloud computing, IT security, virtualisation and many other areas and if you want to know more about that please check this out here;

https://www.ucisa.ac.uk/groups/ig

This year’s conference started with a bang and we had a fantastic presentation from Stefan Hytforrs.

Stefan is a freelance speaker who lectures on how innovation, disruptive technologies and behavioural change affects both the world of business and of course social change. Stefan presented a fantastic example of how new games like Pokomon Go have grabbed the attention of huge numbers of people and altered their behaviour. He shows a great example of hoards of people frantically chasing a virtual pokamon in fields and from a non participant’s point of view it looks simply incredible.

However his lecture really discusses far more interesting questions about what actually we regard as success. He sees the importance of community and people as the vital component in success and believes that really this is at the heart of success.

Stefan goes on to open or eyes to the fact that for the first time in our history we are truly connected, not in a hierarchy but in a peer to peer collaboration and it is here that things really start to resonate for me when we think about the objectives of the software development community of practice.

I highly recommend taking a look at Stefan’s blog and his videos, this is really a person interested in creating a better future

Home

Hear him talking about the future here you might like it;

 

 

 

Our University’s front-end development community

Last fall, I sent around a survey to members of the University in advance of setting up our new Front-End Development Community. The survey asked for information about how people might want the community to work, and also about what they were doing in their jobs. In this post, I’ll share some of the results along with my thoughts.

The survey was sent out to various groups that either do or engage with front-end development. Not all of these groups were technical – for example, I also included some of the user groups for our University CMS platform, EdWeb. This could include everything from content authors to administrators. Overall, only 66 people responded to the survey, so we are only seeing a small slice of the University as a whole. However, it still gives an interesting picture of some of the work which people are doing. I was surprised to see the diversity among some of the responses.

Continue reading “Our University’s front-end development community”

Improving curriculum data quality with better tools

Every student in the University of Edinburgh is enrolled on a degree programme and has some sort of “degree programme table” (DPT): a set of rules which guide the individual courses they’ll take during their studies.

For some programmes, the DPT is just a selection of the courses you must take each year. Others add choices for students (“select French 1A or Arabic 1A”) or let them select from a wide range of courses (“select any level 10 courses in the Moray House School of Education”). These rules are joined together with simple and/or logic.

On Wednesday we released a new version of our DPT editor. Those familiar with the old editor will be pleased to see a big UI update, plus features to natively support core courses and unstructured degrees. More exciting to us though are changes to improve the quality of future DPTs.

Continue reading “Improving curriculum data quality with better tools”

Harry Roberts: Refactoring CSS Without Losing Your Mind

Last Thursday we were lucky enough to be able to welcome Harry Roberts from CSS Wizardry (https://csswizardry.com/) to give a talk on “Refactoring CSS Without Losing Your Mind”. Many thanks to Harry for taking the time out to speak to us on this subject. The talk was organised and funded by the University’s Front-End Development Community, a new subset of the Software Development Community.  If you aren’t part of the community yet, check out our community channel on Slack.

Slides and a video of the talk (for those that missed it), can be found here:

Overall the event was a huge success, with almost 100 people attending! Since this is one of our first community events, I thought people might be interested to learn a little bit more about who attended. The numbers come from the Events Booking application so won’t be exact – some people may have attended without booking, and others may have booked but not attended.

Continue reading “Harry Roberts: Refactoring CSS Without Losing Your Mind”