IS at DrupalCon – Sessions Day 2

DrupalCon Barcelona 2015, Keynote Day 2 with Nathalie Nahai

On Tuesday and Wednesday I posted some session summaries and comments from DrupalCon 2015 in Barcelona, where myself and a few colleagues are spending this week.

Day 2 of DrupalCon began with a short session celebrating those involved with Drupal. i.e. partners and contributors, highlighting the importance of Drupal community members contributing through sprints, followed by the morning’s Keynote with Nathalie Nahai. Nathalie spoke about web psychology, providing a scientific perspective on how people see and react to different aspects of web content presentation. Admittedly the theme was more applicable to those Drupal users who deal with marketing aspects of websites since it was concerned with how to get, and keep, the attention that you desire from your online audience.  However, the principles apply equally well to any organisation or institution interested in engaging in the most effective way with visitors to their website.

The day continued with many more sessions across a broad range of topics.  We also took part in a couple of Birds of a Feather sessions, one of the great features of DrupalCon, allowing members of the Drupal community with a common background or interest an opportunity to discuss face-to-face the issues they deal with, sharing their knowledge and experience and exploring potential strategies to resolve those issues.

Our experiences of some of Wednesday’s DrupalCon sessions are outlined below. Thanks to Riky, Tim, Andrew, Adrian and Chris for contributing their thoughts on sessions they attended; any errors, omissions or misinterpretations in their edited notes are entirely mine. Most of the sessions mentioned below, along with many more interesting talks, are recorded and available on the DrupalCon YouTube channel.

How to change our Estimation Process Took our Project Endgame from WTF to FTW

This session covered estimation and how to turn this critical project component from something that often leads to a project being perceived as a failure, into an accurate and more reliable part of the project process. A common issue within some organisations is that the person deciding the budget does not have the in-depth knowledge of the project deliverables required to make sensible decisions. A lot of work goes into creating an initial estimate without knowing the details of the deliverables; the “bid”, or in UoE terms the proposal estimate, needs to be made on the objectives, and it should be accepted that this is what the estimate reflects.

During the estimation process, it’s crucial to ask the right questions, to review and to explain the process outlined below (creating transparency), to define scope (get the business to say what they want), and to discuss and agree milestones.

The next part in the process is the discovery, which is best done with UX sketches, but this needs a designer! Rapid iterative design should be the approach, with sketch approval, continuing early tech planning with sketches in preference to wire frames; these sketches are not full set of requirements but enable rough estimates to be produced with a goal of +/- 40% accuracy. This provides an early indication of feature complexity and expedites prioritisation before moving on to wireframes, and long before anything is actually built. Wireframes must be fully approved before beginning the next stage.

The next stage is full Tech Planning, which involves larger group of people with a goal of achieving estimates with 10% accuracy, adding implementation notes. This stage comprises of several 1.5 – 2 hour meetings over a couple of weeks where the deliverables are broken down into tasks and these are estimated in hours. These sessions involve lots of discussion using a kind of low level poker estimation, but do not involve the business. The project manager can then create a budget breakdown based on the estimates; the results, which are now deliverables, are then shared with the business and recommendations discussed with them.

If the estimation is over the the available budget at this stage, the options are clear: descope, share work or find more budget.

During the build pay close attention to:
1.    Large overspend on individual tasks;
2.    New requirements (these need prioritisation!);
3.    Weekly budget reviews and status check ins;
4.    Demo as often as possible as this gets customers excited when they can see a concept come to life!

Finally at project wrap up the project should be within 10% of the estimate. It is worth noting that this process doesn’t really work when the client provides UX, or for time and materials projects – in that case just go Agile!

However, this approach does present two main challenges: it requires partner buy in, and essential meetings are difficult to schedule.

My takeaways from this session are the need to review and constantly update estimates as the project moves forward, the importance of prioritisation and defining what is actually needed, and creating transparency throughout this process. On suitable projects, this could involve additional soft milestones for estimation. Another takeaway relates to the estimation process itself, to make it more accurate. To do this we need to have more detail before committing to delivery. As project managers, we need to be strong and not press ahead when there is insufficient detail; without this, there is a tendency to estimate at an abstract optimistic level.

Drupal Extreme Scaling

The old way to boost Drupal performance is to use the following technologies: memcache, APC/OPcache, Varnish, and server redundancy. We currently use all of these. We can now utilise elastic computing and containerisation to boost performance – as the presenter put it:

“this is not future technology, but present technology”

The speaker’s team was tasked to provide a minimum cost, automated, no-downtime hosting platform for 30 to 100 thousand Drupal sites. To do this, Amazon Web Services infrastructure was used to run a stack consisting of Docker containers, Nginx, MySQL, MongoDB, with Ansible for Configuration Management, a Node.JS administrative application, Apache Mesos as an abstraction layer, and Marathon and Chronos running on Mesos to allow it to control Docker containers and scheduled tasks. The end result gave a platform which could perform EC2 auto scaling and spin up Amazon Machine Images which contain the three used Docker containers (one for the admin application, one for Varnish, and a Drupal container which would be used for every site), while databases were shared with one per 500 sites to minimise their overheads using a clever method of table prefixing.

This was a fascinating, very technical talk which I’d recommend watching to anyone with an interest in successfully solving a huge, complex infrastructural project with modern technologies. Although we only have one Drupal site to run and not tens of thousands, some very useful advice was given based on the presenter’s experiences: Nginx is very flexible and PHP-FPM increases performance significantly (as we found with our own testing); centralised logging is vital; always use authentication on REST APIs; and the combination of cloud hosting and containerisation was excellent. If the task were to be repeated today though, one would likely use the AWS EC2 Container Service instead of Mesos and Marathon. Possibly the most important thing to remember

“Lazy DevOps is the best DevOps!”

Headless D8 in HHVM plus Angular.js and some other things you can do with is a deployment platform originally developed for PHP applications which now also  supports Python and node.js. It integrates with whatever git repository you want, as well as HipChat, Jira and other tools, allowing multiple applications to be pushed into one build, e.g. front and back-end applications, and appear under separate hostnames. handles all the DNS and varnish config to create these pop-up environments and replicates live database and configuration into your development area. It can also sanitise the database as it’s moved to strip out user passwords and email addresses, etc.

One really useful feature is the ability for developers to specify the version of PHP and control the php.ini file in YAML files. You can also specify which database should be set up for the environment.

The ability to control this non-code configuration and replicate a complex build process for all developers without them all needing the level of expertise to set up their own environment comes in very useful when working with multiple teams. This is especially true if external developers who don’t know your environment are involved.

The session also covered some of the performance gains that can be achieved running Drupal on HHVM (HipHop Virtual Machine), over PHP7 and PHP5.

Defense in Depth: Lessons learned securing 100,000 Drupal Sites

Data breaches can be very expensive, so it is incredibly important to ensure that security consciousness is part of our mind-set in IS Applications. Breaches typically are not due to cracking encryption and hashes or exploiting unknown vulnerabilities, but rather human error. Thought should be given to the “CIA Security Triad” of confidentiality, integrity and availability. Security lists can be used to find out known vulnerabilities which need to be patched, these include: US-CERT and CERT-EU, LWN, Drupal, and security releases by Red Hat.

The main thing we took away from this session was not the quality of the advice, which was all very sensible (do backups, patch your servers and applications, use a Version Control Repository), but the practicalities of implementing that advice. Recommendations like using 2 Factor Authentication for our SSH keys are great, but we aren’t even using SSH keys for connecting to servers. Using enterprise login services so password hashes aren’t stored locally is also sound, but only if we were to use a technology like OAuth to allow it. We need to be doing more good practice when it comes to security; a greater security consciousness within IS Applications would be a great step in the right direction.

Local vs. Remote Development: Do Both by Syncing Your Site From Kitchen to Cloud With Jenkins

There are pros and cons to developing both locally (using VMs on a developer’s PC) and remotely (using Development servers provisioned by Development Technology). CASCADE is a new tool to streamline development workflows and add CI to local development. Effectively, it is extra code which uses Ansible to spin up and configure local Jenkins and GitLab Vagrant boxes and provide an interface to them.

While everyone in the audience was using Vagrant, very few had edited a Vagrantfile or run more than one Vagrant box at a time; a tool like CASCADE could provide developers with a simpler way to have a more advanced local CI environment. I don’t think its use would be appropriate to Development Services, but some of the ideas raised were interesting, especially as we are not generally using Vagrant for local development yet.

Behat+Mink+PhantomJS = Test ALL THE THINGS!

Integration testing is a topic that pops up regularly at DrupalCon, and in retrospect it was interesting to hear this talk on the same day as another talk on unit testing.  This particular session focused on the use of BDD framework Behat for testing, coupled with Mink to simplify interaction with the browser emulator, provided in this case by PhantomJS (other browser emulators such as Selenium webdriver can also be used).

Whilst the speaker was very engaging and did give a decent high level outline of the different components, including the Gherkin language used to define Behat tests, the outline didn’t have a clear structure, which made it difficult to get a grasp on how each component fits into the bigger picture.  That in turn makes it difficult to judge whether any/all of what was demonstrated would be useful in our context.  It was also disappointing to see that whilst the talk description mentioned screenshot comparison, the only real mention of this during the talk was to say that PhantomJS was not the best tool for UI comparison (one attendee suggested wraith as an alternative).  UI testing is something that we definitely need to explore further in the context of our Drupal CMS, and our current set of tools for automated testing (primarily Selenium WebDriver with test suites built in Java) may not be the best starting point.  Unfortunately, although it was interesting to see Mink, which I hadn’t come across before, there was nothing in this particular talk to help us find the best approach to UI testing where there are gaps in our own test suites.

Principles of Solitary Unit Testing

Whereas the earlier session I attended on Behat with Mink and PhantomJS was concerned more with integration testing, this session explored the principles of solitary unit testing, as contrasted with sociable unit testing, where the idea is to limit what is being tested as much as possible, essentially to test one thing without “crossing boundaries” such as writing to disk or reading from a database.

The speaker provided a very clear and interesting summary of the principles of unit testing, exploring aspects such as:

  • the importance of testing “one concrete class” (not counting value objects as these don’t have behaviour), using “doubles” to represent dependencies and objects returned by collaborators, thereby eliminating crossing of boundaries;
  • the stages of solitary unit testing, namely Arrange (setting up the context, e.g. any data required, before carrying out the test), Act (which ideally should call only one method) and Assert (to test whether the test passes);
  • the principle of always asserting last, and limiting each test to one assertion, which is really a general principle rather than a hard and fast rule – it was pointed out by one attendee and acknowledged by the speaker that sometimes it is necessary to break this principle;
  • ways of handling some of the complexity issues around unit testing, for example using ‘object mothers’ or ‘data builders’ to encapsulate setup, writing custom assertions to avoid multiple asserts in one method, and eliminating dependencies, all of which reduce the lines of code in the actual test and help to avoid “fragile tests” which break easily when something changes that isn’t directly related to the specific test;
  • the ability of solitary unit testing to highlight bad OOP code – if the test is difficult to write, the problem could be the code.

The speaker noted that solitary unit testing is not a catch-all, and will not always provide the most appropriate benefit.  In our particular context, end-to-end integration testing is of greater importance than unit testing as we need to ensure that the complex set of contrib and custom modules and configuration settings which comprise our central Drupal CMS function correctly when deployed together in one of our deployment environments; integration testing using Selenium Webdriver is therefore incorporated in our automated deployment process.

Notwithstanding the focus of our own test suites however, the principles explored in this session such as clear code structure, isolating specific functionality, ensuring readability and clarity of tests, minimising what is covered by one test, and limiting the assertions performed, are equally applicable.  It seems to me that many of these principles are a starting point for best practice regardless of the particular type of testing being performed. This was an excellent talk which provoked an equally interesting conversation between attendees and the speaker on when it is appropriate to bend or break the principles.  I highly recommend watching the session recording to anyone with an interest in automated code testing.

SmarTest: Proposal for accelerating the detection of faults in Drupal

The SmarTest module has been developed at the University of Seville, extending SimpleTest to improve automated testing for widely varying system configurations.  With Drupal having a high scope of configuration variability, it can generate multiple test cases for different configurations which can be quite difficult to cover in testing.

As part of the studies performed by the speaker, Ana, and her team at the University of Seville, a diagram was drawn up showing the relationships between a set of modules (48 in their example). After querying how this was produced, I was told that it was quite an involved manual process and, without tools to assist, would be a fairly time consuming if we wanted to have the same thing.

With the tests that they ran across various modules, they found some direct (and possibly obvious) relationships between certain aspects. They found that module size (lines of code) as well as the number of commits on a module directly related to the number of faults found in the modules which they tested, i.e. More code and/or more commits produced more faults. However, the more contributors that there were on a module did the opposite and reduced the number of faults in modules. It was also found that migrating the same modules to a newer version of Drupal introduced more faults again.

The SmarTest module is something that could be interesting for us to run against the configuration of EdWeb, our central Drupal CMS. It aims, among other things, to highlight the most potentially problematic modules.  However, the problem of Drupal being a “variability-intensive system” is not so much of an issue for us as we don’t really expect to vary our configuration drastically or often.

Next generation graphics: SVG

SVG is making a comeback now that Flash is dying off, and high resolution mobile and touch-screen tablet devices require vector graphics to keep logos and icons sharp while keeping file size low.

While there are no SVGs in Drupal 7, Drupal 8 core is now making use of SVG assets to replace PNGs. This session covered SVG as a markup language, even how to write it by hand (if you are that way inclined), as well as the features available for animating and interacting with SVGs using javascript, and how well (or not so well) these features are supported by the different browsers.

There are also some quite significant security risks if you allow users to upload their own SVG files, but that aside, you should be looking to SVGs rather than icon fonts for those vector icons now.

Configuration management in Drupal 8

Configuration Management is a new feature to Drupal 8; in Drupal 7 the closest you have is the features module. This session was a quick tour of how configuration in Drupal 8 can be exported and imported using drush, how it is stored in YAML files, and where it is defined in custom modules.

Dependencies are fully managed in Drupal 8 through these YAML configuration files, and when dependencies are removed, the configuration is deleted. Demonstrations during the session showed how dependencies build up and apply as soon as you use them, for example, a role access filter being applied to a view.

Tips for best practice in changing these files and moving these files between environments were covered, as well as in-depth details of the Configuration Entity and third party settings.

Drupal architectures for flexible content

This session explored the need to understand the requirements of editors and how to deal with the demands of content editors in Drupal.

Current limitations in Drupal were also highlighted, such as the disconnection between content and layout, which is not always a problem, and also how Drupal does not currently have revision history in content editing.

Making Drupal fly – The fastest Drupal ever is here!

Quite simply, this “fastest Drupal ever” is Drupal 8.

Different caching options were discussed and it does look like, due to the issues seen in Drupal 7 and earlier, a lot of attention has been given to performance and customisation of caching options in Drupal 8, .

As mentioned in previous session notes, full page caching via reverse proxy such as Varnish is also possible in Drupal 8.

Birds of a Feather Sessions

Design and Usability Critiques

This was a great BoF discussion, not so much for new information, but confirmation that the UX approach taken during the EdWeb project was fundamentally correct, although the process of incorporating UX into Agile project needs to be lighter, take place earlier and be more frequent. We had 3 hour sessions with a large group of people, who at that time in the iteration were all under pressure to get the iteration completed.

Some recommendations to make Agile UX sessions more effective:
1.    Run combined sessions with developers, business analyst, etc. but make them shorter;
2.    Present results at these sessions;
3.    Be clear about how UX sessions relate to prioritised stories;
4.    Be clear up front about what questions the UX session should answer.


Drupal in Higher Education

This session was set up by developers from the University of Adelaide in Australia and was well attended by representatives from Higher Education institutions in the UK and across Europe. The initial introductions showed that a wide range of Drupal experiences at various stages of maturity were represented, from the management of a small number of Drupal sites, through the distribution of a profile across more than 100 devolved Drupal sites, to the wholesale replacement of existing CMSs with a central Drupal service.  As is so often the case during meetings between Drupal users having a common background, the main topics under discussion were pain points; what was apparent in the conversation that ensued was the commonality of these among the experiences of those present.

One area of particular concern was hosting, with almost everyone present agreeing that Universities often suffer from a peculiar fetish for internal hosting which can make it controversial to explore external hosting options.  The main reason for this seems to be the desire to avoid exposing sensitive data, and that is clearly an important issue for many websites managed within the HE sector.  The desire to keep things internal can make Drupal hosting especially difficult where there is a dependency for stability reasons on old versions of infrastructure elements such as PHP.

The approach which is being taken in Adelaide is of particular interest and something that we should explore further, especially given our own desire to look into the possibilities of configuration deployment and tools such as Puppet, Vagrant and Docker for automated deployment of the required server environments. The developers who set up the BoF have created an evolving platform for automated deployment of Drupal 8 to get around their internal hosting issues; they hope to collaborate on this with other institutions and ultimately make it available for wider use.  We are not yet planning for Drupal 8, but the principles of how to manage deployment of Drupal in a devolved HE context are of interest regardless of the Drupal version being used.  We have a relatively sophisticated means of deploying updates to the Distribution Profile that is associated with our central Drupal CMS, but we can do more with our automated deployment process in the admittedly more complex area of configuration and server deployment for the central CMS itself.

Another topic covered was role management – how to ensure that only the appropriate people can perform tasks such as generating a new Drupal site, or use functionality within a site.  LDAP groups were discussed as one means of achieving this, with users automatically added and removed from roles within Drupal based on their LDAP group membership; this requires that groups be configured with the appropriate members, and that there are clearly defined mappings between those groups and roles in the CMS.

Overall this was a really interesting BoF.  It was great to discuss both the positive aspects and common problems associated with using Drupal in an HE context, and to hear how other institutions are deploying and using Drupal. At the end of the session, contact details were shared and this will hopefully lead to further engagement beyond our meet-up in Barcelona.