The Last Measurable Ounce of Quality Can Be Expensive

A Brief Lexicon of the Software Quality Landscape

The quality of a software product is a multi-dimensional measurement, spanning such things as functionality, correctness, performance, documentation, ease of use, flexibility, maintainability, and many others. Many of these qualities are difficult to measure, are difficult to see, and hence are difficult to manage. The result is that they are ignored by all but the most enlightened in management.

The topic of interest for this screed is going to be correctness. This is not going to concern correctness in the end-user sense, meaning the correct meeting of requirements. It is going to mean “is the code doing what we think it should be doing.” Since we don’t generally have provably correct programs, it is a matter of convincing ourselves, through lack of evidence, that our programs are working as we’d like. This precarious situation is perfectly portended by Edsger Dijkstra’s observation that “absence of evidence is not evidence of absence.”

So we have several levels of ways to convince ourselves of correctness. They are, from most detailed to most abstract, unit testing, integration testing, functional testing, and system testing. Note that there is not industry agreement concerning these exact terms, but the general concepts are recognized.

In typical object oriented designs, unit testing involves isolating a given class and driving its state and/or behavior and verifying that we see what we expect. Integration testing is a layer above that where we use multiple classes in the tests. Functional testing is yet above that, where we try to deploy our programs in the natural components that they would inhabit in production, like a server or a process. Finally system test covers testing in the full production-like environment.

Unit Testing

Our focus will be on the lowest level, namely unit testing. The expectation is that unit tests are both numerous (think many hundreds or even thousands) and are extremely fast (think milliseconds). To be effective, these tests should be run on every single compile on every developer’s machine across the organization. The goal is that the unit tests precisely capture the design intent behind the implementation of the class code, and that any violation of those intents result in immediate feedback to the developer making code changes.

I’d like to tell you that every developer is doggedly focused on both the quality of the production logic and the thoroughness of the unit tests that back that logic. Through a combination of poor training, lack of emphasis at the management level, and just plain laziness, developers produce tests that span from greatness all the way down to downright destructive (more on that in another blog entry). One of the easiest ways to try to externally track this testing is through code coverage.

Code Coverage

Code coverage is a set of metrics that can give developers and other project stakeholders a sense of how much of the production logic has been tested by the unit tests. The simplest metric is the “covered lines of code” aka line coverage. This is usually a percentage and it means that if a class has 50 lines of code in it, and it has 60% code coverage, then 30 lines of that production logic is executed as part of the running of the unit tests for that class. There are other coverage metrics that can help you gauge the goodness of your tests, like branch coverage, class coverage, and method coverage. But here, we will focus on line coverage since that is most widely used.

The general, common sense assumption is that “more is better”, so mis-guided management and deranged architects insist on 100% code coverage, thinking that would give the maximum confidence that the quality of the code is high. If we had an infinite amount of time and money to spend on projects, this conception could represent the optimum. Since this luxury has never been true in the last 4 billion years, we have to spend our money wisely. And this changes things drastically.

The truth is that it might cost M dollars+time to achieve say 80% line coverage, but it might take M *more* dollars+time to get that last 20%. In some cases, getting the last few percentage might be extremely expensive. The reason for this non-linear cost is complicated.

First, production logic should be tested through its public interface where possible rather than through a protected or private interface. It can be laborious to construct the conditions necessary to hit a line of code buried in try/catches and conditional logic behind public interfaces. This cost can be lowered by refactoring the code towards better testability, but this is a continuous struggle as new code is produced. There is a truism in the veteran developers that increasing the testability of the production logic improves its design.

Second, some code has high complexity also known as cyclomatic complexity. Arguably this code should be refactored, but projects do have a certain percentage of their code with high cyclomatic complexity that gets carried forward from sprint to sprint.

The third reason is a bit technical. Code like Java is compiled into byte code. The code coverage tools run off of an analysis of the byte code, not the source code. The Java compiler will consume the source code and emit byte code that may have extra logic in it, meaning code with extra branches. It might not be possible to control the conditions which would take one path or the other through this invisible branch. Further complicating this, is that the invisible logic can change from Java compiler release to release, putting a burden on the test logic to reverse engineer the conditions needed to cover this invisible logic.

Summary

Based on the above discussion, achieving 100% line coverage can be very expensive. On teams that I have worked on over the years, a reasonable line coverage would be 70% or more. But you should let the development team determine this limit. If you force your teams to get to 100% line coverage, you are spending money that might be better spent on automation tests. In addition, I have seen cases where developers will short-circuit the unit tests by writing tests only for the purpose of increasing the coverage. You can readily identify these test because they have no assertion or verification check in them – they just make a call and never check on the result.

In short, you should be careful what you ask for. Make sure you interact with the development team in making the decision about code coverage. Spending another 50% of scarce testing dollars on that last 10% coverage is unlikely to bring a return on investment.

Testing OpenDaylight controller with Postman tools

Introduction
Usage of REST APIs has been gaining popularity in the Networking industry as more vendors are realizing its ease of use and the standard interface it provides. With HTTP based REST API, a vendor can provide a standard interface to the hardware’s features that helps in management, verification and automation.
The primary interface to OpenDaylight controllers features is the YANG model based REST API, backed by the standard IETF RESTCONF specification.

As a Senior QA Engineer working on QA aspects of OpenDaylight controller, I will describe various techniques of automating OpenDaylight REST API testing using Postman tools.

A typical automation testing workflow of OpenDaylight can be classified as:

  1. Install a feature
  2. Verify that the feature is installed
  3. Execute a a northbound REST API call against an underlying fabric or application
  4. Assert the impact of the call
  5. Run a test (positive, negative, performance, scale etc)
  6. Goto 3

OpenDaylight primarily uses Robot Framework for automation testing. Although Robot Framework is very easy to learn as it uses plain English as keywords for test cases, it gets complicated for parsing data structures or executing loops.

I will share my experience on using Postman and its features, where test cases are vastly simplified and as a side effect, its much faster than executing same tests in Robot Framework.

Most of the SDN community is familiar with Postman, the UI tool for executing REST APIs. But not many are familiar with its companion tool newman, which is available as an NPM module (Node Package Manager). Using the newman utility, all Postman collections can be run on CLI as a bulk action. Some of the features of Postman are:

  1. grouping a set of similar logical APIs through collections
  2. write tests on API to parse correct return codes and values
  3. ability to create environments for a call and switch environments
  4. ability to generalize constants through environment
  5. running a set of collections workflows using Runner feature
  6. generate reports and documents
  7. collaborate with various teams by sharing collections and environments. Sharing can be done through a shareable link or json files

Installation
Postman can be either installed through application or as a Google Chrome browser extension. The Google Chrome extension will be deprecated some time soon. So it is recommended to install the native package, which is available for most platforms. If you sign up an account with Postman, it allows you to share collections between multiple systems, which can be a great benefit to test from various platforms.

Request Options
A typical request in Postman has many options like selecting type of request, authorization, headers, body (request payload), and a very easy way to store the response of the request the user sends. This can be quite handy while writing tests. More about tests would be covered in the subsequent sections.

Collection
Collection is a powerful feature of Postman. We can group a set of calls for a particular workflow and make it a Postman collection. Each collection can result in a set of APIs that verify a feature or functionality or a sequence of steps. For example following screenshot shows a collection for verifying data migration feature in OpenDaylight.

Environment
The environment feature in Postman can be used to define variables and change them for different scenarios. These variables values would be substituted at runtime from the environment. A typical call on ODL inventory would be like this.

Testing the Response
Let’s say you have defined a collection with set of REST APIs for an environment. The response would then have to be inspected for its values. This can be tedious especially if the response runs into hundreds of lines. Even though Postman can prettify json response, it still takes time to ensure a particular key-value is what is expected.

Postman comes to the rescue here, where with a little bit of Javascript knowledge, you can write tests that will assert the values of the response. The following snippet shows the operational topology url for a given environment and tests written to verify. As you can see, that these tests verify the REST API return code and loop through the response. There are many other nice features in the tests such as, waiting for some delay, redirecting to another rest api depending on the current api result and so on.

Collection Runner
The Collection Runner feature is very useful to verify a particular postman collection and a specific environment with various control options. Using this feature, a user can select a collection and an environment with options like, iterations, delay between each request and logging. When the runner is invoked on a collection with all these parameters set, it would generate a report which can be shared across. A typical collection runner looks like the following:

Using newman
While the Postman testing is packed with features, it is still a chore to execute tests via UI tools. For effective automation, testing should be performed via cli. This is exactly what Newman helps with. You have to install NodeJS first to use newman. Goto NodeJS and install the latest LTS version of NodeJS for your platform.

From the terminal, execute

npm install newman -g

to install newman globally. The -g parameter will ensure newman is installed in ~/.npm and a binary link is created to execute newman from any directory.

The newman tool can run any collection with a given environment in cli. We can source the collection and environment in the form of an URL or a json files to the newman client. There are many options provided by newman to control your output. See
newman for a complete list of options. The newman utility can also generate a report in html, xml, plain text format. A typical newman run looks like this:

newman run intf_snake_test_collection -e specifci_env -r cli --reporter-cli-export clirun --disable-unicode

Integrating Newman with Robot Framework
At LuminaNetworks, we use Jenkins as our CI tool and Robot Framework as our primary automation testing tool. Robot Framework is a keyword based framework built on top of Python. From Robot Framework you can call nemwan tool, execute the collections, grab the results and inspect for success / failure and post the status to Jenkins.

Generating Documentation
Well, lets say you have run all your tests successfully. How do you let the world know about the APIs used? Documentation can be a tedious job, but Postman provides a very nice feature for automating documentation and assisting in automating REST API calls. For every collection in postman, there is an option to generate documentation. Once we generate the documentation, a public url for that particular documentation is generated that is shareable. Using this link, we can generate snippets of code to automate the APIs in that particular collection. A good use of this is to share the urls to customers. Here is a screenshot that shows how to generate and publish documentation:

Conclusion
In summary, Postman (and newman) is a very fast and efficient tool to verify, automate, document and publish the REST APIs. These Postman features help to collaborate and build on existing work very easily and without any hassle. Since it tool can also be integrated into CI easily, I consider it a one stop shop for the API development and automation testing.

Getting started with Lumina SDN Controller

Lumina SDN Controller (LSC) is Lumina’s SDN controller distribution based on the industry leading OpenDaylight project. We recently released version 7 of LSC which is based on OpenDaylight Nitrogen release. Lumina offers a free license for LSC (to manage up to 5 network nodes for 1 year) so that interested users can try it out for themselves. In this introductory blog post, we will take you through the steps for downloading and installing LSC. Detailed steps for the same are available in LSC Software Installation Guide.

Step 1: Create an account by visiting the My Account link on Lumina Networks website. Submit your information using Register form. Upon registering, you will receive two emails from Lumina – one for email verification and another with your Lumina user-id. Click on the verification link in email to complete the registration process.

Step 2: Login to the web site using Login form with your registered credentials.

Step 3: Once you are logged in, click on the down arrow next to Download link. Select the first option – Lumina SDN Controller Trial.

Step 4: Click on the “Download Trial” at the end of the description.

Step 5: Click on “Free Trial” button.

Step 6: Enter the billing details. Review and accept the Terms & Conditions and EULA. Click the Submit button.

Step 7: Depending on the target installation platform type, download either the Debian or RPM LSC package. (Optional) Download the LSC Documentation package.

LSC Documentation package contains below doc files:

You can review information in these files to become more familiar with LSC and the installation process.

Doc Name Description
lumina-sdn-controller-7.1.0-quick-start-guide.pdf Summary of steps for installing LSC
lumina-sdn-controller-7.1.0-software-installation-guide.pdf Detailed instructions for installing and configuring LSC
lumina-sdn-controller-7.1.0-release-notes.pdf List of modified features and known issues in this version of LSC
lsc-app-topology-manager-7.1.0-release-notes.pdf List of modified features and known issues in this version of Lumina Topology Manager application that is part of LSC distribution

LSC DEB/RPM package contains below files:

Step 8: Extract the LSC package on the target machine. Depending on which LSC feature you want to use, run either the “install” or “unpack” script.

Running the “install” script will install all the packages bundled in the distribution.
Running the “unpack” script will only unpack the packages bundled in the distribution and make them available for installation. You can then use the package manager manager commands for the target platform (i.e. apt-get or yum) to selectively install only the required packages.

NOTE: Both “install” and “unpack” must be run with superuser (sudo) privileges.

When “install” script is run, output like below will be displayed:

When prompted for confirmation, reply in affirmative by typing “y” and press Enter.

The following screenshot shows sample output generated when “unpack” script is run. It will by default print out all available extensions and apps that you can install.

After “unpack” script execution finishes, you can install individual packages as needed. The following screenshot shows the installation of BGP/PCEP extension of Lumina SDN Controller.

Again, when prompted, answer in affirmative by typing “y” and press Enter.

Step 9: Start the Lumina controller by running “sudo service lumina-lsc start” command.

Step 10: Check the karaf log (located under directory /opt/lumina/lsc/controller/data/log) and verify that the installed extensions come up successfully. Taking BGP/PCEP module as an example, you will find lines in the log indicating that the corresponding module has got loaded.

Step 11: Login to the Karaf client with command “sudo -u lumina /opt/lumina/lsc/bin/client”.

Step 12: Verify that the LSC core and extensions are successfully installed and service is started. For example, if you want to verify BGP/PCEP module is installed, execute command “feature:list -i | grep bgpcep” in karaf console. You should be able to see the LSC bgpcep extension listed.

“feature:list” will list all features available to the controller and managed by karaf.
“-i” option will filter the feature list and display installed ones only
“grep xxx” will filter the output and display entries contain keyword “xxx” only.

Optionally, you can verify all installed LSC modules with command “feature:list -i | grep lsc”. You should see a bunch of LSC features listed.

Congratulations! You have successfully downloaded and installed the Lumina SDN Controller (LSC) distribution. More details about LSC components and installation options are available in the Software Installation Guide provided in LSC Documentation package.

Journey towards SDN services delivery

We at Lumina Networks believe that Software Defined Networking (SDN) is not a product by itself, but rather a set of use cases that use assortment of technologies to meet customer needs. As we all know, the goal of SDN is to allow network administrators to respond quickly to changing business requirements. We deliver software development services, products using agile processes that make SDN possible to help service-providers move beyond lab trials and into production. I will share few tips that have helped us to achieve success in such SDN projects from a project manager perspective.

Every project manager faces multitude of challenges such as expectations management, controlling scope creeps, accountability of team resources, risk management, and delivering results on time. This blog provides some guidance on how to manage expectations and how to do value based prioritization.

Expectations Management

Setting expectations helps ease the anxiety of outcomes. Usually IT project goals are set by senior management without getting into much technical details. This creates a huge gap between desired and actual outcomes. Therefore, the project manager must be involved in planning process early and to define the measurable goals that can be agreed upon. When selecting which features to include in the first version, it’s important to make sure that each feature’s impact is measurable against the business goals. A Minimum Viable Product (MVP) is a development technique using which we can build the first version with just enough features to satisfy early customers, and also gather feedback for future product versions. Here is an interesting quote:

“An MVP is a down payment on a larger vision. — Johnny Holland

There needs to be a discovery period where project stakeholders can effectively break down the goals and determine the true MVP scope. The project manager must also perform the MVP planning process with the entire development team to get their feedback. Their inputs on effort estimates, timelines and risks will be helpful in go/no-go decisions and defining the project milestones. It costs time up front, but it saves a lot of stress and scope creep in the long run. Project milestones should be communicated to all the stakeholders and project manager should make sure that the requirements are prioritized based on business value for the milestones (With the help of product owner. To be discussed later). At the end of the day, the stakeholders have to feel confident that the project manager cares about what’s most important to them. With Agile processes, it becomes easier to communicate the progress and changes during the daily stand-ups with stakeholders and weekly sync-ups meetings. When there are scope creeps, the project manager should clearly set (or reset) expectations as to whether the changed scope is feasible with or without additional resources.

Value based Prioritization

Prioritization can be defined as determining the order of execution and separating what must be done now, from what needs to be done in the future. Product Owner is the key person to decide whether certain requirements have more business value than others and how to prioritize them. It is very important to identify and assign the Product Owner role to a key project stakeholder who is representative of the customer. Agile processes aim to deliver maximum business value in a minimum time span using prioritization. If the requirements are important enough, then the scope will change as it increases the likelihood of delivering improved business value to the customer. High value requirements are identified and moved to the top of the prioritized backlog by the Product owner. Prioritization of user stories should be done throughout the project cycle as the perceived value of the requirements and scope will change over time. At some point in the project, we will face tradeoffs due to prioritization and scope changes. When such situations arise, we need to step back and ask, “What’s the MVP ?”. Take a step back to visualize what’s most important and discuss if there are alternative ways to proceed.  The Agile project team’s flexibility to change is utmost important as well during re-prioritizations.

To summarize, while delivering software development services, we need to establish trusted relationships with stakeholders by managing expectations, need to quickly engage them for value based prioritization and build flexible project teams to implement MVP and the future versions.

The case for container-based builds

We have sat on the river bank and caught catfish with pin hooks. The time has come to harpoon a whale.
– John Hope

The hallmark of a healthy software ecosystem is the thrum of older technologies being displaced by the newer. But from time to time, a fascinating phenomenon occurs whereby the older becomes enveloped by the newer and the result is so compelling that they are rarely seen apart.

In this multi-part series, we’d like to share our experiences in transforming from our legacy, single-point-of-failure integration server to modern Continuous Integration (CI) services backed by an infrastructure-as-code (IaC) philosophy.

For building our open-source OpenDaylight-based Lumina SDN Controller, our Lumina Networks developers use a common build tool stack of Jenkins, Maven, Java, Python, NodeJS, and a testing framework of Robot Framework and Nightwatch. Our build system is centralized with about three dozen repositories. Like most projects that grow over time, friction developed in our build workflow, including

  • An older Jenkins with a risky upgrade path (and a disruption to team productivity)
  • Inability to upgrade plugins/tools (which would risk incompatibility and broken builds)
  • Accumulation of ad hoc snippets of shell scripts embedded in Maven POM files
  • Developers using our CI build system to do compiles and test code (instead of doing that locally)
  • Spurious trouble with local builds passing and CI builds failing

Virtually every development team of any size in every company has suffered with these issues, particularly those with projects that are older than a year or two. Developers rely on doing builds dozens of times per day, so any slowdowns or blockages in this workflow can become very expensive, have low visibility outside of development, and costs are hard to quantify.

To address these and other issues, we started looking at containerization to solve replication, scaling, and dependency management.

Along Came a Whale

Docker first was released for public consumption 2013. Today, it is used in production in very diverse ecosystems. It is used by tier one TelCos and by the largest tech companies. Docker has been downloaded billions of times and has enjoyed large scale growth year over year.

Docker/containers are a natural fit for DevOps applications. There are some compelling reasons to consider using containerized builds. Here at Lumina Networks, we have just completed our conversion to containerized builds and want to enumerate the advantages we saw in this solution.

Advantages Of Containerized Builds

So what does containerizing the builds achieve ? It means

  • we can deploy onto a cloud with minimal work – this can address scaling issues effectively. Note that some builds will still depend on lab access to local devices, and these dependencies may not scale.
  • efficient resource management – instead of spinning vm-s per build, we can run 15-20 builds in a single vm, all isolated from each other securely.
  • easier upgrading – for example, running a component in its own container isolates it so other containers that depend on it are forced through a standard, explicit interface.
  • better partitioning – so instead of making environments that contain all possible tools and libraries, a container can only use those needed for its specific purpose. This has the side effect of reducing defects due to interacting 3rd party components.
  • a clean reset state – instead of writing reset scripts, the container is discarded and resurrected from a fixed image. This is a phoenix (burn-it-down) immutable view of containers, and forces all build and configuration to be explicit (not accumulate in snow flake instances).
  • 100% consistency between local development and the build system, which should eliminate the “but-it-works-on-my-machine” problem.
  • effective postmortems for build failures, potentially leaving failed runs in the exact state of failure, rather than relying solely on extracted log files.
  • building and installing an intermediate, shared artifact once, instead of 52 times, and potentially speeds up the build cycle.
    some tests can make temporary checkpoints via a temporary image/container and roll back to that state rather than tearing down and rebuilding, affording a quicker build.

Judicious use of containers might help with diagnosing hard-to-reproduce issues in the field. We have seen instances of technical support sending/receiving VM images to/from customers. Containers would be both simpler and could be a lot smaller.

Containerizing the build is considered a modern best-practice and affords access to many kinds of build work flow and management tools. If you are a customer of ours, and you have your own in-house software development, maybe this list will help you convince your management to do the same.

Use of containers is not limited to build contexts. Containers are used in production environments too. Delivering software components in orchestrated containers has been under discussion for some time here.

This is Part 1 in our series of The Build-Cycle diaries. Look here for future blog articles giving more details about our experiences.

Dependency Injection and Default Visibility Constructors

It has been a few years since I’d done more than review or read Java code. I’ve been using Java on and off over the years but my last few years have been spent writing UI/JavaScript in browsers and Node.js (along with Python, Perl, Groovy, Bash scripting…but no Java). So as I approached the task of tackling an ODL ‘application’ for the first time I doubted my previous Java stint with GWT years ago were going to pay off…

TDD and DI

The task was to create a new set of functional components which collaborated with the code layer immediately responsible for communicating with MDSAL and other ODL features. For this I used Spock and a TDD (test driven development) approach to write tests which exemplify the required functionality based on a hastily sketched interface and the acceptance criteria in the JIRA item. Typically all that is needed for implementation, especially when using dependency injection friendly patterns, are the bits that wire the components together – variables to hold references and functional methods with no implementation but what will satisfy the compiler. The simplified lenient Mock and Stub capabilities from Spock and the use of Java interfaces substitutes whatever results of the functions are needed. Once a level of satisfaction that the behaviors needed and the inter-relationship between components is complete the set of unit tests become the acceptance criteria of the class implementations.

Constructor Visibility

To be friendly with unit testing practices and the use of DI (Blueprint in ODL), the visibility modifiers of Java are used to define at least two constructors: a default visibility (“package-private”) constructor and a public constructor. Unit tests reside in the same package as the ‘subject under test’ and access the default visible constructor for a full set of dependency injection parameters. This includes things as the Logger and all components which it collaborates. The public constructor only exposes a subset of the dependency injection parameters in the default visible constructor. This limits the exposure in DI to just the desired injection properties (e.g. Logger is omitted since that flexibility is unnecessary for normal use).

DI Constructor Pattern


package com.example.java;

import com.google.common.base.Preconditions;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class ServiceImpl implements Service {
  private Logger logger;

  public ServiceImpl() {
    this(LoggerFactory.getLogger(Service.class));
  }

  ServiceImpl(Logger logger) {
    Preconditions.checkNotNull(logger);
    this.logger = logger;
  }

  @Override
  public void op() {}
}

The test starts simply as:
Test Specification


package com.example.java

import spock.lang.Specification
import org.slf4j.Logger

class ServiceImplSpec extends Specification {
  def logger = Mock(Logger)

  def "construction succeeds"() {
    when:
      new ServiceImpl(logger)

    then:
      notThrown Exception
  }

  def "construction fails"() {
    when:
      new ServiceImpl(null)

    then:
      thrown NullPointException
  }

  def "op called successfully"() {
    given:
      def sut = new ServiceImpl(logger)

    when:
      sut.op()

    then:
      notThrown Exception
  }
}

Along with a simple regression test provided and the beginnings of the unit test for the class there are additional architectural benefits to this code approach:

  • the default visible constructor is DRY with regards to constructor parameter checking and construction code
  • the public constructors are only responsible for construct default injection properties for the class
  • increasing public access to injection properties, or altering the defaults, can be done with new public constructors if necessary to maintain backwards compatibility

In Action

One of the last challenges to this task turned out to be an integration issue. A component I was ‘hiding’ behind the public constructor actually required a reference to an instance found through Blueprint. This required adding a DI parameter to some public constructors to inject the needed Blueprint reference into the component system. Since the default visible constructors already provided the dependency injection properties, unit tests and implementation remained unchanged in all other aspects and the system functioned as desired.

Pin It on Pinterest