The arXiv Next Generation project is an ambitious effort to renew the software that runs arXiv.org for long-term evolvability. In a previous post, I described some of the technical drivers for arXiv-NG, and our high-level approach. We’ve embarked on a mission to progressively rebuild the arXiv software by isolating components from the classic system, and reimplementing them in a more modular architecture.
Implementing our vision for arXiv-NG has entailed several significant changes in the way that the arXiv development team works. The team is learning new technologies, like Flask, Docker, and Kubernetes. We’ve also become increasingly adept at coordinating work among geographically distributed members of the team. Some of the most dramatic changes, however, have been in how we plan and prioritize development effort to advance the long-term goals of arXiv-NG. The complexity of the classic system, and thus the complexity of incrementally porting its components to a new architecture, require strategic planning on multiple time-scales.
In this post, I introduce some of the processes that we are using to plan and prioritize this complex effort. I’ll also talk a bit about our testing and release process. For those of you watching arXiv-NG development proceed, this will be especially useful background as NG components go into public beta testing over the coming weeks and months.
One of the most significant challenges of arXiv-NG planning has been to identify a long-term plan for development that takes into account the complex inter-dependencies of components in the classic system, and the challenges of shifting between two dramatically different architectures over time.
Our first step was to document the architecture of the classic system, as well as the architecture that we envision for the future. We then used the dependencies that emerged from the documentation process to identify critical paths in redevelopment. For example, we can’t migrate the arXiv API until we replace the ancient Lucene backend that powers our search interfaces. What emerged from that analysis was a set of checkpoints, or milestones—nodes in those critical paths for migration.
Our second step was to review the considerable backlog of wishes and dreams for the arXiv software that have accumulated over the past several years. This includes things that the operations team has identified as problematic for users, things that our stakeholders have asked for, things that we heard about from user surveys, and things that we simply know that we ought to do (consistent UTF-8 support, for starters!). We pinned those development goals onto the milestones that we had already identified. As some of those milestones got rather large and unwieldy, we split them into more manageable chunks.
The result is a collection of bite-sized development goals that look something like this:
This milestones-and-dependencies format gives us a comprehensible high-level view of the project, and allows us to zoom in on specific bits of effort to move the project forward.
Milestones & sprints (getting work done)
A milestone refers to the completion of a collection of related development work in a particular arXiv-NG subsystem (e.g. submission & moderation, search & discovery). Work toward a particular milestone is intended to span from one to several development sprints, and usually results in one major release. Most of the items in the technical section of the 2018 arXiv roadmap correspond directly to these milestones.
In the past, we worked in two-week omnibus sprints, with each developer usually working on a separate project. As the team has grown, one of the procedural changes that we made recently was to support multiple simultaneous sprints of variable length. What this means in practice is that our sprints have become much more goal-oriented (get milestone X to beta this week!), rather than being merely a biweekly check-in and status update.
Release process & user testing
In order to focus our goals during sprint planning, and to help coordinate effort among arXiv team members and external participants (like our one thousand-plus volunteer testing force!), we divide up work toward each milestone into several phases.
Prior to beginning work toward a milestone, the IT lead and lead architect evaluate the planned development work described in that milestone, and make adjustments in consultation with the dev team and stakeholders. Members of the dev team assigned to the milestone reviews the goals and specific requirements for soundness, over or under specification, and feasibility. At this point, specific goals may be moved into or or out of the milestone based on the latest understanding of constraints and priorities. Specific tasks are elaborated as tasks and stories (we use Jira), and are prioritized and assigned.
During pre-alpha, we identify quality and performance goals for the milestone. For example, we set goals for code quality, user experience, and accessibility. We also identify opportunities for incorporating user feedback, such as user surveys or focus groups.
Alpha: Does It Work?
Once the majority of and highest priority functional goals for a milestone have been met, we enter alpha testing. We deploy the new software for internal review by other members of the dev team, as well as our operations and management teams. Everyone is involved in identifying and reporting bugs. The developers working on the milestone begin to prioritize and fix those bugs, and to evaluate the release against quality and performance goals.
As alpha testing draws to a close, we start a new push toward a beta release. We’re patching bugs, addressing any remaining quality deficiencies (e.g. test coverage, documentation, performance issues), and also perform an accessibility review—we target WCAG 2.0 Level A at a minimum for all new development.
This is also an opportunity for a final code review by the full dev team. Everyone participates in building and testing the new software, reviewing documentation, looking for ways to improve quality, and identifying bugs.
Beta: Will Stakeholders Be Satisfied?
In many cases, new components will be release for public beta testing. At this point the milestone is feature-complete (or very nearly so), and bugs and improvements identified during alpha testing and pre-beta dev team review have been addressed. If we are replacing existing functionality (which is often the case), we deploy the beta version in parallel to the interface(s) that it replaces. Depending on what we’re testing, we may send out a call to our user testing mailing list, the arXiv API list, or other stakeholders that can provide feedback.
After a final push to address issues that arose during beta testing, and to ensure that all of our goals have been met, software developed for the current milestone goes into production. Depending on the features involved, we may either immediately replace user-facing interfaces, or deprecate them with a concrete timeline for removal.