In 2016 we decided that we wanted to explore some new product ideas. We’d seen great success with Channels, but we’d also seen customers building other kinds of connected experiences that we thought we could also help them with.
As part of this process, we built some new infrastructure around our APIs that allowed us to prototype in different ways. We also built a new dashboard application to be the entry point for these new products.
We’ve just finished the first phase of a project to unify the dashboard, which provides a seamless experience for customers whether they need Channels or Beams.
We don’t think any company ever intends on having two dashboards, but that is exactly the situation we found ourselves in. We wanted to reflect on how this happened and explain the steps we have taken to remedy it.
Why did we have two dashboards?
In the beginning, Pusher had a single product: Channels. We built a dashboard in Ruby on Rails for customers to manage their apps. This served us well for a number of years, but it also started to develop some common problems:
- We had stopped using Ruby/Rails in other systems we were developing. We had lost some of the skills.
- The system had developed tech debt and feature creep that made it slow to work with.
When we started developing new products, we needed a dashboard for customers to manage them.
To build this quickly we decided to implement a new system based on technologies we were more familiar with at the time: React and Express, backed by Go microservices in front of Postgres databases. In hindsight, much of the backend system suffered from second system effect.
A deeper incentive for building a new system was Conway’s Law. The engineering team had grown significantly around this time. What was once a single high-context team was now multiple product teams. The team tasked with building dashboard functionality for the new products did not have much contact with the engineers who were most experienced with the old dashboard. The path of least resistance was to build something new rather than bridge team boundaries. In other words, the systems we were producing reflected the product teams that had formed.
The new dashboard solved our short term problem (the new products had a dashboard). Our long term intention was to build Channels support into the new dashboard in the future, but we had not laid out a strategy for how we would achieve this.
It looked like this
The new dashboard grew and matured. Before long it was 40k lines of code. Over time we became increasingly frustrated by the pain caused by two the separate dashboards:
- Our customers who wanted to use Channels and any of the new products had to create separate accounts. The UX was different in each dashboard.
- Internally, having two systems meant more maintenance and more skills that we needed in the engineering team.
Interestingly, most of the advice against rewrites (for example) mentions the case where the new system fails to be delivered, but it does not mention the case where the new system and old system exist indefinitely, each implementing a subset of functionality. We don’t think we’re the first company in the software industry to experience this problem!
We attempted some quick wins to improve the experience for our customers. We built a system to sync accounts between the two systems. We added a “product switcher” that made it easier to move between dashboards. These helped, but we knew these incremental steps were small hops between local optima. Meanwhile these mitigations made the system more complex and unreliable.
We needed to make a larger leap.
How we unified them
The obvious solution would be to finish the rewrite and implement all the Channels functionality in the new dashboard.
There were a number of reasons we decided against this:
- We decided the microservice architecture of the new dashboard was more complex than the requirements justified. We wanted to go back to a single rails app.
- There were fewer features in the new dashboard than the old dashboard, so therefore less to re-implement.
- The old dashboard was more reliable. It had four more years of battle testing than the new one.
Having said that, there were aspects of the new dashboard we liked, for example some of the pages were much more effective as a single page app. We’re planning to slowly bring these ideas into the new “old” dashboard.
Ironically, although we wanted to move away from the microservice architecture, it made deprecating it much simpler. We could wrap the microservices of the new dashboard with the old dashboard, and then incrementally remove them. We could avoid one large data migration.
Our primary concern with this release was security. For example making sure that only Beams instance owners and collaborators can view and update the instances they own or collaborate on. We also had to re-implement the Beams billing system, which we were particularly concerned about implementing correctly. One of the benefits of moving to the “old” dashboard was a lot of the existing battle tested access control functionality could be re-used. In addition to this, we found writing acceptance tests with Capybara was an effective way of increasing our confidence.
While the entire unification project was a big leap, we wanted to work as incrementally as possible. We didn’t think it was feasible to move the dashboard over page by page, but we successfully employed some techniques for reducing risk and the time to the initial releases. These are listed below.
Top three recommendations for unifying systems
Merge all changes to master, behind feature flags
To reduce risk we decided to release each page to production as soon as we implemented it. We kept them behind a feature flag so that they were only visible to Pusher employees. We liked being able to easily share the work we were doing and get early feedback.
When merging two systems, it is tempting to create abstractions around common functionality as you go. This could be internal data models, or concepts that are exposed to the end user, e.g. “Channels apps and Beams instances are superficially similar – could we combine them?”. We think you should avoid this initially. It is hard to get abstractions right the first time due to the rule of three. Instead: duplicate; release; then abstract. You will ship value sooner. The abstractions can be added incrementally later on.
Don’t change functionality
It is tempting to make UX improvements to features as you port them over. We think you should avoid this. Before you know it you will be wrapped up in product debates about how exactly it should be improved. And where does one stop with these improvements? Instead you should copy the existing system as closely as possible, and maintain a list of potential improvements that you could work on later. Not changing functionality also has the advantage that the same tests can potentially be run against the old and new systems.
If you are considering a rewrite, or need to bring divergent or duplicate together, it’s worth considering these lessons:
- It’s easy (and necessary) to create prototypes or start new projects without a view on their final integration with legacy systems. This shouldn’t be allowed to go on too long.
- Green field projects are so productive, they can lure us into a false sense of being the right direction (e.g. productivity gains vs working on existing systems).
- Favour incremental improvements to existing systems where possible. If you decide a rewrite is necessary, there are incremental approaches (e.g. the strangle fig application) that are less risky than a full rewrite.
- The systems we build really do reflect the shape of our organisation, for good or bad (i.e. our real life validation of Conway’s Law).
- The latest greatest technology doesn’t always solve the fundamental issues with accrued complexity (micro services, SPAs).
Where we are now
We’re now in a position where our customers have a seamless dashboard experience, and we have one fewer 40k LoC system to maintain. It has only been released for a week, but other than a couple of minor issues that have now been resolved, it has been working great!
Earlier I mentioned our challenges with Conway’s Law in the engineering team. This is something we have improved considerably. The dashboard is always a tricky case because the tech stack is quite different to our other systems, so it requires specialist skills. Despite this we have seen success fighting silos by structuring our team around projects (transient squads) rather than systems. But this is a story for another blog post!
The dashboard unification project is part of a larger plan to improve this important part of the customer experience. In the future we will be working on
- Unifying Beams and Channels billing
- Organisations and projects
- Improving access controls
- More analytics
If you are a user of our dashboard we’d love to hear your feedback on the recent changes or ideas for the future.