If you are a tech person like me, you‘ll probably share the sentiment – soft stuff like writing a blog moved into the darkest schedule corner with a hope that it will just pass. Well, it‘s not, and it‘s not fair to You, our customers, and partners, so I promise to deliver quarterly reports on what we‘ve been up, what‘s ahead, and perhaps share some juicy details.
What we‘ve been up to
We‘ve ended 2016 heavily scarred after the infamous Black Friday incident, where we hit a snag with our storage subsystem in Frankfurt. Although no customer data was lost, the time it took to bring everything back to normal took a while. Morale was low, customers were angry. However, we brought ourselves together and moved on.
Early in 2017, we‘ve contacted 42on B.V., a consultancy company based in The Netherlands, and agreed to have an intense two-day brainstorm session on few things: how to avoid meltdowns in the future and how to make it faster. We‘ve designed a completely new cluster, got rid of underperforming Ceph caching layer, bought new storage boxes, and went to DC for installation. The results were amazing in comparison to previous design: no longer we had unpredictable I/O latency (removed tens of thousands of code lines each I/O needed to traverse in caching layer logic), loads were significantly reduced (more and faster SAS disks, better CPUs), much higher IOps due to NVMe backing and, in a way, lower power consumption due to a change from BASE-T to fiber channel-based networking. Without waiting, we‘ve performed the same upgrade in Chicago. Then a later same year in Johannesburg.
Network infrastructure has been growing organically over the past ten years, so it‘s no wonder that we‘ve found ourselves surrounded by semi-random switches, routers, transceivers, and cables. We would bump into unexplainable packet drops, latency spikes, protocol compatibility issues, and so on. It did not work well with our goal of automating router/switch configuration changes either since developers would need to write yet another compatibility layer. We‘ve redesigned the data center network topology and, backed with experience and knowing what we want, we‘ve decided to standardize on Juniper hardware. As of writing this, Frankfurt, Chicago, and Johannesburg are all upgraded to the new spec. In addition to hardware upgrades, we‘ve considerably expanded our connectivity with local Internet exchanges and Tier 1 network providers, which brings not only higher availability, but also the better price per Mbps to the end-user.
Data Center Upgrades
Looking at perspective, it feels like something out of science fiction, however, in between everything, we‘ve also built and moved all services in two locations: São Paulo, Brazil, and Johannesburg, South Africa. Both were „legacy“ locations where we owned only parts of the infrastructure. Not having direct contracts with Tier 1 upstream providers and not being able to offer a full set of Cloud services hampered our growth. With upgrades complete, we have launched our standard suite of services, not previously available in both locations: Enterprise Cloud, Virtuozzo, IP Transit, and Colocation. We also managed to cut end-user costs on bandwidth and improve compute and storage performance due to the hardware upgrade.
Having all that shiny metal is not much good without automation. Our in-house development team has been as busy as ever. Here is a quick glimpse of their quarterly achievements.
Virtuozzo 7 Automation
For some time, we were running an open-source cousin of Virtuozzo ® – OpenVZ – in production. It was good, but just as our focus shifted towards business clients and partners, so did the requirements for the future of this particular virtualization platform. The top requests from our customers were newer kernel, faster backups, better OS support, and stable tunnel module, while our engineering team wanted more efficient compute utilization, automated datastore management, and improved management panel. Our dev team designed and developed a production-ready automation system in 8 months. The system developed in GoLang, and we heavily utilize AMQP for global message queuing and processing. It enabled us to provide not only a full set of services, like capacity monitoring, statistics collection, smart placement, storage, and backup management, but also better management capabilities for our system admin team. Most importantly, though, we’ve exposed all of this functionality via publicly available API.
Our flagship hypervisor-based VM product has been in production for two years now, and it quickly became a de-facto standard for long-term infrastructure deployments. The development of this platform is our strategic goal. Thus we have been continuously releasing improvements: – Template Builder, keeping our Operating System templates up to date with the latest security patches and improvements. – We have implemented CPU and Memory hot-plugging, allowing our customers to upgrade these resources on supported OS’es on the fly, without a system restart. – Custom ISO upload, allowing you to install a custom Operating System image of choice and have the maximum flexibility, or install an application. – Bandwidth consumption monitoring and alerting, which helps to keep track of consumable resources. – Lastly, we’ve updated the backend logic so that VMs placed in their dedicated NUMA nodes for improved VM performance and better resource utilization.
One of the strategic goals of our development has been exposing as much as possible of our infrastructure platform functionality programmatically via API. Being a professional company at our core, we deal with APIs every day and hour: website checkout talks to a billing system, billing system talks to automation platform, automation talks to machines and network gear, monitoring system talks to chat system, etc. All of this is possible due to a set of rules, definitions, and protocols, allowing developers to build applications or any integration between otherwise unrelated systems. Heficed API allows anyone with an account with us to grab an API token and control the full lifecycle of Enterprise Cloud or Virtuozzo environment. Also, we have added reverse DNS control and IP management functionality. This means you can integrate virtual machine provisioning and management along with your existing virtualized infrastructure workflow. We have standardized on RESTful principles while developing our API endpoint, please give it a try and let us know if you’d like to see it improved.
WHMCS Partner Module
Having an API allowed us to take our Partner Program to the next level. We have released an open-source module for a popular WHMCS web hosting automation platform, which with a few configuration steps, will allow you to enable Cloud and Virtuozzo services for your customers. You can find it on GitHub, and it comes with documentation along with our developer support in case you need assistance or have a feature request. Currently, it’s tailored for WHMCS7/PHP7.
We strive to improve the user experience when dealing with our services, accessible via the Client Area. Big things are coming there, but meanwhile: – If you have Premium Dedicated servers with us, you probably noticed the server dashboard now has a similar look and feel to the rest of Cloud services: an informative dashboard with specifications and event log, power and console control buttons, statistics tab. – Virtuozzo dashboard has been improved to provide more details on resource utilization, has a much more effective out-of-band management console, improved event log, and fast scheduled backups section. – Interaction with support organizations has improved, and we are also showing your account manager information. – By popular request, we’ve added bulk reverse DNS management (with IPv6 support!).
It is almost mid-year, and we are well into several projects set to be delivered this year. Here is a quick glimpse of what is to come short to medium term.
– We have just launched Virtuozzo, improved our backup system, and massively improved networking in LAX (Los Angeles). – Backups in SAO (São Paulo) are on their way. I know it’s overdue, but we believe now we have fixed the delivery and import to Brazil so that it will be fast now. We will also re-enable local Internet Exchange in São Paulo for better connectivity and, hopefully, better Mbps pricing for you. – We are launching an improved traffic accounting system in all of our locations. We’ll be able to track per upstream bandwidth consumption with a much finer granularity, which will allow us to provide bandwidth consumption information more transparently with the possibility for you to choose what kind of bandwidth you need more (e.g., more local traffic vs. expensive international). – Deploying hardware firewalls in major locations, which we’ll virtualize to provide hardware firewall as a service, configurable via Client Area. – Removing all single points of failures from our infrastructure. It’s a daunting task, but we are committed to getting rid of them and are well into the process already. – Service status page: we’ll launch a site (and perhaps provide an API endpoint) to check/query our service health, like Billing, API, virtualization, storage, and other systems. – Virtual Private Cloud: a much-awaited product, which will allow you to take control of all infrastructure aspects and have better inter-VM communication.
– Instant Bare-metal over API: we are about to launch an Instant Bare-metal product in multiple locations around the globe, which you’ll be able to order via API in a matter of minutes. Essentially, a simplicity of Cloud on shared-nothing boxes. Stay tuned for an announcement very soon. – Premium Dedicated Servers management console proxy: we’ll make it easier to grab that out-of-band IPMI console, without a need to open an additional page. – Virtual Private Cloud interface and management: although we’ll offer a Managed VPC with its own console, we want to integrate it with the rest of our products in the Client Area. – BYO-IP process simplification: for the times when you want to bring your IPs to use on our infrastructure, we’ll simplify the process, so there’s as little human interaction as possible. – Host1Plus SDK. To quickly get started with our API, we’ll provide libraries in popular programming languages. – Client Area rewrite: we are getting ready to switch from legacy APIs and start consuming our API for the Client Area interface. It will allow us to enable 2-factor authentication, have multiple user levels/roles, and other long-awaited features. The design consistency and UX is also a top priority. – Enterprise Cloud improvements: introducing storage tiers and system backups. Please join our community at the feature-voting page and let us know if you’d like to see a product improvement. We might add it to our list! What would you like to read about in the next quarterly report? GDPR – please be assured we are working very hard on making sure your (personal) data is secure. There will be a few announcements over the coming weeks on how and why it is handled by us to provide the best service. I will write a section with the next quarterly blog update on the challenges we faced and the solutions we put in place for the better good.