DevOps Engineer
It is a text version of my speech at DevopsConf 2019-10-01 and SPbLUG 2019-09-25 slides.
There was a custom configuration management solution. I would like to share the story about a project. The project used to use a custom configuration management solution. Migration lasted 18 months. You can ask me: ‘Why?’. There are some answers below about it and it is related to changing processes, agreements and workflows.
The infrastructure looked like a bunch of standalone Hyper-V servers. In case of creating a VM we had to perform some actions:
It was a partially automated process. Unfortunately, we had to manage used resources & VMs locations manually. Hopefully, developers were able to change VMs configuration in the git repo, reboot VM and, as a result, get VM with the desired configuration.
I guess the original approach was to have IaC. It had to be a bunch of stateless VMs. Those VMs had to reset state after reboot. How did it look like?
There were some flaws:
It was possible to say that there was no CM. There was a pile of organized bash scripts & systemd unit files.
It was a standard environment for developing and testing: Jenkins, test environments, monitoring, registry, etc. CoreOS developers created it as a underlying OS for k8s or rancher. So, we had a problem: we used a good tool, in the wrong way. The first step was to determine the desired technology stack. Our idea was:
The next step was to put met requirements, to establish contracts or in other words to have Agreements as Code. It had to be manual actions -> mechanization -> automatization.
There were some processes. Let us chat about them separately
It was not a picnic. It was a bit tricky to create a VM at Hyper-V from Linux:
We decided not to reinvent the wheel and use the packer:
It worked really easy:
Agreements as Code was not enough for us. The amount of IaC was increasing, agreements were changing.We faced a problem about how to sync our knowledge about infrastructure across the team. The solution was to test Ansible roles. You can read the article about that process Test me if you can. Do YML developers Dream of testing Ansible? or more general How to test Ansible and don’t go nuts.
As I mentioned our infrastructure was like a creature. It was alive. It was growing. It was changing. As a part of that process & development process, we had to research was it possible or not to run our application inside Openshift/k8s. It is better to read Let’s deploy to Openshift. Unfortunately, we were not able to re-use Openshift inside development infrastructure.
Hyper-V & SCVMM were not user friendly for us. There was much more interesting thing - Windows Azure Pack. It was an SCVMM extension. It looked like Windows Azure, it provided HTTP REST API. Unfortunately, in reality, it was an abandoned project. However, we spent time on research.
Windows Azure Pack looked interesting, but we decided it was too risky to use. We used SCVMM.
As you can see, a year later we had the foundation for starting the migration. The migration had to be S.M.A.R.T.. We created the list of VMs & started yak shaving. We were dealing one by one with each old VM, create Ansible roles & cover them by tests.
Migration was prune determined process. It followed the Pareto principle: