This follows on from my previous blog post that covered the basics of getting started with Amazon’s cloud computing service EC2. On this post I want to go into a lot more detail about the issues I had automating Windows networking on EC2.
I have very little knowledge about networking but I quickly realised that each networked machine would need the IP Address of the domain controller. For this reason I start up my EC2 network in several steps
- Start-up the domain controller machine instance and wait for it to become ready for use.
- Get the IP Address of the DC.
- Start the rest of the machines, passing them the IP Address using the runRequest.UserData property.
Step 2 can be done manually by monitoring the machine instance in Elasticfox and then pasting the IP Address into the C# code used to start the rest of the machines. Alternatively it can be automated using the ec2-describe-instances.cmd command line utility and parsing the output to pick out the IP Address. This IP Address then gets passed to every subsequent machine using the runRequest.UserData property as follows:
runRequest.UserData = Convert.ToBase64String(Encoding.ASCII.GetBytes(dcIpAddress));
The only thing that needs doing now is to make sure that each machine launched reads in this IP Address and joins the domain. To do this I wrote a C# executable, copied it to several of my personal AMIs and configured the Windows Server to run it at start-up. Maybe there’s an easier, more intuitive means to do this but I chose the executable and didn’t hit too many problems.
The only major problem was getting all my stuff to play nicely with the ec2config service which runs on all EC2 machines and seems to do important stuff like giving the machine unique names (based on the internal IP address) and probably lots of networking stuff that I don’t pretend to understand. One thing I learned early on was that attempting to rename my machines to something more memorable than IP-0AE456B0 (for example) was very bad as it seemed to start a recurring reboot issue. I could never exactly figure this one out and the Amazon support forums seemed to contain several unresolved threads of this nature. Similarly bad things seemed to happen when my C# code attempted to join the machine to the domain.
After a lot of trial and error I came to the conclusion that it was the ec2config service that was causing the issues and that I would be better off not trying to compete with it. So I changed my startup executable to detect whether it was the first boot of the machine and depending on this follow different code branches.
First boot? (detected by the absence of a file called firstboot on the C: drive)
- Write a file “C:\firstboot” so that my code knows after the next reboot that it’s free to do its work.
- Set the ec2config service to Manual so that after the next boot it would not be running
- Wait a few minutes and if the ec2config service has not itself started a reboot, then force a reboot.
No -> Do the domain joining
- Check that we’re not already on example.com – if so exit application
- Create a HttpWebRequest object to http://169.254.169.254/latest/user-data. The corresponding HttpWebResponse will contain the IP Address of the domain controller that we specified in the machine start-up code.
- Use “netsh int ip set dns “local area connection 2″ static <insert_ipaddress_here> primary” to set to Primary DNS on the local machine to be the IP Address of the domain controller
- Use “netdom join %computername% /d:example.com /UD:email@example.com /PD:password” to actually join the domain. Note that I have setup a user account on my domain controller called “sqlmonitortest” that has the necessary privileges to do this kind of stuff.
- Use “net localgroup Administrators /ADD firstname.lastname@example.org” in order to make that user a local administrator. This is a huge time saver if, like me, your AUT (Application Under Test) requires access to this machine.
- Because I’m primarily interested in SQL Servers and most of my machines have SQL instances, I use ADO.NET to create a login for sqlmonitortest account and then add that login to the ‘sysadmin’ group (this is overkill for what my AUT requires but much, much less of a headache)
CREATE LOGIN [example\sqlmonitortest] FROM WINDOWS;
EXECUTE sp_addsrvrolemember ‘example\sqlmonitortest’, ‘sysadmin’;
7. Force a reboot using “shutdown -r -t: 0 /f”
Once your machine has surfaced after its second reboot it’s on the network and ready to go. Any subsequent reboots won’t change anything as the amazon service is still disabled and my executable detects that it’s now on example.com and exits. Maybe this is a long-winded way to automate networking on EC2 but I really struggled to find information on how to do this. The majority of support threads I visited seems to be geared towards Linux users.
Overall not the most intuitive process but one that wasn’t too much of an issue to achieve.
Next, in the third and final part, I will cover using CloudWatch to monitor EC2 instances…