Datacenter in a box

Network setup

Running multiple VMs on one bare metal system like a datacenter.

Today I want to tell you something about my “datacenter in a box” setup, that I’m currently running on my Root Server at Hetzner.

The base system is a bare metal EX root server with 32GB RAM and 2x2TB hard disk running SLES 11 SP3 with KVM support.

On top of that, there are multiple virtual machines up and running that connect to the world via an internal gateway. This gateway runs the SuSEFirewall2 (to forward some ports directly to some VMs, xinetd (for port redirection) and haproxy (mainly for WWW pages).

The virtual machines just use a private network – all external traffic is handled by the gateway.

I defined 3 RAID 1:

  1. /boot with 1 GB space (/dev/md0)
  2. swap with 16GB (/dev/md1) – might be reduced next time, but as long as I have some space, who cares 😉
  3. the “rest” (/dev/md2 with ~1,8TB), setup as LVM volume named “lss” (I’m using the machine name here – using something like “system” might be normal, but you might run into trouble once you mount the disk in another PC)

On the 3rd RAID 1, the first logical volume is called /dev/mapper/lss-root with a size of 200GB. The rest of the space is used for the virtual machines. Each virtual machine has 8 GB as root filesystem named by the internal machine name and the target (the command line looks like “lvcreate -n pgsql1_root -L 8G lss” – which results in /dev/mapper/lss-pgsql1_root).

The first task was to install a “template” virtual machine:

  1. lvcreate -n template_root -L 8G lss
  2. start YaST -> Create virtual machines -> install openSUSE 13 based system
  3. All disks use the “virtio” disk bus and the “unsafe” cache mode (unsafe means: most of the data that should end up on disk is instead stored in RAM, which will result in data loss if the Hetzner data center has a power outage or my bare metal machine crashes. So far I benefit from the performance more than seeing any problems – but you might choose a more safe configuration).
  4. The network uses the “virtio” device model.
  5. I added a serial device, type “pty”, to be able to connect via “virsh console <machinename>” later
  6. After installation of the minimal system, adapt the boot commandline to contain “console=tty0 console=ttyS0,115200n8 elevator=noop” (especially the elevator=noop is important to gain more performance)
  7. I also did some other basic configuration (set DAILY_TIME=”19:30″ in /etc/sysconfig/cron ; configured /etc/ntp.conf to use/allow only the gateway as source ; restrict SSH by allowing just ssh-keys; configure /etc/securetty and systemd for serial console ; define the correct update repositories (also using the gateway here, to save some bandwidth with all the virtual machines runing the same OS ; …) ) which might be done by ansible (or any other configuration management tool), too, but this makes the setup of new machines easier.
  8. After that (and an initial “zypper up” to get the latest updates installed) , I rebooted the system and checked that the serial console works and the machine get’s the right IP address. This is done by the following configuration in /etc/libvirt/qemu/networks/internal.xml :

<network>
<name>internal</name>
<uuid>fceffece-47c6-4848-bda2-b6668e9999as</uuid>
<forward mode=’nat’/>
<bridge name=’internal’ stp=’on’ delay=’0′ />
<mac address=’52:54:00:6e:ce:e1’/>
<ip address=’192.168.155.1′ netmask=’255.255.255.0′>
<dhcp>
<range start=’192.168.155.200′ end=’192.168.155.254′ />
<host mac=’fe:54:00:15:66:d1′ name=’gw’ ip=’192.168.155.1′ />
<host mac=’52:54:00:e0:b2:9c’ name=’mail’ ip=’192.168.155.40′ />
<host mac=’52:54:00:d0:a3:9d’ name=’www1′ ip=’192.168.155.10′ />
<host mac=’52:54:00:81:09:9e’ name=’www2′ ip=’192.168.155.30′ />
<host mac=’52:54:00:15:66:9f’ name=’mysql’ ip=’192.168.155.20′ />
<host mac=’52:54:00:e6:03:9g’ name=’pgsql’ ip=’192.168.155.50′ />
<host mac=’52:54:00:a5:bc:9h’ name=’www3′ ip=’192.168.155.70′ />
<host mac=’52:54:00:b5:a6:9i’ name=’www4′ ip=’192.168.155.80′ />
<host mac=’52:54:00:f4:62:9j’ name=’monitor’ ip=’192.168.155.100′ />
</dhcp>
</ip>
</network>

  1. Just link that file to /etc/libvirt/qemu/networks/autostart/, so the network get’s started automatically during boot of the bare metal machine.
  2. (Do the same later with the virtual machine definitions stored in /etc/libvirt/qemu/<machinename>.xml : create a symlink to /etc/libvirt/qemu/autostart/ to get them started during boot – but not with the template machine.)
  3. As I use NAT in libvirt, I need to setup a bridge on the bare metal OS. Normally YaST should do this automatically for you. If not, here is my /etc/sysconfig/network/ifcfg-br0 definition:

BOOTPROTO=’static’
BRIDGE=’yes’
BRIDGE_FORWARDDELAY=’0′
BRIDGE_PORTS=’eth0′
BRIDGE_STP=’off’
IPADDR=’144.76.220.231/27′
IPADDR_0=’2a01:4f8:200:93e6::2/64′
PREFIXLEN=’27’
STARTMODE=’onboot’
USERCONTROL=’no’

  1. As my machines run just on a single bare metal machine, I decided to make use of all the features of my real CPUs and copied the host CPU configuration (that’s done very easy via the graphical virt-manager). Maybe I remove the display and mouse definitions later, but they do not harm and I can connect via virt-manager to get a graphical interface, if I like.  My final “template.xml” machine definition looks like:

<domain type=’kvm’>
<name>template</name>
<uuid>59851e51-9513-952b-2b31-fcde3f703b43</uuid>
<memory unit=’KiB’>16777216</memory>
<currentMemory unit=’KiB’>2097152</currentMemory>
<vcpu placement=’static’ current=’2′>4</vcpu>
<os>
<type arch=’x86_64′ machine=’pc-i440fx-1.4′>hvm</type>
<boot dev=’hd’/>
</os>
<features>
<acpi/>
<apic/>
<pae/>
</features>
<cpu mode=’custom’ match=’exact’>
<model fallback=’allow’>Haswell</model>
<vendor>Intel</vendor>
<feature policy=’require’ name=’tm2’/>
<feature policy=’require’ name=’est’/>
<feature policy=’require’ name=’vmx’/>
<feature policy=’require’ name=’osxsave’/>
<feature policy=’require’ name=’smx’/>
<feature policy=’require’ name=’ss’/>
<feature policy=’require’ name=’ds’/>
<feature policy=’require’ name=’vme’/>
<feature policy=’require’ name=’dtes64’/>
<feature policy=’require’ name=’abm’/>
<feature policy=’require’ name=’ht’/>
<feature policy=’require’ name=’acpi’/>
<feature policy=’require’ name=’pbe’/>
<feature policy=’require’ name=’tm’/>
<feature policy=’require’ name=’pdcm’/>
<feature policy=’require’ name=’pdpe1gb’/>
<feature policy=’require’ name=’ds_cpl’/>
<feature policy=’require’ name=’rdrand’/>
<feature policy=’require’ name=’f16c’/>
<feature policy=’require’ name=’xtpr’/>
<feature policy=’require’ name=’monitor’/>
</cpu>
<clock offset=’utc’/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>restart</on_crash>
<devices>
<emulator>/usr/bin/qemu-kvm</emulator>
<disk type=’block’ device=’disk’>
<driver name=’qemu’ type=’raw’ cache=’unsafe’ io=’native’/>
<source dev=’/dev/lss/template_root’/>
<target dev=’vda’ bus=’virtio’/>
<address type=’pci’ domain=’0x0000′ bus=’0x00′ slot=’0x05′ function=’0x0’/>
</disk>
<controller type=’usb’ index=’0′>
<address type=’pci’ domain=’0x0000′ bus=’0x00′ slot=’0x01′ function=’0x2’/>
</controller>
<controller type=’ide’ index=’0′>
<address type=’pci’ domain=’0x0000′ bus=’0x00′ slot=’0x01′ function=’0x1’/>
</controller>
<controller type=’pci’ index=’0′ model=’pci-root’/>
<interface type=’network’>
<mac address=’52:54:00:19:13:ab’/>
<source network=’internal’/>
<model type=’virtio’/>
<address type=’pci’ domain=’0x0000′ bus=’0x00′ slot=’0x03′ function=’0x0’/>
</interface>
<serial type=’pty’>
<target port=’0’/>
</serial>
<console type=’pty’>
<target type=’serial’ port=’0’/>
</console>
<input type=’mouse’ bus=’ps2’/>
<graphics type=’vnc’ port=’-1′ autoport=’yes’/>
<video>
<model type=’cirrus’ vram=’9216′ heads=’1’/>
<address type=’pci’ domain=’0x0000′ bus=’0x00′ slot=’0x02′ function=’0x0’/>
</video>
<memballoon model=’virtio’>
<address type=’pci’ domain=’0x0000′ bus=’0x00′ slot=’0x04′ function=’0x0’/>
</memballoon>
</devices>
</domain>

  1. Once I want to create a new virtual machine, I now just have to to the following steps:
    1. Create an additional logical volume(s): “lvcreate -n www1_root -L 10G lss“.
    2. Make sure that the template machine is not running.
    3. Go into virt-manager and “clone” the template system to the new virtual root volume (Storage -> Clone this disk -> details -> Enter the path to the new LV here). virt-manager will warn you that cloning will overwrite an existing “file”, which is correct here.
    4. Virt-Manager UI

      Cloning a VM to a logical volume in virt-manager.

      After the cloning finished, you can start your new machine. It will get an IP address of the dynamic pool – until you add the MAC address to the network configuration listed above. I configure my productive virtual machines to use static IP addresses at the moment, which allows me to define specific Firewall rules for them (something for an additional blog entry).

  2. Now the Firewall and the network configuration on the gateway needs adaption:
    1. all my virtual machines are in the “DMZ” zone of SuSEFirewall2 – so add an entry like: FW_DEV_DMZ=”internal […] vnet9″ to assign the newly created virtual interface to this DMZ.
    2. add an entry like: FW_MASQ_NETS=”192.168.1.150/32,0/0,tcp,80″ to allow the VM to reach other web servers in the internet or an entry like: FW_MASQ_NETS=”192.168.1.150/32” to NAT the machine completely. You can of course skip this if the machine itself should not reach the internet at all. Others can reach services on the machine via redirect from haproxy or xinetd – but this is again something for another blog entry (especially as this one is already very long 😉 …
Advertisements

About Lars Vogdt

This is the private blog space of Lars Vogdt, the topics will be in first place work related.
This entry was posted in network, openSUSE, SUSE Linux Enterprise and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s