Author Archive

Using SSH ProxyCommand to Tunnel Connections

Monday, March 16th, 2009

My systems are usually configured to allow ssh connections from a small set of trusted hosts or a bastion host. This is decent security practice but can be a pain when you want to scp a file or grab the stdout of a command from a host outside the trusted area. It also can be problematic if you have hosts on a private subnet and only one host (bastion host or jump box) to get in through. This method will enable transparent access to a host while behind the scenes tunneling through another host. No modification of the server is required. It just involves a few adjustments to the .ssh/config (ssh client config file) in your home directory.

Here’s how it works

A connection is established to the bastion host

+-------+            +--------------+
|  you  | ---ssh---> | bastion host |
+-------+            +--------------+

Bastion host runs netcat to establish a connction to the target server

+--------------+                +--------+
| bastion host | ----netcat---> | server |
+--------------+                +--------+

Your client then connects through the netcat tunnel and reaches the target server

+-----+                  +--------------+                +--------+
| you |                  | bastion host |                | server |
|     | ===ssh=over=netcat=tunnel======================> |        |
+-----+                  +--------------+                +--------+

So there are 3 things we need to have happen behind the scenes:

1. Ssh to bastion host.
2. Run netcat command on bastion host.
3. Connect to netcat tunnel.

Here’s how to use the ssh proxycommand

#~/.ssh/config
 
Host superchunk.example.org
	ProxyCommand  ssh [email protected] nc %h %p

In the above we are telling ssh that when it establishes a connection to superchunk.example.org to do so using the stdin/stdout of the ssh ProxyCommand as a transport. The ssh ProxyCommand then tells the system to first ssh to our bastion host and open a netcat connection to host %h (hostname supplied to ssh) on port %p (port supplied to ssh).

The result is a connection as if you were connecting from a trusted host:

$ ssh superchunk.example.org
Password: 
[email protected]'s password: 
Last login: Wed Jun 25 12:05:47 2008 from 10.0.0.221
[user@superchunk ~]$

Now you may be wondering why it prompted me for two passwords. This is because we are effectively sshing into two systems one right after the other. This can be resolved through the use of pre-shared ssh keys or with more advanced methods such as kerberos ticket forwarding.

More info about ssh proxycommand

For more detail you can read the full ssh_config man page here: http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man5/ssh_config.5.

LACP – How to Configure Network Bonding in Linux

Friday, March 13th, 2009

(Update: Feb 5, 2010 – I even more recently obtained a Cisco IOS switch and have included the configuration bits for IOS below.)

I recently obtained a Dell PowerConnect 5224 Gigabit switch which has the ability to combine multiple twisted-pair or fiber Ethernet links into one fault-tolerant and load balanced logical link. It also appears that its configuration syntax is very similar to that of a Cisco switch. In Linux this is called bonding, in switches its commonly referred to as a port channel. Either way, Its using the LACP (802.3ad) Protocol behind the scenes.

Configuring the switch for LACP bonding


Cisco IOS switch LACP configuration

Enabling LACP across two ports in IOS is pretty straightforward. The first thing to do is associate the ports with the channel-group. This is good to do early so that when you apply switchport parameters to the Port-channel interface it automagically applies them to the GigabigEthernet interfaces.

Here are the relevant portions of my running configuration.

interface Port-channel2
 description LACP Channel for mk2
 switchport trunk encapsulation dot1q
 switchport trunk allowed vlan 1,2
 switchport mode trunk
 spanning-tree portfast trunk
!
interface GigabitEthernet1/0/23
 description mk2 eth0
 switchport trunk encapsulation dot1q
 switchport mode trunk
 channel-group 2 mode active
!
interface GigabitEthernet1/0/24
 description mk2 eth1
 switchport trunk encapsulation dot1q
 switchport mode trunk
 channel-group 2 mode active
!

Dell PowerConnect switch LACP configuration

The Dell switch configuration is surprisingly easy. A port-channel is automatically created when the Linux host brings up it’s bond interface(s). Just figure out which ports you want to use for your bond and enable LACP on them. I used ports 1/23 and 1/24 (ports 23 & 24 on switch 1).

Vty-0#config
Vty-0(config)#interface ethernet 1/23
Vty-0(config-if)#lacp
Vty-0(config-if)#exit
Vty-0(config)#interface ethernet 1/24
Vty-0(config-if)#lacp
Vty-0(config-if)#exit

‘show run’ now indicates that the selected ports are LACP enabled.

Vty-0#show run
building running-config, please wait.....
...
!
interface ethernet 1/23
 switchport allowed vlan add 1 untagged
 switchport native vlan 1
 lacp
!
interface ethernet 1/24
 switchport allowed vlan add 1 untagged
 switchport native vlan 1
 lacp
!
...

At this point your port-channel will be down. Don’t worry, it will automagically come up when the Linux host brings up the bond interface. You can verify that its down by issuing the following:

Vty-0#show interfaces status port-channel 1
% Trunk 1 does not exist.

Note: This assumes you have no pre-existing port-channels, if you do have other port-channels configured you should iterate the port-channel number to be one more than the number of already defined port-channels.

Configuring the Linux host for LACP bonding:


There are a few places where you define the parameters of the bond. The kernel module defies the protocol, frequency and other attributes of the low-level bond channel configuration. The command ifenslave will create a bond device and allow you to manage the Ethernet devices within it (add/remove,etc.). Finally the network address configuration is handled by ifconfig, consistent with most other network interfaces in Linux. Luckily most of this is taken care of automatically by the networking init scripts.

Linux Kernel Module Configuration


LACP is referred to in linux as bonding mode 4, so we need to inform the kernel module to use this bonding mode. We’ll also pass it a few other parameters like the frequency of which to scan for changes in status.

Add the following to your module config file, in gentoo this is /etc/modules.autoload.d/kernel-2.6. This will pass the following options to the kernel module the next time it is inserted.

Red Hat and CentOS Kernel Module Configuration

#/etc/modprobe.conf
 
alias bond0 bonding
options bond0 miimon=100 mode=4 lacp_rate=1

RHEL 6 and CentOS 6 Kernel Module Configuration

#/etc/modprobe.d/bonding.conf
 
#Deprecated syntax
#alias bond0 bonding  
 
#Updated syntax
alias netdev-bond0 bonding
options bond0 miimon=100 mode=4 lacp_rate=1

Debian Kernel Module Configuration

# /etc/modules: kernel modules to load at boot time.
 
bonding mode=4 miimon=100 lacp_rate=1

Ubuntu Kernel Module Configuration

# /etc/modprobe.d/bonding.conf
 
bonding mode=4 miimon=100 lacp_rate=1

Gentoo Kernel Module Setup

#/etc/modules.autoload.d/kernel-2.6
 
bonding miimon=100 mode=4 lacp_rate=1

Linux Network Configuration


Red Hat and CentOS Network Setup

#/etc/sysconfig/network-scripts/ifcfg-eth0
 
DEVICE=eth0
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
MASTER=bond0
SLAVE=yes
#/etc/sysconfig/network-scripts/ifcfg-eth1
 
DEVICE=eth1
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
MASTER=bond0
SLAVE=yes
#/etc/sysconfig/network-scripts/ifcfg-bond0
 
DEVICE=bond0
IPADDR=10.0.0.80
NETMASK=255.255.255.0
BROADCAST=10.0.0.255
GATEWAY=10.0.0.1
ONBOOT=yes
BOOTPROTO=none
USERCTL=no

Debian / Ubuntu Network Setup

#/etc/network/interfaces 
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).
 
auto bond0
iface bond0 inet static
	address 10.0.0.80
	gateway 10.0.0.1
	broadcast 10.0.0.255
	netmask 255.255.255.0
	up /sbin/ifenslave bond0 eth1 eth2
	down /sbin/ifenslave -d bond0 eth0 eth1

*Note* This is dependant upon the ifenslave package, to install run the following:

apt-get install ifenslave

Ubuntu 10.04 or newer support an updated interfaces(5) syntax

#/etc/network/interfaces 
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).
 
auto eth0
    iface eth0 inet manual
    bond-master bond0
 
auto eth1
     iface eth1 inet manual
     bond-master bond0
 
auto bond0
     iface bond0 inet static
     address 10.0.0.80
     gateway 10.0.0.1
     netmask 255.255.255.0
 
 
bond-mode 802.3ad
bond-miimon 100
bond-lacp-rate 4
bond-slaves none

Gentoo LACP bonding Setup

#/etc/conf.d/net
 
config_eth0=( "null" )
config_eth1=( "null" )
 
slaves_bond0="eth0 eth1"
 
config_bond0=( "10.0.0.80/24" )

We also need to create a symlink in /etc/init.d for the new bond0 interface and turn off eth0 as it is controlled by the bond now. The following will disable eth0 and enable bond0 on boot.

  cd /etc/init.d
  ln -s net.lo net.bond0
  rc-update del eth0 default
  rc-update add bond0 default

Now you can bring up the bond interface.

  /etc/init.d/net.bond0 start

Checking the Status of the bonded LACP interface


You can check the status of your bond now from within Linux by using the /proc and /sys interfaces into the Linux bond driver.

$ cat /proc/net/bonding/bond0  
 
Ethernet Channel Bonding Driver: v3.1.1 (September 26, 2006)
 
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
 
802.3ad info
LACP rate: fast
Active Aggregator Info:
	Aggregator ID: 1
	Number of ports: 2
	Actor Key: 17
	Partner Key: 1
	Partner Mac Address: 00:77:54:71:a8:6f
 
Slave Interface: eth0
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:99:97:60:9d:48
Aggregator ID: 1
 
Slave Interface: eth1
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:00:85:60:9d:49
Aggregator ID: 1

You can check the bond from the switch.

Cisco IOS

Switch#show interfaces Port-channel 2
Port-channel2 is up, line protocol is up (connected)
  Hardware is EtherChannel, address is 001b.0dbf.ba17 (bia 001b.0dbf.ba17)
  Description: LACP Channel for mk2
  MTU 1500 bytes, BW 2000000 Kbit, DLY 10 usec, 
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, loopback not set
  Full-duplex, 1000Mb/s, link type is auto, media type is unknown
  input flow-control is off, output flow-control is unsupported 
  Members in this channel: Gi1/0/23 Gi1/0/24 
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 1d23h, output 00:00:01, output hang never
  Last clearing of "show interface" counters never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 0 bits/sec, 0 packets/sec
  5 minute output rate 5000 bits/sec, 7 packets/sec
     1060041 packets input, 193406916 bytes, 0 no buffer
     Received 18241 broadcasts (0 multicast)
     0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
     0 watchdog, 11873 multicast, 0 pause input
     0 input packets with dribble condition detected
     3181997 packets output, 2735804051 bytes, 0 underruns
     0 output errors, 0 collisions, 1 interface resets
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier, 0 PAUSE output
     0 output buffer failures, 0 output buffers swapped out

Dell PowerConnect:

Vty-0#show interfaces status port-channel 1
Information of Trunk 1
 Basic information: 
  Port type: 1000t
  Mac address: 00-30-F1-71-A8-82
 Configuration: 
  Name: 
  Port admin: Up
  Speed-duplex: Auto
  Capabilities: 10half, 10full, 100half, 100full, 1000full, 
  Flow control: Disabled
 Current status: 
  Created by: Lacp
  Link status: Up
  Port operation status: Up
  Operation speed-duplex: 1000full
  Flow control type: None
  Member Ports: Eth1/23, Eth1/24,

That’s is all the configuration work that I needed to perform. I hope it saves you time. I spent a while digging through dell’s site and the linux kernel docs to find the right combination of options. Please let me know if you had troubles with these directions or if you have questions. keith (at) backdrift.org

Additional docs on linux bonding driver https://www.kernel.org/doc/Documentation/networking/bonding.txt

Live Migration and Synchronous Replicated Storage With Xen, DRBD and LVM

Sunday, January 11th, 2009

Xen LVM & DRBD Overview


The Xen Hypervisor provides a great deal of flexibility and high availability options when it comes to deploying virtual machines. One of the most attractive features it offers is called live migration. Live migration is the ability to take a running virtual machine (“domU”) and move it from one Xen host server (“dom0”) to another. As you might expect, it is called “Live” because it is done while the virtual machine is on, without any noticeable degradation in performance or availability.

LVM is a logical volume manager for Linux. It allows you to take one or more disks and carve them up into dynamic volumes. Logical volumes are like really flexible partitions. They can be grown, shrunk, snapshotted, renamed, copied and moved around with minimal overhead and without ever needing to re-calculate partition sizes.

The DRBD Project provides storage replication at the block-level, essentially it is a network-enabled raid driver. With DRBD we can take a LVM volume and synchronize its contents on two servers. DRBD supports a multi-master architecture which as you’ll read is perfect for Xen.

Combining these technologies together provides us with a serious feature set; Dynamic volume management, snapshotting, synchronous replication, virtualization and best of all live migration. These are the fundamentals of massively expensive enterprise solutions and we’re going to implement them for free.

Architecture and Scalability


For the purposes of this howto I’ll be working with only two systems, however, it is fairly easy to scale this concept up by deploying additional servers. At a point of critical-mass you could also begin decoupling the components into a highly available replicated storage system and Xen servers which connect via iSCSI.

Hardware

1x Intel Pentium 4 @
2x 80G SATA Hard Disks
2GB DDR2 RAM
2x Gigabit Ethernet Adapters

Software

CentOS 5.2
kernel-xen-2.6.18-92.1.6.el5
xen-3.0.3-64.el5_2.1
drbd82-8.2.6-1.el5.centos
kmod-drbd82-xen-8.2.6-1.2.6.18_92.1.6.el5

[ad]

Setting up the DRBD Environment


The first thing DRBD requires is a backing block device, so lets create a LVM volume for this purpose.

[root@mk1 ~]# lvcreate -L 512M -n vm_shaolin vg0
 
[root@mk1 ~]# lvs
 
LV              VG   Attr   LSize Origin Snap%  Move Log Copy%  Convert
vm_shaolin vg0  -wi-ao 512M

Next, we tell drbd.conf about the LVM block device.

/etc/drbd.conf is DRBD’s main configuration file. It is made up of a number of sub-sections and supports many many more options than I currently utilize. [http://www.drbd.org/users-guide-emb/re-drbdconf.html man drbd.conf] does a great job of explaining every possible configuration option and should provide you with hours of entertainment.

#/etc/drbd.conf
common {
protocol C;
 
}
 
resource vm_shaolin {
 
disk      /dev/vg0/vm_shaolin;
device    /dev/drbd1;
meta-disk internal;
 
syncer {
rate 500M;
verify-alg sha1;
}
 
on mk1 {
address   10.0.0.81:7789;
}
 
on mk2 {
address   10.0.0.82:7789;
}
 
}

drbd.conf explained

common {
protocol C;
}

“protocol” defines the method which is used to determine that data is synchronized. Method C is the safest, it ensures that a write has completed on both sides before reporting success. Other methods are defined in man drbd.conf.

resource vm_shaolin {
disk      /dev/vg0/vm_shaolin;
device    /dev/drbd1;
meta-disk internal;

“resource” starts the section which describes a specific volume. The “disk” is the block device where data is actually stored, in our case this is an LVM volume. “device” is the name of the device presented by DRBD. “meta-disk” defines where the DRBD meta-data is stored. I choose internal because it is the most automatic option. If you want to squeeze every ounce of IO performance or are paranoid about combining DRBD meta-data and filesystem data on the same block device you may want to investigate using a separate block device for meta-data storage.

syncer {
rate 500M;
verify-alg sha1;
}

“syncer” defines the parameters of the syncornization system. I have upped the rate to 500M (Mega Bytes) per second in order to let drbd fully utilize the dual gigabit network interfaces I’ve given it. “verify-alg” defines the hashing method used to compare blocks between systems.

on mk1 {
address   10.0.0.81:7789;
}
 
on mk2 {
address   10.0.0.82:7789;
}

The “on” statements define parameters specific to actual host involved in the DRBD cluster. Be aware that these names do need to resolve properly on the host in order for DRBD to start. “address” defines the IP and port drbd will both listen on and connect to on each server. Make sure that your iptables rules allow access to these ports.

Initializing The DRBD Volume

Now that we’ve defined the working parameters of the DRBD replication subsystem we’re ready to initialize the volume.

We start off by initializing the meta data for the DRBD devices on the first node:

[root@mk1 ~]# drbdadm create-md vm_shaolin
 
v08 Magic number not found
v07 Magic number not found
v07 Magic number not found
v08 Magic number not found
Writing meta data...
initialising activity log
NOT initialized bitmap
New drbd meta data block sucessfully created.

Do the same on the second node.

[root@mk2 ~]# drbdadm create-md vm_shaolin
 
v08 Magic number not found
v07 Magic number not found
v07 Magic number not found
v08 Magic number not found
Writing meta data...
initialising activity log
NOT initialized bitmap
New drbd meta data block sucessfully created.

With the meta-data created we can now attach the drbd device to the backing block device and establish a network connection on both sides. This must be performed on both nodes.

[root@mk1 ~]# drbdadm up vm_shaolin
[root@mk2 ~]# drbdadm up vm_shaolin

”Note: “up” is a shorthand command which runs the “attach” command followed by the “connect” command behind the scenes.”

Let’s check the status of our volume through the /proc/drbd interface.

[root@mk1 ~]# cat /proc/drbd
1: cs:Connected st:Secondary/Secondary ds:Inconsistent/Inconsistent C r---
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 oos:524236

This shows us that the volume is in a connected state and that both nodes are showing up as secondary and inconsistent. This is what we expect to see as we have not yet put any actual data on the volume. Now we need to synchronize our array, this only needs to be run on one node.

[root@mk1 ~]# drbdadm -- --overwrite-data-of-peer primary vm_shaolin

Now when we look at /proc/drbd we see progress as the volume is synchronized over the network.

root@mk1 ~]# cat /proc/drbd
 
1: cs:SyncSource st:Primary/Secondary ds:UpToDate/Inconsistent C r---
ns:19744 nr:0 dw:0 dr:19744 al:0 bm:1 lo:0 pe:0 ua:0 ap:0 oos:504492
[>....................] sync'ed:  4.0% (504492/524236)K
finish: 0:00:03 speed: 40,944 (40,944) K/sec

Once the volume has finished its synchronization we should see that both sides are showing “UpToDate” device status.

[root@mk1 ~]# cat /proc/drbd
*        *
1: cs:Connected st:Secondary/Secondary ds:UpToDate/UpToDate C r---
ns:458700 nr:0 dw:0 dr:458700 al:0 bm:28 lo:0 pe:0 ua:0 ap:0 oos:0

Now that we’ve verified the device status we’re ready to promote the volume to “Primary” status on the primary server.

[root@mk1 ~]# drbdadm primary vm_shaolin

We should see this Primary/Secondary status reflected in the /proc/drbd interface.

[root@mk1 ~]# cat /proc/drbd
*        *
1: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
ns:458700 nr:0 dw:0 dr:458700 al:0 bm:28 lo:0 pe:0 ua:0 ap:0 oos:0

[ad]

Setting up the Xen Environment


Our volume is now ready for data, we can format it with mkfs or populate it with a pristine block device image now. Since this is going to be the root file system of a Xen system I usually start with a file system image from stacklet. I’ll leave it up to you to get your favorite OS installed on this block device.

DomU configuration

The Xen virtual machine configuration file is pretty standard. The important piece is that we specify the resource name using the drbd xen block script provide by the drbd distribution.

#/etc/xen/configs/shaolin
name    = 'shaolin';
memory     = 512;
maxmem  = 4096;
kernel  = '/boot/xenU/vmlinuz-2.6.16.49-xenU-r1';
disk = [ 'drbd:vm_shaolin,hda1,w' ];
root = '/dev/hda1 ro';
vif = [ 'bridge=xenbr0, mac=a0:00:00:01:00:01' ];

Xend configuration

In order for live migration to work we need to enable it and define the network interfaces, ports and permitted hosts for xend. These configuration steps must be completed on both hosts.

#/etc/xen/xend-config.sxp
(xend-relocation-server yes)
(xend-relocation-port 8002)
(xend-relocation-address '')

“xend-relocation-server” switches the live migration functionality on or off. “xend-relocation-port” defines the TCP port used for incoming relocation, 8002 looks good. “xend-relocation-address” is a list of hosts allowed to migrate virtual machines on to this one, I leave this empty to allow any host and then restrict access using iptables.

Once live migration has been enabled in your xend config you’ll need to restart xend

# /etc/init.d/xend restart
 
restart xend:                                              [  OK  ]

Verify that xend is listening for relocation connections:

# netstat -nlp |grep 8002
 
tcp        0      0 0.0.0.0:8002                0.0.0.0:*                   LISTEN      4109/python

Starting the Virtual Machine

We can now start our virtual machine using the “xm create” command. This is critical, ”’your virtual machine must be run on only one host at a time”’. Two different virtual machines using the same block device simultaneously will severely corrupt your file system.

[root@mk1 ~]# xm create /etc/xen/configs/shaolin
Using config file "/etc/xen/configs/shaolin".
Started domain shaolin

To check in on it after we have started it we use the “xm list” command.

[root@mk1 ~]# xm list
 
Name                                      ID Mem(MiB) VCPUs State   Time(s)
Domain-0                                   0      491     2 r-----    176.6
shaolin                                    1      511     1 -b----      7.1

Migrating the Virtual Machine

Once the virtual machine has been started on the first node we can migrate it over to the second. Run the following command on the first node:

[root@mk1 ~]# xm migrate --live shaolin mk2

Its normal for this command to have no output. You should now see that your VM is no longer running on the first node.

[root@mk1 ~]# xm list
 
Name                                      ID Mem(MiB) VCPUs State   Time(s)
Domain-0                                   0      489     2 r-----    208.7

Let’s verify that it migrated over…

root@mk2 ~]# xm list
 
Name                                      ID Mem(MiB) VCPUs State   Time(s)
Domain-0                                   0     1509     2 r-----    263.8
shaolin                                    3      511     1 -b----     26.2

There it is, you may notice the counters have been reset.

A few notes and precautions about live migration:

  • In order to ensure that your switch knows that your virtual machine’s MAC address is located on a different port it is best to generate traffic by pinging out from the virtual machine continually during the migration.

    (more…)