I first learned of Sheepdog from a watching a 2012 presentation on Windows VDI (Virtual Desktop Infrastructure) made by the CTO (now CEO) of Atlantis. The 68-minute talk is up on Youtube. I was shocked to learn that Software Defined Storage (SDS) in a distributed architecture with 5+ nodes could boast higher IOPS than enterprise SAN hardware.
At work, I have tested Ceph as a storage backend for Openstack, namely as a backend for Nova ephemeral VM's, Glance images, and Cinder block storage volumes.
According to the documentation from various versions of Openstack (from Juno onwards), the Sheepdog storage driver is supported. For example, here's what the Openstack Kilo docs say about Sheepdog and Cinder:
http://docs.openstack.org/kilo/config-reference/content/sheepdog-driver.html
This driver enables use of Sheepdog through Qemu/KVM.
Set the following volume_driver in cinder.conf:
volume_driver=cinder.volume.drivers.sheepdog.SheepdogDriverIn another post, I talk about setting up Sheepdog as a backend for Openstack Kilo.
Of course, Sheepdog can be used as distributed storage on its own without Openstack. In this post I will cover setting up Sheepdog on Fedora 22/23 and mounting an LVM block device using the sheepdog daemon sheep.
Compile Sheepdog v0.9 from Github
As of May 2016, the upstream version of sheepdog from Github is v0.9.0...
By contrast, the sheepdog package provided by the RDO (Redhat Distribution of Openstack) Kilo repos for Fedora is at version 0.3, which is incompatible with libcpg from corosync 2.3.5 in the default Fedora repos for f22/23. (sheep daemon fails to start because of a segfault in libcpg).
When trying to start the v0.3 sheep daemon I got the following error in dmesg:
...
[Apr25 14:52] sheep[11897]: segfault at 7fdb24f59a08 ip 00007fdb2ccc7cd8 sp 00007fdb24f59a10 error 6 in libcpg.so.4.1.0[7fdb2ccc6000+5000]
...
As you can see above, the sheep daemon fails to start because of a segfault in libcpg which is part of corosync.
This issue does not occur, however, when I use the v0.9 sheep daemon.
Here are the steps to compile Sheepdog from the upstream repo on Github:
(1) RENAME OLD SHEEPDOG 0.3 BINARIES
If you have RDO installed on your Fedora machine, sheepdog v0.3 binaries sheep and collie will already exist in /usr/sbin, but when you build sheepdog v0.9, it will install binaries into both /usr/sbin and /usr/bin:
- sheep will be created in /usr/sbin
- dog (the replacement for collie since v0.6) will be created in /usr/bin
To avoid namespace conflicts, it's a good idea to rename the old binaries from sheepdog v0.3. You might wonder why I bother renaming the binaries instead of doing dnf remove sheepdog. The reason you cannot just remove the old package is that sheepdog is one of the dependencies of RDO. Even marking the package as "manually installed" and trying to remove it didn't work for me.
mv /usr/sbin/collie /usr/sbin/collie_0.3
mv /usr/sbin/sheep /usr/sbin/sheep_0.3
(2) BUILD FROM UPSTREAM SOURCE
As of May 2016, the current sheepdog version is 0.9.0 ...
git clone git://github.com/collie/sheepdog.git
sudo dnf install -y autoconf automake libtool yasm userspace-rcu-devel \
corosynclib-devel
cd sheepdog
./autogen.sh
./configure
If you wish to build sheepdog with support for zookeeper as the sync agent (corosync is used by default) you must invoke the following:
./configure --enable-zookeeper
Finally, invoke:
sudo make install
Sheepdog 0.9 binaries will be installed into /usr/bin and /usr/sbin, so make sure the old sheepdog binaries in /usr/sbin have been renamed! BTW, there is no collie command in sheepdog v0.9. It has been replaced with dog.
Setup Corosync
Before starting corosync, you must ensure that TCP port 7000 has been opened in your firewall on all the machines you plan to use as sheepdog storage nodes. In a simple lab environment, you may be able to get away with temporarily stopping your firewall with systemctl stop firewalld, but don't do this in a production environment!
(1) CREATE COROSYNC CONFIG FILES
sudo vim /etc/corosync/corosync.conf
# Please read the corosync.conf 5 manual page
compatibility: whitetank
totem {
version: 2
secauth: off
threads: 0
# Note, fail_recv_const is only needed if you're
# having problems with corosync crashing under
# heavy sheepdog traffic. This crash is due to
# delayed/resent/misordered multicast packets.
# fail_recv_const: 5000
interface {
ringnumber: 0
bindnetaddr: 192.168.95.146
mcastaddr: 226.94.1.1
mcastport: 5405
}
}
logging {
fileline: off
to_stderr: no
to_logfile: yes
to_syslog: yes
# the pathname of the log file
logfile: /var/log/cluster/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
amf {
mode: disabled
}
For bindnetaddr, use your local server's IP on a subnet which will be available to other Sheepdog storage nodes. In my lab environment, my sheepdog nodes are on ...95.{146,147,148}.
This probably isn't necessary, but if you want a regular user myuser to be able to access the corosync daemon, create the following file:
sudo vim /etc/corosync/uidgid.d/myuser
uidgid {
uid: myuser
gid: myuser
}
(2) START THE COROSYNC SERVICE
The corosync systemd service is not enabled by default, so enable the service and start it:
sudo systemctl enable corosync
sudo systemctl start corosync
When you check the corosync service status with systemctl status corosync you should see something like this:
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/usr/lib/systemd/system/corosync.service; disabled; vendor preset: disabled)
Active: active (running) since Mon 2016-05-09 11:45:06 KST; 1 weeks 4 days ago
Main PID: 2248 (corosync)
CGroup: /system.slice/corosync.service
└─2248 corosync
May 09 11:45:06 fx8350no3 corosync[2248]: [QB ] server name: cpg
May 09 11:45:06 fx8350no3 corosync[2248]: [SERV ] Service engine loaded: corosync...4]
May 09 11:45:06 fx8350no3 corosync[2248]: [SERV ] Service engine loaded: corosync...3]
May 09 11:45:06 fx8350no3 corosync[2248]: [QB ] server name: quorum
May 09 11:45:06 fx8350no3 corosync[2248]: [TOTEM ] A new membership (192.168.95.14...86
May 09 11:45:06 fx8350no3 corosync[2248]: [MAIN ] Completed service synchronizati...e.
May 09 11:45:06 fx8350no3 corosync[2236]: Starting Corosync Cluster Engine (corosync... ]
May 09 11:45:06 fx8350no3 systemd[1]: Started Corosync Cluster Engine.
May 09 11:48:46 fx8350no3 corosync[2248]: [TOTEM ] A new membership (192.168.95.14...88
May 09 11:48:46 fx8350no3 corosync[2248]: [MAIN ] Completed service synchronizati...e.
Hint: Some lines were ellipsized, use -l to show in full.
(3) REPEAT STEPS 1 & 2 ON ALL MACHINES YOU WISH TO USE AS STORAGE NODES
In /etc/corosync/corosync.conf make sure to change bindnetaddr to the IP for each different machine.
Launch Sheepdog Daemon on LVM Block Device
Sheepdog can use an entire disk as a storage node, but for testing purposes, it is easier to just mount an LVM block device with sheep.
(1) CREATE A MOUNTPOINT FOR SHEEP TO USE
sudo mkdir /mnt/sheep
(2) CREATE A LVM BLOCK DEVICE FOR SHEEPDOG
sudo pvcreate /dev/sdxy
sudo vgcreate /dev/sdxy VGNAME
sudo lvcreate -L nG VGNAME -n /dev/VGNAME/LVNAME
where x is a letter (such as a, b, ...z), y is a whole number (i.e., 1, 2, 3, ...), and n is a whole number.
(3) CREATE File System ON LV
sudo mkfs.ext4 /dev/VGNAME/LVNAME
In this example, I created an ext4 file system, but you could use XFS or anything else.
(4) MOUNT BLOCK DEVICE ON MOUNTPOINT
sudo mount /dev/VGNAME/LVNAME /mnt/sheep
(5) RUN SHEEP DAEMON ON MOUNTPOINT
sudo sheep /mnt/sheep
To make sure the daemon is running you can also try pidof sheep which should return two Process ID's.
(6) VERIFY DEFAULT FILES IN SHEEPDOG MOUNT
cd /mnt/sheep
ls
This should show the following files and directories:
config epoch lock obj sheep.log sock
If you don't see anything in the mount point, the sheep daemon failed to load.
(7) REPEAT STEPS 1-6 ON ALL MACHINES TO BE USED AS STORAGE NODES
(8) CHECK SHEEPDOG NODES
Now that the sheep has been launched you should check if it can see other sheepdog nodes. Sheepdog commands can be invoked by the regular user.
dog node list
Id Host:Port V-Nodes Zone
0 192.168.95.146:7000 128 2455742656
1 192.168.95.147:7000 128 2472519872
2 192.168.95.148:7000 128 2489297088
The sheepdog daemon should automatically be able to see all the other nodes on which sheep is running (if corosync is working properly, that is).
You can get a list of valid dog commands by just invoking dog without any arguments:
dog
Sheepdog administrator utility (version 0.9.0_352_g3d5438a)
Usage: dog
Available commands:
vdi check check and repair image's consistency
vdi create create an image
...
(9) DO INITIAL FORMAT OF SHEEPDOG CLUSTER
This step only needs to be done once from any node in the cluster.
dog cluster format
using backend plain store
dog cluster info
Cluster status: running, auto-recovery enabled
Cluster created at Mon Apr 25 19:22:14 2016
Epoch Time Version [Host:Port:V-Nodes,,,]
#2016-04-25 19:22:14 1 [192.168.95.146:7000:128, 192.168.95.147:7000:128, 192.168.95.148:7000:128]
Convert RAW/QCOW2 Image to Sheepdog VDI Format
(1) INSTALL QEMU
sudo dnf install -y qemu qemu-kvm
(2) CONVERT A VM IMAGE TO SHEEPDOG VDI FORMAT
qemu-img convert -f qcow2 xenial-amd64.qcow2 sheepdog:xenial
In this example, I am converting an Ubuntu cloud image for 16.04 64-bit to sheepdog VDI format. Note that cloud images do not contain any user:pass info so it will be impossible to login without first injecting an ssh keypair with cloud-init. This can be achieved by first booting the image in Openstack, selecting a keypair, and then logging into the launched instance through the console in Horizon. Once you are logged in, you can create a user and password. Then take a snapshot of the instance and download it for use in qemu or virt-manager.
NOTE: The format for sheepdog images is sheepdog:imgName
Converting a RAW or QCOW2 image to sheepdog format will cause a new sheepdog VDI image to be created in the distributed storage nodes (which you can verify by navigating to the sheep mountpoint and running ls (but the image file itself won't appear, just a bunch of new file chunks, as this is object storage).
(3) VERIFY SHEEPDOG VDI CREATION
dog vdi list
Name Id Size Used Shared Creation time VDI id Copies Tag Block Size Shift
xenial 0 2.2 GB 976 MB 0.0 MB 2016-04-25 20:22 4f6c3e 3 22
(4) LAUNCH VDI VIA QEMU-SYSTEM & X11 FORWARDING
The Sheepdog storage nodes will probably be server machines without Xorg X11 / Desktop Environment installed.
You can still launch qemu-system if you use ssh x11 forwarding.
First check that the server machine has xauth installed:
rpm -q xorg-x11-xauth
Then from another machine that has X11 installed, invoke ssh with -X option to run the remote program with your local X session and run qemu-system:
ssh -X fedjun@192.168.95.148 qemu-system-x86_64 -enable-kvm -m 1024 \
-cpu host -drive file=trusty64-bench-test1
The -m flag designates memory in MB; qemu's default is only 128MB so you need to specify this manually if you need more memory.
Note that qemu uses NAT for VM's by default. You cannot directly communicate from host to VM, but you can go from VM to host by ssh'ing or pinging 10.0.2.2.
댓글 없음:
댓글 쓰기