For quite some time we have been using ceph-deploy to deploy OSD in folders during the Ceph trainings held by Netways. This worked perfectly with jewel, but newer versions don’t allow this behaviour anymore.
There are several reasons for this, however, as we have a quite regulated setup for our training notebooks, we had to come up with some kind of workaround. This approach is, while working fine in our training setup, not recommended for production!
The following steps apply to a current CentOS 7 system.
As stated before, we will deploy an OSD on a block device. Though you could use a separate partition for this, we will use a loop device. For this, the first step is to create a file:
For this, create an OSD directory
$ mkdir -p /home/training/my-cluster/osd-$HOSTNAME $ cd /home/training/my-cluster/osd-$HOSTNAME/
in this folder, create a file for later use
$ fallocate -l 30G 30GB.img
# losetup -l -P /dev/loop1 "/home/training/my-cluster/osd-$HOSTNAME/30GB.img" # wipefs -a /dev/loop1 # lsblk
This should then display your new loopdevice.
As loop devices are not reboot safe, you need to go some steps further. If you like to use rc.local for this, you’re free to do so.
We’re going to create a service, which will essentially execute the prior mentioned losetup command. For this, we need a script with the command and a .service file, which will execute the script:
rebloop.sh #!/bin/bash sudo losetup -l -P /dev/loop1 "/home/training/my-cluster/osd-$HOSTNAME/30GB.img"
and the service file:
rebloop.service [Unit] Description=Reattach loop device after reboot [Service] Type=simple ExecStart=/bin/bash /usr/bin/rebloop.sh [Install] WantedBy=multi-user.target
These files have to be executable and be copied to the correct folders. Afterwards, the service must be enabled and can be started.
# chmod +x rebloop.* # cp rebloop.sh /usr/bin/rebloop.sh # cp rebloop.service /etc/systemd/system # systemctl enable rebloop.service # systemctl start rebloop.service
Ceph, however, will still not want to create an OSD on this device, instead give you following error message:
--> RuntimeError: Cannot use device (/dev/mapper/<name>). A vg/lv path or an existing device is needed
You have to make changes to /usr/lib/python2.7/site-packages/ceph_volume/util/disk.py on the OSD host:
in line 201, add “or TYPE==’loop'”:
# use lsblk first, fall back to using stat TYPE = lsblk(dev).get('TYPE') if TYPE: return TYPE == 'disk' or TYPE == 'loop'
and in line 286, change the “skip_loop” switch from “True” to “False”:
def get_block_devs(sys_block_path="/sys/block", skip_loop=False):
For testing purposes, simply reboot your system and verify if the loop device gets reattached correctly.
If yes, you can deploy an OSD. We’re using ceph-deploy here:
$ ceph-deploy osd create --data /dev/loop1 $HOSTNAME
When the command was successfully executed on your hosts, you can create your first pool
# ceph osd pool create rbd 100 100 replicated
examine ceph status
# ceph status
tag the pool with an application
# ceph osd pool application enable rbd rbd
As you can see, there are quite some changes to be made, and each of it is failure-prone.
Best practice is to simply use a real block device and for production you really should stick to this.
If, however, there are certain needs to be fulfilled, ceph can be convinced to comply.