Archive

Archive for the ‘SAN’ Category

Online LUN expansion and partition resizing without reboot under Linux

September 30th, 2009 liuk No comments

If you are in the need of expanding the LUN where your Linux is installed without rebooting the server, then may be that here you can find the right answer. Your Linux must be sufficiently recent to support features like LVM version 2, ext3 on-line resizing and so on.

My setup in tihs test is the following:

A Linux RedHat 5.3 64 bit Virtual Machine with a single 300 GB LUN /dev/sda in Raw Device Mapping mode (Physical Compatibility mode) under VMware 3.5 and a Compellent SAN. Inside the single disk there is one Volume Group (vg0) with serveral Logical Volumes.

The SAN guys expanded on-line the LUN from 300 to 500 GB (a 30″ operation :-) ).

To force the rescan of partition to get the kernel aware of the new size (supposing your LUN is /dev/sda):

# echo 1 > /sys/block/sda/device/rescan

and then in dmesg you’ll see:

SCSI device sda: 1048576000 512-byte hdwr sectors (536871 MB)
sda: Write Protect is off
sda: Mode Sense: 8f 00 00 08
SCSI device sda: drive cache: write through
sda: detected capacity change from 322122547200 to 536870912000

Now you have two choices at this point: expand the partition containing the current volume group or create a new partition and extend the current volume group. I sincerely prefer the latter, since resizing the partition with fdisk is a risky operation IMVHO.

Use fdisk to create a new partition of type LVM (0×8e) in the free space. But at this point you may have trouble in the kernel re-reading the partition table:

# sfdisk -R /dev/sda
BLKRRPART: Device or resource busy

So to inform the OS of the partition table changes use partprobe(8) command which comes with the parted packages.

Verify in /proc/partitions that the kernel has updated the partition table.

Now use pvcreate /dev/sda3, then vgextend vg0 /dev/sda3 and you are done! Verify with vgdisplay that the Free PE are consistent.

To make some more stress test I made the following:

While running this command on a newly created LVM partition /dev/vg0/TRASHME mounted on /TRASH:

# dd if=/dev/zero of=TTTT bs=1024k count=30000
..... dd is running....

I tried an online resizing of the Logical Volume and the ext3 partition inside it:

# lvextend -L+10G /dev/vg0/TRASHME
Extending logical volume TRASHME to 40.00 GB
Logical volume TRASHME successfully resized
# resize2fs /dev/vg0/TRASHME
resize2fs 1.39 (29-May-2006)
Filesystem at /dev/vg0/TRASHME is mounted on /TRASH; on-line resizing required
Performing an on-line resize of /dev/vg0/TRASHME to 10485760 (4k) blocks.
The filesystem on /dev/vg0/TRASHME is now 10485760 blocks long.

in the mean time dd finished his work:
30000+0 records in
30000+0 records out
31457280000 bytes (31 GB) copied, 175.612 seconds, 179 MB/s

Great! Remember to always backup your data before doing operations like this! YMMV.

((enjoy))

SAN Migration and Linux Storage Device issues

July 1st, 2008 liuk No comments

The scenario is the following:

- SAN migration from an old EMC Clariion CX-500 to a new CX3-40

- 2 Linux RedHat 3.0 AS and Oracle RAC 9.2 with OCFS 1.x

The goal is to migrate all the Oracle Data partition (1 LUN in this case) to the new SAN using EMC SAN Copy.

All the host will be powered off (yes we can live with this…).

The main issues are the following:

- PowerPath will mess up the device naming. If your device on the old SAN was seen as /dev/emcpowera, when you connect the hosts to the new SAN it will probably see the LUN as /dev/emcpowerb. To correct this problem the trick is the following:

  1. Stop PowerPath
  2. cd /etc
  3. mkdir /etc/EMC_BACKUP
  4. /bin/mv emcp_devicesDB.dat emcp_deviceDB.idx powermt.custom /etc/EMC_BACKUP
  5. Restart PowerPath (this will recreate the files you have moved above)
  6. powermt config
  7. powermt check
  8. powermt display dev=all (here you should see your LUN again as /dev/emcpowera)

A really strange issue that happened to us, is that we have to force the link speed on the new 4Gbps FC switch to 2Gbps, otherwise the lpfc Linux Driver was unable to correctly detect all the I/O devices (the HBAs are quite old Emulex LP9002). May be that some lpfc_* parameter to the module was missing, but there wasn’t the time to investigate further (and documentation about this is lacking IMVHO…).

((enjoy))

Categories: Linux, SAN Tags:

SAN Migration and VMware issues

July 1st, 2008 liuk No comments

The scenario is the following:

- SAN migration from an old EMC Clariion CX-500 to a new CX3-40

- Several Vmware ESX 3.x nodes all with 2 HBAs

The goal is to migrate all the VMware LUNs to the new SAN using EMC SAN Copy.

All the host will be powered off (yes we can live with this…).

The main issue is that when we connect the ESX host to the new SAN, the ESX hosts will see all the LUN as snapshots and will disable access.

The message in /var/log/vmkernel should be similar to this:

Jul  1 12:58:59 esxnode00 vmkernel: 0:01:00:22.568 cpu15:1045)ALERT: \
LVM: 4903: vmhba2:0:6:1 may be snapshot: disabling access. \
See resignaturing section in SAN config guide.

This seems to be the Right Way to solve the problem:

  1. Be sure that only 1 node has access to the LUN and no other node is writing to the LUN involved
  2. From 1 node, in the Advanced Settings enable LVM.EnableResignature
  3. Rescan all HBA
  4. All the LUN will be renamed to /vmfs/volumes/snap-NNNNNNNN-ORIGNAME
  5. Reset LVM.EnableResignature to 0 (this is REALLY IMPORTANT, you risk to get corrupted VMFS data)
  6. You have to register again all the VM since the UUID is changed

You can also use the option AllowSnapshot, but in this way you will keep the old UUID, and I dont like this; I think that this option should be used on a DR site.

A really interesting document (PPT) about all this is here.

((enjoy))

Categories: Linux, SAN, VMware Tags: