Solaris I/O multipathing gives you the ability to set up multiple redundant paths to a storage system and gives you the benefits of load balancing and failover.
We're talking fibre attached storage. As far as I know, Sun don't yet support multipathing over scsi connections (the way that the A3500 could with Raid Manager, for example). So you need to be using a SAN, or a new 3510/3511 array, or something like a T3 partner pair.
By default, multipathing is disabled. You need to explicitly enable and configure it.
The software is available in Solaris 10, but for earlier releases you'll need to download and install the SAN Foundation Kit and associated patches.
You need to make sure that your storage is exposed over multiple paths. With a SAN this should simply be a case of having more than one connection. A T3 partner pair should work as is. For a 3510 arrays, you'll need to set it up first.
Note: The instructions are different for Solaris 10 and earlier releases. Make sure you use the right one!
Solaris 10 is the easier, because the mpxio capability is built-in. You just need to turn it on!
To enable it, edit the file /kernel/drv/fp.conf
file. At the end it should say:
mpxio-disable="yes";
Just change yes to no and it will be enabled:
mpxio-disable="no";
Enabling mpxio enables it everywhere. This includes the internal drives if they're fibre-attached, like on a V880. Now, you can get an extra kit to add a second path to the internal storage, in which case you would want to use mpxio. But in that case you really want to use stmsboot instead. So I'm not going to cover that here, I'm just going to tell you how to disable it on a particular port.
Add a line like the following to the end of
/kernel/drv/fp.conf
:
name="fp" parent="/pci@8,600000/SUNW,qlc@2" port=0 mpxio-disable="yes";
That's the right entry for a V880 or V890. To find the parent and port numbers, look at the device entry for one of the internal disks:
lrwxrwxrwx 1 root 70 Nov 7 11:45 /dev/dsk/c1t0d0s2 -> ../../devices/pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000011c6ce45a8,0:c
I've folded it, but the parent is the bit between /devices/ and /fp.
Then do a reconfiguration boot and it will create the multipathed devices for you.
The procedure here is a bit more complex. You have to start out by installing the software, and you need to edit different files.
Go to Sun's download site and go to the A-Z tab, then look under S for "StorEdge SAN 4.4".
Once you've downloaded and unpacked it, add the packages:
#!/bin/sh # # installs the SFS base packages in the correct order # pkgadd -d . SUNWsan pkgadd -d . SUNWcfpl pkgadd -d . SUNWcfplx pkgadd -d . SUNWfchba pkgadd -d . SUNWcfclr pkgadd -d . SUNWfchbr pkgadd -d . SUNWfchbx pkgadd -d . SUNWfcsm pkgadd -d . SUNWfcsmx pkgadd -d . SUNWcfcl pkgadd -d . SUNWcfclx pkgadd -d . SUNWmdiu #JNI packages pkgadd -d . SUNWjfca pkgadd -d . SUNWjfcax pkgadd -d . SUNWjfcau pkgadd -d . SUNWjfcaux #Emulex packages pkgadd -d . SUNWemlxs pkgadd -d . SUNWemlxu pkgadd -d . SUNWemlxsx pkgadd -d . SUNWemlxux
Then you need to get the patches and add those:
#!/bin/sh # # install SFK patches in the right order # patchadd 111847-08 patchadd 113046-01 patchadd 113049-01 patchadd 113039-11 patchadd 113040-17 patchadd 113041-10 patchadd 113042-12 patchadd 113043-12 patchadd 113044-05 patchadd 114476-06 patchadd 114477-03 patchadd 114478-07 #JNI and Emulex patches patchadd 114878-10 patchadd 119914-05
All the above are for Solaris 9, and newer versions may be available when you're reading this.
Then reboot.
The steps are the same as for Solaris 10, it's just that the files are different.
Edit /kernel/drv/scsi_vhci.conf
and change the line
that says:
mpxio-disable="yes";
Just change yes to no and it will be enabled:
mpxio-disable="no";
Enabling mpxio enables it everywhere. This includes the internal drives if they're fibre-attached, like on a V880. Now, you can get an extra kit to add a second path to the internal storage, in which case you would want to use mpxio. But in that case you really want to use stmsboot instead. So I'm not going to cover that here, I'm just going to tell you how to disable it on a particular port.
Add a line like the following to the end of
/kernel/drv/qlc.conf
:
name="qlc" parent="/pci@8,600000" unit-address="2" mpxio-disable="yes";
Note that both the filename and the syntax are different from the Solaris 10 case.
That's the right entry for a V880 or V890. To find the parent and port numbers, look at the device entry for one of the internal disks:
lrwxrwxrwx 1 root 70 Nov 7 11:45 /dev/dsk/c1t0d0s2 -> ../../devices/pci@8,600000/SUNW,qlc@2/fp@0,0/ssd@w21000011c6ce45a8,0:c
I've folded it, but the parent is the bit after /devices/ and the unit-address is the number after SUNW,qlc@.
Then do a reconfiguration boot and it will create the multipathed devices for you.
Before multipathing, you should see two copies of each disk in format. Afterwards, you'll just see the one copy.
It assigns the next available controller ID, and makes up some horrendously long target number. For example:
Filesystem kbytes used avail capacity Mounted on /dev/dsk/c6t600C0FF000000000086AB238B2AF0600d0s5 697942398 20825341 670137634 4% /test
There are two principal things you can do to check that everything
is fine. The first is to look for log messages in
/var/adm/messages
. As the machine boots, you should see a
message like:
Dec 18 11:42:24 vampire mpxio: [ID 669396 kern.info] /scsi_vhci/ssd@g600c0ff000000000086ab238b2af0600 (ssd11) multipath status: optimal, path /pci@9,600000/SUNW,qlc@1/fp@0,0 (fp1) to target address: 216000c0ff886ab2,0 is online. Load balancing: round-robin
If the state changes, then it will put in a message. If everything's fine, then it will say it's optimal. If there's a problem with one of the paths (HBA, controller, or fibre, or switch in a SAN) then the mode will be degraded. That means it's lost one of the paths but is still using another. What you don't want to see is the other message, when it's lost all paths. I think it says failed, but I can't remember (and dont want to get into that state anyway).
The other way is to look at luxadm output. Get luxadm to query the device:root@vampire# luxadm display /dev/rdsk/c6t600C0FF000000000086AB238B2AF0600d0s2 DEVICE PROPERTIES for disk: /dev/rdsk/c6t600C0FF000000000086AB238B2AF0600d0s2 Vendor: SUN Product ID: StorEdge 3510 Revision: 413C Serial Num: 086AB238B2AF Unformatted capacity: 1397535.000 MBytes Write Cache: Enabled Read Cache: Enabled Minimum prefetch: 0x0 Maximum prefetch: 0xffff Device Type: Disk device Path(s): /dev/rdsk/c6t600C0FF000000000086AB238B2AF0600d0s2 /devices/scsi_vhci/ssd@g600c0ff000000000086ab238b2af0600:c,raw Controller /devices/pci@9,600000/SUNW,qlc@1/fp@0,0 Device Address 216000c0ff886ab2,0 Host controller port WWN 210000e08b14cc40 Class primary State ONLINE Controller /devices/pci@9,600000/SUNW,qlc@2/fp@0,0 Device Address 266000c0fff86ab2,0 Host controller port WWN 210000e08b144540 Class primary State ONLINE
As you can see, it tells you both paths and they are ONLINE.
(If you have a T3 partner pair, then the output is slightly different. You should have 2 devices, one for each member of the pair. But the Class will be primary for one path and secondary for the other: the paths aren't symmetric and it sets the direct connection as the primary and uses the other one for failover.)