Skip to content

matching Hitach and AIX FC adapters

Working with dlnkmgr you quickly discover that it uses its own “numbering” in regards to AIX host FC (HBA) adapters which may lead to some surprise it the least expected time.
To get HDS “view”, you need to execute:

# dlnkmgr view -hba -portwwn
HbaID Port.Bus HBAPortWWN IO-Count IO-Errors Paths OnlinePaths
00000 00.05 10000000C985E7C6 2841716579 488 60 60
00001 00.02 10000000C987C25E 2838346781 397 60 60
00002 00.07 10000000C987BEAE 2814428535 1348 60 60
00003 00.01 10000000C987BFA2 2812541903 1373 60 60
KAPL01001-I The HDLM command completed normally. Operation name = view, completion time = 2012/03/14 13:56:23

To get AIX view of its HBA, one needs to execute (for example) the command lscfg -vl fcsX | grep Netw. With both outputs, it is time to reconcile them or you can save these few lines of code to do it automatically for you.

### W.M. Duszyk
### Match AIX versus HDS FC adapters

set -A HDS_FC `dlnkmgr view -hba -portwwn | \
                              grep '^00' | awk '{print $3}'`

for FC in `lsdev -Cc adapter | awk '/fcs/ {print $1}'`
          AIXWWPN=`lscfg -vl $FC | grep Netw | \
                                            sed 's/\./ /g' | \
                                            awk '{print $3}'`

         for HDSWWPN in ${HDS_FC[*]}
            [[ $AIXWWPN = $HDSWWPN ]] && \
           { print "AIX $FC ($AIXWWPN) is HDS $cntr ($HDSWWPN)"; \
              break; }
           ((cntr=$cntr + 1))

The sample output:

AIX fcs0 (10000000C987BFA2) is HDS 3 (10000000C987BFA2)
AIX fcs2 (10000000C985E7C6) is HDS 0 (10000000C985E7C6)
AIX fcs4 (10000000C987C25E) is HDS 1 (10000000C987C25E)
AIX fcs6 (10000000C987BEAE) is HDS 2 (10000000C987BEAE)

Posted in HDS, Real life AIX.

Tagged with , , , , , .

san design quide for aix with mpio, sdddrv, pcmdrv

anybody who is involved with AIX, SAN and maybe with booting from SAN can find this PowerPoint presentation very useful. Straight from Dan Braden and John Hock (IBM), SanBoot MPIO SDD SDDPCM Presentation.

Posted in Real life AIX.

Tagged with , , , , , , .

identifying HDS ShadowImage disks in mpio environment

In some places, a certain AIX hosts mirrors its data with disks from two HDS (Hitach) SAN controllers. In this case, there is also a number of places that associate a set of mirror disks with a ShadowImage disks from the same HDS controller. There are also places that maintain a separate ShadowImage disks sets on each HDS controller and activate only the set which for some reasons seems appropriate at a given backup time. Following the same train of thoughts; some places execute ShadowImage based backup on the host with data where the other places varyon the ShadowImage based volume group on the backup server instead. In the last case, the backup server might not have the dlnkmgr software installed; it may just use the AIX built in mpio drivers.

In the last case, occasionally AIX administrator may need to validate that the ShadowImage disks associated with a given backup server, really come from the specific HDS controller …. .

Posted in Real life AIX.

Tagged with , , , , .

recovering “lost” mpio disks

Last week something happened to fabrics, switches and God only knows what else. Some hosts lost some disks but fortunately since we mirror to separate fabrics every volume group withstood this incident. Later, we had to re-acquire the lost or missing disks which for most host was easily done executing the cfgmgr command. For a few hosts this operations required few additional steps that are described here. As the first step, I did rmdev -dl hdisk3 -R but unfortunately cfgmgr did not “bring” it back…..

# lspv
hdisk0     00cc7261a9f50481          rootvg          active
hdisk1     00cc72617b870835          rootvg          active
hdisk2     00cc7261e4fdc7b7          entoras1_vg     active
hdisk4     00cc7261d207431a          mksysbvg        active

After a repetitious execution, cfgmgr still cannot deliver one missing disks that is required to mirror entoras1_vg. Just for “kicks”, I execute the sanscan utility.

# sanscan
sanscan v2.2
Copyright (C) 2010 IBM Corp., All Rights Reserved

Processing FC device:
    Adapter driver: fcs0
    Protocol driver: fscsi0
    Connection type: fabric
    Local SCSI ID: 0x083e00
    Local WWPN: 0x10000000c96ef366
    Local WWNN: 0x20000000c96ef366
Initializing device information...
Scanning SAN...
SCSI ID LUN ID WWPN     WWNN   Vendor ID Product ID Rev  NACA Qualifier     Device Type Error(s)
080c00  0000000000000000 5005076801402afd 5005076801002afd IBM  2145   0000 yes  Connected     Disk [hdisk2, path_id 0]                            
080c00  0001000000000000 5005076801402afd 5005076801002afd IBM   2145   0000 yes  Connected     Disk [hdisk4, path_id 0]                            
081c00  0000000000000000 5005076801402ba4 5005076801002ba4 IBM    2145   0000 yes  Connected     Disk [hdisk2, path_id 1]                            
081c00  0001000000000000 5005076801402ba4 5005076801002ba4 IBM    2145   0000 yes  Connected     Disk [hdisk4, path_id 1]                            
603d00  0000000000000000 5005076801202a94 5005076801002a94 IBM    2145   0000 yes  Connected     Disk [no ODM match]                                 
601d00  0000000000000000 5005076801202d7f 5005076801002d7f IBM     2145   0000 yes  Connected     Disk [no ODM match]                                 
4 targets and 6 LUNs found in 0.015390 seconds 

Yes, sanscan sees “something” but cfgmgr is not able to get it. In situation like this one, it is often helpful to remove the parent (physical adapter) and all its children. Before any removal, we verify that there is at least one more additional available FC adapter (lsdev -Cc adapter | grep FC) and that all disks have paths leading through it.

# lspath
Enabled hdisk0 scsi1
Enabled hdisk1 scsi1
Enabled hdisk2 fscsi0
Enabled hdisk4 fscsi0
Enabled hdisk2 fscsi0
Enabled hdisk4 fscsi0
Enabled hdisk2 fscsi1
Enabled hdisk4 fscsi1
Enabled hdisk2 fscsi1
Enabled hdisk4 fscsi1

Knowing that the host has paths to each disk through both FC adapter allows us to proceed. We start with smitty devices and select the Disable a FC SCSI Protocol Device menu. From the list, select the first adapter (fcs0) and validate the selection with the RETURN key. Next, get out of smitty and proceed with removal of this adapter.

# rmdev -dl fcs0 -R
fcnet0 deleted
fscsi0 deleted
fcs0 deleted

Execution of cfgmgr followed with lspv shows a new disk. Let’s repeat the same process as before but this time we start disabling fscs1 and next removing it. With the new disk back on board, we have to verify this it is the missing mirror that we will use to re-mirror entoras1_vg. What will determine that this is the “right” disks? First, it must have the same size as hdisk2.

# bootinfo -s hdisk2
# bootinfo -s hdisk3

Next, both disks have to belong to different fabrics/controllers.

# lsattr -El hdisk2 | awk '/unique_id/ {print $2}'
# lsattr -El hdisk3 | awk '/unique_id/ {print $2}'

Indeed, hdisk2 “belongs” to SVC 3148 and hdisk3 to SVC 037C – two different “fabrics”, we can proceed with mirroring.

# extendvg -f entoras1_vg hdisk3
# mirrorvg -S -c 2 entoras1_vg hdisk3

Ramon, thanks for showing me how to use awk pattern matching …. 🙂

Posted in Real life AIX.

Tagged with , , , , , , .

mirroring progress in aix

If you wonder why AIX does not really have any tools to report on the progress of mirroring you are mistaken. Every time you execute the lsvg command against a volume group is returns (among others) the number of the Stale partitions. You could also execute the lsvg -M and count the number of the stale partitions. Both of these options, executed at an interval will deliver the decreasing numbers of stale partitions which could be used to generate the progress….

What you have here, are a few lines of code that provide the answer – nothing fancy.

### W.M. Duszyk, 3/2/12
### show percentage of mirrored PPs in a volume group

[[ $# < 1 ]] && { print "Usage: $0 vg_name"; exit 1; } vg=$1 Stale=`lsvg $vg | grep 'STALE PPs:' | awk '{print $6}'` [[ $Stale = 0 ]] && { print "$vg is fully mirrored."; exit 2; } Total=`lsvg $vg | grep 'TOTAL PPs:' | awk '{print $6}'` PercDone=$(( 100 - $(( $(( Stale * 50.0 )) / $Total )) )) echo "Volume group $vg is mirrored $PercDone%." If you decide to use this code (copy/paste), remember that each line above is really a single line of characters!!! Your browser may distort this fact. Have a good weekend! UPDATE: This script will work if the mirrorvg -S but not if the syncvg command was used to create the mirrors - the second command "locks" the volume groups being "synced" while the first one does not.

Posted in Real life AIX, scripts.

Tagged with , , , .

mkuser does not work…..

Trying to create a user account failed with the following two messages:

3004-690 No default group.
3004-703 Check "/etc/security/mkuser.default" file.

Contrary to these messages the /etc/security/ contained the mkuser.default and the mkuser.sys files which after a visual inspection were declared to be OK. So what could be a reason for mkuser failure to create a new user account?

These two files have also presence (in the format of symbolic links) in the directory /usr/lib/security, so it is prudent to investigate these links too.

# ls -l /usr/lib/security
r-xr-xr-x 2 bin  bin   256 Dec 15 15:18 risk-manager
dr-xr-xr-x 2 bin  bin  4096 Dec 15 17:11 acl
dr-xr-xr-x 2 bin  bin   4096 Dec 15 17:11 64
lrwxrwxrwx 1 root security 16 Dec 15 17:11 methods.cfg -> /etc/methods.cfg
lrwxrwxrwx 1 root security 17 Dec 15 17:11 fpm -> /etc/security/fpm
lrwxrwxrwx 1 root system    10 Mar 01 11:21 mkuser.sys -> mkuser.sys

This output does not look good for a few reasons. First, there is no link to the /etc/security/mkuser.default. Next, mkuser.sys instead of being linked to /etc/security/mkuser.sys is linked to itself….. To fix it, we have to cd /usr/lib/security to remove mkuser.sys and to re-create links to both files in /etc/security as shown next.

/usr/lib/security> rm mkuser.sys

/usr/lib/security> ln -s /etc/security/mkuser.default mkuser.default
/usr/lib/security> ls -ltr mkuser.default
lrwxrwxrwx 1 root system  28 Mar 02 07:50 mkuser.default -> /etc/security/mkuser.default

/usr/lib/security>ln -s /etc/security/mkuser.sys mkuser.sys
/usr/lib/security> ls -l mkuser.sys
lrwxrwxrwx 1 root system  24 Mar 02 07:50 mkuser.sys -> /etc/security/mkuser.sys

Following these repairs, the mksuser start working again.

Posted in Real life AIX.

Tagged with , , , , .

HDS (USP_V) hdisk/controller relationship

For AIX host with HDS (dlnkmgr) delivered storage, especially when there are multiple HDS controllers in use, it may be important to be able to identify what controllers deliver exactly what hdisks.

The following command delivers the answers:

# dlnkmgr view -lu -item -c
Product S/N     LUs iLU    SLPR HDevName VG    Paths  OnlinePaths
USP_V   0029386  14 00000A    0 hdisk14  -          8           8
                    00000D    0 hdisk15  -          8           8
                    000013    0 hdisk16  -          8           8
                    000019    0 hdisk17  -          8           8
                    00001D    0 hdisk18  -          8           8
                    00001E    0 hdisk19  -          8           8
                    00002B    0 hdisk20  -          8           8
                    00002E    0 hdisk21  -          8           8
                    00002F    0 hdisk22  -          8           8
                    000030    0 hdisk23  -          8           8
                    00004C    0 hdisk24  hb1_vg     8           8
                    000052    0 hdisk26  -          8           8
                    000065    0 hdisk29  -          8           8
                    000066    0 hdisk30  -          8           8
USP_V   0048835  16 00000A    0 hdisk2   epcdbm_vg  8           8
                    00000E    0 hdisk3   epcdbm_vg  8           8
                    00000F    0 hdisk4   epcdbm_vg  8           8
                    000010    0 hdisk5   epcdbm_vg  8           8
                    000011    0 hdisk6   epcdbm_vg  8           8
                    000012    0 hdisk7   epcdbm_vg  8           8
                    000013    0 hdisk8   epcdbmjrnl_vg 8        8
                    000018    0 hdisk31  -          8           8
                    00002D    0 hdisk9   epcdbmjrnl_vg 8        8
                    00002F    0 hdisk10  epcdbm_vg  8           8
                    000037    0 hdisk11  epcdbm_vg  8           8
                    000038    0 hdisk12  hb2_vg     8           8
                    000039    0 hdisk13  mksysbvg   8           8
                    000052    0 hdisk25  epcdbm_vg  8           8
                    000058    0 hdisk27  -          8           8
                    000059    0 hdisk28  epcdbm_vg  8           8
KAPL01001-I The HDLM command completed normally. Operation name = view, completed on time = 2012/03/01 13:12:01

This command is very useful why trying to establish mirroring with disks from appropriate controllers. It is easily identifies disks, volume groups and HDS controllers managing the disks.

Posted in Real life AIX.

Tagged with , , , .

local print queue removal in AIX

I have a ticket in my work queue to remove a considerable amount of print queues. Following the “usual” path executing smitty rmque I get an unexpected message:

There are currently no additional SMIT screen entries available for this item.  This item may require installation of additional software before it can be accessed.

Well, I do not have time to identify and to install this “additional” software, this has to wait for tomorrow. Today, the selected print queues have to be deleted. AIX deletes print queues in two steps. First, it remove the definition of the devices that does the actual printing and then it proceeds deleting the queue associated with the device removed in step one.

The command called lsque followed with the name of a print queue produces all the details we may need, for example:

lsque -c -qspdB_land

The -c flag in the last command is responsible for the output being “colon” delimited. The first token repeats the queue name while the second delivers the name of the associated device.

Now the script:



if [[ $# -ne 1 ]]
            print "\tMissing queue name, aborting."
            exit 1

lsque -c -q$queue | grep $queue | awk -F ':' '{print $2}' | while read dev
            rmquedev -d $dev -q $queue 2>/dev/null
            rmque -q $queue 2>/dev/null

This script takes one argument – the name of the print queue to be removed.

Posted in AIX, Real life AIX.

Tagged with , , , , , .

installing Shadow Image on AIX host

An active node in a cluster has mirrored volume groups which data we want to backup with a minimal interruption (downtime). Each mirror consists of SAN disks (LUNs) from one of two SAN fabrics. One node in this cluster has ShadowImage environment configured to use disks from one of these fabrics (PVOLs and SVOLs belong to the same fabric). Due to site/host migration/re-configuration ShadowImage has to be “moved” (re-installed) on the other node of the same cluster using disks from the other SAN fabric. ShadowImage has to be installed and configured accordingly. This post documents how this is done.

Posted in Real life AIX.

Tagged with , , , , , , .

removing orphaned home directories

To remove a user account, we execute the rmuser command which by design does not remove the user’s home directory which from this moment on will show removed used id in place of his/hers login name. Eventually, with years of doing nothing a host home may end up like this one:

rwxr-xr-x    2 21564    lawson          256 Mar 04 2004  teagued
drwxr-xr-x    2 20337    lawson          256 Mar 04 2004  taylore
drwxr-xr-x    2 22391    lawson          256 Mar 04 2004  tantillo
drwxr-xr-x    2 258      lawson          256 Mar 04 2004  studnt9
drwxr-xr-x    2 256      lawson          256 Mar 04 2004  studnt7
drwxr-xr-x    2 255      lawson          256 Mar 04 2004  studnt6
drwxr-xr-x    2 305      lawson          256 Mar 04 2004  studnt50
drwxr-xr-x    2 254      lawson          256 Mar 04 2004  studnt5
drwxr-xr-x    2 294      lawson          256 Mar 04 2004  studnt40
drwxr-xr-x    2 253      lawson          256 Mar 04 2004  studnt4

It is time to clean. Today, this host was migrated from the local to the LDAP authentication and as the part of this process we decided to clean /home. After its execution only the locally authenticated accounts (files) will have their homes. The LDAP defined users home directories will be automatically created at the first login, thanks to the following entry in the /etc/security/login.cfg

mkhomeatlogin = true

And now the script which did the cleaning:


for nbr in `ls -l /home | awk '{print $3}' | grep ^[0-9]`
        for usr in `ls -l /home | grep $nbr | awk '{print $9}'`
                lsuser -R files $usr

                if [[ $? -ne 0 ]]
                        rm -rf /home/$usr

Posted in ldap, Real life AIX, scripts.

Tagged with , , , .

Copyright © 2016 Waldemar Mark Duszyk. All Rights Reserved. Created by Blog Copyright.