Skip to content


command line editing pains ….. .

Hi,

it is bugging me for a some time already.
Do you know how to do that: “we need to change u50_lv into u60_lv and /u50 into /u60 in a “single step”, kind of like a global change in vi?

crfs -v jfs2 -d u50_lv -A yes -a log=INLINE -m /u50

I know, how to turn each 5 into 6, but this requires two steps and I want to do it with just one (is this possible at all ?).
Please let someone show me how this is done.

How to do it as a command line edit….. Not as an edit of a line in a file…..

Thanks,

MarkD 🙂

Posted in Real life AIX.


to remove a file with special characters in its name

-rw-r-----  1 root   system 1582339  Apr 05 20:24 smit.log
-rw-------  1 root   system  14751   Apr 06 07:36 .lsof_lawaptpu001
-rw-r-----  1 root   system      0   Apr 06 08:02 *
-rw-r-----  1 root   system 136792   Apr 10 05:25 dsmerror.log
-rw-r-----  1 root   system 37641208 Apr 10 05:25 dsmsched.log

Looking at the output above, you may find a surprise and if you have not dealt with such surprises before you may already be thinking how to remove the “offending” file. Executing rm * will quickly decrease the number of files in this location and most likely it is not what one intends to do. We have a few options.

One could execute the command rm simultaneously suppressing the meaning of the special character or characters included in the file name.

# rm '*'

Above, the command was instructed not to “expand” the * into any existing file name that does not have an extension, but to treat it as the single meaningless character * and to remove only the one file which name is a single *.

What to do if one finds a file named * ?.? In the last case, it is often difficult to establish the number of spaces in the file name. In this situation, you may use the find command to find the value of the inode associated with the file.

# ls -i
   52 *
   36 *    ? 
   11.java
   40 .lsof_lawaptpu001
   47 .sh_history
   64 .ssh
   31 .toc
16384 .topasrecrc
    6 .vi_history

Next, the find command is instructed to associate the the earlier identified inode with the file name which is given to the rm command for a prompt removal.

# find . -inum 36 -exec rm '{}' \;

If instead of removal you are interested in renaming the file, you could proceed as follow:

# find . -inum 36 -exec mv '{}' new_file_name \;

where the new_file_name is the new name.

Posted in Real life AIX.

Tagged with , , , .


argument list is too long……

Today, during our lunch break we attempted to fix my friend stationary bike (an exercise machine). Being the “real” admins we attempted the repairs without reading the manual – but of course! Well, you are free to imagine the outcome; now Adi has one almost functioning exercise machine and a few spare parts!

After the lunch, Adi calls for help – one of his file system is 96% full and in order to reclaim its capacity the files older then 90 days needs to be removed. Regardless that he has a full access to the directory tree and its contents and his command always worked before, this time the procedure does not work…. He gets error message about “something” being “too long”. The command he executes is:

find . -mtime +90 -exec rm -f {} \;

In the same directory, I execute the command ls -ltr which after a long while also refuses to work – “the argument list is too long”. In the past, I discovered that to prevent situation like this one I need to increase the value of the attribute called ncargs which belongs to the sys0 device. Currently, this attribute is set to 256 blocks of 4kb each and I cannot increase its value as I think a reboot is required to make the change effective.

# lsattr -El sys0 | grep ncargs
ncargs    256     ARG/ENV list size in 4K byte blocks     True

I gradually increase the age of files to be removed from 90 through, 180, 360, 720 days and still AIX responds with the same message – “the argument list is too long”…. There must be a huge number of files in this directory…..

It has to be my “lunch” time experience that made me execute the command man xargs. This decision proved to be a really good one indeed! I find that one can limit the number of arguments processed by the xargs command using its -L parameter followed with an appropriate number. With the freshly acquired knowledge, I modify the find command as follows:

find . -mtime +90 | xargs -L180 rm

Guess what? Now, the files are being removed!!!!!

I still have to digest the meaning of the -L of the xargs command. Accordingly with the man page “The generated command line length is the sum of the size, in bytes, of the Command and each Argument treated as strings, including a null byte terminator for each of these strings. The xargs command limits the command line length”. Digging deeper into the man page, one finds that

-L Number
            Runs the Command parameter with the specified number of nonempty parameter lines read from standard input. The last invocation of the Command parameter can have fewer parameter lines if fewer than the specified Number remain. A line ends with the first new-line character unless the last character of the line is a space or a tab. A trailing space indicates a continuation through the next nonempty line.

I am not sure about the meaning of these statements so I interrupted the last command and changed -L value to nargs * 4 = 1024 as in

find . -mtime +90 | xargs -L1024 rm

The procedure still works, files older then 90 days are being removed. Did I requested up to 1024 lines (file names) at a time for to the rm command to process? Is there any relation between nargs and the -L value or not? You who knows, please leave a comment, the inquiring minds wants to know 🙂

Posted in Real life AIX.

Tagged with , , , , , , .


centralized authentication – preparing for its failure

As the conventional wisdom goes, in UNIX (AIX) “plain” users can (it is recommended they do) authenticate via some global methods like NIS, LDAP, Kerberos, and so forth. For the application (also known as the administrative) and the “system” accounts it is recommended that they authenticate locally – that they are defined on the host.

I do not argue that for the few “really” secured environments it is an excellent idea to authenticate administrative users with a token (so passwords are never the same) or to use the equivalent method of obtaining the password for a specific account from a specific location and then immediately changing it and recording the change in the same secure depository so the next time the password is needed it will be used and also immediately changed. Yes, some admins work like that and I say it again – I understand and do not dispute the need for extreme security measures.

But, “the shoes that fit John may not fit his little brother Johnny”…. So for some other organizations the describe above security requirements may not be appropriate.
There is already a number of organizations storing all login names and passwords in a central depository like LDAP or AD to name just a few.
Without diving into details, the reasoning follows this path – if users cannot login because our authentication mechanism is not functioning, why do I need to worry if an admin account can or cannot do the same?
You may also ask a UNIX administrator how often does he/she has to change the ORACLE administrative password if there are 30 or more Oracle servers and 6 DBAs? Well, sometimes often, sometimes not but always the source of the same pain for both – UNIX and DB administrators.

If the communications between the password depository and client are made secure (for example using SSL) and the repository is protected from the “outside” interference why not to authenticate even the administrative accounts centrally instead of locally?
I vote for the centralized authentication, what about you?

Still, I believe that it is (if possible) a splendid idea to allow an administrative account the opportunity to authenticate locally when the global authentication mechanism is not functional. Why not?
For example, isn’t it nice to be able to log-in as root or an application administrator to gracefully shut down the host or the application despite not being able to resolve their credentials by LDAP server?

As always, AIX comes through and delivers…. The rest of this post shows how to quickly allow a user to authenticate locally when IBM TDS (LDAP) service is not available.

Posted in Real life AIX.

Tagged with , , , .


disabling journaling of AIX jfs2 file systems …

Since introduction of ver 6.1, file system journaling is not longer a permanent feature of AIX. If this is “news” for you, then you most likely wonder why? Well, let’s think about it for a moment.
Every “write” is written to the file system and it is also “acknowledged” in associated with the file system journal (log logical volume). The last sentence implies that for each write there are actually two.
With this knowledge in mind, one may really clearly see the beauty of the INLINE logs by visualizing a disk head moving to write to a file system and then moving (relocating) to the location on the platter where the journal “sits” – the INLINE log is placed in the middle of its file system so the time required for the disk head to travel is shorter then if the log is located outside of its file system….

Regardless of the journal (log) type there is time spent moving back and forth, from file system to its log volume and back again…… This time to move the head sometimes represents a waste as at specific circumstances the journal updates are counter productive – there waste time.

Posted in Real life AIX.

Tagged with , , .


fixing offline paths in dlnkmgr environment.

The dlnkmgr SAN driver is no exception, like in the case of sdddrv, sddpcm or just mpio for one reason or another one or more “paths” may go missing. This post shows how to enable missing paths in the dlnkmgr environment.

Posted in HDS, Real life AIX.


matching Hitach and AIX FC adapters

Working with dlnkmgr you quickly discover that it uses its own “numbering” in regards to AIX host FC (HBA) adapters which may lead to some surprise it the least expected time.
To get HDS “view”, you need to execute:

# dlnkmgr view -hba -portwwn
HbaID Port.Bus HBAPortWWN IO-Count IO-Errors Paths OnlinePaths
00000 00.05 10000000C985E7C6 2841716579 488 60 60
00001 00.02 10000000C987C25E 2838346781 397 60 60
00002 00.07 10000000C987BEAE 2814428535 1348 60 60
00003 00.01 10000000C987BFA2 2812541903 1373 60 60
KAPL01001-I The HDLM command completed normally. Operation name = view, completion time = 2012/03/14 13:56:23

To get AIX view of its HBA, one needs to execute (for example) the command lscfg -vl fcsX | grep Netw. With both outputs, it is time to reconcile them or you can save these few lines of code to do it automatically for you.

/usr/bin/ksh
### W.M. Duszyk
### Match AIX versus HDS FC adapters
###

set -A HDS_FC `dlnkmgr view -hba -portwwn | \
                              grep '^00' | awk '{print $3}'`

for FC in `lsdev -Cc adapter | awk '/fcs/ {print $1}'`
do
          AIXWWPN=`lscfg -vl $FC | grep Netw | \
                                            sed 's/\./ /g' | \
                                            awk '{print $3}'`
         ((cntr=0))

         for HDSWWPN in ${HDS_FC[*]}
         do
            [[ $AIXWWPN = $HDSWWPN ]] && \
           { print "AIX $FC ($AIXWWPN) is HDS $cntr ($HDSWWPN)"; \
              break; }
           ((cntr=$cntr + 1))
         done
done

The sample output:

AIX fcs0 (10000000C987BFA2) is HDS 3 (10000000C987BFA2)
AIX fcs2 (10000000C985E7C6) is HDS 0 (10000000C985E7C6)
AIX fcs4 (10000000C987C25E) is HDS 1 (10000000C987C25E)
AIX fcs6 (10000000C987BEAE) is HDS 2 (10000000C987BEAE)

Posted in HDS, Real life AIX.

Tagged with , , , , , .


san design quide for aix with mpio, sdddrv, pcmdrv

anybody who is involved with AIX, SAN and maybe with booting from SAN can find this PowerPoint presentation very useful. Straight from Dan Braden and John Hock (IBM), SanBoot MPIO SDD SDDPCM Presentation.

Posted in Real life AIX.

Tagged with , , , , , , .


identifying HDS ShadowImage disks in mpio environment

In some places, a certain AIX hosts mirrors its data with disks from two HDS (Hitach) SAN controllers. In this case, there is also a number of places that associate a set of mirror disks with a ShadowImage disks from the same HDS controller. There are also places that maintain a separate ShadowImage disks sets on each HDS controller and activate only the set which for some reasons seems appropriate at a given backup time. Following the same train of thoughts; some places execute ShadowImage based backup on the host with data where the other places varyon the ShadowImage based volume group on the backup server instead. In the last case, the backup server might not have the dlnkmgr software installed; it may just use the AIX built in mpio drivers.

In the last case, occasionally AIX administrator may need to validate that the ShadowImage disks associated with a given backup server, really come from the specific HDS controller …. .

Posted in Real life AIX.

Tagged with , , , , .


recovering “lost” mpio disks

Last week something happened to fabrics, switches and God only knows what else. Some hosts lost some disks but fortunately since we mirror to separate fabrics every volume group withstood this incident. Later, we had to re-acquire the lost or missing disks which for most host was easily done executing the cfgmgr command. For a few hosts this operations required few additional steps that are described here. As the first step, I did rmdev -dl hdisk3 -R but unfortunately cfgmgr did not “bring” it back…..

# lspv
hdisk0     00cc7261a9f50481          rootvg          active
hdisk1     00cc72617b870835          rootvg          active
hdisk2     00cc7261e4fdc7b7          entoras1_vg     active
hdisk4     00cc7261d207431a          mksysbvg        active

After a repetitious execution, cfgmgr still cannot deliver one missing disks that is required to mirror entoras1_vg. Just for “kicks”, I execute the sanscan utility.

# sanscan
sanscan v2.2
Copyright (C) 2010 IBM Corp., All Rights Reserved

Processing FC device:
    Adapter driver: fcs0
    Protocol driver: fscsi0
    Connection type: fabric
    Local SCSI ID: 0x083e00
    Local WWPN: 0x10000000c96ef366
    Local WWNN: 0x20000000c96ef366
Initializing device information...
Scanning SAN...
SCSI ID LUN ID WWPN     WWNN   Vendor ID Product ID Rev  NACA Qualifier     Device Type Error(s)
-----------------------------------------------------------------------------------------------------------------------
080c00  0000000000000000 5005076801402afd 5005076801002afd IBM  2145   0000 yes  Connected     Disk [hdisk2, path_id 0]                            
080c00  0001000000000000 5005076801402afd 5005076801002afd IBM   2145   0000 yes  Connected     Disk [hdisk4, path_id 0]                            
081c00  0000000000000000 5005076801402ba4 5005076801002ba4 IBM    2145   0000 yes  Connected     Disk [hdisk2, path_id 1]                            
081c00  0001000000000000 5005076801402ba4 5005076801002ba4 IBM    2145   0000 yes  Connected     Disk [hdisk4, path_id 1]                            
603d00  0000000000000000 5005076801202a94 5005076801002a94 IBM    2145   0000 yes  Connected     Disk [no ODM match]                                 
601d00  0000000000000000 5005076801202d7f 5005076801002d7f IBM     2145   0000 yes  Connected     Disk [no ODM match]                                 
4 targets and 6 LUNs found in 0.015390 seconds 

Yes, sanscan sees “something” but cfgmgr is not able to get it. In situation like this one, it is often helpful to remove the parent (physical adapter) and all its children. Before any removal, we verify that there is at least one more additional available FC adapter (lsdev -Cc adapter | grep FC) and that all disks have paths leading through it.

# lspath
Enabled hdisk0 scsi1
Enabled hdisk1 scsi1
Enabled hdisk2 fscsi0
Enabled hdisk4 fscsi0
Enabled hdisk2 fscsi0
Enabled hdisk4 fscsi0
Enabled hdisk2 fscsi1
Enabled hdisk4 fscsi1
Enabled hdisk2 fscsi1
Enabled hdisk4 fscsi1

Knowing that the host has paths to each disk through both FC adapter allows us to proceed. We start with smitty devices and select the Disable a FC SCSI Protocol Device menu. From the list, select the first adapter (fcs0) and validate the selection with the RETURN key. Next, get out of smitty and proceed with removal of this adapter.

# rmdev -dl fcs0 -R
fcnet0 deleted
fscsi0 deleted
fcs0 deleted

Execution of cfgmgr followed with lspv shows a new disk. Let’s repeat the same process as before but this time we start disabling fscs1 and next removing it. With the new disk back on board, we have to verify this it is the missing mirror that we will use to re-mirror entoras1_vg. What will determine that this is the “right” disks? First, it must have the same size as hdisk2.

# bootinfo -s hdisk2
599040
# bootinfo -s hdisk3
599040

Next, both disks have to belong to different fabrics/controllers.

# lsattr -El hdisk2 | awk '/unique_id/ {print $2}'
33213600507680181003148000000000002ED04214503IBMfcp
# lsattr -El hdisk3 | awk '/unique_id/ {print $2}'
332136005076801830037C00000000000037904214503IBMfcp

Indeed, hdisk2 “belongs” to SVC 3148 and hdisk3 to SVC 037C – two different “fabrics”, we can proceed with mirroring.

# extendvg -f entoras1_vg hdisk3
# mirrorvg -S -c 2 entoras1_vg hdisk3

PS
Ramon, thanks for showing me how to use awk pattern matching …. 🙂

Posted in Real life AIX.

Tagged with , , , , , , .




Copyright © 2016 Waldemar Mark Duszyk. All Rights Reserved. Created by Blog Copyright.