Skip to content


Power7, SMT, CPU utilization, etc

There is a lot of room for misunderstanding CPU utilization with active SMT (either 2 or 4 threads). Lately, I am in situation where not only I have to know what is going on with CPU utilization but also I have to be able to show and explain it my clients and my bosses as well.
For all of you who need to learn more about SMT and CPU utilization – check at least these two post by Mr. Nigel Griffiths, IBM.

nmon – I can’t see all the CPUs on-screen. Please Help!

nmon – new online Physical CPU Graphs arrive for latest AIX 6.1

Another reading material after comment from Rob: Power7 CPU and Virtual Processors . You may need to download this document to be able to read it (PowerPoint presentation).

Posted in Real life AIX.


Method error (/usr/lib/methods/cfallvpath -2): 0514-068 Cause not known

To migrate to PowerHA System Mirror 7.1 my cluster needs a pair of disks. But something is wrong, the cfgmgr command fails with the following messages:

# cfgmgr
Method error (/usr/lib/methods/cfallvpath -2):
        0514-068 Cause not known.
sh: /usr/lib/methods/cfallvpath:  not found.

What adds to the mystery is the fact that the /usr/lib/methods/cfallvpath is absent from this host and from all the other nodes in this cluster and as I can see later – it is absent from all of my AIX boxes (6.1.8 and 7.1.3). Something here does not add up…. Do I really need to reboot these nodes in order to get the disks in? Maybe not if only I could delete the offending “method” from ODM.

Let’s start and backup the current configuration rules, just in case…

# cd /etc/objrepos; cp Config_Rules Config_Rules.BACKUP

Now, let see (this step should be really the first one) these rules/methods for the last time.

# odmget -q "rule='/usr/lib/methods/cfallvpath -2' " Config_Rules

Config_Rules:
        phase = 2
        seq = 50
        boot_mask = 0
        rule = "/usr/lib/methods/cfallvpath -2"

Config_Rules:
        phase = 3
        seq = 50
        boot_mask = 0
        rule = "/usr/lib/methods/cfallvpath -2"

Look and check that the backup you made is really there when you left it. Ready to go, let’s remove the rules.

# odmdelete -q "rule='/usr/lib/methods/cfallvpath -2' " -o Config_Rules
0518-307 odmdelete: 2 objects deleted.

Is this the truth and they are gone?

# odmget -q "rule='/usr/lib/methods/cfallvpath -2' " Config_Rules

No output means that the rules have been removed. Now, it is the time for some housekeeping.

# savebase -v
saving to '/dev/hd5'
81 CuDv objects to be saved
366 CuAt objects to be saved
27 CuDep objects to be saved
22 CuVPD objects to be saved
405 CuDvDr objects to be saved
110 CuPath objects to be saved
216 CuPathAt objects to be saved
0 CuData objects to be saved
0 CuAtDef objects to be saved
Number of bytes of data to save = 51308
Compressing data
Compressed data size is = 16280
        bi_start     = 0x3600
        bi_size      = 0x1b20000
        bd_size      = 0x1b00000
        ram FS start = 0x917e30
        ram FS size  = 0x10ec71a
        sba_start    = 0x1b03600
        sba_size     = 0x20000
        sbd_size     = 0x3f9c
Checking boot image size:
        new save base byte cnt = 0x3f9c
Wrote 16284 bytes
Successful completion

Now, ready to run the ConfigurationMangler as Mr.Mike F. affectionately calls it ….

# cfgmgr 

It returns with no errors and the lspv shows the two new disks as expected. Have a good weekend Sys Admins!

Posted in Real life AIX.


NIM, KRB5/AD rsh, ftp …….

Lately, I am busy trying to get ftp and rsh to work with KRB5/AD as the authentication engine. Apparently, there are still applications that need both ftp and rsh… NIM is one such example, it still needs rsh. Well, this is exactly what I have thought till this morning when I discovered Chris Gibson article http://www.ibmsystemsmag.com/aix/administrator/systemsmanagement/nimsh_nimadm/ showing what to do in order to change this requirement! For me this is a “WIN” situation as now I can put back our NIM servers into the KRB5 and they will still work! This is really ironic as two days ago during a meeting with IBM reps I expressed my surprise that NIM still needs rsh. As I see it now, my idea was at least several months old.

Looking for more NIM info, I found a really nice blog I recommend all to take a look at – “NIM Less known features : HANIM, nimsh over ssl, DSM”
http://chmod666.org/index.php/nim-less-known-features-hanim-nimsh-over-ssl-dsm/.

This rocks! Thanks Gents!

Posted in Real life AIX.


AIX LDAP client + KRB5A with Active Directory 2012

If you select this method to authenticate/authorise you may notice that a user group membership is missing – a user with multiple groups will be shown to belong to only one group!

# lsuser -a pgrp groups mannt
mannt pgrp=lawson groups=lawson

The mannt user belongs to more than one group, really. So why when we ask (using AIX LDAP client) Active Directory to deliver this information it does come to us truncated?

It could be that the Active Directory administrator did not follow this procedure:

Active Directory object management
As is the case with any other authentication mechanism, we need to configure the user objects for the users that are to use the system. However, if you are implementing this solution, more than likely your users already have Windows accounts. In that case, all we need to do is to modify the objects to be POSIX compliant.
1.	Open the Active Directory Users and Groups management tool.
   a.) Modify a group object to function as a POSIX group.
   b.) Right-click on the user group for assignment of a GID.
   c.) Click on the Unix Attributes tab.
   d.) Populate the NIS Domain dropdown and the GID number as appropriate.
2.	Modify a user object to function as a POSIX user.
   a.) Locate and activate the tab that says Unix Settings.
   b.) Under Unix Settings, set the UID and GID for the user, as well         as the home directory location (on the Linux filesystem /home/).
   Note: You will need to ensure that the directory exists with the appropriate user object having access to the directory.
   c.) Reset the user's password. This causes the AD password and the Unix password attributes to synchronize.
3.	Add the user as a Unix member of the group.
   a.) After you have added the user as a Unix user, you will also need to come back to the group properties and add the user as a member on the Unix Attributes tab. Otherwise, the user will not be populated in the msSFU30PosixMember attribute.

Next, you have to modify the /etc/security/ldap/sfur2group.map file, which default is presented bellow:

groupname  SEC_CHAR   cn                 s       na      yes
id         SEC_INT    gidNumber          s       na      yes
users      SEC_LIST   cn                 m       na      yes

Depending how your user group membership is declared in Active directory, you have to replace the last line of this file so it looks either like this:

users      SEC_LIST    msSFU30PosixMember m      na      yes

or like this

users      SEC_LIST   member              m      na      yes

Follow this modifications with execution of the restart-secldapclntd command and list the user again. Now, his full group membership is really shown.

# lsuser -a pgrp groups mannt
mannt pgrp=lawson groups=lawson,shell,payroll,operator,printq

I will be able to provide you with the Active Directory “side” of this procedure as soon as my college Igor Zilberman (the greatest AD/CITRIX administrator I have been lucky to work with! :-) ) documents this process – Igor thanks in advance!
These two different attributes (member and msSFU30PosixMember) you use in sfur2group.map do really have an effect on how you assign UNIX attributes to AD users….

ATTENTION:
Tu Vo (IBM) just told me that the default *.map files may be overwritten the next time AIX is patched!!! With this knowledge at hand, I copied the original sfur2group.map into sfur2AD2012group.map (am I creative or not?), edited it as described above and next, I modified the appropriate entry in the /etc/security/ldap/ldap.cfg so know it looks like that:

groupattrmappath:/etc/security/ldap/sfur2AD2012group.map

After a few days, a user tried to use the sftp command and failed. Fixing his issues, I noticed a “strange” behavior (AIX 7.1 host) – for a casual user the id command did not work:

# id
uid=934960 gid=4141 groups=216(operator)

The host/user lost the ability to translate (to show) the user login name and his/her groups names – it just showed their numerals. When the same user tried to ssh to another host he would receive this pleasant message:

# ssh markd@hostB
You don't exist, go away!

Well, tell it to the user that he does not exist! How dare you? Tu Vo (IBM) delivered the resolution to this issue letting me know that KRB5A is “depreciated” (on its way out….). Tu Vo advise was to replace in /etc/methods.cfg file every KRB5A with just KRB like that

KRB5:
        program = /usr/lib/security/KRB5
        program_64 = /usr/lib/security/KRB5_64
        options = authonly,is_kadmind_compat=no,tgt_verify=no

LDAP:
        program = /usr/lib/security/LDAP
        program_64 =/usr/lib/security/LDAP64

KRB5LDAP:
        options = auth=KRB5,db=LDAP

next, you must do the same in the /etc/security/user file – make sure that registry and SYSTEM also show KRB5LDAP instead of KRB5ALDAP.

After the change, you either has to restart the secldapclntd or to flush its cache (flush-secldapclntd). Now, login as the ordinary user and execute the id command, does it work? YES!!!! Now it is time to do ssh and sftp do they work? YES!!!

Thanks Tu Vo!
:-)

What about the earlier ssh issue? Well, it was not justKRB5 it was also the key in his ~/.ssh/known_hosts …….

Posted in Real life AIX.

Tagged with , , , , , , , , , , .


PCI DSS – how to show CVE report is wrong?

The Payment Card Industry Data Security Standard (PCI DSS) is a set of specific security standards designed to ensure that all companies that process, store or transmit credit card information maintain a secure environment during and after a financial transaction………

For a company to maintain a “good PCI standing” its internet facing infrastructure has to be “scanned” to verify that all identified security vulnerabilities has been addressed (implemented). If your company follows the PCI DSS you may receive an email notification about PCI CVE vulnerabilities and exposures (or something like that). Often, these emails will really be bogus. If the internet facing hosts are systematically patched they will be bogus and you will have to prove it.

So how do you do that? If you http server is running Red Hat, you can use the method shown next.
Let’s say that the PCI compliance email identifies the following – “Apache HTTPD: error responses can expose cookies (CVE-2012-0053)”. The “CVE-2012-0053″ is known as the “Common Vulnerability and Exposures” identified in 2012 under the ID of 0053.

It is a well know fact, that the ones tasked with creating “scanning” tools (regardless of their intended targets AIX or LINUX) as a rule are no able to follow UNIX patching “logic” and always flag as missing something that already has been fully addressed. To verify that the particular CVE andthe associated with it issues have already been addressed, login to the appropriate host and identify all rpm's that could be exposed (the web base stuff – Apache and HTTPD):

# rpm -qa | egrep -i "apache|httpd"
httpd-tools-2.2.15-29.el6_4.x86_64
httpd-2.2.15-29.el6_4.x86_64

Now, you have to see if the packages listed by the last command contain any information about the CVE-2012-0053. The next command looks “inside” all installed httpd rpm’s.

# rpm -q --changelog httpd | grep -C1 'CVE-2012-0053'
* Mon Feb 06 2012 Joe Orton <jorton@redhat.com> - 2.2.15-16
- add security fixes for CVE-2011-4317, CVE-2012-0053, CVE-2012-0031,
  CVE-2011-3607 (#787599)

Analyzing the output above, you clearly see that on 02/06/12 one Joe Orton added four fixes to the httpd rpms. One of them is the one that the PCI scan identified as missing ……

The credit goes to Mike “Ski” Swierczynski who showed me this procedure, thanks Mike.

Posted in Linux.

Tagged with , , .


issuess with alt_disk_install and AIX 6.1.8.3 and above…..

when “migrating” to this version of AIX using the alt_disk_install, you may be presented with this unwelcomed message

0301-150 bosboot: Invalid or no boot device specified!
usage:  bosboot {-a | -v} [-d device] [-p proto] [-k kernel] [-l lvdev]
                [-b file] [-M primary|standby|both] [-D|-I] [-LTq]
        Where:
        -a              Create boot image and write to device or
                        file.
        -v              Verify, but do not build boot image.
        -d device       Device for which to create the boot image.
        -p proto        Use given proto file for RAM disk file
                        system.
        -k kernel       Use given kernel file for boot image.
        -l lvdev        Target boot logical volume for boot image.
        -b file         Use given file name for boot image name.
        -D              Load kernel debugger.
        -I              Load and Invoke kernel debugger.
        -M primary|standby|both Boot mode - primary or standby.
        -T platform     Specifies the hardware platform type.
        -q              Query disk space required to create boot
                        image.
        -L              Enable MP locks instrumentation.
0505-120 alt_disk_install: Error running bosboot in the cloned
root volume group.
Cleaning up.

Executing lspv or lsvg will show you that there is no altinst_rootvg. If you check the state of the current bootlist you most likely will find it in a questionable state….

# bootlist -m normal -o
hdisk0 blv=hd5

Why “questionable”? Because the output of the last command should really look like this

# bootlist -m normal -o
hdisk1 blv=hd5 pathid=0

Searching the net for 0505-120 alt_disk does not returned nothing for me. One short call to IBM SUPPORT and the issues was identified and resolved. Apparently there is a bug and for the alt_disk_install to work you have to install the appropriate fileset – in my case the one included in the AIX 6.1.8.3 collection. A quick nfs mount of the file system on our NIM server containing the AIX 6.1.8.3 filesets followed with smitty install_all and selection of the bos.alt_disk_install.rte from among the available file sets

# lslpp -l | grep alt_disk
bos.alt_disk_install.boot_images
bos.alt_disk_install.rte 6.1.8.16 COMMITTED Alternate Disk Install
bos.msg.en_US.alt_disk_install.rte
bos.alt_disk_install.rte 6.1.8.16 COMMITTED Alternate Disk Install

resolved the issues and the follow up execution of

# alt_disk_install -C -F update_all -I acNgX \
              -l /ptfs/tl8_sp3_update hdisk1

resulted (after a few minutes) in this very welcomed display:

...........................................................
install_all_updates: Log file is /var/adm/ras/install_all_updates.log
install_all_updates: Result = SUCCESS
Modifying ODM on cloned disk.
Building boot image on cloned disk.
forced unmount of /alt_inst/var/nmon
forced unmount of /alt_inst/var/nmon
forced unmount of /alt_inst
...........................................................
forced unmount of /alt_inst
Changing logical volume names in volume group descriptor area.
Fixing LV control blocks...
Fixing file system superblocks...
Bootlist is set to the boot disk: hdisk1 blv=hd5
You have mail in /usr/spool/mail/root

# bootlist -m normal -o
hdisk1 blv=hd5 pathid=0

After the reboot this host shows the expected new version of AIX. By the way, before I migrate the next machine to 6.1.8.3, I wll first upgrade its bos.alt_disk_install.rte ….. :-)

Today, I witnessed my upgrade from 7100-01-06 to 7100-03-01 fail just the same way. So I installed the bos.alt from 7100-01-06, restarted the process and watched it work = install_all_updates: Result = SUCCESS

Posted in Real life AIX.

Tagged with .


mksysb backups and their integrity

It is a good practise to use this type of backup just in case “something” happens to the rootvg. In my opinion, mksysb backup is the easiest and the fastest way to “recover/recreate” a host. Occasionally, it may also used for a quick restore of a file or a directory that was removed but it should not be – as long as it resided in the rootvg file system structure as mksysb works only with the rootvg.

Many of us automate mksysb backups with cron using the value returned by this command as a trigger for a possible alert in the case failure – in a perfect world this would work every time…….

Over the weekend, I have discovered that backups of some of the servers did not contain the files are needed to restore. It could be that the emails triggered by a faulty mksysb were not delivered or they were delivered but I have missed them among dozens of other emails delivered daily to my mailbox. Regardless of the reason or excuse, the result was the same ..... What I was looking for I could not find because it was not there!

A few hours later, after the issues were finally resolved and my mind was free to think about something else, I started to think about the steps needed to prevent this what has happened today - what to do to make sure the mksysb really contains everything it should and administrator is notified and it is aware of the fact that the backup failed?

The email sent from the backup script is really mandatory. Is it a good idea to re-send the "failure" email a few hours later? Maybe it is a perfect idea to keep re-sending email until sys admin logs into the host sending these emails?

I modified the backup script. Now, mksysb failure is marked with creation of the "mksysb.failed" file, which presence is the reason of sending the "mksysb failed on $HOST" emails (once a day). To disable this mechanism, system administrator needs to login and delete the "mksysb.failed" file. Simple, right?

Nothing, no procedure can substitute a "admin's eye". How to make sure that the mksysb backup is indeed good and it contains the data you may need? You have to do it yourself! So no matter how "sophisticated" is the script, occasionally you have to login and interrogate its backup.
In this case there are at least two options - one uses the restore and the other the lsmksysb command. For example:

# lsmksysb -lf /path/to/mksysb_backup
# restore -Tqvf /path/to/mksysb_backup

List its contents, look for a specific file or files. Can you "get" them?

Ok, so this is one side of the store. Now, let's talk about the second one. Where is this backup and the previous ones being stored? In my case they are backed up and stored by TSM. To verify that they really are there execute the next command.

# dsmc restore -pick /path/to/mksysb_backup -subdir=yes -inactive
...................................................
TSM Scrollable PICK Window - Restore

  #    Backup Date/Time     File Size A/I  File
        -------------------------------------------------------------
  1. | 01/22/14   07:36:50  14.69 GB  A   /mksysbimg/mksysbimg.tds1
  2. | 01/21/14   11:45:12  13.77 GB  I   /mksysbimg/mksysbimg.tds1
  3. | 01/21/14   02:00:14 308.25 MB  I   /mksysbimg/mksysbimg.tds1
  4. | 01/20/14   02:00:15 306.10 MB  I   /mksysbimg/mksysbimg.tds1
  5. | 01/19/14   02:00:16 304.63 MB  I   /mksysbimg/mksysbimg.tds1
  6. | 01/18/14   02:00:30   7.22 GB  I   /mksysbimg/mksysbimg.tds1
  7. | 01/17/14   08:41:29   7.23 GB  I   /mksysbimg/mksysbimg.tds1

The last output shows that something is not right. Why their size is so drastically different? Something needs to be checked here for sure.

Also, the backup command has been modified to include the -p option - the backup seems to work better with it.

# /usr/bin/mksysb -i -e -X -p /mksysbimg/mksysbimg.${HOSTNAME}

Posted in Real life AIX.

Tagged with .


aix printing to a specific tray

The qrpt command does this trick. To send the file defined by its path and name as /tmp/joe2.ps to tray 2 (-u2) of the printer associated with the print queue called myPrintQueue do the following

/usr/bin/qprt -u2 -PmyPrintQueue /tmp/joe2.ps

Be aware that in most cases the destination print queue must mach the type (PostScript, ASCI, PCL, ….) of the file to be printed or nothing will print at all.
Does the target printer has the tray you trying to print too? :-) Better check.

Posted in AIX.


automating your life ….

AIX Administration often requires serial execution of the same command, series of commands a script or a set of scripts on each host in one’s environment. This is may not be a big issue when both the number of hosts and the frequency of executions are small but when one of these factors increases the “not a big deal” quickly becomes a real inconvenience.
Imagine you “own” a few dozen hosts, and one day you are asked to produce a listing of their operating system versions, and the next day you have to change their root password, which also now must be longer than eight characters. Each of these requirements calls for you to login to every host again and again – who knows how often you will have to do it tomorrow or the following week? The sheer mindlessness of such tasks is enough for the few who dream about becoming system administrators to abandon their dreams and become software developers (this was a joke! I used to be a software developer myself).
Is there a way out? Yes, there is and not just one! There are many commercial and noncommercial solutions that address this scenario but they may not be applicable to everybody because of their costs both in treasure and in human effort. For some it is the lack of money for others it is lack of time to develop the required skills.
In reality everything that is needed to automate responses to many challenges faced by system administrator is most likely already there – I am talking about ssh. If ssh is installed on your hosts than nothing can stop you from automating many, if not most, of your daily tasks or procedures, and this article will show you how to do just that.
First, you have to configure your environment so that from at least one host you will be able to execute commands on all other hosts without being required to provide the root password. Login to the host of your choice, become root and verify that the ~/.ssh directory exists. Absence of this directory could indicate that the ssl/ssh packages have not been installed. Check it by executing the following command:

# lslpp -l | grep 'ss[h|l]'

If both ssl and ssh packages are installed you have to create this directory and set its owner/permissions as shown next:

# mdir –p ~/.ssh; chown root.system ~/.ssh; chmod 700 ~/.ssh

Next move to the ~/.ssh and create a set of public keys.

# cd ~/.ssh
# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
3e:4f:05:79:3a:9f:96:7c:3b:ad:e9:58:37:bc:37:e4 

The newly generated file called id_rsa.pub needs to be copied to the other hosts in your environment. If your “master” server is an AIX host, the latest can be done executing

# cat .ssh/id_rsa.pub | ssh root@YourOtherHost \
                                    'cat >> .ssh/authorized_keys'

Execution of the last command requires entry of root password – if everything works well this is the last time you have to provide it. Repeat the last step for each host you have. To verify this procedure execute a command on a remote host or hosts, for example

# ssh root@YourHostName ‘lslpp –l | grep bos; date’

You will know when it works and when it does not…. By the way, have you noticed that two separate commands were executed with one invocation of ssh?
Now, let’s take the second step. For your new procedure to work, you must always be able to have or to generate the list of your hosts. For some of you, the last statement may sound funny – especially if you are the only administrator of an environment with a few hosts. If the environment is large and dynamic the number and names of its hosts may change often as well as their state (on/off-line).
To be able to generate the list of AIX hosts at will, you have to extend the ssh based password-less functionality to your HMC’s – they are the places to inventory! I am aware of two ways to accomplish this task. One procedure requires you to copy the previously created id_rsa.pub file to each hmc as shown on the next four lines.

# scp hscroot@YourHmcName ~/id_rsa.pub .

Next, you have to login to HMC and move the contents of this file to its proper location.

# cat id_rsa.pub >> ./.ssh/authorized_keys2; rm ./id_rsa.pub

Do not forget to remove the id_rsa.pub file since you will no longer need it.
If you use “putty” to login, then you can use its built in copy/paste functionality to display the contents of id_rsa.pub to copy and to paste them as the argument to the next command – mkauthkeys.

# cat id_rsa.pub
# ssh hscroot@YourHmcName "mkauthkeys --add 'ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAsdpDqfALCSOc0ytZ+DC5cgpHNRwHFGi/SL+4UFNz7qrpuqvnWI0cFzmb5TW2Jtt5IEYcoGHDnwjMTnmYSdM7ThobLoj+JyxAY/4f7YPHFjQhkkaiy7Avv4Ov7l02kwxiJreyzN7aYzoEplv+xGTPi/VyKWu9puaCL/hl9hxMMKVcp3X7FPqoBOievbdxVD95xcXL8ZKEmz2SeStyFzkWe+WGRfY8hZ9Bhye66G3Ys5k8RXRNlFGtXryij89TZ4a+oQdBlv9mkcEg1J7HjexRSsRcGUWUsEk0e7JU7mHvSmMYl1JGkj6cwc2X4UkVVuHaAIu4Ndp7kn4xDtJ+DM7Urw== root@markd'"

To check your HMC setup, try to execute “something” remotely, for example:

# ssh hscroot@YourHmcName lspartition –dlpar

Again, you will know when it works or when it doesn’t. This functionality needs to be extended to all UNIX hosts. To push the public keys can be done executing either

 # cat .ssh/id_rsa.pub | ssh root@host_name \
                   'cat >> .ssh/authorized_keys'

The last step becomes simpler if executed from a LINUX host,

# ssh-copy-id -i ~root/.ssh/id_rsa.pub rootB@host_name

At this moment, we are ready to address the original requirement – generation of list of AIX hosts. Of course, we assumed that all our AIX hosts are attached to one or more HMCs. In my case, I started creating a file containing DNS names of all my Hardware Management Consoles. Here it is:

# cat Hmc.Listing
hmc1
hmc2
…………
hmc8

The actual script that harvests the names of AIX hosts (dlpars) is shown below. By the way, I call this script AixHarvester.ksh.

1: #!/usr/bin/ksh

2: ### Waldemar Mark Duszyk 05/17/2013
3: ### get AIX hostnames from all managed systems
4: ### communicating with HMCs identified in the file called "hmc.list"

5: [ -f hmc.list ] && echo "" || { echo "The file hmc.list containing HMC names is missing, aborting."; exit -1; }

6: set -A Dlpars
7: i=0

8: rm AixHosts.list >/dev/null 2>&1
9: rm HmcList >/dev/null 2>&1

10: for hmc in `cat hmc.list`
11: do
12:         ping -c 1 $hmc >/dev/null 2>&1 && echo $hmc >> HmcList \
                                           || echo "$hmc is DOWN."
13: done

14: for hmc in `cat HmcList`
15: do
16:         for  dlpar in `ssh hscroot@$hmc lspartition -dlpar \
17:                 | grep '<#' | awk -F "," '{print $2}'`
18:         do
19:                Dlpars[i++]="$dlpar\n"
20:         done
21: done

22: print "${Dlpars[@]}" | \
23:	sed '/^\s*$/d' | \
24:	sed 's/^ //g' | \
25:	grep -v vio | \
26:	sort -u > AixHosts.List

The first line identifies the type of shell to use to process this script. The contents of the fifth line check for the presence of the file called hmc.list which should contain the HMC names we wish to process. An appropriate error message is printed if this file is missing and the further execution is aborted.
On the sixth line, we define an array variable called Dlpars, which will store the names of logical partitions. The following line (7) declares and initializes the array index. On line number 10, we start reading the names of HMCs stored inside hmc.list file. After the variable hmc is loaded with name of the HMC the ping command verifies that this HMC is UP and RUNNING. The name of a “ping-able” HMC is than copied into the file called HmcList. Otherwise the message declaring the HMC as being DOWN is displayed on the screen. This process continues till all the contents of hmc.list are processed.
On line fourteen, we find another for loop, which loads (one at a time) the names of the “ping-able” HMC into the variable hmc – we could use a different variable name but why not to re-use it?
Line sixteen marks the start of a nested loop generating logical partition information associated with the hmc being processed. The following shows what type of information is stored inside the dlpar variable.

<#17> Partition:<11*8233-E8B*1060FDP, oracledb1.wmd.edu, 10.119.80.124>
   Active:<1>, OS:<AIX, 6.1, 6100-07-06-1241>, DCaps:<0x2c5f>, CmdCaps:<0x1b, 0x1b>

The grep statement from line seventeen selects only the first line of each partition description – <#17> Partition:<11*8233-E8B*1060FDP, oracledb1.wmd.edu, 10.119.80.124>. The following it awk statement sets the comma character as the field delimiter and extracts the second field aka the DNS name associated with the processed lpar – in this case it is oracledb1.wmd.edu
Line nineteen – the host name with its leading space character and appended “new line” character is loaded into the Dlpars array. Why the new line at the end? So the resulting file will contain one host per line instead of just one line containing all the hosts.
Next, we print and filter the contents of the array and redirect them into the file called AixHosts.List. The sed statement on line 23 removes any blank lines from the output. The second sed statement removes the leading space character from lines containing host names. Instruction of line twenty five filters out any VIOS partitions - you may have a different convention for VIO server names, and as such you may have to modify this line! Line twenty six removes any duplicate host names and sends the data stream to AixHosts.List. Why we need to worry about the "duplicate host names"? In the case two HMC are attached to a given managed system…. For a certain class of hardware this is mandatory (for example p595).
What do we do with the generated data? Your imagination sets the limits of what can be done. For example, we want to check for any missing paths to SAN disks. In this case we may create the following script which we call appropriately “pathcheck.ksh”. This script requires one arguments – the number of paths do each disk we expect to see.

#!/usr/bin/ksh
### By Waldemar Mark Duszyk 12/19/13
### scan AIX hosts for missing paths to virtual (FC) disks
### This script does not look for SCSI/SAS/IDE disks!

[[ $# != 1 ]] && { print "syntax error, aborting.\n\
Example execution looking for 4 paths per a disk: $0 4"; exit -1; }

message='missing paths to its vFC disks.'

set -A vFCadapters `echo "vfcs" | kdb | grep -v vfcs | \
                grep fcs | awk '{if ($3 ~ /0x0008/) {print $1}}' | \
                awk 'BEGIN {ORS="\t";} {print}'`
set -A hdisks `lsdev -Cc disk | egrep -ive "scsi|sas|ide" | \
                awk '{print $1}' | awk 'BEGIN {ORS="\t"} {print}'`

integer paths=${#hdisks[@]}*${#vFCadapters[@]}*$1
integer PATHS=`lspath | egrep -ive "dac|sas|vscsi|failed|missing" | \
              wc -l`

print -n `hostname -s`
[[ $PATHS == $paths ]] && \
        { print " has no $message"; exit 0; } || \
        { print " has $message"; exit -1; }

Next, we need to push this script to the AIX hosts we want to work with. We can do it from a command line as shown next.

# for host in `cat /some/path/AixHosts.List`
do
scp /path/to/pathcheck.ksh root@$host:/usr/local/bin
done

Let's say that under normal circumstances each disk should have four (4) paths. From this moment on, whenever we what to scan the hosts for missing paths, all that we need to do is this:

# for host in `cat /some/path/AixHosts.List`
do
ssh root@$host /usr/local/bin/pathcheck.ksh 4
done

The time has come to change the root password again and this time we want to make it longer than the current eight characters limit. After we decide what encryption type to use we have to implement it on the selected AIX hosts.

# for host in `cat /some/path/AixHosts.List`
do
ssh root@$host ‘chsec -f /etc/security/login.cfg -s usw -a \ pwd_algorithm=ssha256’
done

Now, let’s change the root password:

# for host in `cat /some/path/AixHosts.List`
do
ssh root@$host 'echo root:verylongpassword | chpasswd -c'
done

Our internal police aka security department request a list of all local user accounts:

# for host in `cat /home/path/AixHosts.List`
do
ssh root@$host 'hostname; lsuser -R files -a login ALL; echo "\n"'
done

You un-mirror/re-mirror often. You want to track the progress of mirroring on a remote host. Download to it the next script and than execute it whenever you want to know the percentage of "mirroring".

#!/usr/bin/ksh93
### W.M. Duszyk, 3/2/12
### show percentage of re-mirrored PPs in a volume group

[[ $# < 1 ]] && { print "Usage: $0 vg_name"; exit 1; }
vg=$1
Stale=`lsvg -L $vg | grep 'STALE PPs:' | awk '{print $6}'`
[[ $Stale = 0 ]] && { print "$vg is fully mirrored."; exit 2; }
Total=`lsvg -L $vg | grep 'TOTAL PPs:' | awk '{print $6}'`
PercDone=$(( 100 - $(( $(( Stale * 50.0 )) / $Total )) ))
echo "Volume group $vg is mirrored $PercDone%."
exit 0
# for host in `cat /some/path/AixHosts.List`
do scp ./MirrorMeter.ksh root@targethost:/usr/local/bin
done

If you want to execute the MirrorMeter against a specific host do this:

# ssh root@specifichost:/usr/local/bin/MirrorMeter.ksh

or against a list of specific hosts (a subset of host list generated earlier), like that

# for host in `cat SpecificHosts.List`
do
ssh root@$host /usr/local/bin/MirrorMeter.ksh
done

To add an entry into root's crontable

# for host in `cat SpecificHosts.List`
do
ssh root@$host 'echo "0,5,10,15,20,25,30,35,40,45,50,55 * * * * /usr/sbin/restart-secldapclntd" >> /var/spool/cron/crontabs/root'
ssh root@$host kill -HUP `ps -ef | grep cron | awk '{print $2}'`
done

As you can see there is practically no task that cannot be automated. You either send it to the host as a script, which you execute later often multiple times or as a command or a whole series of command. Regardless how you do it - always remember to test it before deploying it.

By the way, while looking at the "stuff" above keep in mind that any line of code split on two or more lines in the absence of the \ character is really a single line.

Posted in Real life AIX.

Tagged with , , .


comparing files in AIX and LINUX

It is easy – in AIX execute sdiff file1 file2, in LINUX execute diff -i -a -w -y file1 file2 command. The result is a two columns output easily identifying the different lines in both files. You may consider sorting these files first – just a suggestion.

Posted in Real life AIX.

Tagged with , .




© 2008-2014 www.wmduszyk.com - best viewed with your eyes.