Skip to content


How to capture boot debug of a SAN boot PowerVM Virtual I/O Server or AIX/NPIV client partition that is failing to boot?

Does it interest you? Go to page number 2.

Posted in Real life AIX.


links to various performance tools for AIX

You will find it all in one place following this link https://www.ibm.com/developerworks/mydeveloperworks/wikis/home?lang=en#/wiki/Power%20Systems/page/Other%20Performance%20Tools

Posted in AIX.


a few words about EtherChannel

Originally, this technology was meant to protect a host against a failure of its network adapter and/or switch (network switch). Additionally, some unscrupulous salesmen claimed a fantastic increase in throughput aka two adapters tied together will double, three adapters tied together will triple throughput of the associated with them EtherChannel adapter – yes, in a salesman pure land of fantasy.

In reality you may get a few percent higher (max. at about 20%) – you do not EtherChannel for a volume increase but for a good night sleep! There is a big difference between EtherChannel and (Link) Aggregation these terms have different meaning! Check with your switch documentation as to which term they use to what AIX (IBM) calls EtherChannel to avoid confusion -as in most cases aggregation means a trunk of ports not the EtherChannel. By the way, AIX also supports link aggregation.

As you contemplate EtherChannel for your own use, keep in mind that AIX for a long time (I do not remember the version number) has the future called backup adapter …… and that at certain time (check it against your verion of AIX) an EtherChannel adapter built on top of two physical NIC’s (one being the PRIMARY and the second being the BACKUP) was not really an EtherChannel adapter and it requires no changes to the settings of their switches ports…. Finally, it is a BAD idea to connect EtherChannel adapter and its BACKUP adapter to a single switch, really.

Sometimes, the cables that you are convinced lead to different switches are in reality attached to the same switch – the one which just lost power and as the result the most important application was just killed by the cluster deamon on the node that now has no network connectivity and it was moved to the standby node in the other data center with the understandable few minute delay in the application services. It could not happen in a better moment!

To make the long story short, at the end of the day the cables have to be traced, labeled and the date scheduled for their swap. The question remains the same – how do you verify/check what cable goes to the switch A and what cable goes to switch B? What? Why? Well, as the BACKUP adapter is free from traffic its switch port cannot “see” its MAC address.

Let’s agree that our EtherChannel adapter is ent8 consists of ent0 and its backup adapter is ent4. The following shows how you can produce these details. In the output bellow, the entry adapter_names (plural) is not a mistake. An EtherChannel NIC may employee more than one physical NIC and this set of NIC’s may be protected by yet another physical NIC which role is to assume the role of the all composite adapters if they all fail and die – this is the BACKUP adapter. For as long as a single composite EtherChannel adapter is working the BACKUP adapter does nothing, it springs to live when the last component NIC dies an honorable death.

# lsattr -El ent8 | grep adapter  
adapter_names  ent0 EtherChannel Adapters                      True
backup_adapter ent4 Adapter used when whole channel fails      True

It is easy to validate/verify that each participating adapter is connected do a different switch while you are implementing EtherChannel – just assigned an IP address to each one by one each time asking LAN administrator to validate the connection with the switch (he can see individual adapter MAC address). Later it is not as easy….. To repeat the same procedure you have to destroy the EtherChannel and sometimes you may not be able to do it for reasons that are beyond you. So in this case just flip the roles – let BACKUP become ACTIVE and the MAC should be seen (hopefully on the other switch).

MAC – is it the MAC or is it not the MAC? Well, EtherChannel is really cool as it not only allow to group NIC together to keep the resulting logical adapter alive if its components start to die, additionally it allow you to have a backup adapter just in case all the “primary” components (adapters) are no longer operational. The whole idea of multiple adapters sharing the same IP address raises an immediate question which is – “what about the MAC address associated with this IP address?” Why? The changing MAC address may not be really well accepted neither by the operating system on the receiving end, its application of the intermittent routers or switches as in reality this breaks on of the principals of TCP/IP. I think, it is better to use EtherChannel ability yo assign on MAC to all its component adapters. In my case I create this new MAC replacing the first try characters of the first MAC with the string BADBEEF which consists of all valid hex characters …:-) In this case, it is easy to spot the address of an EtherChannel adapter on a switch.
So consult with your switch/router manual ahead of go-live date. There is one more issue to consider here – are the switches capable of sharing vlans? Can a vlanA on port X of switch A be also assigned to port Y of switch B? Most likely this is not an issue, still ask before the go-live.

Since, the existing EtherChannel adapter could not be destroyed, another way had to be identified to validate its components connectivity. There is such a way! All that needs to be done is to flip the adapter roles! To flip the adapters roles (so ent4 becomes the ACTIVE and ent8 becomes the BACKUP) you would have to execute the next command – ethchan_config.

# /usr/lib/methods/ethchan_config -f ent8

I like to exercise my extremities so I do not have the /usr/lib/methods in my PATH ….. 🙂

So far, it works because the EtherChannel adapter had only one primary adapter (ent8). What if there is more? You could use the -d command line options to remove the components adapters till the EtherChannel has the only on active physical adapter and after testing you could re-add whatever you have removed with the -a option for example:

# /usr/lib/methods/ethchan_config -d entX

Test connectivity, and re-add the adapter to the EtherChannel:

# /usr/lib/methods/ethchan_config -a entX

By the way, the last command is really a cool tool to create and to manipulate EtherChannel devices – highly recommended (of course everything EtherChannel wise can be done via smitty too).

If you can, I recommend you insist that the cables from the EtherChannel adapters are of the different color than the cable used to provide connectivity to its BACKUP adapter, really sometimes it is worth it.

At few earlier posts, I mentioned the fact that computer science is in 2.333% (approx.) based on magic, and as such any procedure provided by manufacturer and/or this posts may fail when YOU are attempting to perform it. This one as any other of your failures as a system administrator may also be result of a combination of operating system and firmware versions, your overall luck and your accumulated karma. With this in mind, make sure you understand what you are about to do, set your expectations appropriately, test your procedure and schedule its date/time to minimize any negative outcome.

Play it safe in a data center be a hero in a pub celebrating your victory over a machine later!

Posted in Real life AIX.


10Gbit Ethernet, bad assumption and Best Practice

Lately, we have some issues with network performance among our virtualized hosts. We found two posts by the same author Gareth M. Coates which are priceless.
The first one can be read here https://www.ibm.com/developerworks/mydeveloperworks/blogs/aixpert/entry/10gbit_ethernet_bad_assumption_and_best_practice_part_137?lang=en. Good info indeed.

The second post by the same author, you can read here: https://www.ibm.com/developerworks/mydeveloperworks/blogs/aixpert/entry/powervm_virtual_ethernet_speed_is_often_confused_with_vios_sea_ive_hea_speed?lang=en

The next page is a printout of Gareth’s first post.

Posted in Real life AIX.


(re-)installing OSP’s VMware “tools” on a RedHat guest

VMware tools are not running on one of our guests and Igor, the creator of our VMware template is not around …. This post documents the process that I followed (after a few mistakes, u-turns, etc) to have these tools re-installed from the external repository instead of the outdated local one.

Posted in LINUX.

Tagged with , .


VIO Network Performance Tip

What follows is an excerpt from the February IBM News Letter USA East which you can read following this link http://www.wmduszyk.com/wp-content/uploads/2013/03/POWER-Sys-Newsletter-USA-East-3.pdf

By Doug Herman – hermand@us.ibm.com
When using the Virtual I/O server (VIO) for virtualizing physical networks to client logical partitions, a Shared Ethernet Adapter (SEA) is configured by using the physical Ethernet adapter and the virtual Ethernet adapter to create a layer 2 bridge. One way to improve network performance is to use the largesend option on the VIO SEA and the client logical partitions. The largesend feature allows sending large data packets over virtual Ethernet adapters without breaking up the packets into smaller MTU size packets. Starting with AIX 6.1 TL7-SP1 and AIX 7.1 TL0-SP1, the operating system supports the mtu_bypass attribute for the shared Ethernet adapter to provide a persistent way to enable the largesend feature:

ftp://public.dhe.ibm.com/common/ssi/ecm/en/pow03049usen/POW03049USEN.PDF

Using largesend (mtu_bypass) on the AIX interfaces boosts throughput between logical partitions within the hypervisor of the Power server, without using additional processor utilization. Set largesend on the VIO SEA, and mtu_bypass (largesend) on the AIX LPAR interfaces. This lowers both the sending AIX LPAR and the sending VIO processor usage when transferring to an outside machine. All MTU sizes remain at 1500. There is no requirement for Jumbo Frames.

Some examples of largesend attributes for performance:

# ifconfig en0 largesend 

(LPAR to LPAR, virtual to virtual, in same machine single stream, binary FTP dd test) 1Gb per second without largesend 3.8Gb per second with largesend – Higher throughput
Processor utilization slightly higher on sender and slightly lower on receiver

largesend=1 on VIO SEA and largesend on client interfaces

– Much lower processor utilization on sender and on sending VIO

VIO SEA physical adapters should have both large_send and large_receive set to yes

$ lsdev -dev ent0 -attr |grep lar

large_receive yes Enable receive TCP segment aggregation True
large_send yes Enable hardware Transmit TCP segmentation

To make change settings permanent:

VIO Server:

$ chdev -dev ent# -attr largesend=1 large_receive=yes

AIX LPAR:

# chdev –l en# -a mtu_bypass=on

Posted in Real life AIX.

Tagged with , .


fixing missing paths to SAN disks – part 2

In the past I posted information showing how to “activate” paths that for some reason were no longer on-line. I used the combination of lspath command to weed our the Failed paths and the chpath command to change their state to the Enabled one.
Recently we cleaned the connectors of our FC fabric and as it was expected some of the effected paths switched state to Failed. As the number of paths to a disk (LUN) decreases the I/O to this disk suffers too.

We noticed that this time the described earlier procedure did not work … To re-activate the missing paths we removed the associated with them FC adapter, immediately followed with execution of the “configuration mangler” as we affectionately call here the cfgmgr command.

As a sort of a remainder: to check the state of paths in the XIV environment you can use the command xiv_devlist. In the listing bellow, hdisk7 and hdisk15 show missing paths.

# xiv_devlist
XIV Devices
--------------------------------------------------------------------
Device        Size (GB)  Paths  Vol Name        Vol Id   XIV Id   XIV Host
--------------------------------------------------------------------
/dev/hdisk2   10.7       12/12  BIN_MS  243      7801370  BINORTPU001
--------------------------------------------------------------------
/dev/hdisk3   10.7       12/12  BIN_E6  364      7802518  BINORTPU001
--------------------------------------------------------------------
/dev/hdisk7   1754.5     6/12   BIN_E1  367      7802518  BINORTPU001
--------------------------------------------------------------------
/dev/hdisk15  532.6      6/12   BIN_D1  371      7802518  BINORTPU001
--------------------------------------------------------------------
/dev/hdisk16  532.6      12/12  BIN_D1  263      7801370  BINORTPU001
--------------------------------------------------------------------

The previous post showed this method of located the FC adapters associated with the missing paths. You could also use the lspath command.

# lspath -l hdisk7 -H -F"name parent path_id connection status"
name   parent path_id connection                     status
 
hdisk7 fscsi0 0       5001738009d60142,5000000000000 Enabled
hdisk7 fscsi0 1       5001738009d60182,5000000000000 Enabled
hdisk7 fscsi0 2       5001738009d60162,5000000000000 Enabled
hdisk7 fscsi1 3       5001738009d60142,5000000000000 Enabled
hdisk7 fscsi1 4       5001738009d60182,5000000000000 Enabled
hdisk7 fscsi1 5       5001738009d60162,5000000000000 Enabled
hdisk7 fscsi2 6       5001738009d60150,5000000000000 Enabled
hdisk7 fscsi2 7       5001738009d60190,5000000000000 Enabled
hdisk7 fscsi2 8       5001738009d60170,5000000000000 Enabled
hdisk7 fscsi3 9       5001738009d60150,5000000000000 Failed
hdisk7 fscsi3 10      5001738009d60190,5000000000000 Failed
hdisk7 fscsi3 11      5001738009d60170,5000000000000 Failed

Now, to selectively enable the paths execute:

# chpath -s enabled -l hdisk7 -p fscsi3
# lspath -l hdisk7 -H -F"name parent path_id connection status"
name   parent path_id connection                     status
 
hdisk7 fscsi0 0       5001738009d60142,5000000000000 Enabled
hdisk7 fscsi0 1       5001738009d60182,5000000000000 Enabled
hdisk7 fscsi0 2       5001738009d60162,5000000000000 Enabled
hdisk7 fscsi1 3       5001738009d60142,5000000000000 Enabled
hdisk7 fscsi1 4       5001738009d60182,5000000000000 Enabled
hdisk7 fscsi1 5       5001738009d60162,5000000000000 Enabled
hdisk7 fscsi2 6       5001738009d60150,5000000000000 Enabled
hdisk7 fscsi2 7       5001738009d60190,5000000000000 Enabled
hdisk7 fscsi2 8       5001738009d60170,5000000000000 Enabled
hdisk7 fscsi3 9       5001738009d60150,5000000000000 Enabled
hdisk7 fscsi3 10      5001738009d60190,5000000000000 Enabled
hdisk7 fscsi3 11      5001738009d60170,5000000000000 Enabled

As I mention above, this time the procedure could not “fix” all our issues with the Failed paths. So, we resolved to something more drastic, which in this case was

# rmdev -dl fsc3 -R; cfgmgr

These two steps were executed against each adapter with missing paths, one adapter at a time. This procedure worked for all of our cases – XIV, SVC and HDS.

Posted in Real life AIX.

Tagged with , , .


read iso images without a cd-drive

After gaining the root access to the host (see the last post) it is the time to finally upgrade its disk drivers … In this case, they landed on my desktop in form of an iso image. This is all nice and dandy but the host does not have a cd/dvd drive … – this post shows how to unpack an iso image without it.

Use, whatever is appropriate to move the iso image to the host that needs it. Next, create a file system (I call it /iso) of the appropriate size and with appropriate attributes (-pro, -tno).

# crfs -v jfs2 -g vg_name -a size=5G -m /iso -Ano -pro -tno

Since, I allowed LVM to create the logical volume creating the file system above, now it is time to learn its name:

# lsfs /iso | grep fslv

In the next line the if= defines the location of image, and the of defines just identified logical volume. To load the image into the file system we used the command as old as UNIX:

# dd if=/home/duszyk/dlmglm_074001.iso of=/dev/fslv00/ bs=100M

Now, mount the file system (mount /iso) and explore its contents.

It came quick and swift – I am talking about the two comments to this post 🙂 . Thank you Marcus and Ku!!!! For the reader – if you are on AIX 6.1.4 and above, please user the loopmount command instead of creating a file system and populating it with the image with dd command!

By the way, see one of the previous post for loopmount command – http://www.wmduszyk.com/?p=7999&langswitch_lang=en
It looks like I need to add a few GB’s to my own memory ….. 🙂

Posted in AIX, Real life AIX.

Tagged with .


recovering forgotten root password with NIM

A decision was made that the fastest way to answer the need for a new AIX host will be to re-purpose an existing dormant partition. After CPU and RAM quantities were set in its profile the partition was powered-on. The next came the discovery that the currently user root password does not work here …… With a working NIM server nothing seems to be a problem :-).

What follows documents the process which with help of a NIM server allow you to reset root password in AIX – it does not matter if this is a stand alone or a partition, the procedure remains the same.

Posted in Real life AIX.

Tagged with , , .


password grammar in LINUX – PAM style

In AIX to control password grammar (composition rules), expiration and all other not mentioned here “attributes”, both on a local or global scale is very easy. All is collected and ready to be manipulated to satisfy your fancy in a single file called user located in directory called /etc/security.

This file as many other files in AIX is based on stanzas. There is the default stanza where all attributes are set to their “default” values. This stanza is followed by any number of other stanzas, each representing a particular user whose password/login attributes differ from the ones set in the default stanza. It is nice and easy. Does it answer all needs in this area? Maybe not, most likely not. Is it dated solution? It could be. In my opinion it is clean and efficient and if anything else is called for in addition and above what this file provides, one has to remember that AIX has PAM,yes AIX supports PAM. Having said that, I admit that I have never used PAM with AIX and have no idea do what extend PAM is supported by this operating system. What PAM stands for? PAM is short for the Pluggable Authentication Modules.

Here comes LINUX. A few days ago, I started to think about RedHat – How to set a different set of password controls on a RedHat host with an configured and operational Active Directory authentication? What to do to apply a different set of password attributes for the local users? This post is a way to share with you what I have so far discovered. As always, let me know if you find an error, omission or a better solution to what I have shown here.

In RedHat to make sure that the local users (the ones present in the file /etc/passwd) are authenticating by the local means and not via AD or LDAP or something else you have to check the files inside directory called /etc/pam.d for the presence of the following entry:

account       sufficient             pam_localuser.so

On my host, these entry is present in a few files like fingerprint-auth, password-auth, password-auth-ac, smartcard-auth, system-auth – not all of them are actually needed as for example this host does not employ any finger print recognition device to authorize the login-in user. I think, that for most cases it is enough to have this entry in both files which names start with password.

Next comes the grammar. To make sure that the local users passwords are sufficiently strong and they contain the correct ratio of numerals/upper/lower/special characters and so forth we look into the same directory again, but now seeking a different PAM module. This time we look for a PAM module called pam_cracklib.so, which we find in the file called password-auth. For example the following entry:

password	required	pam_cracklib.so \
                    dcredit=-1 ucredit=-1 lcredit=-1 minlen=8

requires a password to contain at the minimum one (1) digit (dcredit), one (1) upper case (ucredit) character, one (1) lower case (lcredit) character and be not shorter than (minlen 8 characters. Not to mention that this module will automatically check the selected password against a dictionary an if it is found to be a word …. guess what? It will be refused and the user will have to specify to select a different one.

Now, what about the previous passwords? How to prevent their repetitive usage? First, verify that this file is present:

# ls -l /etc/security/opasswd
-rw-------. 1 root root 0 Apr  5  2012 /etc/security/opasswd

This file is the “keeper” of users previous passwords. The number of previous password per user stored inside this file is controlled by the value of the token called remember – thanks the developer for using meaningful terms!!!!!!!!!! The next line shows this setting in action (an excerpt form the file /etc/pam.d/password-auth).

password    sufficient    pam_unix.so \
                 sha512 shadow nullok try_first_pass use_authtok \
                 remember=4

On this host, user is not allowed to repeat any of his four last passwords. By the way, the maximum for remember is 400 (past passwords).

For now this is all I have, stay warm,

MarkD:-)

Posted in Linux.

Tagged with , , .




Copyright © 2016 - 2017 Waldemar Mark Duszyk. All Rights Reserved. Created by Blog Copyright.