Skip to content



possible paging issues with AIX 6.1.7

On hosts with a specific application (EPIC/CACHE) we observed incidents of overwhelming paging which was consequently tracked to “enabled” 64kb memory pages – it happened after upgrade to AIX 6.1.7.5. All other applications (ORACLE, LAWSON, CLARITY, etc) did not show any adverse behavior and are running just fine. For the one that did “page a lot” we disabled the support for 64kb pages.

Here is the official explanation from our IBM support engineer:

There are not currently any hard and fast rules regarding the ratio of paging space to memory. The requirements change based on apps, workload, … Paging space is tuned based on the needs of the server.

Regarding your system continually running out of paging space we believe you may have hit an APAR.  IV26272: REDUCE EARLY WORKING STORAGE PAGING
http://www-01.ibm.com/support/docview.wss?uid=isg1IV26272 

An explanation and possible resolution 

The kernel parameter, numperm_global , was implemented and enabled with with AIX 6.1 TL7 SP4/7.1 TL1 SP4 to be able to look at the paging from a global perspective. That means that before AIX 6.1 TL7 SP4/7.1 TL1 SP4 the number_global tunable was not available. Unfortunately the number_global might cause in some environments early paging due to failed pincheck on 64K pages related to different size mempools. There are two possible ways to prevent the problem from happening:             

1) Disable global numperm (numperm_global=0)                                   
2) Increase the number of unpinned pages for the page size that is close to maxpin - In the previous case, the 64K page pool is close to maxpin.  Most of the pinned pages are kernel heap and there is a large number of 4K computational pages used by the customers application. Forcing the application to use 64K pages (LDR_CNTRL) will reduce the percentage pinned count for the 64K pool and therefore prevent the problem from happening.
                                                                              
Customers who are running workloads like DB2 or Oracle and follow the best practices (using 64K page size) are unlikely to experience this early paging problem. 

UPDATE:

If you want to learn more about AIX and memory pages, follow the link provided by Lonny Niederstadt from EPIC Corporation.
http://www-03.ibm.com/systems/resources/systems_p_os_aix_whitepapers_multiple_page.pdf

Thanks Lonny!

Posted in Real life AIX.


secldapclntd will not work with SSL

Once, there was AIX system which LDAP client refused to run on top of SSL. Now way, ever! AIX update did not help, LDAP software did not help, SSH/SLL upgrade did not help, GSKit patch did not help. It seems that this system was cursed.

# start-secldapclntd
Starting the secldapclntd daemon.
3001-710 SSL initialization failed. Check the SSL key path and key password in the /etc/security/ldap/ldap.cfg file.
3001-710 SSL initialization failed. Check the SSL key path and key password in the /etc/security/ldap/ldap.cfg file.
The secldapclntd daemon failed to start.

The ldapsearch command executed with SSL and a key file kept failing generating:

ldap_ssl_client_init failed! rc == -1, failureReasonCode == 804400244
Unknown SSL error

Well, here comes To Vo who says = “Mark, please execute this command:”

/opt/IBM/ldap/V6.3/bin/idslink -igl32 -f

It works, it works like a charm, thanks To Vo!

Posted in Real life AIX.

Tagged with , , , .


ANS1030E The operating system refused TSM request for memory

One AIX machines refused TSM backup of a certain file systems throwing the following messages:

ANS1999E Incremental processing of '/filesystem/name' stopped.

Followed with

ANS1030E The operating system refused a TSM request for memory allocation.

Investigation of /etc/security/limits reveled nothing unusual:

default:
        fsize = -1
        core = 2097151
        cpu = -1
        data = 262144
        rss = 65536
        stack = 65536
        nofiles = -1

Looking into the nmon and vmstat did not indicated any memory shortages neither. This host runs 64bit AIX and never had this problems before… The uptime command showed just fourteen days. I asked for a permission to reboot this machine – it may wait a few days.

Looking on-line, I found a few IBM TSM notes on this subject and with the newly gained knowledge, I implemented the following two changes.

The following line was added to the file called dms.sys

memoryefficientbackup yes

The next line was added to the file inclexcl, this is one continuous line not two like your browser may show.

INCLUDE.FS /filesystem/name MEMORYEFFICIENTBACKUP=DISKCACHEMETHOD DISKCACHELOCATION=/TSM_cache

Where the /filesystem/name is the path which dsmc previously failed to backup.

Following these two changes and refresh of the dsmc daemon, the next incremental worked like a charm.

Posted in Real life AIX.

Tagged with , , , .


AIX Support Center Tools

Do you know what zsnap is all about? What about devscan or VIOS Adviser? IBM engineer just gave me this link to the IBM SUPPORT CENTER TOOLS. Something new, something good.

Posted in Real life AIX.


replacing FC adapters …… port speed matches adapter speed?

After relocating a host from one rack into another its two 8Gb PCI Express Dual Port FC Adapters failed. They “live” in a 5208 I/O drawer attached to 8204-E8A. Each adapter lost one port – the lower one. I guess, misery loves company, right? The affected ones where fcs7 and fcs9. Both, when treated with sanscan utility (described in an earlier post) returned with the same message.

# sanscan fscsi7
sanscan v2.2
Copyright (C) 2010 IBM Corp., All Rights Reserved
Opening device /dev/fscsi7 failed with errno ENETUNREACH
Cleaning up...
Completed with error(s)

The diag routine executed with and without the “wrap plug” did not help – both adapters were declared OK. Still, I have seen bad adapters misdiagnosed by these utilities. The new fiber cables stretched from switch to both ports, done. Still the issue persists. Magic, pure magic.

Before you declare the cards bad or failed, make sure that the SAN administrator set the ports to match the speed of the attached to them FC adapters.
If adapters are 8Gbps make sure the ports are set to 8Gbps. If they are 4Gbps make sure the ports are set to 4Gbps. Otherwise you may spend more then 24 hours absolutely unnecessary suspecting the cards, waiting for their replacements and facing identical situation after the new cards are in while listening to people around you spinning tails of bad motherboards, issues with AIX, the level of your own skills and so forth ……… life can be really entertaining.

UPDATE:

a. I wrote this post after a very long session (over 24 hours) which apparently impaired my brains….. As the result, I was (wrongly) under impression that adapters and devices present in the I/O drawer attached to my host are not Hot Plug-able….. Nothing far from the TRUTH!!!!!!!. If lsdev can see them Hot Plug tasks in either smitty or diag will see them too. Yes, the contents of 5802 I/O drawers are fully hot plug-able.

b. We use Brocade switches, which support up to 16Gbps when they are LICENSED to do so! A few minutes ago, we discovered that the new switch we failed to connect to last Saturday is not licensed for 8Gbps!!!! We used the licenseshow command to verify it.

Posted in Real life AIX.

Tagged with , , , .


TDS LDAP client issues

One AIX host went through a motherboard replacement. Surprisingly, after the host was powered ON nobody could log in. The only way to do that was via HMC. It did not take a long time to determine that any commands associated with secldapclntd did not work. The following commands failed: lsldap -a passwd, lsuser -R LDAP and ls-secldapclntd.

This host communicates with TDS LDAP servers over SSL so the “key” files and their password quickly became the primary suspects. We renamed the original /etc/security/ldap/xxxxxxx.kdb" files and replaced it with a file copied from another host. This new file was renamed to match the name of the original. The secldapcnltd was restarted and any expectations of fixing this issued died quickly. LDAP still not working!

Next, the ldapsearch command was tried with the “.kdb” file and the bind user passwords in plain text.

host:RDC:/root> ldapsearch -h tdsServerName -Z \
-K /etc/security/ldap/FileName.kdb -P kdbPassword \
-D cn=bindUser -w bindUserPassword \
-b ou=People,cn=aixdata,dc=wmd,dc=edu -s sub objectclass *

The results were short of spectacular! OK, so there is something fishy about one of the passwords….. When the “key” files were created (about two years ago) the keys were set to expire in 10 years…. Could it be that the encrypted password of the bind account queering TDS LDAP servers on behalf of this host somehow stopped working? To test this hypothesis, the line in the /etc/security/ldap/ldap.cfg was changed and its content aka the “encrypted” password was replaced with its plain text version.

From this:

# LDAP server bind DN password
bindpwd:{DESv2}A3D8A8F5BCEA39 E 04599999996F2 E8F9CCC1AA3B68EC1DA

To this:

# LDAP server bind DN password
bindpwd:plaintextpassword

The secldapclntd was refreshed and the host exhibited a full LDAP functionality again. It is obvious that the existing encrypted password string is no longer accepted. To create a new one, we executed the next command with the original password

host:RDC:/root>secldapclntd -e original_password
{DESv2}744655DF EC085C3A53A5A7F436C6DC4host:RDC:/root>

The line above needs to be copied without the host:RDC/root> (which is in my case the host prompt) and pasted into the ldap.cfg file as shown next.

# LDAP server bind DN password
bindpwd:{DESv2}744655DF EC085C3A53A5A7F436C6DC4

Recycle secldapclntd after this change!

Posted in ldap, Real life AIX.

Tagged with , , .


Integrating Red Hat with Active Directory

This morning, I found this publication showing everything required for a successful integration with AD – including time synchronization, DNS, Samba setup and so forth.
If you are branching to LINUX or just have to support LINUX in addition to AIX then this document may help you.

Integrating Red Hat with Active Directory

Posted in Linux.

Tagged with , , , , , , .


a nice WMWARE link

My friend Adam is into WMWARE.Today, he sent me an email with this link, which he highly recommends. If you are “into” VMWARE this could be something for you too.

http://vsphere-land.com/

My friend Tony, recommends this link

http://planet.vsphere-land.com/

Posted in Real life AIX.

Tagged with .


any issues upgrading to AIX 6.1.7.5?

Recently, we upgraded two machines to AIX 6.1.7.5 and both experienced incidents of very heavy paging to the point that they had to be rebooted – we are not sure that there is any relation between this upgrade and paging. Has anybody else experienced the same?
If so, please let me know – we have to patch few extremely “important” machines and we do not want to complicate our lives.

Thanks!!!!

Update:

so far it looks like TL7SP5 automatically turns ON support for 64kb memory pages. You can check it executing vmstat –P ALL. To disable this feature, execute vmo -r -o vmm_mpsize_support=0, agree to update multibos and reboot the system.

Posted in Real life AIX.




Copyright © 2015 - 2016 Waldemar Mark Duszyk. - best viewed with your eyes.. Created by Blog Copyright.