Skip to content

patching RedHat not only for AIX administrators

Red Hat Satellite server must be the source of all packages installed on each and every RedHat host in a given place. If this is not the case patching will never be 100% dull, instead it will always be interesting and adventurous event…. The case in point… System administrator complains that while patching a host its operating system complains about few python libraries. A quick investigation shows that the host has many duplicate packages – some from EPEL some from IBM (in this particular case). What is going on is this – during an upgrade process yum identifies the package its know about (as it was installed by yum from a known repo) and it deletes it. Next, just prior to installing the updated version it does check for the presence of the package using its general name… Here it is where the process fails as it find the other package present – both packages share the same general name. What follows is an excerpt from error messages generated trying to update the package (python-ply)

In our case these packages are and python-ply-3.4-4.el6.noarch.

Error unpacking rpm package python-ply-3.4-4.el6.noarch
error: unpacking of archive failed on file /usr/lib/python2.6/site-packages/ply-3.4-py2.6.egg-info:
cpio: rename was supposed to be removed but is not!

With some knowledge of yum and a bit of persistence, one my get the yum -y update to completion by passing over the issues related to the presence of duplicate packages but this does not mean that the system has really been patched as the following command proves.

# yum --security check-update
Loaded plugins: product-id, refresh-packagekit, rhnplugin, security, subscription-manager
This system is receiving updates from RHN Classic or RHN Satellite.
prodclone-epel_rhel6_x86_64        | 1.3 kB     00:00
Limiting package lists to security relevant ones
epel/updateinfo                    | 1.0 MB     00:00
1 package(s) needed for security, out of 4 available
Security: kernel-2.6.32-573.el6.x86_64 is an installed security update
Security: kernel-2.6.32-504.16.2.el6.x86_64 is the currently running version

python-pip.noarch     7.1.0-1.el6      epel
python-ply.noarch     3.4-4.el6        prodclone-epel_rhel6_x86_64

To fully patch this host, one should first remove the package or packages (even better) that were installed outside the Satellite server. In this case we will start with the first offender.

# yum remove

By the way, a lot of “depending” ibm packages will be removed at this step. When the last command exits, we will finally be able to install/upgrade the right python-ply package.

# yum –y update python-ply
oaded plugins: product-id, refresh-packagekit, rhnplugin, security, subscription-manager
This system is receiving updates from RHN Classic or RHN Satellite.
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package python-ply.noarch 0:3.4-4.el6 will be installed
--> Finished Dependency Resolution

Now, let’s check if all security errata have been applied.

# yum --security check-update -q

The last command generates no output – there are no applicable security errata. Are there any other non-security related errata which are still waiting to be installed?

# yum update
Loaded plugins: product-id, refresh-packagekit, rhnplugin, security, subscription-manager
This system is receiving updates from RHN Classic or RHN Satellite.
Setting up Update Process
No Packages marked for Update

Nothing to patch here. You could also login to the Satellite server, select the appropriate host and open its “errata” tab which should be empty validating the output of the last two commands.

Posted in Linux.

Tagged with , , , , , , .

printing stanzas

If you worked with AIX you might have quickly get used to the grep -p stanza: /some/pathto/file to display paragraphs of a file. Unfortunately, other UNIX flavors do not share the same idea of what is “stanza” and their awk does not have the -p option.
Recently, I was tasked with extracting from information from about the not patched packages of a LINUX host.

To generate the report (in text format) I used the following command:

# oscap xccdf eval \
            --results results-cve-`hostname`.xml
            --report report-cve-`hostname -s`.html \
               com.redhat.rhsa-all.xccdf.xml \
               >>  report-cve-rhnprod.txt 

The resulting text file has the paragraph format.

Title   RHSA-2015:1137: kernel security and bug fix update (Important)
Rule    oval-com.redhat.rhsa-def-20151137
Ident   RHSA-2015-1137
Ident   CVE-2014-9420
Result  pass

Title   RHSA-2015:1139: kernel-rt security, bug fix, and enhancement update (Important)
Rule    oval-com.redhat.rhsa-def-20151139
Ident   RHSA-2015-1139
Ident   CVE-2014-9420
Result  fail

So, how to extract from this file only the “stanzas” which result is false? Like this:

# awk -vRS= -vORS='\n\n' '/fail/' report-cve-rhnprod.txt

Title   RHSA-2015:1081: kernel security, bug fix, and enhancement update (Important)
Rule    oval-com.redhat.rhsa-def-20151081
Ident   RHSA-2015-1081
Ident   CVE-2014-9419
Ident   CVE-2014-9420
Ident   CVE-2014-9585
Ident   CVE-2015-1805
Ident   CVE-2015-3331
Result  fail

Title   RHSA-2015:1221: kernel security, bug fix, and enhancement update (Moderate)
Rule    oval-com.redhat.rhsa-def-20151221
Ident   RHSA-2015-1221
Ident   CVE-2011-5321
Ident   CVE-2015-1593
Ident   CVE-2015-2830
Ident   CVE-2015-2922
Ident   CVE-2015-3636
Result  fail

Posted in Linux.

Tagged with .

AIX+cifs+WIN2012R2 = doesn’t work ?

Be aware that the group policy might need to be changed on the WIN2012 host side before it will allow AIX host to mount its shares…
It could be that “SMB Signing”, which is incompatible with the current AIX cifs driver is enabled on the Windows server which share you want to mount…

Posted in AIX, Linux.

Tagged with , , , , .

rpmdb: unable to join the environment error

See bellow, what happened when I wanted to look at some rpm package:

# rpm -q --last kernel
rpmdb: unable to join the environment
error: db3 error(11) from dbenv->open: Resource temporarily unavailable
error: cannot open Packages database in /var/lib/rpm
rpmdb: unable to join the environment
error: db3 error(11) from dbenv->open: Resource temporarily unavailable

There was something now correct with rpm environment on this machine! After a quick search, I found an on-line article providing this solution:

# rm -f /var/lib/rpm/__db*
# echo .%__dbi_cdb create private cdb mpool mp_mmapsize=16Mb mp_size=1Mb. > /etc/rpm/macros
# rpm --rebuilddb

I tried it and it failed too.

I would probably found the answer if I checked the state of /var sooner but as they say – it is never too late. I decided to execute the previous command with more verbose output. Look what it did say:

# rpm -vv --rebuilddb
D: rebuilding database /var/lib/rpm into /var/lib/rpmrebuilddb.51978
D: creating directory /var/lib/rpmrebuilddb.51978
error: failed to create directory /var/lib/rpmrebuilddb.51978:
No space left on device

What it is at the very end? Yes, /var was full!!! From this point it was all easy – resize /var and re-execute the rpm --rebuildb.

Posted in Linux.

SystemMirrors, building clusters and fighting with the CAA services

I am expecting something good to come my way soon. Why? I have suffered for a few days trying to get to work a very simple cluster that consistently refused all of my efforts insisting on not letting the first node to start the HA services and to join the cluster. This is not an extraordinary cluster, there is nothing special about just two nodes, one resource group and this is all.

But first, this is how the cluster was created.

# clmgr add cluster lawmsmpa1 \
            nodes=lawmsmpa1c1,lawmsmpa1c2 \
            type=NSC \
            heartbeat_type=unicast \

Note, that hdisk2 has identical PVID on both nodes and its reserve_policy is set to no_reserve on both nodes.

# lsattr -El hdisk2 | grep reserve_policy
reserve_policy no_reserve 

Next, I defined the application controller aka the “entity” identifying the highly available application start and stop scripts.

# clmgr add application_controller lawmsmp \
            startscript=/usr/es/sbin/cluster/scripts/start_cluster.ksh \

The service address was defined next.

# clmgr add service_ip \
            netmask= \

Finally, the cluster resource group is defined as follows.

# clmgr add resource_group lawmsmpRG \
            nodes=lawmsmpa1c1,lawmsmpa1c2 \
            startup=OHN fallover=FNPN fallback=NFB \
            service_label=lawmsmpa1 \
            applications=lawmsmp \
            volume_group=lawson_vg \

At this time, we should have the caavg_private volume group present on the hdisk2 on the both nodes… Well, this was not the case….. It was present on the second node (lawmsmpa1c2) but not on the first one which by the way is the primary node of this cluster (it was declared first while defining the cluster).
Rebooting both nodes definitely did not help. The situation did not change. Every attempt to start cluster services failed on the primary node. The error message was always the same –

lawmsmpa1c1: rc.cluster: Error: CAA cluster services are not active on this node.


RSCT cluster services (cthags) are not active on this node

And the clconfg service was not running.

# lssrc -g caa
Subsystem         Group            PID          Status
  clcomd           caa              13041908     active
  clconfd          caa              failed

Indeed, id does not work. The cluster services and its resource group could be started and brought on line on the second node but not on the first one…. Looking at the /etc/services on lawmsmpa1c1, I noticed that one HA entry was missing. The missing entry was

caa_cfg         6181/tcp

Well, this could explain at least some of the reasons behind this disaster but not all. The cluster sync following the update to the services file did not help. The repository was still mangled and the caavg_private volume group was still only present on the second node. It is time for some scrubbing!

Executed on both cluster nodes:

# rmcluster -f -r hdisk2
# lsattr -El cluster0
# rmdev -dl cluster0

On lawmsmpa1c1

# mkvg -f -y scrubvg hdisk2
# varyoffvg  scrubvg

On lawmsmpa1c2

# importvg -f -y scrubvg hdisk2
# varyoffvg scrubvg
# exportvg  scrubvg

On lawmsmpa1c1

# exportvg scrubvg

Yes, we validated it – the disk for the repository volume group is accessible from both nodes.
On both nodes

# shutdown -Fr

After reboot, the next command executed on both nodes showed that cluster definition present (do not be surprised, we only cleaned the repository disk!)

# odmget -q name=cluster0 CuAt
        name = "cluster0"
        attribute = "node_uuid"
        value = "749fb3f2-05f9-11e5-95f1-56bdbf443b02"
        type = "R"
        generic = "DU"
        rep = "s"
        nls_index = 3

Crossed my fingers, and asked to synchronize the cluster which if everything works as intended will create the caa_private volume group on both nodes – the way it was designed to be!

# clmgr sync cluster

It worked like a charm and the cluster started like nothing bad ever happened…… Thanks God it is Friday! :-)

By the way, PowerHA 7.1.3 has the following requirement:

CAAnodename  =   hostname   =  COMMUNICATION_PATH

Because hostname and COMMUNICATION_PATH are both in the short-name format the contents of /etc/hosts also followed this formula.      lawmsmpa1c1      lawmsmpa1c2      lawmsmpa1   

Posted in HACMP, Real life AIX.

Tagged with , , , , , , , .

migrating from nslcd to sssd with AD/KRB

Recently, I have found same articles stating that Red Hat is depreciating nslcd and choosing sssd as its successor. Looking for more info about it, I recognized that sssd offers caching of credentials which means that even in the absence of authentication authority be it LDAP, AD , KRB and so forth users will still be able to log into a host (as long as they have done it before and their data still resides in the local host database). This alone makes migration to sssd worth the effort! Can you imagine storing all application administrative accounts like oracle, lawson, epicadm and many more in the AD instead locally? Isn’t it great?

If you look around this blog, you will find a few posts about implementing nslcd based on LDAP/KRB5/AD authentication/authorization with Red Hat. In this environment Active Directory also provides LDAP and Kerberos services and there is PAM component in it also. What follows next, is an illustration of the steps taken to migrate such environment to sssd. There are many other ways how sssd could be implemented!

I will backup the existing authentication/authorization environment just in the case I run out of time switching to sssd and have to restore the system back to its original shape.

# authconfig --savebackup=wmdbackup

Second, we will need to make sure the proper packages are installed:

# yum -y install sssd-client sssd-ldap sssd-ad sssd-krb5

Next, you will need to stop nslcd and nscd. We are replacing nslcd with sssd for authentication, and since sssd also does caching, we disable nscd to keep them from having conflicts:

# service nslcd stop
# service nscd stop
# chkconfig 2345 nslcd off
# chkconfig 2345 nscd off

Now, execute this little command to set sssd going.

# authconfig --enablesssd --enablesssdauth \
             --enablelocauthorize --enablemkhomedir \

Be aware that the last command changes nsswitch.conf and few other files in the /etc/pam.d directory. Look for sss.

It is time to edit the /etc/sssd/sssd.conf starting with setting the appropriate permissions.

# chmod -R 600 /etc/sssd

If these permissions are not exactly as show, the daemon will refuse to start.

In my case, I asked RedHat to provide me this file for my current version of RH which is 6.6. I copied it to the host and after starting sssd I was allowed me to log-in remotely with no further issues. But there was one caveat with this file as it is currently – it is it guarantees that any user log-in will result in AD wide search. This file identified the ldap_search_base which is the very top of my domain – all searches have to start from the top and follow every branch till the result is found…. This is not what I want. I want to limit the search (time) to the very specific branches of my AD tree, the ones storing UNIX users log-in and group membership info only. Initially, I included in it the following statements:

ldap_user_search_base=OU=Secured,OU=Corporate Users,DC=wmd,DC=edu
ldap_group_search_base=OU=Unix,OU=Security Groups,OU=Corporate Groups,dc=wmd,dc=edu

Normally, the named parameter is normally followed by an equals sign then the values that are to be assigned to that variable, right? Not in this case. With such entries inside the its configuration file sssd wouldn’t start… After a while of searching, I found one site ( that has it right! Instead of the equal you have to use the comma character. The whole sssd.conf is shown bellow.

config_file_version = 2
services = nss, pam
domains = WMD.EDU
debug_level = 4

ldap_id_use_start_tls = False
cache_credentials = True
id_provider = ldap
auth_provider = krb5
chpass_provider = krb5
ldap_schema = rfc2307bis
ldap_force_upper_case_realm = True
ldap_user_object_class = user
ldap_group_object_class = group
ldap_user_gecos = displayName
ldap_user_home_directory = unixHomeDirectory
ldap_uri = ldap://
ldap_search_base = dc=wmd,dc=edu
ldap_user_search_base,OU=Managed By Others,DC=wmd,DC=edu
ldap_user_search_base,OU=Shared,OU=Corporate Users,DC=wmd,DC=edu
ldap_user_search_base,OU=Secured,OU=Corporate Users,DC=wmd,DC=edu
ldap_user_search_base,OU=ServiceAccounts,OU=Corporate Servers,DC=wmd,DC=edu
ldap_group_search_base,ou=Unix,ou=Security Groups,ou=Corporate Groups,dc=wmd,dc=edu
ldap_default_bind_dn = CN=aixldapquery,OU=ServiceAccounts,OU=Corporate Servers,DC=wmd,DC=edu
ldap_default_authtok_type = password
ldap_default_authtok = bind_account_password_goes_here
ldap_tls_cacertdir = /etc/openldap/cacerts
ldap_referrals = false

krb5_realm = WMD.EDU
krb5_kpasswd =
krb5_server =
krb5_canonicalize = False

To make sure that sssd starts at reboot, we will do this:

# chkconig 2345 sssd on

Let’s start it and test it.

# service sssd start

# id duszyk
 uid=923810(duszyk) gid=216(operator) groups=216(operator)

# getent passwd duszyk
duszyk:*:923810:216:Duszyk, Waldemar M:/home/duszyk:/bin/bash

The rest that follows are a few general observations that I have developed/learned in my so far very short interaction with sssd.
If a user for some reason has a changed UID/GID number, then the SSSD cache must be cleared for that user before that user can log in again. The sssd-tools package contains the neccesary utilitites.

# yum install sssd-tools
# sss_cache -u markd

There can be times when it is useful to seed users into the SSSD database rather than waiting for users to log-in and be added (kickstart comes to mind…?). New users can be added using the sss_useradd command.

# sss_useradd --UID 501 --home /home/jsmith --groups admin,dev-group jsmith

The cache purge utility, sss_cache, invalidates records in the SSSD cache for a user, a domain, or a group. Invalidating the current records forces the cache to retrieve the updated records from the identity provider, so changes can be realized quickly. Most commonly, this is used to clear the cache and update all records:

# sss_cache -E

The sss_cache command can also clear all cached entries for a particular domain:

# sss_cache -Ed WMD.EDU

If the administrator knows that a specific record (user, group, or netgroup) has been updated, then sss_cache can purge the records for that specific account and leave the rest of the cache intact:

# sss_cache -u markd

Be careful when you delete a cache file. This operation has significant effects: Deleting the cache file deletes all user data, both identification and cached credentials. Consequently, do not delete a cache file unless the system is online and can authenticate with a user name against the domain’s servers. Without a credentials cache, offline authentication will fail. If the configuration is changed to reference a different identity provider, SSSD will recognize users from both providers until the cached entries from the original provider time out.

While troubleshooting failed sssd, remove its cache before restarting sssd service.

# rm -f /var/lib/sss/db/*
# service sssd start

Posted in Real life AIX.

Tagged with , , , , .

re-sizing disks in VMware

It is not a good idea to partition disks with fdisk. It is better to allow LVM work with whole physical disks, let’s LVM manage the disks. If disks were not whole owned by LVM the following procedure would fail and a disk would not be able to “grow”.

# pvs
  PV         VG      Fmt  Attr PSize  PFree
  /dev/sda2  vg_sys  lvm2 a--  69.51g    6.91g
  /dev/sdb   vg_u01  lvm2 a--  30.00g   10.00g
  /dev/sdc   vg_sshi lvm2 a--  30.00g   10.00g
  /dev/sdd   soa_vg  lvm2 a--  16.00g 1020.00m

There is about 1GB left in the soa_vg but the application owner demands 15GB more. It is easy to address this request as long as the disks are not partitioned with fdisk which is my case. Using vmware console, we “grow” the appropriate disk by the additional 15GB. Next, the operating system must be instructed to re-scan its disks, which is achieved with these steps.

# cd /sys/class/scsi_disk
# for i in `ls`; do echo "1" > $i/device/rescan; done 

Now, let the kernel know to “re-size” the disk.

# pvresize -v /dev/sdd
   DEGRADED MODE. Incomplete RAID LVs will be processed.
    Using physical volume(s) on command line
    Archiving volume group "soa_vg" metadata (seqno 4).
    Resizing volume "/dev/sdd" to 62912512 sectors.
    No change to size of physical volume /dev/sdd.
    Updating physical volume "/dev/sdd"
    Creating volume group backup "/etc/lvm/backup/soa_vg" (seqno 5).
  Physical volume "/dev/sdd" changed
  1 physical volume(s) resized / 0 physical volume(s) not resized

Were we successful?

# pvs
  PV         VG      Fmt  Attr PSize  PFree
  /dev/sda2  vg_sys  lvm2 a--  69.51g  6.91g
  /dev/sdb   vg_u01  lvm2 a--  30.00g 10.00g
  /dev/sdc   vg_sshi lvm2 a--  30.00g 10.00g
  /dev/sdd   soa_vg  lvm2 a--  30.00g 15.00g

Now, we can increase the size of the file systems in soa_vg

Posted in LINUX, Real life AIX.

re-synchronizing Satellite server

To re-synchronize (re-build) channel or channels database (metadata) you can execute one of the following commands.

To rebuild an individual channel metadata:

# satellite-sync -c rhel-x86_64-server-6 --force-all-packages

To rebuild all channels:

# satellite-sync --force-all-packages

These operations are time consuming! More channels to sync, more time it will take.

Posted in Linux.

upgrading RH Satellite server schema …..

After upgrading operating system (RedHat 6.6) on my Satellite server (version 5.7) the WEB GUI stopped functioning but lucky for me the same GUI “told” me that in order to get it to work the schema of Satellite server needs to be updated. The rest of this post documents this process. Yes, I am not using Oracle as the database.

Backup, Satellite database.

[root@rh_satellite1]# /usr/sbin/rhn-satellite stop
[root@rh_satellite1]# db-control backup /ColdBackup
[root@rh_satellite1]# /usr/sbin/rhn-satellite start

Stop the Satellite services except the database.

[root@rh_satellite1]# rhn-satellite stop --exclude postgresql
Shutting down spacewalk services...
Stopping RHN Taskomatic...
Stopped RHN Taskomatic.
Stopping cobbler daemon:                                   [  OK  ]
Stopping rhn-search...
Stopped rhn-search.
Stopping MonitoringScout ...
[ OK ]
Stopping Monitoring ...
[ OK ]
Shutting down osa-dispatcher:                              [  OK  ]
Stopping httpd:                                            [  OK  ]
Stopping tomcat6:                                          [  OK  ]
Terminating jabberd processes ...
Stopping s2s:                                              [  OK  ]
Stopping c2s:                                              [  OK  ]
Stopping sm:                                               [  OK  ]
Stopping router:                                           [  OK  ]

Find out the version of the current schema.

[root@rh_satellite1]# rhn-schema-version

Find out what is the “new” schema version waiting to be installed.

[root@rh_satellite1]# rpm -q satellite-schema

Upgrade the schema.

[root@rh_satellite1]# spacewalk-schema-upgrade
Schema upgrade: [satellite-schema-] -> [satellite-schema-]
Searching for upgrade path: [satellite-schema-] -> [satellite-schema-]
Searching for upgrade path: [satellite-schema-] -> [satellite-schema-]
The path: [satellite-schema-] -> [satellite-schema-]
Planning to run spacewalk-sql with [/var/log/spacewalk/schema-upgrade/20150505-130353-script.sql]

Please make sure you have a valid backup of your database before continuing.

Hit Enter to continue or Ctrl+C to interrupt:
Executing spacewalk-sql, the log is in [/var/log/spacewalk/schema-upgrade/20150505-130353-to-satellite-schema-].
The database schema was upgraded to version [satellite-schema-].

Verify the version.

[root@rh_satellite1]# rpm -q satellite-schema
[root@rh_satellite1]# rhn-schema-version

Start the Satellite.

[root@rh_satellite1]# rhn-satellite start --exclude postgresql
Starting spacewalk services...
Initializing jabberd processes ...
Starting router:                                           [  OK  ]
Starting sm:                                               [  OK  ]
Starting c2s:                                              [  OK  ]
Starting s2s:                                              [  OK  ]
Starting tomcat6:                                          [  OK  ]
Waiting for tomcat to be ready ...
Starting httpd:                                            [  OK  ]
Starting osa-dispatcher:                                   [  OK  ]
Starting Monitoring ...
[ OK ]
Starting MonitoringScout ...
[ OK ]
Starting rhn-search...
Starting cobbler daemon:                                   [  OK  ]
Starting RHN Taskomatic...

For good measure.

[root@rh_satellite1]# service rhn-search cleanindex
Stopping rhn-search...
Stopped rhn-search.
Starting rhn-search...

Posted in Linux.

Tagged with .

multiple search bases in ldap and AD

time required to search for a data is function of its repository size, right? The same applies to LDAP and AD – they are both data repositories. In their case, one may speed up the search using multiple search bases. Case in point, in my Active Directory users are defined in one of the following places:

OU=Secured,OU=Corporate Users,DC=wmd,DC=edu
OU=Users,OU=Ping,ou=Managed By Others,DC=wmd,DC=edu
OU=Users,OU=Research,ou=Managed By Others,DC=wmd,DC=edu
OU=ServiceAccounts,OU=Corporate Servers,DC=wmd,DC=edu

Their UNIX group definitions are stored here:

OU=Unix,OU=Security Groups,OU=Corporate Groups,DC=wmd,DC=edu

In case of Redhat, Centos, Scientific host using nslcd one limits the scope of LDAP searches adjusting the contents of the /etc/nslcd.conf:

# Customize certain database lookups.
base passwd OU=Secured,OU=Corporate Users,DC=wmd,DC=edu
base passwd OU=Users,OU=Ping,ou=Managed By Others,DC=wmd,DC=edu
base passwd OU=Users,OU=Research,ou=Managed By Others,DC=wmd,DC=edu
base passwd OU=ServiceAccounts,OU=Corporate Servers,DC=wmd,DC=edu
base group OU=Unix,OU=Security Groups,OU=Corporate Groups,DC=wmd,DC=edu 

Any change in this file must be associated with a mandatory nslcd daemon refresh.

# service nslcd restart

IN the case of AIX, one goes straight to /etc/security/ldap/ldap.cfg.

userbasedn OU=Secured,OU=Corporate Users,DC=wmd,DC=edu
userbasedn OU=Users,OU=Ping,ou=Managed By Others,DC=wmd,DC=edu
userbasedn OU=Users,OU=Research,ou=Managed By Others,DC=wmd,DC=edu
userbasedn OU=ServiceAccounts,OU=Corporate Servers,DC=wmd,DC=edu
groupbasedn OU=Unix,OU=Security Groups,OU=Corporate Groups,DC=wmd,DC=edu 

The change it this file must be followed with the refresh of ldap daemon. One way of doing it is shown bellow.

# restart-secldapclntd

Posted in Real life AIX.

© 2008-2015 - best viewed with your eyes.