yinxiuping
HACMP的问题
errpt的结果
1BA7DF4E 0618211806 P S SRC SOFTWARE PROGRAM ERROR
BA431EB7 0618211806 P S SRC SOFTWARE PROGRAM ERROR
BA431EB7 0618211806 P S SRC SOFTWARE PROGRAM ERROR
[H50-2][root][/usr/sbin/cluster]>errpt -aj 1BA7DF4E
---------------------------------------------------------------------------
LABEL: SRC_TRYX
IDENTIFIER: 1BA7DF4E
Date/Time: Sun Jun 18 21:18:44 BEIS
Sequence Number: 4933
Machine Id: 000055034C00
Node Id: H50-2
Class: S
Type: PERM
Resource Name: SRC
Description
SOFTWARE PROGRAM ERROR
Probable Causes
APPLICATION PROGRAM
Failure Causes
SOFTWARE PROGRAM
Recommended Actions
DETERMINE WHY SUBSYSTEM CANNOT RESTART
Detail Data
SYMPTOM CODE
256
SOFTWARE ERROR CODE
-9020
ERROR CODE
0
DETECTING MODULE
'srchevn.c'@line:'343'
FAILING MODULE
clsmuxpdES
[H50-2][root][/usr/sbin/cluster]>errpt -aj BA431EB7
---------------------------------------------------------------------------
LABEL: SRC_RSTRT
IDENTIFIER: BA431EB7
Date/Time: Sun Jun 18 21:18:43 BEIS
Sequence Number: 4932
Machine Id: 000055034C00
Node Id: H50-2
Class: S
Type: PERM
Resource Name: SRC
Description
SOFTWARE PROGRAM ERROR
Probable Causes
APPLICATION PROGRAM
Failure Causes
SOFTWARE PROGRAM
Recommended Actions
VERIFY SUBSYSTEM RESTARTED AUTOMATICALLY
Detail Data
SYMPTOM CODE
256
SOFTWARE ERROR CODE
-9035
ERROR CODE
0
DETECTING MODULE
'srchevn.c'@line:'217'
FAILING MODULE
clsmuxpdES
---------------------------------------------------------------------------
LABEL: SRC_RSTRT
IDENTIFIER: BA431EB7
Date/Time: Sun Jun 18 21:18:43 BEIS
Sequence Number: 4931
Machine Id: 000055034C00
Node Id: H50-2
Class: S
Type: PERM
Resource Name: SRC
Description
SOFTWARE PROGRAM ERROR
Probable Causes
APPLICATION PROGRAM
Failure Causes
SOFTWARE PROGRAM
Recommended Actions
VERIFY SUBSYSTEM RESTARTED AUTOMATICALLY
Detail Data
SYMPTOM CODE
256
SOFTWARE ERROR CODE
-9035
ERROR CODE
0
DETECTING MODULE
'srchevn.c'@line:'217'
FAILING MODULE
clsmuxpdES
[H50-2][root][/usr/sbin/cluster]>lssrc -g cluster
Subsystem Group PID Status
clstrmgrES cluster 22250 active
[H50-2][root][/usr/sbin/cluster]>more /usr/es/adm/cluster.log
Jun 18 06:19:03 H50-2 RMCdaemon[8940]: (Recorded using libct_ffdc.a cv 2):::Error ID: 6eKora0Lz5Z2/JlI/422e.1...................:::R
eference ID: :::Template ID: a6df45aa:::Details File: :::Location: RSCT,rmcd.c,1.37,202 :::RMCD_INFO_0_ST
The daemon is started.
Jun 18 21:01:50 H50-2 syslog: 0821-285 ioctl returns 70
Jun 18 21:02:05 H50-2 syslog: 0821-285 ioctl returns 70
Jun 18 21:16:25 H50-2 topsvcs[22714]: (Recorded using libct_ffdc.a cv 2):::Error ID: 6UpNEL0d6JZ2/e3v0422e.1...................:::Re
ference ID: :::Template ID: 97419d60:::Details File: :::Location: rsct,bootstrp.C,1.176,4010 :::TS_START_ST Top
ology Services daemon started Topology Services daemon started by: SRC Topology Services daemon log file location /var/ha/log/topsvc
s.18.211625.H50_cluster.en_/var/ha/run/topsvcs.H50_cluster/ Topology Services daemon run directory /var/ha/run/topsvcs.H50_cluster/
Jun 18 21:16:28 H50-2 grpsvcs[21492]: (Recorded using libct_ffdc.a cv 2):::Error ID: 63Y7ej0g6JZ2/je61422e.1...................:::Re
ference ID: :::Template ID: afa89905:::Details File: :::Location: RSCT,pgsd.C,1.51,541 :::GS_START_ST Gro
up Services daemon started DIAGNOSTIC EXPLANATION HAGS daemon started by SRC. Log file is /var/ha/log/grpsvcs_2_12.H50_cluster.
Jun 18 21:16:40 H50-2 clstrmgrES[22250]: Sun Jun 18 21:16:40 HACMP/ES Cluster Manager Started
Jun 18 21:17:44 H50-2 HACMP for AIX: EVENT START: node_up H50_2
Jun 18 21:17:52 H50-2 HACMP for AIX: EVENT START: acquire_service_addr
Jun 18 21:17:57 H50-2 HACMP for AIX: EVENT START: acquire_aconn_service en0 net_ether_01
Jun 18 21:17:57 H50-2 HACMP for AIX: EVENT COMPLETED: acquire_aconn_service en0 net_ether_01
Jun 18 21:17:58 H50-2 HACMP for AIX: EVENT COMPLETED: acquire_service_addr
Jun 18 21:18:43 H50-2 clsmuxpdES[23012]: 3 clsmuxpd 23012 (root ) smuxp_doit: SMUX registration of 1.3.6.1.4.1.2.3.1.2.1.5 faile
d
Jun 18 21:18:43 H50-2 clsmuxpdES[23014]: 3 clsmuxpd 23014 (root ) smuxp_doit: SMUX registration of 1.3.6.1.4.1.2.3.1.2.1.5 faile
d
Jun 18 21:18:44 H50-2 clsmuxpdES[23016]: 4 clsmuxpd 23016 (root ) smuxp_doit: SMUX registration of 1.3.6.1.4.1.2.3.1.2.1.5 faile
d
Jun 18 21:18:44 H50-2 HACMP for AIX: clexit.rc : Unexpected termination of clsmuxpdES.
Jun 18 21:18:57 H50-2 HACMP for AIX: EVENT COMPLETED: node_up H50_2
Jun 18 21:18:59 H50-2 HACMP for AIX: EVENT START: node_up_complete H50_2
Jun 18 21:19:01 H50-2 HACMP for AIX: EVENT START: start_server H50_2_app
Jun 18 21:19:02 H50-2 syslog: entry not in table or multiple matches
Jun 18 21:19:03 H50-2 HACMP for AIX: EVENT COMPLETED: start_server H50_2_app
Jun 18 21:19:06 H50-2 HACMP for AIX: EVENT COMPLETED: node_up_complete H50_2
Jun 18 21:19:09 H50-2 HACMP for AIX: EVENT START: fail_interface H50_2 192.168.64.6
Jun 18 21:19:10 H50-2 HACMP for AIX: EVENT COMPLETED: fail_interface H50_2 192.168.64.6
[H50-2][root][/usr/sbin/cluster]>./clstat
clstat - HACMP Cluster Status Monitor
-------------------------------------
THERE ARE NO CLUSTERS CURRENTLY ACTIVE
THE PROGRAM WILL CONTINUE SEARCHING FOR ONE
以上是我能得到的一些信息,请大侠帮忙看看到底是什么问题?还有clsmuxpd到底是干吗的? 谢谢:)
yinxiuping
[H50-2][root][/usr/sbin/cluster]>netstat -i
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
en0 1500 link#2 0.6.29.dc.95.e9 150401 0 127455 0 0
en0 1500 192.168.65 H50_2_boot 150401 0 127455 0 0
en0 1500 10.10.65 H50_2_svc 150401 0 127455 0 0
en1 1500 link#3 0.4.ac.49.7c.f8 0 0 38593 0 0
en1 1500 192.168.64 H50_2_stdby 0 0 38593 0 0
lo0 16896 link#1 100138 0 113191 0 0
lo0 16896 127 loopback 100138 0 113191 0 0
lo0 16896 ::1 100138 0 113191 0 0
这个应该没坏吧表示
yinxiuping
[H50-2][root][/usr/sbin/cluster]>netstat -i
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
en0 1500 link#2 0.6.29.dc.95.e9 150401 0 127455 0 0
en0 1500 192.168.65 H50_2_boot 150401 0 127455 0 0
en0 1500 10.10.65 H50_2_svc 150401 0 127455 0 0
en1 1500 link#3 0.4.ac.49.7c.f8 0 0 38593 0 0
en1 1500 192.168.64 H50_2_stdby 0 0 38593 0 0
lo0 16896 link#1 100138 0 113191 0 0
lo0 16896 127 loopback 100138 0 113191 0 0
lo0 16896 ::1 100138 0 113191 0 0
这个应该没坏吧表示
yinxiuping
[H50-2][root][/usr/sbin/cluster]>netstat -i
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
en0 1500 link#2 0.6.29.dc.95.e9 150401 0 127455 0 0
en0 1500 192.168.65 H50_2_boot 150401 0 127455 0 0
en0 1500 10.10.65 H50_2_svc 150401 0 127455 0 0
en1 1500 link#3 0.4.ac.49.7c.f8 0 0 38593 0 0
en1 1500 192.168.64 H50_2_stdby 0 0 38593 0 0
lo0 16896 link#1 100138 0 113191 0 0
lo0 16896 127 loopback 100138 0 113191 0 0
lo0 16896 ::1 100138 0 113191 0 0
这个应该没坏吧表示
yixianq
这个问题我也遇到了,同样的报错,同样的./clstat 监控不到HA的状态。ibm的官方文档说是AIX5.2上装HACMP有SNMP的代理版本不匹配的问题,是个BUG(见下),在所有节点上都停HA的情况下,将操作系统的SNMP的代理版本由3降为1了,可是问题并没有解决。手动激活clsmuxpdES,提示你已经激活,可是lssrc -g cluster 去看的时候,clsmuxpdES并没有被激活。而且在日志里报了跟楼主一样的错误。这个问题有谁有比较完整的解决方法.小弟跪谢!
IY37779: DOC: AIX 5.2 SNMP CFG CHANGE NEEDED FOR CLSTAT, CLINFO AND CSPOC
APAR status
Closed as documentation error.
Error description
HACMP C-SPOC cluster start and stop, as well as the
CLINFO utility and CLSTAT require SNMP Version 1 agents.
These utilites will not work with the default AIX 5.2
configuration.
clstat fails with
"THERE ARE NO CLUSTERS CURRENTLY ACTIVE - THE PROGRAM WILL
CONTINUE SEARCHING FOR ONE"
Local fix
Problem summary
HACMP C-SPOC cluster start and stop, as well as the
CLINFO utility and CLSTAT require SNMP Version 1 agents.
These utilites will not work with the default AIX 5.2
configuration.
This APAR is being used to document this requirement.
Problem conclusion
The following information will be added to a future
version of the HAMCP PTF README File.
=======================
SNMP Issue with AIX 5.2
=======================
AIX version 5.2 defaults to using SNMP version 3 agents,
where HACMP uses SNMP version 1 agents. Since HACMP uses
SNMP for C-SPOC cluster start and stop, as well as the
CLINFO utility, these features will not work under the
AIX 5.2 default configuration.
AIX 5.2 provides a utility to change which SNMP agent it
uses. By executing the following command, you can change the
SNMP agent used to version 1. This restores compatibility
with HACMP's use of SNMP.
/usr/sbin/snmpv3_ssw -1
Note that the command line parameter is a numeral one.
Temporary fix
Comments