wxqdelphi
两个关于mpich的基本问题
1.需要将可执行程序拷贝到“所有”节点机上吗?
2.一定要创建/etc/hosts.equiv文件,使各个机器之间相互识别吗?
wxqdelphi
谢谢大家的建议!
我现在两台工作站,可以ssh无密码登录。
但为什么运行很简单的一个小程序,就会报以下错误:child process exited while making connection to remote process on?
这个小程序在单台工作站上运行没问题。
网上有以下两段建议:
i think the message you got maybe arised by your rsh ,rexec or rlogin command not being set correctly.so...you can execute "rsh {node} {commond}" to examine the rsh command work or not...
i solved the same problem by this method.you can try it .no problem.
-----------------------------------------
you reported:
util/tstmachines Errors while trying to run ssh client1.mydomain.com -n true Unexpected response from client1.mydomain.com:
--> Warning: No xauth data; using fake authentication
data for X11 forwarding.
check if ssh is working, for example,
" % ssh slavehost ls "
Check if passwordless solution is real working.
If you will have problems in the second step of the tstmachines, check if you share folders between the hosts.
I saw in the net the use of mpiCC to compile prevents these following errors that you reported.
p0_4293: p4_error: Child process exited while making connection to remote process on client1.mydomain.com:
0
p0_4293: (10.253768) net_send: could not write to fd=4, errno = 32 [root@master basic]#
it worked here.
Sorry, but i have low experience until now in MPI.
i wish that i could help you.
-----------------------------------------
看起来很复杂,会不会是机器的问题?
我用的是dell 690工作站,每台工作站4颗cpu。(两个双核cpu)。
好像两个双核cpu并行会出问题?