AMD Processors
Decrease font size
Increase font size
Topic Title: Hp Dl145(2 amd 250 cpu) + Suse 9.1 problem
Topic Summary:
Created On: 01/12/2005 09:46 PM
Status: Read Only
Linear : Threading : Single : Branch
Search Topic Search Topic
Topic Tools Topic Tools
View similar topics View similar topics
View topic in raw text format. Print this topic.
 01/12/2005 09:46 PM
User is offline View Users Profile Print this message

Author Icon
huangzx
Junior Member

Posts: 1
Joined: 01/12/2005

Hello all,


my 24 nodes cluster info :

HP DL145 2400MHz 2P Opteron + suse 9.1

kernel : 2.6.4-52-smp #1 SMP Wed Apr 7 01:58:54 UTC 2004 x86_64 x86_64 x86_64 GNU/Linux

after finish install OS and lsf , run test job.

test job : lsf submit 24 setiathome jobs to compute nodes.

compute nodes hang randomly(different nodes ) , need to reset/reboot .

yesterday night sumbit jobs , this morning hang six nodes, no any clue in log file .

it is random nodes hang, I can not get any temperture info in /proc/acpi , I check it in BIOS, also no temperture info; our datacenter cooling system is ok, inside very cold, I check the datacenter temperture record , all about 20c degree.

if there is no job running on cluster nodes, all nodes fine (until now like that).

any one has this problem before ? or any suggestion ?

Thanks!!

--huangzx
Statistics
112018 users are registered to the AMD Processors forum.
There are currently 0 users logged in.

FuseTalk Hosting Executive Plan v3.2 - © 1999-2014 FuseTalk Inc. All rights reserved.



Contact AMD Terms and Conditions ©2007 Advanced Micro Devices, Inc. Privacy Trademark information