[Wien] Wien2k with load-distributing systems

Valerio Bellini vbellini at unimo.it
Wed Sep 17 03:53:23 CEST 2003


Hi, 

> Although I can submit "run_lapw" via commandline to the batch system
> (circumventing w2web), only the main process is running as planned, the other
> forked processes (which use ssh to fork, normal k-parallel mode) run outside
> of the scheduler. Understandably the administrator is not too happy about
> this.
> 
> How can I modify WIEN to not fork processes by ssh but instead submit all
> processes to the scheduler?

If you are on a shared-memory system, you should not use the USEREMOTE variable
when you make the 'siteconfig' setup.
Instead you should use some parallel environment command like poe, or mpiexec
or mpirun..depending on your computer.
Check for instance the chapter ' Customizing Batch Jobs for LSF '
in your LSF User's guide.

In case of an SP4 machine I do the following:
I changed the lapw1para script in order to create a file (command_list) 
with the list of the executable that for instance
lapw1para would submit one after the other during a 
k-parallelized job, i.e.

time lapw1 uplapw1_1.def >>.time1_1 &
time lapw1 uplapw1_2.def >>.time1_2 &
time lapw1 uplapw1_3.def >>.time1_3 &
time lapw1 uplapw1_4.def >>.time1_4 &

and then I give it to the poe comand:
es. poe <command_list

But this you need actually when you sit on a shared memory architecture.
computer as the IBM-SP series.
If you have a cluster, I really don't see why one shold not use the ssh
to fork the processes.
As soon as you fork them into the cpus reserved by the Batch System,
it should not make any problem..

The PBS is just a different kind of Batch queuing system.
With PBS you can do it 'on the flight' giving a command
'sed 's/^/1: /' $PBS_NODEFILE > .machines'
and extracting from the environment variable $PBS_NODEFILE the name
of the node the PBS reserved to you.
Probably it exists something like that also in the LSF.

Regards,
Valerio
-- 

*******************************************************************************
  Valerio Bellini
  INFM-S3 National Research Center on nanoStructures and bioSystems at Surfaces 
  and Department of Physics, University of Modena and Reggio Emilia
  Via Campi 213/A, 41100 Modena, Italy.
  Phone:   ++39 059 2055301
  Fax:     ++39 059 374794
  E-mail:  vbellini at unimo.it
  WWW:     http://www.s3.infm.it
*******************************************************************************



More information about the Wien mailing list