[Wien] time difference among nodes
Luis Ogando
lcodacal at gmail.com
Tue Sep 29 14:07:38 CEST 2015
Dear Prof. Marks,
Thanks !
I will send your message to the administrators !
All the best,
Luis
2015-09-29 8:57 GMT-03:00 Laurence Marks <laurence.marks at gmail.com>:
> If it happens again, one thing to ask them to check is swap usage and how
> much memory is cached. On some of my nodes I have noticed that they do not
> always release cached memory, and can start swapping. If this happens the
> job will get very slow. The commands to use to clear the cache can be found
> at
>
> http://www.tecmint.com/clear-ram-memory-cache-buffer-and-swap-space-on-linux/
> or similar. (Needs root access.) Top can also show memory use.
>
> While there should be no need to do this, I have noticed that I need to do
> it every 3hrs on 4 nodes - the other 20 don't need it. It is an issue
> mainly for big calculations.
>
> Alternatively it was something else, a zombie, big log files or other
> things. Rebooting gets rid of a lot of system caches and helps -- even on
> my Android tablet every week or two. It's murky waters.
>
> ---
> Professor Laurence Marks
> Department of Materials Science and Engineering
> Northwestern University
> http://www.numis.northwestern.edu
> Corrosion in 4D http://MURI4D.numis.northwestern.edu
> Co-Editor, Acta Cryst A
> "Research is to see what everybody else has seen, and to think what nobody
> else has thought"
> Albert Szent-Gyorgi
> Hi Elias,
>
> There were no other jobs in the specific queue I was using and the
> nodes are dedicated to that queue, so, it was the opportunity to reboot
> them without furious reactions from other users.
> After trying everything suggested by the Wien2k community, the
> administrators resignedly remembered the words of wisdom given by the
> cluster guru, Shakespeare, and followed the suggestion given by Lyudmila
> Dobysheva. In other words, they killed my job, restarted all the nodes and
> I resubmitted the calculation
> All the best,
> Luis
>
>
> 2015-09-29 3:50 GMT-03:00 Elias Assmann <elias.assmann at gmail.com>:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> On 09/28/2015 01:58 PM, Luis Ogando wrote:
>> > The problem is solved ! The solution was one suggested by Lyudmila
>> > Dobysheva : reboot the nodes. We will never know the origin of the
>> > problem, but, honestly, I do not care !
>>
>> Good to hear that! So, how did you get the admins to reboot them?
>>
>> > "There are more things in heaven and earth, Horatio, Than are
>> > dreamt of in your philosophy."
>>
>> That is an apt quote for people working on clusters ;-).
>>
>>
>> Elias
>>
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1
>> Comment: Using GnuPG with Icedove - http://www.enigmail.net/
>>
>> iQIcBAEBAgAGBQJWCjTGAAoJEE/4gtQZfOqPhFAQAKZmda0t9FGgfAsk9UjymogK
>> oN1WxHdenQVOSaOblpAFEn4c0ihTog7zePEXdTqNl03OcBUcdKtOPVqSVLBKlmlF
>> f0VOBUeXjmOZKd6SAIuwNojflW0k9ysrJ2sLCo/dOGepT4L2Q8Um5DHpgh+mjehM
>> XtGbn6uDUQlcjoLKgHG9GxBzr9qRDqc4chYnMAvwNGkm7qntt7Q1jol9yGZikB8e
>> CONyaqYghNBr4x7BtGOaITJQ7yWw++l7t56oMSCNOXzee8Noy53cKPCVOvzh8lUF
>> PlMRNFB9pTgdxs59dy5yF31R4LTJjMG7zm+gHjmWDMi7BnQZQGEWDc6MIzLIwTPj
>> kN5dZm4R/cbVjYEzIlmsr9h67H/+9Otr36AvwfvvwycL/wy0RkC7jxqY0eC8i3fK
>> v/FdmFbt6b2wxzalmjvg+sEILe18Uz0fCmhcCDRdZ2fgmOWC68WeH4I7d2/kCJTr
>> Az2K8ZvZ5LxBCSH9MLoh/heZVSI3rowHu3aUNqfcbZ1pJLmT68RU9ZmPgfQnA4bK
>> 4uny7MaDcyYN/IvMRWf8lUiuY3OsRHGZAmcIfagkqvV2ukWPRFQ2AmsaZpMxbYyg
>> FsdKDJfYocUdp14KMT3wEhiGmUTE5BwtxAXq4NTq1sdJGESZIzhbEXYHbgnD7mbF
>> QDT7WZ/DqG+KpcVTRmnz
>> =JtdF
>> -----END PGP SIGNATURE-----
>> _______________________________________________
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>> SEARCH the MAILING-LIST at:
>> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>>
>
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20150929/f039a03e/attachment.html>
More information about the Wien
mailing list