[Wien] compile wien2k_11.1 in mkl8.1, fortcom error

Peter Blaha pblaha at theochem.tuwien.ac.at
Fri Apr 22 16:16:58 CEST 2011


I include a couple of routines, where I noticed problem with very old 
compilers. I concerns  the use of functions like dsqrt or acos in 
"parameter" statements as well as some subroutines which are not 
included in old mkl/vml versions.
You can replace the original ones by them.
PS: The list of subroutines may not be complete.

 > Dear all,
> Recently, I have compiled the latest version, wien2k_11.1(downloaded on April 21, 2011), and got the 'fortcom errors' below. 
> The errors mainly come from lapw2, so I also  present the content in SRC_lapw2/compile.msg file below.
> 
> Our system is a two nodes cluster, every machine has two EM64T Intel(R) Xeon(TM) Processor 3.6GHz/1M CPUs, and the system is the Redhat enterprise 9.0 linux distribution (somewhat too old, maybe). I used intel mkl 8.1  and intel ifort 9.1.032 compiler. All earlier versions of wien2k (from wien2k_05 to wien2k_10) have been successfully compiled and worked well in the last six years. 
> The compiler options used for the newest version are
> 
>  O   Compiler options:        -I/opt/intel/mkl/8.1/include -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -traceback
>  L   Linker Flags:            $(FOPT) -L/opt/intel/mkl/8.1/lib/em64t -lpthread
>  P   Preprocessor flags       '-DParallel'
>  R   R_LIB (LAPACK+BLAS):     -lmkl_lapack64 -lmkl_em64t -lguide -lvml -pthread
> 
> By the way, no other errors occured except the list below. Are they bugs or my improper chosen compiler options?  Any comment or response is appreciated!
> 
> 
> 
> Compile time errors (if any) were:
> SRC_lapw1/compile.msg:make[1]: *** [lapw1c] Error 1
> SRC_lapw1/compile.msg:make: *** [complex] Error 2
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 680: This intrinsic function is invalid in constant expressions.   [DSQRT]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 779: This intrinsic function is invalid in constant expressions.   [COS]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 779: This intrinsic function is invalid in constant expressions.   [SIN]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 780: This intrinsic function is invalid in constant expressions.   [COS]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 780: This intrinsic function is invalid in constant expressions.   [SIN]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 999: This intrinsic function is invalid in constant expressions.   [DSQRT]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 1097: This intrinsic function is invalid in constant expressions.   [COS]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 1097: This intrinsic function is invalid in constant expressions.   [SIN]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 1098: This intrinsic function is invalid in constant expressions.   [COS]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 1098: This intrinsic function is invalid in constant expressions.   [SIN]
> SRC_lapw2/compile.msg:make[1]: *** [c3fft.o] Error 1
> SRC_lapw2/compile.msg:make: *** [real] Error 2
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 680: This intrinsic function is invalid in constant expressions.   [DSQRT]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 779: This intrinsic function is invalid in constant expressions.   [COS]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 779: This intrinsic function is invalid in constant expressions.   [SIN]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 780: This intrinsic function is invalid in constant expressions.   [COS]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 780: This intrinsic function is invalid in constant expressions.   [SIN]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 999: This intrinsic function is invalid in constant expressions.   [DSQRT]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 1097: This intrinsic function is invalid in constant expressions.   [COS]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 1097: This intrinsic function is invalid in constant expressions.   [SIN]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 1098: This intrinsic function is invalid in constant expressions.   [COS]
> SRC_lapw2/compile.msg:fortcom: Error: c3fft_tmp_.F, line 1098: This intrinsic function is invalid in constant expressions.   [SIN]
> SRC_lapw2/compile.msg:make[1]: *** [c3fft.o] Error 1
> SRC_lapw2/compile.msg:make: *** [complex] Error 2
> SRC_mixer/compile.msg:make: *** [mixer] Error 1
> 
> 
> The content in the SRC_lapw2/compile.msg :
> 
> fortcom: Error: c3fft_tmp_.F, line 680: This intrinsic function is invalid in constant expressions.   [DSQRT]
>       parameter (TAUR=-0.5D0, TAUI=dsqrt(0.75D0))
> -----------------------------------^
> fortcom: Error: c3fft_tmp_.F, line 779: This intrinsic function is invalid in constant expressions.   [COS]
>       parameter (TR11= cos(acos(-1.D0)*0.4D0), TI11=sin(acos(-1.D0)*0.4D0), &
> -----------------------^
> fortcom: Error: c3fft_tmp_.F, line 779: This intrinsic function is invalid in constant expressions.   [SIN]
>       parameter (TR11= cos(acos(-1.D0)*0.4D0), TI11=sin(acos(-1.D0)*0.4D0), &
> ----------------------------------------------------^
> fortcom: Error: c3fft_tmp_.F, line 780: This intrinsic function is invalid in constant expressions.   [COS]
>                  TR12=-cos(acos(-1.D0)*0.2D0), TI12=sin(acos(-1.D0)*0.2D0) )
> -----------------------^
> fortcom: Error: c3fft_tmp_.F, line 780: This intrinsic function is invalid in constant expressions.   [SIN]
>                  TR12=-cos(acos(-1.D0)*0.2D0), TI12=sin(acos(-1.D0)*0.2D0) )
> ----------------------------------------------------^
> fortcom: Error: c3fft_tmp_.F, line 999: This intrinsic function is invalid in constant expressions.   [DSQRT]
>       parameter (TAUR=-0.5D0, TAUI=-dsqrt(0.75D0))
> ------------------------------------^
> fortcom: Error: c3fft_tmp_.F, line 1097: This intrinsic function is invalid in constant expressions.   [COS]
>       parameter (TR11= cos(acos(-1.D0)*0.4D0), TI11=-sin(acos(-1.D0)*0.4D0), &
> -----------------------^
> fortcom: Error: c3fft_tmp_.F, line 1097: This intrinsic function is invalid in constant expressions.   [SIN]
>       parameter (TR11= cos(acos(-1.D0)*0.4D0), TI11=-sin(acos(-1.D0)*0.4D0), &
> -----------------------------------------------------^
> fortcom: Error: c3fft_tmp_.F, line 1098: This intrinsic function is invalid in constant expressions.   [COS]
>                  TR12=-cos(acos(-1.D0)*0.2D0), TI12=-sin(acos(-1.D0)*0.2D0) )
> -----------------------^
> fortcom: Error: c3fft_tmp_.F, line 1098: This intrinsic function is invalid in constant expressions.   [SIN]
>                  TR12=-cos(acos(-1.D0)*0.2D0), TI12=-sin(acos(-1.D0)*0.2D0) )
> -----------------------------------------------------^
> compilation aborted for c3fft_tmp_.F (code 1)
> make[1]: *** [c3fft.o] Error 1
> 
> 
>     PhD. candidate,  Zhen Chen
> ------------------------------
> Beijing Laboratory of Electron Microscopy 
> Institute of Physics 
> Chinese Academy of Sciences
> P. O. Box 603 
> Beijing 100190, China
> Tel: 86-10-82648001
>         zchen at blem.ac.cn
> 


                                      P.Blaha
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-15671             FAX: +43-1-58801-15698
Email: blaha at theochem.tuwien.ac.at    WWW: http://info.tuwien.ac.at/theochem/
--------------------------------------------------------------------------
-------------- next part --------------
      SUBROUTINE C3FFT(N1,N2,N3,C,LDC1,LDC2,ISIG,CWORK,DWORK,IERR)
!
!     ..................................................................
! 1.     PROGRAM UNIT 'C3FFT'
!           3-dimensional Fast Fourier Transform of a COMPLEX*16 array
!           FORTRAN 77 SUBROUTINE
!
! 2.     PURPOSE
!           This routine performs either a forward or a backward FFT of
!           three-dimensional double complex (COMPLEX*16) data.
!           The primefactors of the number of datapoints along each
!           dimension of the data array should be small to yield better
!           performance.
!
! 3.     USAGE
!        PARAMETER-DESCRIPTION
!           C      - COMPLEX*16 array                     (input/output)
!                    Input: The (three-dimensional) array containing the
!                           data to be transformed.
!                    Output: the transformed (three-dimensional) data
!           LDC1   - INTEGER variable                            (input)
!                    The exact dimension of the first index of C as
!                    defined in the calling routine.
!           LDC2   - INTEGER variable                            (input)
!                    The exact dimension of the second index of C as
!                    defined in the calling routine.
!           N1     - INTEGER variable                            (input)
!                    The number of datapoints in the first dimension of
!                    array C.
!                    Constraint: N1 .LE. LDC1 (not checked here)
!           N2     - INTEGER variable                            (input)
!                    The number of datapoints in the second dimension of
!                    array C.
!                    Constraint: N2 .LE. LDC2 (not checked here)
!           N3     - INTEGER variable                            (input)
!                    The number of datapoints in the third dimension of
!                    array C.
!           IERR   - INTEGER variable                           (output)
!                    Error indicator.
!                    IERR .EQ. 0 ... no error
!                    IERR .NE. 0 ... currently not supported
!           ISIG   - INTEGER variable                            (input)
!                    Specifies the direction of the transform:
!                       ISIG .LT. 0 ... forward FFT
!                       ISIG .GT. 0 ... backward FFT
!                       ISIG = 0 ...... nothing will be done
!           CWORK  - COMPLEX*16 array                     (input/output)
!                    A work array which must be dimensioned at least
!                    max(N1,N2,N3) in the calling routine.
!           DWORK  - DOUBLE PRECISION array               (input/output)
!                    A work array which must be dimensioned at least
!                    4*(max(N1,N2,N3))+15 in the calling routine.
!
!        USED SUBROUTINES (DIRECTLY CALLED)
!           CFFTB  - backward Fast Fourier Transform (from FFTPACK)
!           CFFTF  - forward Fast Fourier Transform (from FFTPACK)
!           CFFTI  - initialization for CFFTB and CFFTF (from FFTPACK)
!
!        INDIRECTLY CALLED SUBROUTINES
!           PASSB  - (from FFTPACK)
!           PASSB2 - (from FFTPACK)
!           PASSB3 - (from FFTPACK)
!           PASSB4 - (from FFTPACK)
!           PASSB5 - (from FFTPACK)
!           PASSF  - (from FFTPACK)
!           PASSF2 - (from FFTPACK)
!           PASSF3 - (from FFTPACK)
!           PASSF4 - (from FFTPACK)
!           PASSF5 - (from FFTPACK)
!           PIMACH - (from FFTPACK)
!
!        UTILITY-SUBROUTINES (USE BEFOREHAND OR AFTERWARDS)
!           none
!
!        INPUT/OUTPUT (READ/WRITE)
!           none
!
!        MACHINENDEPENDENT PROGRAMPARTS
!           The array containing the three-dimensional data to be
!           transformed is declared as COMPLEX*16.
!
! 4.     REMARKS
!           For repeated usage of this routine no separate set-up
!           routine is provided, because the time needed for this
!           set-up is neglectable compared to the actual transformation.
!
! 5.     METHOD
!           The three-dimensional FFT is done using one-dimensional
!           complex FFT routines (from FFTPACK). This is done by:
!              1. transforming data values along the 1. dimension of C
!              2. transforming data values along the 2. dimension of C
!              3. transforming data values along the 3. dimension of C
!           Because the data elements along the 2. and 3. dimension of
!           C are not stored in contiguous storage locations and
!           FFTPACK-routines don't provide means for specifying a
!           stride, the involved elements are copied into a temporary
!           one-dimensional field before transformation. Afterwards
!           the data values are stored back to their original
!           location in C.
!
! 6.     DATE
!           28. June 1993                                   Version 0.91
!
!        INSTITUT FUER TECHNISCHE ELEKTROCHEMIE            --  TU VIENNA
!     ..................................................................
!
      INTEGER            LDC1,  LDC2,  N1,  N2,  N3,  ISIG,  IERR
      DOUBLE PRECISION   DWORK(*)
      COMPLEX*16         CWORK(*)
      COMPLEX*16         C(LDC1,LDC2,N3)
!
      INTEGER            I,  J,  K
#ifdef FFTW2
        call exec_fftw(isig,n1,n2,n3,C,ierr)
        return
#else
!
      IERR = 0
      IF (ISIG .LT. 0) THEN
!
!        forward transform in 1. dimension
!
         CALL CFFTI(N1,DWORK)
         DO 20 K = 1, N3
            DO 10 J = 1, N2
               CALL CFFTF(N1,C(1,J,K),DWORK)
   10       CONTINUE
   20    CONTINUE
!
!        forward transform in 2. dimension
!
         CALL CFFTI(N2,DWORK)
         DO 60 K = 1, N3
            DO 50 I = 1, N1
!               DO 30 J = 1, N2
!                 CWORK(J) = C(I,J,K)
!  30          CONTINUE
               call zcopy(N2,C(I,1,K),LDC1,CWORK,1)
               CALL CFFTF(N2,CWORK,DWORK)
               call zcopy(N2,CWORK,1,C(I,1,K),LDC1)
!              DO 40 J = 1, N2
!                 C(I,J,K) = CWORK(J)
!  40          CONTINUE
   50       CONTINUE
   60    CONTINUE
!
!        forward transform in 3. dimension
!
         CALL CFFTI(N3,DWORK)
         DO 100 J = 1, N2
            DO 90 I = 1, N1
!               DO 70 K = 1, N3
!                  CWORK(K) = C(I,J,K)
!   70          CONTINUE
               call zcopy(N3,C(I,J,1),LDC1*LDC2,CWORK,1)
               CALL CFFTF(N3,CWORK,DWORK)
               call zcopy(N3,CWORK,1,C(I,J,1),LDC1*LDC2)
!              DO 80 K = 1, N3
!                 C(I,J,K) = CWORK(K)
!  80          CONTINUE
   90       CONTINUE
  100    CONTINUE
      ELSEIF (ISIG .GT. 0) THEN
!
!        backward transform in 1. dimension
!
         CALL CFFTI(N1,DWORK)
         DO 120 K = 1, N3
            DO 110 J = 1, N2
               CALL CFFTB(N1,C(1,J,K),DWORK)
  110       CONTINUE
  120    CONTINUE
!
!        backward transform in 2. dimension
!
         CALL CFFTI(N2,DWORK)
         DO 160 K = 1, N3
            DO 150 I = 1, N1
!              DO 130 J = 1, N2
!                 CWORK(J) = C(I,J,K)
! 130          CONTINUE
               call zcopy(N2,C(I,1,K),LDC1,CWORK,1)
               CALL CFFTB(N2,CWORK,DWORK)
               call zcopy(N2,CWORK,1,C(I,1,K),LDC1)
!              DO 140 J = 1, N2
!                 C(I,J,K) = CWORK(J)
! 140          CONTINUE
  150       CONTINUE
  160    CONTINUE
!
!        backward transform in 3. dimension
!
         CALL CFFTI(N3,DWORK)
         DO 200 J = 1, N2
            DO 190 I = 1, N1
!              DO 170 K = 1, N3
!                 CWORK(K) = C(I,J,K)
! 170          CONTINUE
               call zcopy(N3,C(I,J,1),LDC1*LDC2,CWORK,1)
               CALL CFFTB(N3,CWORK,DWORK)
               call zcopy(N3,CWORK,1,C(I,J,1),LDC1*LDC2)
!             DO 180 K = 1, N3
!                  C(I,J,K) = CWORK(K)
!  180          CONTINUE
  190       CONTINUE
  200    CONTINUE
      ENDIF
!
!        End of 'C3FFT'
!
#endif
      END
!     SUBROUTINE CFFTB(N,C,WSAVE)                                               
!                                                                               
!     SUBROUTINE CFFTB COMPUTES THE BACKWARD COMPLEX DISCRETE FOURIER           
!     TRANSFORM (THE FOURIER SYNTHESIS). EQUIVALENTLY , CFFTB COMPUTES          
!     A COMPLEX PERIODIC SEQUENCE FROM ITS FOURIER COEFFICIENTS.                
!     THE TRANSFORM IS DEFINED BELOW AT OUTPUT PARAMETER C.                     
!                                                                               
!     A CALL OF CFFTF FOLLOWED BY A CALL OF CFFTB WILL MULTIPLY THE             
!     SEQUENCE BY N.                                                            
!                                                                               
!     THE ARRAY WSAVE WHICH IS USED BY SUBROUTINE CFFTB MUST BE                 
!     INITIALIZED BY CALLING SUBROUTINE CFFTI(N,WSAVE).                         
!                                                                               
!     INPUT PARAMETERS                                                          
!                                                                               
!                                                                               
!     N      THE LENGTH OF THE COMPLEX SEQUENCE C. THE METHOD IS                
!            MORE EFFICIENT WHEN N IS THE PRODUCT OF SMALL PRIMES.              
!                                                                               
!     C      A COMPLEX ARRAY OF LENGTH N WHICH CONTAINS THE SEQUENCE            
!                                                                               
!     WSAVE   A REAL WORK ARRAY WHICH MUST BE DIMENSIONED AT LEAST 4N+15        
!             IN THE PROGRAM THAT CALLS CFFTB. THE WSAVE ARRAY MUST BE          
!             INITIALIZED BY CALLING SUBROUTINE CFFTI(N,WSAVE) AND A            
!             DIFFERENT WSAVE ARRAY MUST BE USED FOR EACH DIFFERENT             
!             VALUE OF N. THIS INITIALIZATION DOES NOT HAVE TO BE               
!             REPEATED SO LONG AS N REMAINS UNCHANGED THUS SUBSEQUENT           
!             TRANSFORMS CAN BE OBTAINED FASTER THAN THE FIRST.                 
!             THE SAME WSAVE ARRAY CAN BE USED BY CFFTF AND CFFTB.              
!                                                                               
!     OUTPUT PARAMETERS                                                         
!                                                                               
!     C      FOR J=1,...,N                                                      
!                                                                               
!                C(J)=THE SUM FROM K=1,...,N OF                                 
!                                                                               
!                      C(K)*EXP(I*(J-1)*(K-1)*2*PI/N)                           
!                                                                               
!                            WHERE I=SQRT(-1)                                   
!                                                                               
!     WSAVE   CONTAINS INITIALIZATION CALCULATIONS WHICH MUST NOT BE            
!             DESTROYED BETWEEN CALLS OF SUBROUTINE CFFTF OR CFFTB              
!
      SUBROUTINE CFFTB (N,C,WSAVE)                                              
      IMPLICIT REAL*8     (A-H,O-Z)
      DIMENSION       C(*)       ,WSAVE(*)                                      
!                                                                               
      IF (N .EQ. 1) RETURN                                                      
      IW1 = N+N+1                                                               
      IW2 = IW1+N+N                                                             
      CALL CFFTB1 (N,C,WSAVE,WSAVE(IW1),WSAVE(IW2))                             
      RETURN                                                                    
      END                                                                       
      SUBROUTINE CFFTB1 (N,C,CH,WA,IFAC)                                        
      IMPLICIT REAL*8     (A-H,O-Z)
      DIMENSION       CH(*)      ,C(*)       ,WA(*)      ,IFAC(*)               
      NF = IFAC(2)                                                              
      NA = 0                                                                    
      L1 = 1                                                                    
      IW = 1                                                                    
      DO 116 K1=1,NF                                                            
         IP = IFAC(K1+2)                                                        
         L2 = IP*L1                                                             
         IDO = N/L2                                                             
         IDOT = IDO+IDO                                                         
         IDL1 = IDOT*L1                                                         
         IF (IP .NE. 4) GO TO 103                                               
         IX2 = IW+IDOT                                                          
         IX3 = IX2+IDOT                                                         
         IF (NA .NE. 0) GO TO 101                                               
         CALL PASSB4 (IDOT,L1,C,CH,WA(IW),WA(IX2),WA(IX3))                      
         GO TO 102                                                              
  101    CALL PASSB4 (IDOT,L1,CH,C,WA(IW),WA(IX2),WA(IX3))                      
  102    NA = 1-NA                                                              
         GO TO 115                                                              
  103    IF (IP .NE. 2) GO TO 106                                               
         IF (NA .NE. 0) GO TO 104                                               
         CALL PASSB2 (IDOT,L1,C,CH,WA(IW))                                      
         GO TO 105                                                              
  104    CALL PASSB2 (IDOT,L1,CH,C,WA(IW))                                      
  105    NA = 1-NA                                                              
         GO TO 115                                                              
  106    IF (IP .NE. 3) GO TO 109                                               
         IX2 = IW+IDOT                                                          
         IF (NA .NE. 0) GO TO 107                                               
         CALL PASSB3 (IDOT,L1,C,CH,WA(IW),WA(IX2))                              
         GO TO 108                                                              
  107    CALL PASSB3 (IDOT,L1,CH,C,WA(IW),WA(IX2))                              
  108    NA = 1-NA                                                              
         GO TO 115                                                              
  109    IF (IP .NE. 5) GO TO 112                                               
         IX2 = IW+IDOT                                                          
         IX3 = IX2+IDOT                                                         
         IX4 = IX3+IDOT                                                         
         IF (NA .NE. 0) GO TO 110                                               
         CALL PASSB5 (IDOT,L1,C,CH,WA(IW),WA(IX2),WA(IX3),WA(IX4))              
         GO TO 111                                                              
  110    CALL PASSB5 (IDOT,L1,CH,C,WA(IW),WA(IX2),WA(IX3),WA(IX4))              
  111    NA = 1-NA                                                              
         GO TO 115                                                              
  112    IF (NA .NE. 0) GO TO 113                                               
         CALL PASSB (NAC,IDOT,IP,L1,IDL1,C,C,C,CH,CH,WA(IW))                    
         GO TO 114                                                              
  113    CALL PASSB (NAC,IDOT,IP,L1,IDL1,CH,CH,CH,C,C,WA(IW))                   
  114    IF (NAC .NE. 0) NA = 1-NA                                              
  115    L1 = L2                                                                
         IW = IW+(IP-1)*IDOT                                                    
  116 CONTINUE                                                                  
      IF (NA .EQ. 0) RETURN                                                     
      N2 = N+N                                                                  
      DO 117 I=1,N2                                                             
         C(I) = CH(I)                                                           
  117 CONTINUE                                                                  
      RETURN                                                                    
      END                                                                       
!     SUBROUTINE CFFTF(N,C,WSAVE)                                               
!                                                                               
!     SUBROUTINE CFFTF COMPUTES THE FORWARD COMPLEX DISCRETE FOURIER            
!     TRANSFORM (THE FOURIER ANALYSIS). EQUIVALENTLY , CFFTF COMPUTES           
!     THE FOURIER COEFFICIENTS OF A COMPLEX PERIODIC SEQUENCE.                  
!     THE TRANSFORM IS DEFINED BELOW AT OUTPUT PARAMETER C.                     
!                                                                               
!     THE TRANSFORM IS NOT NORMALIZED. TO OBTAIN A NORMALIZED TRANSFORM         
!     THE OUTPUT MUST BE DIVIDED BY N. OTHERWISE A CALL OF CFFTF                
!     FOLLOWED BY A CALL OF CFFTB WILL MULTIPLY THE SEQUENCE BY N.              
!                                                                               
!     THE ARRAY WSAVE WHICH IS USED BY SUBROUTINE CFFTF MUST BE                 
!     INITIALIZED BY CALLING SUBROUTINE CFFTI(N,WSAVE).                         
!                                                                               
!     INPUT PARAMETERS                                                          
!                                                                               
!                                                                               
!     N      THE LENGTH OF THE COMPLEX SEQUENCE C. THE METHOD IS                
!            MORE EFFICIENT WHEN N IS THE PRODUCT OF SMALL PRIMES. N            
!                                                                               
!     C      A COMPLEX ARRAY OF LENGTH N WHICH CONTAINS THE SEQUENCE            
!                                                                               
!     WSAVE   A REAL WORK ARRAY WHICH MUST BE DIMENSIONED AT LEAST 4N+15        
!             IN THE PROGRAM THAT CALLS CFFTF. THE WSAVE ARRAY MUST BE          
!             INITIALIZED BY CALLING SUBROUTINE CFFTI(N,WSAVE) AND A            
!             DIFFERENT WSAVE ARRAY MUST BE USED FOR EACH DIFFERENT             
!             VALUE OF N. THIS INITIALIZATION DOES NOT HAVE TO BE               
!             REPEATED SO LONG AS N REMAINS UNCHANGED THUS SUBSEQUENT           
!             TRANSFORMS CAN BE OBTAINED FASTER THAN THE FIRST.                 
!             THE SAME WSAVE ARRAY CAN BE USED BY CFFTF AND CFFTB.              
!                                                                               
!     OUTPUT PARAMETERS                                                         
!                                                                               
!     C      FOR J=1,...,N                                                      
!                                                                               
!                C(J)=THE SUM FROM K=1,...,N OF                                 
!                                                                               
!                      C(K)*EXP(-I*(J-1)*(K-1)*2*PI/N)                          
!                                                                               
!                            WHERE I=SQRT(-1)                                   
!                                                                               
!     WSAVE   CONTAINS INITIALIZATION CALCULATIONS WHICH MUST NOT BE            
!             DESTROYED BETWEEN CALLS OF SUBROUTINE CFFTF OR CFFTB              
!                                                                               
      SUBROUTINE CFFTF (N,C,WSAVE)                                              
      IMPLICIT REAL*8     (A-H,O-Z)
      DIMENSION       C(*)       ,WSAVE(*)                                      
!                                                                               
      IF (N .EQ. 1) RETURN                                                      
      IW1 = N+N+1                                                               
      IW2 = IW1+N+N                                                             
      CALL CFFTF1 (N,C,WSAVE,WSAVE(IW1),WSAVE(IW2))                             
      RETURN                                                                    
      END                                                                       
      SUBROUTINE CFFTF1 (N,C,CH,WA,IFAC)                                        
      IMPLICIT REAL*8     (A-H,O-Z)
      DIMENSION       CH(*)      ,C(*)       ,WA(*)      ,IFAC(*)               
      NF = IFAC(2)                                                              
      NA = 0                                                                    
      L1 = 1                                                                    
      IW = 1                                                                    
      DO 116 K1=1,NF                                                            
         IP = IFAC(K1+2)                                                        
         L2 = IP*L1                                                             
         IDO = N/L2                                                             
         IDOT = IDO+IDO                                                         
         IDL1 = IDOT*L1                                                         
         IF (IP .NE. 4) GO TO 103                                               
         IX2 = IW+IDOT                                                          
         IX3 = IX2+IDOT                                                         
         IF (NA .NE. 0) GO TO 101                                               
         CALL PASSF4 (IDOT,L1,C,CH,WA(IW),WA(IX2),WA(IX3))                      
         GO TO 102                                                              
  101    CALL PASSF4 (IDOT,L1,CH,C,WA(IW),WA(IX2),WA(IX3))                      
  102    NA = 1-NA                                                              
         GO TO 115                                                              
  103    IF (IP .NE. 2) GO TO 106                                               
         IF (NA .NE. 0) GO TO 104                                               
         CALL PASSF2 (IDOT,L1,C,CH,WA(IW))                                      
         GO TO 105                                                              
  104    CALL PASSF2 (IDOT,L1,CH,C,WA(IW))                                      
  105    NA = 1-NA                                                              
         GO TO 115                                                              
  106    IF (IP .NE. 3) GO TO 109                                               
         IX2 = IW+IDOT                                                          
         IF (NA .NE. 0) GO TO 107                                               
         CALL PASSF3 (IDOT,L1,C,CH,WA(IW),WA(IX2))                              
         GO TO 108                                                              
  107    CALL PASSF3 (IDOT,L1,CH,C,WA(IW),WA(IX2))                              
  108    NA = 1-NA                                                              
         GO TO 115                                                              
  109    IF (IP .NE. 5) GO TO 112                                               
         IX2 = IW+IDOT                                                          
         IX3 = IX2+IDOT                                                         
         IX4 = IX3+IDOT                                                         
         IF (NA .NE. 0) GO TO 110                                               
         CALL PASSF5 (IDOT,L1,C,CH,WA(IW),WA(IX2),WA(IX3),WA(IX4))              
         GO TO 111                                                              
  110    CALL PASSF5 (IDOT,L1,CH,C,WA(IW),WA(IX2),WA(IX3),WA(IX4))              
  111    NA = 1-NA                                                              
         GO TO 115                                                              
  112    IF (NA .NE. 0) GO TO 113                                               
         CALL PASSF (NAC,IDOT,IP,L1,IDL1,C,C,C,CH,CH,WA(IW))                    
         GO TO 114                                                              
  113    CALL PASSF (NAC,IDOT,IP,L1,IDL1,CH,CH,CH,C,C,WA(IW))                   
  114    IF (NAC .NE. 0) NA = 1-NA                                              
  115    L1 = L2                                                                
         IW = IW+(IP-1)*IDOT                                                    
  116 CONTINUE                                                                  
      IF (NA .EQ. 0) RETURN                                                     
      N2 = N+N                                                                  
      DO 117 I=1,N2                                                             
         C(I) = CH(I)                                                           
  117 CONTINUE                                                                  
      RETURN                                                                    
      END                                                                       
!     SUBROUTINE CFFTI(N,WSAVE)                                                 
!                                                                               
!     SUBROUTINE CFFTI INITIALIZES THE ARRAY WSAVE WHICH IS USED IN             
!     BOTH CFFTF AND CFFTB. THE PRIME FACTORIZATION OF N TOGETHER WITH          
!     A TABULATION OF THE TRIGONOMETRIC FUNCTIONS ARE COMPUTED AND              
!     STORED IN WSAVE.                                                          
!                                                                               
!     INPUT PARAMETER                                                           
!                                                                               
!     N       THE LENGTH OF THE SEQUENCE TO BE TRANSFORMED                      
!                                                                               
!     OUTPUT PARAMETER                                                          
!                                                                               
!     WSAVE   A WORK ARRAY WHICH MUST BE DIMENSIONED AT LEAST 4*N+15            
!             THE SAME WORK ARRAY CAN BE USED FOR BOTH CFFTF AND CFFTB          
!             AS LONG AS N REMAINS UNCHANGED. DIFFERENT WSAVE ARRAYS            
!             ARE REQUIRED FOR DIFFERENT VALUES OF N. THE CONTENTS OF           
!             WSAVE MUST NOT BE CHANGED BETWEEN CALLS OF CFFTF OR CFFTB.        
!                                                                               
      SUBROUTINE CFFTI (N,WSAVE)                                                
      IMPLICIT REAL*8     (A-H,O-Z)
      DIMENSION       WSAVE(*)                                                  
!                                                                               
      IF (N .EQ. 1) RETURN                                                      
      IW1 = N+N+1                                                               
      IW2 = IW1+N+N                                                             
      CALL CFFTI1 (N,WSAVE(IW1),WSAVE(IW2))                                     
      RETURN                                                                    
      END                                                                       
      SUBROUTINE CFFTI1 (N,WA,IFAC)
      IMPLICIT REAL*8     (A-H,O-Z)
      DIMENSION       WA(*)      ,IFAC(*)    ,NTRYH(4)
      DATA NTRYH(1),NTRYH(2),NTRYH(3),NTRYH(4)/3,4,2,5/
      NL = N
      NF = 0
      J = 0
  101 J = J+1
      IF (J-4) 102,102,103
  102 NTRY = NTRYH(J)
      GO TO 104
  103 NTRY = NTRY+2
  104 NQ = NL/NTRY
      NR = NL-NTRY*NQ
      IF (NR) 101,105,101
  105 NF = NF+1
      IFAC(NF+2) = NTRY
      NL = NQ
      IF (NTRY .NE. 2) GO TO 107
      IF (NF .EQ. 1) GO TO 107
      DO 106 I=2,NF
         IB = NF-I+2
         IFAC(IB+2) = IFAC(IB+1)
  106 CONTINUE
      IFAC(3) = 2
  107 IF (NL .NE. 1) GO TO 104
      IFAC(1) = N
      IFAC(2) = NF
      TPI = 2.0D+0*PIMACH(DUM)
      ARGH = TPI/DFLOAT(N)
      I = 2
      L1 = 1
      DO 110 K1=1,NF
         IP = IFAC(K1+2)
         LD = 0
         L2 = L1*IP
         IDO = N/L2
         IDOT = IDO+IDO+2
         IPM = IP-1
         DO 109 J=1,IPM
            I1 = I
            WA(I-1) = 1.0D+0
            WA(I) = 0.0D+0
            LD = LD+L1
            FI = 0.0D+0
            ARGLD = DFLOAT(LD)*ARGH
            DO 108 II=4,IDOT,2
               I = I+2
               FI = FI+1.0D+0
               ARG = FI*ARGLD
               WA(I-1) = COS(ARG)
               WA(I) = SIN(ARG)
  108       CONTINUE
            IF (IP .LE. 5) GO TO 109
            WA(I1-1) = WA(I-1)
            WA(I1) = WA(I)
  109    CONTINUE
         L1 = L2
  110 CONTINUE
      RETURN
      END
      SUBROUTINE PASSB (NAC,IDO,IP,L1,IDL1,CC,C1,C2,CH,CH2,WA)                  
      IMPLICIT REAL*8     (A-H,O-Z)
      DIMENSION       CH(IDO,L1,IP)          ,CC(IDO,IP,L1)          ,           &
                      C1(IDO,L1,IP)          ,WA(*)      ,C2(IDL1,IP),           &
                      CH2(IDL1,IP)                                              
      IDOT = IDO/2                                                              
      NT = IP*IDL1                                                              
      IPP2 = IP+2                                                               
      IPPH = (IP+1)/2                                                           
      IDP = IP*IDO                                                              
!                                                                               
      IF (IDO .LT. L1) GO TO 106                                                
      DO 103 J=2,IPPH                                                           
         JC = IPP2-J                                                            
         DO 102 K=1,L1                                                          
            DO 101 I=1,IDO                                                      
               CH(I,K,J) = CC(I,J,K)+CC(I,JC,K)                                 
               CH(I,K,JC) = CC(I,J,K)-CC(I,JC,K)                                
  101       CONTINUE                                                            
  102    CONTINUE                                                               
  103 CONTINUE                                                                  
      DO 105 K=1,L1                                                             
         DO 104 I=1,IDO                                                         
            CH(I,K,1) = CC(I,1,K)                                               
  104    CONTINUE                                                               
  105 CONTINUE                                                                  
      GO TO 112                                                                 
  106 DO 109 J=2,IPPH                                                           
         JC = IPP2-J                                                            
         DO 108 I=1,IDO                                                         
            DO 107 K=1,L1                                                       
               CH(I,K,J) = CC(I,J,K)+CC(I,JC,K)                                 
               CH(I,K,JC) = CC(I,J,K)-CC(I,JC,K)                                
  107       CONTINUE                                                            
  108    CONTINUE                                                               
  109 CONTINUE                                                                  
      DO 111 I=1,IDO                                                            
         DO 110 K=1,L1                                                          
            CH(I,K,1) = CC(I,1,K)                                               
  110    CONTINUE                                                               
  111 CONTINUE                                                                  
  112 IDL = 2-IDO                                                               
      INC = 0                                                                   
      DO 116 L=2,IPPH                                                           
         LC = IPP2-L                                                            
         IDL = IDL+IDO                                                          
         DO 113 IK=1,IDL1                                                       
            C2(IK,L) = CH2(IK,1)+WA(IDL-1)*CH2(IK,2)                            
            C2(IK,LC) = WA(IDL)*CH2(IK,IP)                                      
  113    CONTINUE                                                               
         IDLJ = IDL                                                             
         INC = INC+IDO                                                          
         DO 115 J=3,IPPH                                                        
            JC = IPP2-J                                                         
            IDLJ = IDLJ+INC                                                     
            IF (IDLJ .GT. IDP) IDLJ = IDLJ-IDP                                  
            WAR = WA(IDLJ-1)                                                    
            WAI = WA(IDLJ)                                                      
            DO 114 IK=1,IDL1                                                    
               C2(IK,L) = C2(IK,L)+WAR*CH2(IK,J)                                
               C2(IK,LC) = C2(IK,LC)+WAI*CH2(IK,JC)                             
  114       CONTINUE                                                            
  115    CONTINUE                                                               
  116 CONTINUE                                                                  
      DO 118 J=2,IPPH                                                           
         DO 117 IK=1,IDL1                                                       
            CH2(IK,1) = CH2(IK,1)+CH2(IK,J)                                     
  117    CONTINUE                                                               
  118 CONTINUE                                                                  
      DO 120 J=2,IPPH                                                           
         JC = IPP2-J                                                            
         DO 119 IK=2,IDL1,2                                                     
            CH2(IK-1,J) = C2(IK-1,J)-C2(IK,JC)                                  
            CH2(IK-1,JC) = C2(IK-1,J)+C2(IK,JC)                                 
            CH2(IK,J) = C2(IK,J)+C2(IK-1,JC)                                    
            CH2(IK,JC) = C2(IK,J)-C2(IK-1,JC)                                   
  119    CONTINUE                                                               
  120 CONTINUE                                                                  
      NAC = 1                                                                   
      IF (IDO .EQ. 2) RETURN                                                    
      NAC = 0                                                                   
      DO 121 IK=1,IDL1                                                          
         C2(IK,1) = CH2(IK,1)                                                   
  121 CONTINUE                                                                  
      DO 123 J=2,IP                                                             
         DO 122 K=1,L1                                                          
            C1(1,K,J) = CH(1,K,J)                                               
            C1(2,K,J) = CH(2,K,J)                                               
  122    CONTINUE                                                               
  123 CONTINUE                                                                  
      IF (IDOT .GT. L1) GO TO 127                                               
      IDIJ = 0                                                                  
      DO 126 J=2,IP                                                             
         IDIJ = IDIJ+2                                                          
         DO 125 I=4,IDO,2                                                       
            IDIJ = IDIJ+2                                                       
            DO 124 K=1,L1                                                       
               C1(I-1,K,J) = WA(IDIJ-1)*CH(I-1,K,J)-WA(IDIJ)*CH(I,K,J)          
               C1(I,K,J) = WA(IDIJ-1)*CH(I,K,J)+WA(IDIJ)*CH(I-1,K,J)            
  124       CONTINUE                                                            
  125    CONTINUE                                                               
  126 CONTINUE                                                                  
      RETURN                                                                    
  127 IDJ = 2-IDO                                                               
      DO 130 J=2,IP                                                             
         IDJ = IDJ+IDO                                                          
         DO 129 K=1,L1                                                          
            IDIJ = IDJ                                                          
            DO 128 I=4,IDO,2                                                    
               IDIJ = IDIJ+2                                                    
               C1(I-1,K,J) = WA(IDIJ-1)*CH(I-1,K,J)-WA(IDIJ)*CH(I,K,J)          
               C1(I,K,J) = WA(IDIJ-1)*CH(I,K,J)+WA(IDIJ)*CH(I-1,K,J)            
  128       CONTINUE                                                            
  129    CONTINUE                                                               
  130 CONTINUE                                                                  
      RETURN                                                                    
      END                                                                       
      SUBROUTINE PASSB2 (IDO,L1,CC,CH,WA1)                                      
      IMPLICIT REAL*8     (A-H,O-Z)
      DIMENSION       CC(IDO,2,L1)           ,CH(IDO,L1,2)           ,           &
                      WA1(*)                                                    
      IF (IDO .GT. 2) GO TO 102                                                 
      DO 101 K=1,L1                                                             
         CH(1,K,1) = CC(1,1,K)+CC(1,2,K)                                        
         CH(1,K,2) = CC(1,1,K)-CC(1,2,K)                                        
         CH(2,K,1) = CC(2,1,K)+CC(2,2,K)                                        
         CH(2,K,2) = CC(2,1,K)-CC(2,2,K)                                        
  101 CONTINUE                                                                  
      RETURN                                                                    
  102 DO 104 K=1,L1                                                             
         DO 103 I=2,IDO,2                                                       
            CH(I-1,K,1) = CC(I-1,1,K)+CC(I-1,2,K)                               
            TR2 = CC(I-1,1,K)-CC(I-1,2,K)                                       
            CH(I,K,1) = CC(I,1,K)+CC(I,2,K)                                     
            TI2 = CC(I,1,K)-CC(I,2,K)                                           
            CH(I,K,2) = WA1(I-1)*TI2+WA1(I)*TR2                                 
            CH(I-1,K,2) = WA1(I-1)*TR2-WA1(I)*TI2                               
  103    CONTINUE                                                               
  104 CONTINUE                                                                  
      RETURN                                                                    
      END                                                                       
      SUBROUTINE PASSB3 (IDO,L1,CC,CH,WA1,WA2)
      IMPLICIT REAL*8     (A-H,O-Z)
      DIMENSION       CC(IDO,3,L1)           ,CH(IDO,L1,3)           , &
                      WA1(*)     ,WA2(*)
!     DATA TAUR,TAUI /-0.5D+0,0.866025403784439D+0/
      parameter (TAUR=-0.5D0)
      TAUI=dsqrt(0.75D0)    ! dsqrt not allowed in PARAMETER on aurora-ifort
      IF (IDO .NE. 2) GO TO 102
      DO 101 K=1,L1
         TR2 = CC(1,2,K)+CC(1,3,K)
         CR2 = CC(1,1,K)+TAUR*TR2
         CH(1,K,1) = CC(1,1,K)+TR2
         TI2 = CC(2,2,K)+CC(2,3,K)
         CI2 = CC(2,1,K)+TAUR*TI2
         CH(2,K,1) = CC(2,1,K)+TI2
         CR3 = TAUI*(CC(1,2,K)-CC(1,3,K))
         CI3 = TAUI*(CC(2,2,K)-CC(2,3,K))
         CH(1,K,2) = CR2-CI3
         CH(1,K,3) = CR2+CI3
         CH(2,K,2) = CI2+CR3
         CH(2,K,3) = CI2-CR3
  101 CONTINUE
      RETURN
  102 DO 104 K=1,L1
         DO 103 I=2,IDO,2
            TR2 = CC(I-1,2,K)+CC(I-1,3,K)
            CR2 = CC(I-1,1,K)+TAUR*TR2
            CH(I-1,K,1) = CC(I-1,1,K)+TR2
            TI2 = CC(I,2,K)+CC(I,3,K)
            CI2 = CC(I,1,K)+TAUR*TI2
            CH(I,K,1) = CC(I,1,K)+TI2
            CR3 = TAUI*(CC(I-1,2,K)-CC(I-1,3,K))
            CI3 = TAUI*(CC(I,2,K)-CC(I,3,K))
            DR2 = CR2-CI3
            DR3 = CR2+CI3
            DI2 = CI2+CR3
            DI3 = CI2-CR3
            CH(I,K,2) = WA1(I-1)*DI2+WA1(I)*DR2
            CH(I-1,K,2) = WA1(I-1)*DR2-WA1(I)*DI2
            CH(I,K,3) = WA2(I-1)*DI3+WA2(I)*DR3
            CH(I-1,K,3) = WA2(I-1)*DR3-WA2(I)*DI3
  103    CONTINUE
  104 CONTINUE
      RETURN
      END
      SUBROUTINE PASSB4 (IDO,L1,CC,CH,WA1,WA2,WA3)                              
      IMPLICIT REAL*8     (A-H,O-Z)
      DIMENSION       CC(IDO,4,L1)           ,CH(IDO,L1,4)           ,           &
                      WA1(*)     ,WA2(*)     ,WA3(*)                            
      IF (IDO .NE. 2) GO TO 102                                                 
      DO 101 K=1,L1                                                             
         TI1 = CC(2,1,K)-CC(2,3,K)                                              
         TI2 = CC(2,1,K)+CC(2,3,K)                                              
         TR4 = CC(2,4,K)-CC(2,2,K)                                              
         TI3 = CC(2,2,K)+CC(2,4,K)                                              
         TR1 = CC(1,1,K)-CC(1,3,K)                                              
         TR2 = CC(1,1,K)+CC(1,3,K)                                              
         TI4 = CC(1,2,K)-CC(1,4,K)                                              
         TR3 = CC(1,2,K)+CC(1,4,K)                                              
         CH(1,K,1) = TR2+TR3                                                    
         CH(1,K,3) = TR2-TR3                                                    
         CH(2,K,1) = TI2+TI3                                                    
         CH(2,K,3) = TI2-TI3                                                    
         CH(1,K,2) = TR1+TR4                                                    
         CH(1,K,4) = TR1-TR4                                                    
         CH(2,K,2) = TI1+TI4                                                    
         CH(2,K,4) = TI1-TI4                                                    
  101 CONTINUE                                                                  
      RETURN                                                                    
  102 DO 104 K=1,L1                                                             
         DO 103 I=2,IDO,2                                                       
            TI1 = CC(I,1,K)-CC(I,3,K)                                           
            TI2 = CC(I,1,K)+CC(I,3,K)                                           
            TI3 = CC(I,2,K)+CC(I,4,K)                                           
            TR4 = CC(I,4,K)-CC(I,2,K)                                           
            TR1 = CC(I-1,1,K)-CC(I-1,3,K)                                       
            TR2 = CC(I-1,1,K)+CC(I-1,3,K)                                       
            TI4 = CC(I-1,2,K)-CC(I-1,4,K)                                       
            TR3 = CC(I-1,2,K)+CC(I-1,4,K)                                       
            CH(I-1,K,1) = TR2+TR3                                               
            CR3 = TR2-TR3                                                       
            CH(I,K,1) = TI2+TI3                                                 
            CI3 = TI2-TI3                                                       
            CR2 = TR1+TR4                                                       
            CR4 = TR1-TR4                                                       
            CI2 = TI1+TI4                                                       
            CI4 = TI1-TI4                                                       
            CH(I-1,K,2) = WA1(I-1)*CR2-WA1(I)*CI2                               
            CH(I,K,2) = WA1(I-1)*CI2+WA1(I)*CR2                                 
            CH(I-1,K,3) = WA2(I-1)*CR3-WA2(I)*CI3                               
            CH(I,K,3) = WA2(I-1)*CI3+WA2(I)*CR3                                 
            CH(I-1,K,4) = WA3(I-1)*CR4-WA3(I)*CI4                               
            CH(I,K,4) = WA3(I-1)*CI4+WA3(I)*CR4                                 
  103    CONTINUE                                                               
  104 CONTINUE                                                                  
      RETURN                                                                    
      END                                                                       
      SUBROUTINE PASSB5 (IDO,L1,CC,CH,WA1,WA2,WA3,WA4)                          
      IMPLICIT REAL*8     (A-H,O-Z)
      DIMENSION       CC(IDO,5,L1)           ,CH(IDO,L1,5)           ,           &
                      WA1(*)     ,WA2(*)     ,WA3(*)     ,WA4(*)                
     DATA TR11,TI11,TR12,TI12 /0.309016994374947D+0, &
       0.951056516295154D+0,-0.809016994374947D+0,0.587785252292473D+0/                                       
!       TR12 is cos(pi/5)
!       TR11 is cos(2*pi/5) (and similar)
!      parameter (TR11= cos(acos(-1.D0)*0.4D0), TI11=sin(acos(-1.D0)*0.4D0), & 
!                 TR12=-cos(acos(-1.D0)*0.2D0), TI12=sin(acos(-1.D0)*0.2D0) )
      IF (IDO .NE. 2) GO TO 102                                                 
      DO 101 K=1,L1                                                             
         TI5 = CC(2,2,K)-CC(2,5,K)                                              
         TI2 = CC(2,2,K)+CC(2,5,K)                                              
         TI4 = CC(2,3,K)-CC(2,4,K)                                              
         TI3 = CC(2,3,K)+CC(2,4,K)                                              
         TR5 = CC(1,2,K)-CC(1,5,K)                                              
         TR2 = CC(1,2,K)+CC(1,5,K)                                              
         TR4 = CC(1,3,K)-CC(1,4,K)                                              
         TR3 = CC(1,3,K)+CC(1,4,K)                                              
         CH(1,K,1) = CC(1,1,K)+TR2+TR3                                          
         CH(2,K,1) = CC(2,1,K)+TI2+TI3                                          
         CR2 = CC(1,1,K)+TR11*TR2+TR12*TR3                                      
         CI2 = CC(2,1,K)+TR11*TI2+TR12*TI3                                      
         CR3 = CC(1,1,K)+TR12*TR2+TR11*TR3                                      
         CI3 = CC(2,1,K)+TR12*TI2+TR11*TI3                                      
         CR5 = TI11*TR5+TI12*TR4                                                
         CI5 = TI11*TI5+TI12*TI4                                                
         CR4 = TI12*TR5-TI11*TR4                                                
         CI4 = TI12*TI5-TI11*TI4                                                
         CH(1,K,2) = CR2-CI5                                                    
         CH(1,K,5) = CR2+CI5                                                    
         CH(2,K,2) = CI2+CR5                                                    
         CH(2,K,3) = CI3+CR4                                                    
         CH(1,K,3) = CR3-CI4                                                    
         CH(1,K,4) = CR3+CI4                                                    
         CH(2,K,4) = CI3-CR4                                                    
         CH(2,K,5) = CI2-CR5                                                    
  101 CONTINUE                                                                  
      RETURN                                                                    
  102 DO 104 K=1,L1                                                             
         DO 103 I=2,IDO,2                                                       
            TI5 = CC(I,2,K)-CC(I,5,K)                                           
            TI2 = CC(I,2,K)+CC(I,5,K)                                           
            TI4 = CC(I,3,K)-CC(I,4,K)                                           
            TI3 = CC(I,3,K)+CC(I,4,K)                                           
            TR5 = CC(I-1,2,K)-CC(I-1,5,K)                                       
            TR2 = CC(I-1,2,K)+CC(I-1,5,K)                                       
            TR4 = CC(I-1,3,K)-CC(I-1,4,K)                                       
            TR3 = CC(I-1,3,K)+CC(I-1,4,K)                                       
            CH(I-1,K,1) = CC(I-1,1,K)+TR2+TR3                                   
            CH(I,K,1) = CC(I,1,K)+TI2+TI3                                       
            CR2 = CC(I-1,1,K)+TR11*TR2+TR12*TR3                                 
            CI2 = CC(I,1,K)+TR11*TI2+TR12*TI3                                   
            CR3 = CC(I-1,1,K)+TR12*TR2+TR11*TR3                                 
            CI3 = CC(I,1,K)+TR12*TI2+TR11*TI3                                   
            CR5 = TI11*TR5+TI12*TR4                                             
            CI5 = TI11*TI5+TI12*TI4                                             
            CR4 = TI12*TR5-TI11*TR4                                             
            CI4 = TI12*TI5-TI11*TI4                                             
            DR3 = CR3-CI4                                                       
            DR4 = CR3+CI4                                                       
            DI3 = CI3+CR4                                                       
            DI4 = CI3-CR4                                                       
            DR5 = CR2+CI5                                                       
            DR2 = CR2-CI5                                                       
            DI5 = CI2-CR5                                                       
            DI2 = CI2+CR5                                                       
            CH(I-1,K,2) = WA1(I-1)*DR2-WA1(I)*DI2                               
            CH(I,K,2) = WA1(I-1)*DI2+WA1(I)*DR2                                 
            CH(I-1,K,3) = WA2(I-1)*DR3-WA2(I)*DI3                               
            CH(I,K,3) = WA2(I-1)*DI3+WA2(I)*DR3                                 
            CH(I-1,K,4) = WA3(I-1)*DR4-WA3(I)*DI4                               
            CH(I,K,4) = WA3(I-1)*DI4+WA3(I)*DR4                                 
            CH(I-1,K,5) = WA4(I-1)*DR5-WA4(I)*DI5                               
            CH(I,K,5) = WA4(I-1)*DI5+WA4(I)*DR5                                 
  103    CONTINUE                                                               
  104 CONTINUE                                                                  
      RETURN                                                                    
      END                                                                       
      SUBROUTINE PASSF (NAC,IDO,IP,L1,IDL1,CC,C1,C2,CH,CH2,WA)                  
      IMPLICIT REAL*8     (A-H,O-Z)
      DIMENSION       CH(IDO,L1,IP)          ,CC(IDO,IP,L1)          ,           &
                      C1(IDO,L1,IP)          ,WA(*)      ,C2(IDL1,IP),           &
                      CH2(IDL1,IP)                                              
      IDOT = IDO/2                                                              
      NT = IP*IDL1                                                              
      IPP2 = IP+2                                                               
      IPPH = (IP+1)/2                                                           
      IDP = IP*IDO                                                              
!                                                                               
      IF (IDO .LT. L1) GO TO 106                                                
      DO 103 J=2,IPPH                                                           
         JC = IPP2-J                                                            
         DO 102 K=1,L1                                                          
            DO 101 I=1,IDO                                                      
               CH(I,K,J) = CC(I,J,K)+CC(I,JC,K)                                 
               CH(I,K,JC) = CC(I,J,K)-CC(I,JC,K)                                
  101       CONTINUE                                                            
  102    CONTINUE                                                               
  103 CONTINUE                                                                  
      DO 105 K=1,L1                                                             
         DO 104 I=1,IDO                                                         
            CH(I,K,1) = CC(I,1,K)                                               
  104    CONTINUE                                                               
  105 CONTINUE                                                                  
      GO TO 112                                                                 
  106 DO 109 J=2,IPPH                                                           
         JC = IPP2-J                                                            
         DO 108 I=1,IDO                                                         
            DO 107 K=1,L1                                                       
               CH(I,K,J) = CC(I,J,K)+CC(I,JC,K)                                 
               CH(I,K,JC) = CC(I,J,K)-CC(I,JC,K)                                
  107       CONTINUE                                                            
  108    CONTINUE                                                               
  109 CONTINUE                                                                  
      DO 111 I=1,IDO                                                            
         DO 110 K=1,L1                                                          
            CH(I,K,1) = CC(I,1,K)                                               
  110    CONTINUE                                                               
  111 CONTINUE                                                                  
  112 IDL = 2-IDO                                                               
      INC = 0                                                                   
      DO 116 L=2,IPPH                                                           
         LC = IPP2-L                                                            
         IDL = IDL+IDO                                                          
         DO 113 IK=1,IDL1                                                       
            C2(IK,L) = CH2(IK,1)+WA(IDL-1)*CH2(IK,2)                            
            C2(IK,LC) = -WA(IDL)*CH2(IK,IP)                                     
  113    CONTINUE                                                               
         IDLJ = IDL                                                             
         INC = INC+IDO                                                          
         DO 115 J=3,IPPH                                                        
            JC = IPP2-J                                                         
            IDLJ = IDLJ+INC                                                     
            IF (IDLJ .GT. IDP) IDLJ = IDLJ-IDP                                  
            WAR = WA(IDLJ-1)                                                    
            WAI = WA(IDLJ)                                                      
            DO 114 IK=1,IDL1                                                    
               C2(IK,L) = C2(IK,L)+WAR*CH2(IK,J)                                
               C2(IK,LC) = C2(IK,LC)-WAI*CH2(IK,JC)                             
  114       CONTINUE                                                            
  115    CONTINUE                                                               
  116 CONTINUE                                                                  
      DO 118 J=2,IPPH                                                           
         DO 117 IK=1,IDL1                                                       
            CH2(IK,1) = CH2(IK,1)+CH2(IK,J)                                     
  117    CONTINUE                                                               
  118 CONTINUE                                                                  
      DO 120 J=2,IPPH                                                           
         JC = IPP2-J                                                            
         DO 119 IK=2,IDL1,2                                                     
            CH2(IK-1,J) = C2(IK-1,J)-C2(IK,JC)                                  
            CH2(IK-1,JC) = C2(IK-1,J)+C2(IK,JC)                                 
            CH2(IK,J) = C2(IK,J)+C2(IK-1,JC)                                    
            CH2(IK,JC) = C2(IK,J)-C2(IK-1,JC)                                   
  119    CONTINUE                                                               
  120 CONTINUE                                                                  
      NAC = 1                                                                   
      IF (IDO .EQ. 2) RETURN                                                    
      NAC = 0                                                                   
      DO 121 IK=1,IDL1                                                          
         C2(IK,1) = CH2(IK,1)                                                   
  121 CONTINUE                                                                  
      DO 123 J=2,IP                                                             
         DO 122 K=1,L1                                                          
            C1(1,K,J) = CH(1,K,J)                                               
            C1(2,K,J) = CH(2,K,J)                                               
  122    CONTINUE                                                               
  123 CONTINUE                                                                  
      IF (IDOT .GT. L1) GO TO 127                                               
      IDIJ = 0                                                                  
      DO 126 J=2,IP                                                             
         IDIJ = IDIJ+2                                                          
         DO 125 I=4,IDO,2                                                       
            IDIJ = IDIJ+2                                                       
            DO 124 K=1,L1                                                       
               C1(I-1,K,J) = WA(IDIJ-1)*CH(I-1,K,J)+WA(IDIJ)*CH(I,K,J)          
               C1(I,K,J) = WA(IDIJ-1)*CH(I,K,J)-WA(IDIJ)*CH(I-1,K,J)            
  124       CONTINUE                                                            
  125    CONTINUE                                                               
  126 CONTINUE                                                                  
      RETURN                                                                    
  127 IDJ = 2-IDO                                                               
      DO 130 J=2,IP                                                             
         IDJ = IDJ+IDO                                                          
         DO 129 K=1,L1                                                          
            IDIJ = IDJ                                                          
            DO 128 I=4,IDO,2                                                    
               IDIJ = IDIJ+2                                                    
               C1(I-1,K,J) = WA(IDIJ-1)*CH(I-1,K,J)+WA(IDIJ)*CH(I,K,J)          
               C1(I,K,J) = WA(IDIJ-1)*CH(I,K,J)-WA(IDIJ)*CH(I-1,K,J)            
  128       CONTINUE                                                            
  129    CONTINUE                                                               
  130 CONTINUE                                                                  
      RETURN                                                                    
      END                                                                       
      SUBROUTINE PASSF2 (IDO,L1,CC,CH,WA1)                                      
      IMPLICIT REAL*8     (A-H,O-Z)
      DIMENSION       CC(IDO,2,L1)           ,CH(IDO,L1,2)           ,           &
                      WA1(*)                                                    
      IF (IDO .GT. 2) GO TO 102                                                 
      DO 101 K=1,L1                                                             
         CH(1,K,1) = CC(1,1,K)+CC(1,2,K)                                        
         CH(1,K,2) = CC(1,1,K)-CC(1,2,K)                                        
         CH(2,K,1) = CC(2,1,K)+CC(2,2,K)                                        
         CH(2,K,2) = CC(2,1,K)-CC(2,2,K)                                        
  101 CONTINUE                                                                  
      RETURN                                                                    
  102 DO 104 K=1,L1                                                             
         DO 103 I=2,IDO,2                                                       
            CH(I-1,K,1) = CC(I-1,1,K)+CC(I-1,2,K)                               
            TR2 = CC(I-1,1,K)-CC(I-1,2,K)                                       
            CH(I,K,1) = CC(I,1,K)+CC(I,2,K)                                     
            TI2 = CC(I,1,K)-CC(I,2,K)                                           
            CH(I,K,2) = WA1(I-1)*TI2-WA1(I)*TR2                                 
            CH(I-1,K,2) = WA1(I-1)*TR2+WA1(I)*TI2                               
  103    CONTINUE                                                               
  104 CONTINUE                                                                  
      RETURN                                                                    
      END                                                                       
      SUBROUTINE PASSF3 (IDO,L1,CC,CH,WA1,WA2)
      IMPLICIT REAL*8     (A-H,O-Z)
      DIMENSION       CC(IDO,3,L1)           ,CH(IDO,L1,3)           , &
                      WA1(*)     ,WA2(*)
     DATA TAUR,TAUI /-0.5D+0,-0.866025403784439D+0/
!     TAUR=-0.5D0
!     TAUI=-sqrt(0.75D0)
!      parameter (TAUR=-0.5D0, TAUI=-dsqrt(0.75D0))
      IF (IDO .NE. 2) GO TO 102
      DO 101 K=1,L1
         TR2 = CC(1,2,K)+CC(1,3,K)
         CR2 = CC(1,1,K)+TAUR*TR2
         CH(1,K,1) = CC(1,1,K)+TR2
         TI2 = CC(2,2,K)+CC(2,3,K)
         CI2 = CC(2,1,K)+TAUR*TI2
         CH(2,K,1) = CC(2,1,K)+TI2
         CR3 = TAUI*(CC(1,2,K)-CC(1,3,K))
         CI3 = TAUI*(CC(2,2,K)-CC(2,3,K))
         CH(1,K,2) = CR2-CI3
         CH(1,K,3) = CR2+CI3
         CH(2,K,2) = CI2+CR3
         CH(2,K,3) = CI2-CR3
  101 CONTINUE
      RETURN
  102 DO 104 K=1,L1
         DO 103 I=2,IDO,2
            TR2 = CC(I-1,2,K)+CC(I-1,3,K)
            CR2 = CC(I-1,1,K)+TAUR*TR2
            CH(I-1,K,1) = CC(I-1,1,K)+TR2
            TI2 = CC(I,2,K)+CC(I,3,K)
            CI2 = CC(I,1,K)+TAUR*TI2
            CH(I,K,1) = CC(I,1,K)+TI2
            CR3 = TAUI*(CC(I-1,2,K)-CC(I-1,3,K))
            CI3 = TAUI*(CC(I,2,K)-CC(I,3,K))
            DR2 = CR2-CI3
            DR3 = CR2+CI3
            DI2 = CI2+CR3
            DI3 = CI2-CR3
            CH(I,K,2) = WA1(I-1)*DI2-WA1(I)*DR2
            CH(I-1,K,2) = WA1(I-1)*DR2+WA1(I)*DI2
            CH(I,K,3) = WA2(I-1)*DI3-WA2(I)*DR3
            CH(I-1,K,3) = WA2(I-1)*DR3+WA2(I)*DI3
  103    CONTINUE
  104 CONTINUE
      RETURN
      END
      SUBROUTINE PASSF4 (IDO,L1,CC,CH,WA1,WA2,WA3)                              
      IMPLICIT REAL*8     (A-H,O-Z)
      DIMENSION       CC(IDO,4,L1)           ,CH(IDO,L1,4)           ,           &
                      WA1(*)     ,WA2(*)     ,WA3(*)                            
      IF (IDO .NE. 2) GO TO 102                                                 
      DO 101 K=1,L1                                                             
         TI1 = CC(2,1,K)-CC(2,3,K)                                              
         TI2 = CC(2,1,K)+CC(2,3,K)                                              
         TR4 = CC(2,2,K)-CC(2,4,K)                                              
         TI3 = CC(2,2,K)+CC(2,4,K)                                              
         TR1 = CC(1,1,K)-CC(1,3,K)                                              
         TR2 = CC(1,1,K)+CC(1,3,K)                                              
         TI4 = CC(1,4,K)-CC(1,2,K)                                              
         TR3 = CC(1,2,K)+CC(1,4,K)                                              
         CH(1,K,1) = TR2+TR3                                                    
         CH(1,K,3) = TR2-TR3                                                    
         CH(2,K,1) = TI2+TI3                                                    
         CH(2,K,3) = TI2-TI3                                                    
         CH(1,K,2) = TR1+TR4                                                    
         CH(1,K,4) = TR1-TR4                                                    
         CH(2,K,2) = TI1+TI4                                                    
         CH(2,K,4) = TI1-TI4                                                    
  101 CONTINUE                                                                  
      RETURN                                                                    
  102 DO 104 K=1,L1                                                             
         DO 103 I=2,IDO,2                                                       
            TI1 = CC(I,1,K)-CC(I,3,K)                                           
            TI2 = CC(I,1,K)+CC(I,3,K)                                           
            TI3 = CC(I,2,K)+CC(I,4,K)                                           
            TR4 = CC(I,2,K)-CC(I,4,K)                                           
            TR1 = CC(I-1,1,K)-CC(I-1,3,K)                                       
            TR2 = CC(I-1,1,K)+CC(I-1,3,K)                                       
            TI4 = CC(I-1,4,K)-CC(I-1,2,K)                                       
            TR3 = CC(I-1,2,K)+CC(I-1,4,K)                                       
            CH(I-1,K,1) = TR2+TR3                                               
            CR3 = TR2-TR3                                                       
            CH(I,K,1) = TI2+TI3                                                 
            CI3 = TI2-TI3                                                       
            CR2 = TR1+TR4                                                       
            CR4 = TR1-TR4                                                       
            CI2 = TI1+TI4                                                       
            CI4 = TI1-TI4                                                       
            CH(I-1,K,2) = WA1(I-1)*CR2+WA1(I)*CI2                               
            CH(I,K,2) = WA1(I-1)*CI2-WA1(I)*CR2                                 
            CH(I-1,K,3) = WA2(I-1)*CR3+WA2(I)*CI3                               
            CH(I,K,3) = WA2(I-1)*CI3-WA2(I)*CR3                                 
            CH(I-1,K,4) = WA3(I-1)*CR4+WA3(I)*CI4                               
            CH(I,K,4) = WA3(I-1)*CI4-WA3(I)*CR4                                 
  103    CONTINUE                                                               
  104 CONTINUE                                                                  
      RETURN                                                                    
      END                                                                       
      SUBROUTINE PASSF5 (IDO,L1,CC,CH,WA1,WA2,WA3,WA4)
      IMPLICIT REAL*8     (A-H,O-Z)
      DIMENSION       CC(IDO,5,L1)           ,CH(IDO,L1,5)           , &
                      WA1(*)     ,WA2(*)     ,WA3(*)     ,WA4(*)
     DATA TR11,TI11,TR12,TI12 /0.309016994374947D+0, &
       -0.951056516295154D+0,-0.809016994374947D+0, &
       -0.587785252292473D+0/
!      parameter (TR11= cos(acos(-1.D0)*0.4D0), TI11=-sin(acos(-1.D0)*0.4D0), & 
!                 TR12=-cos(acos(-1.D0)*0.2D0), TI12=-sin(acos(-1.D0)*0.2D0) )

      IF (IDO .NE. 2) GO TO 102
      DO 101 K=1,L1
         TI5 = CC(2,2,K)-CC(2,5,K)
         TI2 = CC(2,2,K)+CC(2,5,K)
         TI4 = CC(2,3,K)-CC(2,4,K)
         TI3 = CC(2,3,K)+CC(2,4,K)
         TR5 = CC(1,2,K)-CC(1,5,K)
         TR2 = CC(1,2,K)+CC(1,5,K)
         TR4 = CC(1,3,K)-CC(1,4,K)
         TR3 = CC(1,3,K)+CC(1,4,K)
         CH(1,K,1) = CC(1,1,K)+TR2+TR3
         CH(2,K,1) = CC(2,1,K)+TI2+TI3
         CR2 = CC(1,1,K)+TR11*TR2+TR12*TR3
         CI2 = CC(2,1,K)+TR11*TI2+TR12*TI3
         CR3 = CC(1,1,K)+TR12*TR2+TR11*TR3
         CI3 = CC(2,1,K)+TR12*TI2+TR11*TI3
         CR5 = TI11*TR5+TI12*TR4
         CI5 = TI11*TI5+TI12*TI4
         CR4 = TI12*TR5-TI11*TR4
         CI4 = TI12*TI5-TI11*TI4
         CH(1,K,2) = CR2-CI5
         CH(1,K,5) = CR2+CI5
         CH(2,K,2) = CI2+CR5
         CH(2,K,3) = CI3+CR4
         CH(1,K,3) = CR3-CI4
         CH(1,K,4) = CR3+CI4
         CH(2,K,4) = CI3-CR4
         CH(2,K,5) = CI2-CR5
  101 CONTINUE
      RETURN
  102 DO 104 K=1,L1
         DO 103 I=2,IDO,2
            TI5 = CC(I,2,K)-CC(I,5,K)
            TI2 = CC(I,2,K)+CC(I,5,K)
            TI4 = CC(I,3,K)-CC(I,4,K)
            TI3 = CC(I,3,K)+CC(I,4,K)
            TR5 = CC(I-1,2,K)-CC(I-1,5,K)
            TR2 = CC(I-1,2,K)+CC(I-1,5,K)
            TR4 = CC(I-1,3,K)-CC(I-1,4,K)
            TR3 = CC(I-1,3,K)+CC(I-1,4,K)
            CH(I-1,K,1) = CC(I-1,1,K)+TR2+TR3
            CH(I,K,1) = CC(I,1,K)+TI2+TI3
            CR2 = CC(I-1,1,K)+TR11*TR2+TR12*TR3
            CI2 = CC(I,1,K)+TR11*TI2+TR12*TI3
            CR3 = CC(I-1,1,K)+TR12*TR2+TR11*TR3
            CI3 = CC(I,1,K)+TR12*TI2+TR11*TI3
            CR5 = TI11*TR5+TI12*TR4
            CI5 = TI11*TI5+TI12*TI4
            CR4 = TI12*TR5-TI11*TR4
            CI4 = TI12*TI5-TI11*TI4
            DR3 = CR3-CI4
            DR4 = CR3+CI4
            DI3 = CI3+CR4
            DI4 = CI3-CR4
            DR5 = CR2+CI5
            DR2 = CR2-CI5
            DI5 = CI2-CR5
            DI2 = CI2+CR5
            CH(I-1,K,2) = WA1(I-1)*DR2+WA1(I)*DI2
            CH(I,K,2) = WA1(I-1)*DI2-WA1(I)*DR2
            CH(I-1,K,3) = WA2(I-1)*DR3+WA2(I)*DI3
            CH(I,K,3) = WA2(I-1)*DI3-WA2(I)*DR3
            CH(I-1,K,4) = WA3(I-1)*DR4+WA3(I)*DI4
            CH(I,K,4) = WA3(I-1)*DI4-WA3(I)*DR4
            CH(I-1,K,5) = WA4(I-1)*DR5+WA4(I)*DI5
            CH(I,K,5) = WA4(I-1)*DI5-WA4(I)*DR5
  103    CONTINUE
  104 CONTINUE
      RETURN
      END
      FUNCTION PIMACH (DUM)
      IMPLICIT REAL*8 (A-H,O-Z)
!     PI=3.1415926535897932384626433832795028841971693993751058209749446
!
      PIMACH = 4.0D+0*ATAN(1.0D+0)
      RETURN
      END
-------------- next part --------------
      REAL*8 FUNCTION GAUNT(L1,L2,L3,MM1,MM2,MM3)                       
      IMPLICIT REAL*8 (A-H,O-Z)
     DATA FPI/12.56637061436D0/                                        
!      parameter (FPI=4.D0*acos(-1.D0) )
      J1=2*L1                                                           
      J2=2*L2                                                           
      J3=2*L3                                                           
      M1=-2*MM1                                                         
      M2=2*MM2                                                          
      M3=2*MM3                                                          
!     FAC=(-1.D0)**MM1*SQRT( (J1+1)*(J2+1)*(J3+1)/FPI)                     
      FAC=SQRT(DBLE((J1+1)*(J2+1)*(J3+1))/FPI)
      IF( 2*(MM1/2) .ne. MM1)FAC=-FAC
      GAUNT=FAC*T3J0(J1,J2,J3)*T3J(J1,J2,J3,M1,M2,M3)                   
      RETURN                                                            
      END                                                               
-------------- next part --------------
       Subroutine W2kinit ()
!
!      Generic initialization with Wien2k
!
!      This code sets the stack limit to the hard limit
!      and setups up error-handlers.
!
!      The intent with the handlers is to ensure that a
!      mpi job will stop if one processor encounters an
!      error, but signalling an MPI_ABORT (via abort_parallel)
!      to all the other processors. Almost certainly this
!      will not catch network failures and some other issues.
!
!      It is probably only needed for mpi codes since in
!      other cases the fortran rtl routines will give more
!      information.
!
!      The handlers are in C, as this is 
!      more portable than Intel's ifport routines, plus some
!      handlers as well using ifport if -DINTEL_VML is in
!      the compile line. It appears that some fortran issues
!      are not going to the c rtl, so these are added as
!      well although this could be dangerous! 
!
!      L. D. Marks, January 2010
!
#ifdef INTEL_VML
!       include 'mkl_vml.fi'
#endif
       call w2k_extend_limits()

#ifdef Parallel
!      Register C signal dispatcher, needed for parallel only
       call w2k_register_signals()
#endif

#ifdef INTEL_VML
!      Initialize Intel Vector Math Library (part of MKL)
!       oldmode = vmlsetmode( IOR(VML_HA, VML_DOUBLE_CONSISTENT) )
#endif
       return
       end

        
        Subroutine W2k_catch_signal(wsig)
#ifdef Parallel
        use parallel, only: abort_parallel, myid
#else
        parameter (myid = 0)
#endif
        integer wsig(*)

!	Signal messages, for lack of Fortran psignal().
!	The sequence is W2k-internal and should match
!	W2kutils.c:w2k_dispatch_signal()
       character *50 signal_message(0:12)
       data signal_message / &
     &		 'Unknown signal received'		& ! 0
     &		,'Received an Interrupt'		& ! 1
     &		,'Found an Illegal Instruction'		& ! 2
     &		,'Abort'				& ! 3
     &		,'Bus Error'				& ! 4
     &		,'Floating-Point Exception'		& ! 5
     &		,'SIGSEGV, contact developers'		& ! 6
     &		,'Stack Fault, contact developers'	& ! 7
     &		,'Run out of CPU time allotted, sorry'	& ! 8
     &		,'Process kill signal received'		& ! 9
     &		,'Process termination signal received'	& ! 10
     &		,'Hangup signal received'		& ! 11
     &		,'IOT trap (abort)'			& ! 12
     & /
!
        i = wsig(1)
	if (i .lt. 1 .or. i .gt. 11) i = 0
	call outerr('Unknown', trim(signal_message(i)))
	write(*,*)'Child id',myid, trim(signal_message(i))

#ifdef Parallel
!       Ensure all mpi processes know something has gone wrong
        call abort_parallel
#endif
!     Just in case we are still alive
      stop 'WIEN2K ABORTING'
      return
      end

-------------- next part --------------
      SUBROUTINE HAMILT(LTOP,NAT,NV,U,UP,DU,DUP,MULT,R, &
                        POS,V,VOLUME,CHN,E,INS)
!
#ifdef Parallel
      use matrices, only : H, S,HSROWS, KZZ, XK, YK, ZK, RK
#else
      use matrices, only : HS, HSDIAG, HSROWS, KZZ, XK, YK, ZK, RK
#endif
      use loabc, only : ALO, BLO, CLO, ELO, PLO, DPLO, PELO, &
                        DPELO, PEILO, PI12LO, PE12LO
      use parallel, only : ICTXTALL, DESCHS,DESCZ, NPE, MYID,  &
                           NPROW, NPCOL, MYROWHS, MYCOLHS, &
                           BLOCKSIZE, LDHS, LCOLHS, LCOLZ, &
                           INIT_PARALLEL, BARRIER, CLOSE_PARALLEL
      use lapw_timer, only : time_albl, time_loop260, time_phase,   &
                        time_legendre, time_step_function, time_h,   &
        	       time_us, time_loop210, time_loop230,   &
      		       time_loop240, time_overlap, time_lo,   &
      		       START_TIMER, STOP_TIMER, READ_CPU_TIME, READ_wall_TIME, &
                       init_all_timer,time_distiouter
      use lolog, only : nlo, loor, lapw, ilo
      use rotmat, only: ROTIJ, ROTLOC
      use struk, only : NDF,multmax
      use out, only   : WARP
      use albl

!      use out, only   : WARP1, WARP2, WARP3
      IMPLICIT NONE
      INCLUDE 'param.inc'
!
!        Parameters
!
      DOUBLE PRECISION   ONE, TOL, UTOP, ZERO, HALF
      DOUBLE PRECISION   TWO, THREE, FOUR
      PARAMETER          (ONE=1.0D+0, TOL=1.0D-8)
      PARAMETER          (TWO=2.0D+0, THREE=3.0D0, FOUR=4.0D0)
      PARAMETER          (HALF=0.5D0)
      PARAMETER          (UTOP=0.9999999D+0, ZERO=0.0D+0)
      DOUBLE COMPLEX         CIMAG
      PARAMETER          (CIMAG = (0.0D0,1.0D0))
      DOUBLE COMPLEX         CZERO
      PARAMETER          (CZERO=(0.0D0,0.0D0))
      DOUBLE COMPLEX     CONE
      PARAMETER          (CONE=(1.0d0,0.0d0))
!
!        Arguments
!
      INTEGER            INS, LTOP, NAT, NV
      INTEGER            MULT(NAT)
      DOUBLE PRECISION   VOLUME
      DOUBLE PRECISION   R(NAT)
!      DOUBLE PRECISION   XK(HSROWS+1), YK(HSROWS+1), ZK(HSROWS+1)
      DOUBLE PRECISION   CHN(0:LMAX-1,NAT), DU(0:LMAX-1,NAT)
      DOUBLE PRECISION   DUP(0:LMAX-1,NAT), E(0:LMAX-1,NAT)
      DOUBLE PRECISION   POS(3,NDF)
      DOUBLE PRECISION   U(0:LMAX-1,NAT), UP(0:LMAX-1,NAT), V(NAT)

!        YL(lm,:,i) - temporary storage of spherical harmonics for
!                     local orbitals
!
      DOUBLE COMPLEX, allocatable :: YL(:,:,:)
!
!        Locals
!
      DOUBLE PRECISION   V3
      INTEGER            jlo,jlop
      DOUBLE PRECISION   atmp1, tmp3, stmp, htmp, btmp1
      DOUBLE PRECISION   akinlor,akinlorup,akinloiup,akinbr,akinbi
      INTEGER            I, INDATM, indatm0, J, JEQ, JEQO, JNEQ, K, L
      INTEGER            L0M0, MO1, MO2, N, NNLO, IATM, IHELP, IOUTER
      INTEGER            IMIN, IMAX, II
      DOUBLE PRECISION   ARGX, ARGY, ARGZ, ATMP, BESR, BTMP
      DOUBLE PRECISION   ARGXX, ARGYY, ARGZZ
      DOUBLE PRECISION   C1R, C1I, C2R, C2I, C3R, C3I
      DOUBLE PRECISION   C11R, C11I, C12R, C12I, C13R, C13I
      DOUBLE PRECISION   C1RUP, C1IUP, C2RUP, C2IUP, C3RUP, C3IUP
      DOUBLE PRECISION   C11RUP,C11IUP,C12RUP,C12IUP,C13RUP,C13IUP
      DOUBLE PRECISION   C6R, C6I, C7R, C7I, C8R, C8I
      DOUBLE PRECISION   CABR, CABI, CABCR, CABCI, CLR, CLI
      DOUBLE PRECISION   CLYLR, CLYLI, CLYLNR, CLYLNI
      DOUBLE PRECISION   DLL1, PI4R2V
      DOUBLE PRECISION   DTIME1, DTIME2, DTIMH, DTIMLG, DTIMMA, DTIMPH,DTIMDIST
      DOUBLE PRECISION   DTIME3, dtime240, dtime241, dtime242
      DOUBLE PRECISION   dtime260, dtime211, dtime231, dtime210,dtime230,time_albl_tot
      DOUBLE PRECISION   time_albl_totw, DTIMLGw, DTIMMAw, DTIMPHw,DTIMDISTw,DTIMUSw
      DOUBLE PRECISION   DTIMS, DTIMUS, PI, Q, RKN, TMP,  VIFPR4, rk_tmp1

      DOUBLE PRECISION, allocatable ::   DEN(:), DFJ(:), FJ(:), FK(:), rk_tmp(:)
      DOUBLE PRECISION, allocatable ::   PPLX(:), PPLY(:), ROTV1(:)
      DOUBLE PRECISION, allocatable ::   ROTV2(:), SLEN(:,:), TMP1(:)
      DOUBLE PRECISION, allocatable ::   TMP2(:), VEC(:)
      DOUBLE PRECISION, allocatable ::   X2(:,:), XL(:),x(:)
      DOUBLE PRECISION, allocatable ::   P(:,:,:)
      DOUBLE PRECISION, allocatable ::   help1(:), help_cos(:)
      DOUBLE PRECISION              ::   xk_tmp_c,yk_tmp_c,zk_tmp_c
      DOUBLE PRECISION, allocatable ::   xk_tmp_r(:),yk_tmp_r(:),zk_tmp_r(:)
!_COMPLEX      DOUBLE COMPLEX, allocatable ::     help_exp(:)
      DOUBLE PRECISION, allocatable ::   help_sin(:), tmp_y(:)
#if defined (INTEL_VML)
!_COMPLEX      DOUBLE PRECISION, allocatable ::   help_tmpsin(:), help_tmpcos(:)
      DOUBLE PRECISION, allocatable ::   tmp_ytmp(:)
#endif
!_REAL      DOUBLE PRECISION ::   PHS
!_REAL      DOUBLE PRECISION, allocatable ::   PHASE(:),spanelus(:,:)
!_COMPLEX      DOUBLE COMPLEX ::     PHS
!_COMPLEX      DOUBLE COMPLEX, allocatable ::     PHASE(:),spanelus(:,:)
      DOUBLE COMPLEX, allocatable ::         PHSC(:)
!_REAL      DOUBLE PRECISION, allocatable ::  HPANEL(:,:)
!_REAL      DOUBLE PRECISION, allocatable ::  SPANEL(:,:)
!_COMPLEX      DOUBLE COMPLEX, allocatable :: HPANEL(:, :)
!_COMPLEX      DOUBLE COMPLEX, allocatable :: SPANEL(:, :)
            integer, allocatable:: kzz_tmp(:,:), kzz_ihelp(:,:)
!
!        External Functions
!
!_REAL      DOUBLE PRECISION   USTPHX, WARPIN
!_COMPLEX      DOUBLE COMPLEX         USTPHX, WARPIN
      EXTERNAL           USTPHX, WARPIN
!
!        External Subroutines
!
      EXTERNAL           CPUTIM, DVBES1, ROTATE, SPHBES, YLM
!
!        Intrinsic Functions
!
      INTRINSIC          ABS, ATAN, DCONJG, COS, DBLE, DCMPLX, EXP, SIN
      INTRINSIC          SQRT

      integer i_g,j_g,hsdim_r,hsdim_c,i_col,ipr,ipc,ii_g
      integer global2local, jnnlo
      integer, allocatable :: j_end(:), j_end_1(:)

      CALL BARRIER

!
      allocate( DFJ(0:LMAX-1), FJ(0:LMAX-1), FK(3) )
      allocate( ROTV1(3),ROTV2(3),VEC(3)  )
      allocate( XL(LMAX-2) )

#ifdef Parallel
      HSdim_r=LDHS
      HSdim_c=LCOLHS
#else
      HSdim_c=HSROWS
      HSdim_r=HSROWS
#endif

      allocate( HPANEL( HSdim_r, BLOCKSIZE) )
      allocate( SPANEL( HSdim_r, BLOCKSIZE) )
      allocate( DEN(HSdim_r), rk_tmp(HSdim_r) )
      allocate( PPLX(HSdim_r), PPLY(HSdim_r) )
      allocate( SLEN(HSdim_r,blocksize), TMP1(HSdim_r) )
      allocate( TMP2(HSdim_r) )
      allocate( kzz_TMP(3,HSdim_r) )
      allocate( kzz_ihelp(3,blocksize) )
      allocate( X2(HSdim_r,blocksize), X(HSdim_r) )
      allocate( P(HSdim_r,0:LTOP-1,blocksize) )
      allocate( help1(HSdim_r), help_cos(HSdim_r), help_sin(HSdim_r) )
#if defined (INTEL_VML)
!_COMPLEX      allocate( help_tmpcos(HSdim_r), help_tmpsin(HSdim_r) )
      allocate( tmp_ytmp(HSdim_r) )
#endif
      allocate( tmp_y(HSdim_r) )
!_COMPLEX      allocate( help_exp(HSdim_r) )
      allocate( PHASE(HSdim_r) )
      allocate(spanelus(HSdim_r,blocksize))
      allocate(j_end(blocksize),j_end_1(blocksize))
      allocate(xk_tmp_r(HSdim_r),yk_tmp_r(HSdim_r),zk_tmp_r(HSdim_r))

      CALL init_albl(ltop-1,'HAM')

!_REAL     write(6,'(a20,f12.1,a,a,2i6)') 'allocate spanel',HSdim_r*blocksize*8.0/(1024.0*1024.0),' MB ',&
!_REAL           '         dimensions',HSdim_r,blocksize
!_REAL     write(6,'(a20,f12.1,a,a,2i6)') 'allocate hpanel',HSdim_r*blocksize*8.0/(1024.0*1024.0),' MB ',&
!_REAL           '         dimensions',HSdim_r,blocksize
!_COMPLEX  write(6,'(a20,f12.1,a,a,2i6)') 'allocate spanel',HSdim_r*blocksize*16.0/(1024.0*1024.0),' MB ',&
!_COMPLEX        '         dimensions',HSdim_r,blocksize
!_COMPLEX  write(6,'(a20,f12.1,a,a,2i6)') 'allocate hpanel',HSdim_r*blocksize*16.0/(1024.0*1024.0),' MB ',&
!_COMPLEX        '         dimensions',HSdim_r,blocksize
!_REAL      write(6,'(a20,f12.1,a,a,2i6)') 'allocate spanelus',HSdim_r*blocksize*8.0/(1024.0*1024.0),' MB ',&
!_REAL           '         dimensions',HSdim_r,blocksize
!_COMPLEX   write(6,'(a20,f12.1,a,a,2i6)') 'allocate spanelus',HSdim_r*blocksize*16.0/(1024.0*1024.0),' MB ',&
!_COMPLEX           '         dimensions',HSdim_r,blocksize
      write(6,'(a20,f12.1,a,a,2i6)') 'allocate slen',HSdim_r*blocksize*8.0/(1024.0*1024.0),' MB ',&
           '         dimensions',HSdim_r,blocksize
      write(6,'(a20,f12.1,a,a,2i6)') 'allocate x2',HSdim_r*blocksize*8.0/(1024.0*1024.0),' MB ',&
           '         dimensions',HSdim_r,blocksize
      write(6,'(a20,f12.1,a,a,3i6)') 'allocate legendre',dble((HSdim_r)*LMAX*blocksize*8)/(1024.0*1024.0),' MB ',&
           '         dimensions',HSdim_r,lmax,blocksize
      write(6,'(a20,f12.1,a,a,2i6)') 'allocate al,bl (row)',HSdim_r*ltop*2.0*2*8.0/(1024.0*1024.0),' MB ',&
           '         dimensions',HSdim_r,ltop
      write(6,'(a20,f12.1,a,a,2i6)') 'allocate al,bl (col)',blocksize*ltop*2.0*2*8.0/(1024.0*1024.0),' MB ',&
           '         dimensions',blocksize,ltop
      write(6,'(a20,f12.1,a,a,3i6)') 'allocate YL',dble((LOMAX+1)*(LOMAX+1)*HSdim_r*multmax*16)/(1024.0*1024.0),' MB ',&
           '         dimensions',(LOMAX+1)*(LOMAX+1)-1,HSdim_r,multmax

      PI = FOUR*ATAN(ONE)
      DO I = 1, LMAX - 2
         XL(I) = DBLE(I)/DBLE(I+1)
      enddo
      DTIMPH = ZERO
      DTIMUS = ZERO
      DTIMLG = ZERO
      DTIMMA = ZERO
      DTIMS = ZERO
      DTIMH = ZERO
      DTIMDIST=ZERO
      time_albl_tot=zero
      DTIMPHw = ZERO
      DTIMUSw = ZERO
      DTIMLGw = ZERO
      DTIMMAw = ZERO
      DTIMDISTw=ZERO
      time_albl_totw=zero
!
!        compute Overlap-matrix
!
      CALL START_TIMER(time_loop260)

      j=0
      DO J_g = 1,NV
#ifdef Parallel
         ipr=mod((j_g-1)/blocksize,nprow)
         if (ipr.ne.myrowhs) cycle
         j=j+1
#else
         j=j_g
#endif
         kzz_tmp(1:3,j)=KZZ(1:3,J_g) 
         xk_tmp_r(j)=XK(J_g)
         yk_tmp_r(j)=YK(J_g)
         zk_tmp_r(j)=ZK(J_g)
         rk_tmp(j)=RK(J_g)
      enddo

      iouter_loop: DO IOUTER = 0, NV/(BLOCKSIZE)

#ifdef Parallel
         ipc=mod(iouter,npcol)
         if (mycolhs.ne.ipc) cycle
#endif
         IMIN = IOUTER*BLOCKSIZE + 1
         IMAX = MIN(NV, (IOUTER + 1)*BLOCKSIZE)
!
!        precompute Legendre-Polynomials
!
         CALL START_TIMER(time_legendre)

         IHELP = 0
!DEC$ PARALLEL
         legrende: DO I_g = IMIN, IMAX
            IHELP = IHELP + 1

#ifdef Parallel
            j=0
            DO J_g = 1, I_g-1
               ipr=mod((j_g-1)/blocksize,nprow)
               if (ipr.ne.myrowhs) cycle
               j=j+1
            enddo
            j_end_1(ihelp)=j
            ipr=mod((i_g-1)/blocksize,nprow)
            if (ipr.eq.myrowhs) j=j+1
            j_end(ihelp)=j
#else
            j_end(ihelp)=i_g
            j_end_1(ihelp)=i_g-1
#endif

            kzz_ihelp(1:3,ihelp)=KZZ(1:3,I_g) 
            xk_tmp_c=xk(i_g)
            yk_tmp_c=yk(i_g)
            zk_tmp_c=zk(i_g)
            rk_tmp1=RK(I_g)

            DO J = 1,j_end(ihelp)
               X2(J,ihelp) =  XK_tmp_c*XK_tmp_r(J) +  &
                              YK_tmp_c*YK_tmp_r(J) +  &
                              ZK_tmp_c*ZK_tmp_r(J)
               SLEN(J,ihelp) = SQRT((XK_tmp_r(J)-XK_tmp_c)*(XK_tmp_r(J)-XK_tmp_c)+ &
                                    (YK_tmp_r(J)-YK_tmp_c)*(YK_tmp_r(J)-YK_tmp_c)+ &
                                    (ZK_tmp_r(J)-ZK_tmp_c)*(ZK_tmp_r(J)-ZK_tmp_c))
               x(j)=X2(J,ihelp)
               den(j)=rk_tmp1*rk_tmp(j)
            enddo

            DO J = 1,j_end(ihelp)
               IF (DEN(J) .GT. TOL) X(J) = X(J)/DEN(J)               
               PPLX(J) = X(J)*X(J)
               PPLY(J) = PPLX(J) - ONE
               P(J,0,ihelp) = ONE
               P(J,1,ihelp) = X(J)
            enddo

            DO K = 2, LTOP - 2
               DO J = 1, j_end(ihelp)
                  IF (ABS(X(J)) .LT. UTOP) THEN
                     P(J,K,ihelp) = PPLX(J) + XL(K-1)*PPLY(J)
                     PPLX(J) = X(J)*P(J,K,ihelp)
                     PPLY(J) = PPLX(J) - P(J,K-1,ihelp)
                     P(J,LTOP-1,ihelp) = PPLX(J) + XL(LTOP-2)*PPLY(J)
                  ELSE
                     P(J,K,ihelp) = X(J)*P(J,K-1,ihelp)
                  ENDIF
               enddo
            enddo
            
            DO J = 1, j_end(ihelp)
               IF (ABS(X(J)) .LT. UTOP) THEN
                  P(J,LTOP-1,ihelp) = PPLX(J) + XL(LTOP-2)*PPLY(J)
               ELSE
                  P(J,LTOP-1,ihelp) = X(J)*P(J,LTOP-2,ihelp)
               ENDIF
            enddo
         enddo legrende

         CALL STOP_TIMER(time_legendre)
         DTIMLG = DTIMLG + READ_CPU_TIME(time_legendre)
!         DTIMLGw = DTIMLGw + READ_wall_TIME(time_legendre)

!_REAL          SPANEL(:,:) = ZERO
!_COMPLEX       SPANEL(:,:) = CZERO            
!_REAL          spanelus(:,:)=ZERO
!_COMPLEX       spanelus(:,:)=CZERO
!_REAL          HPANEL(:,:) = ZERO
!_COMPLEX       HPANEL(:,:) = CZERO

         jneq_loop: DO JNEQ = 1, NAT

            if (jneq.eq.1) then
               INDATM0 = 0
            else
               indatm0= indatm0+mult(jneq-1)
            endif

            VIFPR4 = FOUR*PI*VOLUME*R(JNEQ)**4

!
!        precompute al(kn) and bl(kn)
!
            CALL START_TIMER(time_albl)

            call make_albl(jneq,r(jneq),ltop-1,NV,'HAM',imin,imax)
            
            CALL STOP_TIMER(time_albl)
            time_albl_tot=time_albl_tot+READ_CPU_TIME(time_albl)
!            time_albl_totw=time_albl_totw+READ_wall_TIME(time_albl)

            IMIN = IOUTER*BLOCKSIZE + 1
            IMAX = MIN(NV, (IOUTER + 1) * BLOCKSIZE)
            IHELP = 0
!DEC$ PARALLEL
            imin1: DO I_g = IMIN, IMAX

               IHELP = IHELP + 1
!
!        calculate phase-factors
!
               CALL START_TIMER(time_phase)

!_REAL         PHASE(:) = ZERO
!_COMPLEX      PHASE(:) = CZERO

               DO JEQ = 1, MULT(JNEQ)
            
                  INDATM = INDATM0 + jeq
                  
                  ARGXX = POS(1,INDATM)*KZZ_ihelp(1,ihelp)
                  ARGXX = ARGXX + POS(2,INDATM)*KZZ_ihelp(2,ihelp)
                  ARGXX = ARGXX + POS(3,INDATM)*KZZ_ihelp(3,ihelp)

                  DO J = 1,j_end(ihelp)
                     ARGX = POS(1,INDATM)*KZZ_tmp(1,J) - ARGXX
                     ARGX = POS(2,INDATM)*KZZ_tmp(2,J) + ARGX
                     ARGX = POS(3,INDATM)*KZZ_tmp(3,J) + ARGX
                     help1(j)=TWO*PI*(ARGX)
                  enddo

!MASS-Lib http://www.rs6000.ibm.com/resource/technology/MASS
!MASS-Lib help_cos(j)-cos(help1(j)), j=1,i
! or vmllib from Intel
!
#if defined (INTEL_VML)
!_REAL            call vdcos(j_end(ihelp),help1,help_cos)
! in case of older mk/vml you may miss vzcis (IPO Error: unresolved : vzcis_)
! Either upgrade the libraries (ifort+mkl) or:
! Comment the line below and uncomment the other ones. 
! Uncomment also all other lines above where   help_tmpcos occurs
!!_COMPLEX         call vzcis(j_end(ihelp),help1,help_exp)
!_COMPLEX         call vdcos(j_end(ihelp),help1,help_tmpcos)
!_COMPLEX         call vdsin(j_end(ihelp),help1,help_tmpsin)
!_COMPLEX         do j = 1, j_end(ihelp)
!_COMPLEX            help_exp(j) = dcmplx(help_tmpcos(j),help_tmpsin(j))
!_COMPLEX         end do
#else
!_REAL            call vcos(help_cos,help1,j_end(ihelp))
!_COMPLEX         call vcosisin(help_exp,help1,j_end(ihelp))
#endif
                  do j = 1,j_end(ihelp)
!_REAL               PHASE(j) = PHASE(j) + help_cos(j)
!_COMPLEX            PHASE(j) = PHASE(j) + help_exp(j)
                  enddo
               enddo 

               CALL STOP_TIMER(time_phase)
               DTIMPH = DTIMPH + READ_CPU_TIME(time_phase)
!               DTIMPHw = DTIMPHw + READ_wall_TIME(time_phase)

               CALL START_TIMER(time_us)
!
!        initialize Overlap-matrixelements with step-function values
!
ins=0    !fixed PB 20.5.08. because alternative (rm case.vns) does not work anymore
               IF (INS .EQ. 0) THEN

                  CALL START_TIMER(time_step_function)

                  do j=1,j_end_1(ihelp)
                     help1(j) = slen(j,ihelp)*r(jneq)
                     tmp_y(j) = help1(j)*help1(j)*help1(j)
                  end do
!MASS-Lib http://www.rs6000.ibm.com/resource/technology/MASS
!MASS-Lib help_sin(j)=sin(help1(j)), help_cos(j)-cos(help1(j))
#if defined (INTEL_VML)
                  call vdsincos(j_end_1(ihelp),help1,help_sin,help_cos)
#else
                  call vsincos(help_sin,help_cos,help1,j_end_1(ihelp))
#endif
!MASS-Lib http://www.rs6000.ibm.com/resource/technology/MASS
!MASS-Lib: tmp_y(j)=1/tmp_y(j), j=1,i-1

#if defined (INTEL_VML)
                  call vdinv(j_end_1(ihelp), tmp_y, tmp_ytmp)
                  call dcopy(j_end_1(ihelp), tmp_ytmp, 1, tmp_y, 1)
#else
                  call vrec(tmp_y,tmp_y,j_end_1(ihelp))
#endif
                  v3=THREE*V(JNEQ)
                  DO J = 1, j_end_1(ihelp)
                     BESR = (help1(j)*help_COS(j)-help_SIN(j))*tmp_y(j)
                     spanelus(j,ihelp)=spanelus(j,ihelp)+v3*PHASE(j)*BESR
                  enddo
#ifdef Parallel
                  ipr=mod((i_g-1)/blocksize,nprow)
                  if (ipr.eq.myrowhs) then
                     i=global2local(i_g,nprow,blocksize)                     
                     spanelus(i,ihelp)=spanelus(i,ihelp)+ONE/DBLE(NAT) - V(JNEQ)*DBLE(MULT(JNEQ))
                  endif
#else
                  spanelus(i_g,ihelp)=spanelus(i_g,ihelp)+ONE/DBLE(NAT) - V(JNEQ)*DBLE(MULT(JNEQ))
#endif
                  CALL STOP_TIMER(time_step_function)
                  DTIMS = DTIMS + READ_CPU_TIME(time_step_function)
                  
                  CALL START_TIMER(time_h)

                  if (jneq.eq.nat) then

                     j=0
                     DO J = 1, j_end(ihelp)
                        SPANEL(J,IHELP) = SPANEL(J,IHELP)+spanelus(j,ihelp)
                        HPANEL(J,IHELP) = HPANEL(J,IHELP)+X2(j,ihelp)*spanelus(j,ihelp)
                        HPANEL(J,IHELP) = HPANEL(J,IHELP) + &
                            WARPIN(KZZ_tmp(1,J)-KZZ_ihelp(1,ihelp), &
                                       KZZ_tmp(2,J)-KZZ_ihelp(2,ihelp), &
                                       KZZ_tmp(3,J)-KZZ_ihelp(3,ihelp))
                     enddo

                  endif

                  CALL STOP_TIMER(time_h)
                  DTIMH = DTIMH + READ_CPU_TIME(time_h)

               ELSE
                     
                  if (jneq.eq.nat) then
                     j=0
                     DO J_g = 1, I_g
#ifdef Parallel
                        ipr=mod((j_g-1)/blocksize,nprow)
                        if (ipr.ne.myrowhs) cycle
#endif
                        j=j+1                        
                        FK(1) = XK(J_g) - XK(I_g)
                        FK(2) = YK(J_g) - YK(I_g)
                        FK(3) = ZK(J_g) - ZK(I_g)
                        SPANEL(J,IHELP) = USTPHX(FK,PHASE,j,NAT,V,R,MULT)
                        HPANEL(J,IHELP) = X2(j,ihelp)*SPANEL(J,IHELP)
                     enddo
                  endif
                  
               ENDIF
!
               CALL STOP_TIMER(time_us)
               DTIMUS = DTIMUS + READ_CPU_TIME(time_us)
!               DTIMUSw = DTIMUSw + READ_wall_TIME(time_us)

               CALL START_TIMER(time_overlap)
!
!        calculate Overlap-matrixelement
!
               TMP1(:) = ZERO
               TMP2(:) = ZERO

               DO L = 0, LTOP - 1

                  if (.not.lapw(l,jneq)) then
                     ATMP = AL_c(ihelp,L)
                     atmp1 = ATMP*U(L,JNEQ)*DU(L,JNEQ)*R(JNEQ)**2
                     DLL1 = DBLE(L+L+1)
                     DO J = 1, j_end(ihelp)
                        TMP3 = DLL1*P(J,L,ihelp)
                        STMP = ATMP*AL_r(J,L)
                        STMP = STMP*TMP3
                        TMP2(J) = TMP2(J) + STMP
                        HTMP = STMP*E(L,JNEQ)
                        TMP1(J) = TMP1(J) + HTMP
!  surface part of kinetic energy 
                        tmp1(j) = tmp1(j)+TMP3*ATMP1*AL_r(J,L)
                     end do
                  else

!....for lapw all BL lines must be kept
                     ATMP = AL_c(ihelp,L)
                     BTMP = BL_c(ihelp,L)*CHN(L,JNEQ)
                     atmp1 = R(JNEQ)**2* (ATMP*U(L,JNEQ)*DU(L,JNEQ) + &
                          BL_c(ihelp,L)*UP(L,JNEQ)*DU(L,JNEQ))
                     btmp1 = R(JNEQ)**2*(BL_c(Ihelp,L)*UP(L,JNEQ)*DUP(L,JNEQ) + &
                          AL_c(ihelp,L)*U(L,JNEQ)*DUp(L,JNEQ))
                     DLL1 = DBLE(L+L+1)

                     DO J = 1, j_end(ihelp)
                        TMP3 = DLL1*P(J,L,ihelp)
                        STMP = ATMP*AL_r(J,L)
                        STMP = STMP + BTMP*BL_r(J,L)
                        STMP = STMP*TMP3
                        TMP2(J) = TMP2(J) + STMP
                        HTMP = ATMP*BL_r(J,L)
                        HTMP = HTMP*TMP3
                        HTMP = HTMP + STMP*E(L,JNEQ)
                        TMP1(J) = TMP1(J) + HTMP
!  surface part of kinetic energy 
                        tmp1(j) = tmp1(j)+&
                             TMP3*(ATMP1*AL_r(J,L)+Btmp1*BL_r(J,L))
                     end do
                  endif
                  
               enddo

               DO J = 1, j_end(ihelp)
                  TMP1(J) = TMP1(J)*VIFPR4
                  TMP2(J) = TMP2(J)*VIFPR4
                  SPANEL(J,IHELP) = SPANEL(J,IHELP) + PHASE(J)*TMP2(J)
                  HPANEL(J,IHELP) = HPANEL(J,IHELP) + PHASE(J)*TMP1(J)
               enddo

               CALL STOP_TIMER(time_overlap)
               DTIMMA = DTIMMA + READ_CPU_TIME(time_overlap)
!               DTIMMAw = DTIMMAw + READ_wall_TIME(time_overlap)

            END DO imin1
         enddo jneq_loop

         CALL start_TIMER(time_distiouter)

         IHELP = 0
         IMIN = IOUTER*BLOCKSIZE + 1
         IMAX = MIN(NV, (IOUTER + 1)* BLOCKSIZE)
         
         DO I_g = IMIN, IMAX
            IHELP = IHELP + 1
#ifdef Parallel
            i=global2local(i_g,npcol,blocksize)
!_REAL      CALL DCOPY(HSdim_r, HPANEL(1,IHELP), 1, H(1,I), 1 )
!_REAL      CALL DCOPY(HSdim_r, SPANEL(1,IHELP), 1, S(1,I), 1 )
!_COMPLEX   CALL ZCOPY(HSdim_r, HPANEL(1,IHELP), 1, H(1,I), 1 )
!_COMPLEX   CALL ZCOPY(HSdim_r, SPANEL(1,IHELP), 1, S(1,I), 1 )            
#else
!_REAL      CALL DCOPY(I_g-1, HPANEL(1,IHELP), 1, HS(I_g, 1), HSrows )
!_REAL      CALL DCOPY(I_g, SPANEL(1,IHELP), 1, HS(1,I_g), 1 )
!_COMPLEX   CALL ZCOPY(I_g-1, HPANEL(1,IHELP), 1, HS(I_g, 1), HSrows )
!_COMPLEX   CALL ZCOPY(I_g, SPANEL(1,IHELP), 1, HS(1,I_g), 1 )
            HSDIAG(I_g)=HPANEL(I_g,IHELP)
#endif
         END DO

         CALL STOP_TIMER(time_distiouter)
         DTIMDIST = DTIMDIST + READ_CPU_TIME(time_distiouter)
!         DTIMDISTw = DTIMDISTw + READ_wall_TIME(time_distiouter)

      END DO iouter_loop

      CALL STOP_TIMER(time_loop260)

      call end_albl('HAM')

      deallocate( PHASE )
      deallocate( spanelus )
!_COMPLEX      deallocate( help_exp )
      deallocate( DEN, DFJ, FJ, FK, rk_tmp )
      deallocate( PPLX, PPLY )
      deallocate( SLEN, TMP1,TMP2)
      deallocate( X, XL, x2 )
      deallocate( P )
      deallocate( help1, help_cos, help_sin )
#if defined (INTEL_VML)
!!_COMPLEX      deallocate( help_tmpcos, help_tmpsin )
      deallocate( tmp_ytmp )
#endif
      deallocate( tmp_y )
      deallocate( kzz_TMP,kzz_ihelp )
      deallocate( j_end, j_end_1)
      deallocate( xk_tmp_r,yk_tmp_r,zk_tmp_r)

      WRITE (6,'(a,2f12.1)') 'Time for al,bl    (hamilt, cpu/wall) : ', time_albl_tot, time_albl_totw
      WRITE (6,'(a,2f12.1)') 'Time for legendre (hamilt, cpu/wall) : ', DTIMLG, DTIMLGw
      WRITE (6,'(a,2f12.1)') 'Time for phase    (hamilt, cpu/wall) : ', DTIMPH, DTIMPHw
      WRITE (6,'(a,2f12.1)') 'Time for us       (hamilt, cpu/wall) : ', DTIMUS, DTIMUSw
      WRITE (6,'(a,2f12.1)') 'Time for overlaps (hamilt, cpu/wall) : ', DTIMMA, DTIMMAw
      WRITE (6,'(a,2f12.1)') 'Time for distrib  (hamilt, cpu/wall) : ', DTIMDIST, DTIMDISTw
!      WRITE (6,'(a,2f12.1)') 'Time sum iouter   (hamilt, cpu/wall) : ', &
!                     READ_CPU_TIME(time_loop260), READ_wall_TIME(time_loop260)

      
      CALL BARRIER
      WRITE (6,'(a,i8)') ' number of local orbitals, nlo (hamilt) ', NLO

       IF (NLO .NE. 0) THEN

      allocate( PHSC(HSrows) )
      allocate( YL(0:(LOMAX+1)*(LOMAX+1)-1,HSrows,multmax) )
      write(6,'(a20,f12.1,a,a,3i6)') 'allocate YL  ',dble((LOMAX+1)*(LOMAX+1)*HSrows*multmax*16)/(1024.0*1024.0),' MB ',&
           '         dimensions',(LOMAX+1)*(LOMAX+1)-1,HSrows,multmax
      write(6,'(a20,f12.1,a,a,i6)') 'allocate phsc',dble(HSrows*16)/(1024.0*1024.0),' MB ',&
           '         dimensions',HSrows

      CALL init_albl(lomax,'HAM')
!
!------ calculate LO-Overlap-matrixelements ---------------------------
!
         CALL START_TIMER(time_lo)
         INDATM = 0
         IATM = 0
         NNLO = 0
!
!        initialize matrixelements
!
!_REAL       HPANEL(1:HSdim_r, 1:BLOCKSIZE)=ZERO
!_REAL       SPANEL(1:HSdim_r, 1:BLOCKSIZE)=ZERO
!_COMPLEX       HPANEL(1:HSdim_r, 1:BLOCKSIZE)=CZERO
!_COMPLEX       SPANEL(1:HSdim_r, 1:BLOCKSIZE)=CZERO


         I_g = NV + 1
         IMIN = I_g
         IMAX = MIN(NV + NLO,((IMIN-1)/(BLOCKSIZE*NPCOL)+1)*BLOCKSIZE*NPCOL)

         jneq_lo_loop: DO JNEQ = 1, NAT
!
!        precompute al(kn) and bl(kn)
!
            call make_albl(jneq,r(jneq),lomax,NV,'HAM',1,0)

            DO JEQ = 1, MULT(JNEQ)
               IATM = IATM + 1
!
!        precalculate spherical-harmonics for Alm, Blm
!
               DO J_g = 1, NV
#ifdef Parallel
                  ipr=mod((j_g-1)/blocksize,nprow)
                  if (ipr.ne.myrowhs) cycle
#endif
                  VEC(1) = XK(J_g)
                  VEC(2) = YK(J_g)
                  VEC(3) = ZK(J_g)
                  CALL ROTATE(VEC,ROTIJ(1,1,IATM),ROTV1)
                  CALL ROTATE(ROTV1,ROTLOC(1,1,JNEQ),ROTV2)
                  CALL YLM(ROTV2,LOMAX,YL(0,J_g,JEQ))
               enddo
            enddo

            PI4R2V = FOUR*PI*SQRT(VOLUME)*R(JNEQ)*R(JNEQ)
            CLR = ONE
            CLI = ZERO
            
            l_loop: DO L = 0, LOMAX
               jlo_loop:  do  jlo=1,ilo(l,jneq)

!        precalculate factors of a,b,c for LO's
!        C1,C2,C3 are needed for the Overlap-matrix update
!        C11,C12,C13 are needed for the Hamilton-matrix update
!
                  C1R  = ALO(L,jlo,JNEQ) + PI12LO(L,JNEQ)*CLO(L,jlo,JNEQ)
                  C1I  = ZERO
                  C2R  = BLO(L,jlo,JNEQ)*CHN(L,JNEQ) + &
                       PE12LO(L,JNEQ)*CLO(L,jlo,JNEQ)
                  C2I  = ZERO
                  C11R = ALO(L,jlo,JNEQ)*E(L,JNEQ) + HALF*(BLO(L,jlo,JNEQ) + &
                       CLO(L,jlo,JNEQ)*PI12LO(L,JNEQ)*(ELO(L,jlo,JNEQ) + &
                       E(L,JNEQ)))
                  C11I = ZERO
                  C12R = BLO(L,jlo,JNEQ)*CHN(L,JNEQ)*E(L,JNEQ) + &
                       HALF*(ALO(L,jlo,JNEQ) + CLO(L,jlo,JNEQ)* &
                       PI12LO(L,JNEQ) + CLO(L,jlo,JNEQ)*PE12LO(L,JNEQ)* &
                       (ELO(L,jlo,JNEQ) + E(L,JNEQ)))
                  C12I = ZERO
                  C3R  = CLO(L,jlo,JNEQ) +  &
                       PE12LO(L,JNEQ)*BLO(L,jlo,JNEQ) + &
                       ALO(L,jlo,JNEQ)*PI12LO(L,JNEQ)
                  C3I  = ZERO
                  C13R = CLO(L,jlo,JNEQ)*ELO(L,jlo,JNEQ)+ &
                       HALF*(BLO(L,jlo,JNEQ)*PI12LO(L,JNEQ) + &
                       (ELO(L,jlo,JNEQ) + E(L,JNEQ))*(ALO(L,jlo,JNEQ)* &
                       PI12LO(L,JNEQ)+PE12LO(L,JNEQ)*BLO(L,jlo,JNEQ)))
                  C13I = ZERO
!
! Precalculate surface kinetic energy factor
!
                  akinloR=HALF*r(jneq)**2*(alo(l,jlo,jneq)*DU(L,JNEQ)+ &
                       blo(l,jlo,jneq)*DUP(L,JNEQ)+ &
                       CLO(l,jlo,jneq)*DPLO(l,jneq))

                  jeq0_loop: DO JEQO = 1, MULT(JNEQ)
                     mo1_loop:   DO MO1 = -L, L

                        NNLO = NNLO + 1

!
!    compute only local columns
!
                        IHELP=MOD(I_g,BLOCKSIZE)
                        IF (IHELP.EQ.0) IHELP=BLOCKSIZE

                        jeq_loop: DO JEQ = 1, MULT(JNEQ)
                           INDATM = INDATM + 1

!
!        calculate spherical-harmonics for Alm, Blm
!
                           VEC(1) = XK(NV+NNLO)
                           VEC(2) = YK(NV+NNLO)
                           VEC(3) = ZK(NV+NNLO)
                           CALL ROTATE(VEC,ROTIJ(1,1,INDATM),ROTV1)
                           CALL ROTATE(ROTV1,ROTLOC(1,1,JNEQ),ROTV2)
                           jnnlo=NV+NNLO
                           CALL YLM(ROTV2,L,YL(0,jnnlo,JEQ))

#ifdef Parallel
                           ipc=mod((i_g-1)/blocksize,npcol)
                           if (ipc.eq.mycolhs) then
#endif                           
!
!        determine phase factor
!
                              DO J_g = 1, NV + NNLO
#ifdef Parallel
                                    ipr=mod((j_g-1)/blocksize,nprow)
                                    if (ipr.ne.myrowhs) cycle
#endif                           

                                 ARGX = POS(1,INDATM)*(KZZ(1,NV+NNLO)-KZZ(1,J_g))
                                 ARGY = POS(2,INDATM)*(KZZ(2,NV+NNLO)-KZZ(2,J_g))
                                 ARGZ = POS(3,INDATM)*(KZZ(3,NV+NNLO)-KZZ(3,J_g))
                                 ARGX=(ARGX+ARGY+ARGZ)*TWO*PI
                                 PHSC(J_g) = DCMPLX(DCOS(ARGX),-DSIN(ARGX) )
!                                 PHSC(J_g) = EXP(CMPLX(ZERO,TWO)*PI*(ARGX+ARGY+ARGZ))
!                                 PHSC(J_g) = DCONJG(PHSC(J_g))
                              enddo

                              mo2_loop: DO MO2 = -L, L

                                 L0M0 = L*(L+1) + MO2
!
!        CLYLN = (CIMAG**L)*DCONJG(YL(L0M0,NV+NNLO,JEQ))
!
                                 CLYLNR = CLR*DBLE(YL(L0M0,jNNLO,JEQ)) + &
                                      CLI*DIMAG(YL(L0M0,jNNLO,JEQ))
                                 CLYLNI = CLI*DBLE(YL(L0M0,jNNLO,JEQ)) - &
                                      CLR*DIMAG(YL(L0M0,jNNLO,JEQ))
!
!        update C1,C2,C3 for the Overlap-matrix 
!        C11,C12,C13 for the Hamilton-matrix 
!
                                 C1RUP = C1R*CLYLNR*PI4R2V
                                 C1IUP = C1R*CLYLNI*PI4R2V
                                 C2RUP = C2R*CLYLNR*PI4R2V
                                 C2IUP = C2R*CLYLNI*PI4R2V
                                 C3RUP = C3R*CLYLNR*PI4R2V
                                 C3IUP = C3R*CLYLNI*PI4R2V
                                 C11RUP = C11R*CLYLNR*PI4R2V
                                 C11IUP = C11R*CLYLNI*PI4R2V
                                 C12RUP = C12R*CLYLNR*PI4R2V
                                 C12IUP = C12R*CLYLNI*PI4R2V
                                 C13RUP = C13R*CLYLNR*PI4R2V
                                 C13IUP = C13R*CLYLNI*PI4R2V
                                 akinlorup = akinloR*CLYLNR*PI4R2V
                                 akinloiup = akinloR*CLYLNI*PI4R2V

                                 j=0
                                 DO J_g = 1, NV
#ifdef Parallel
                                    ipr=mod((j_g-1)/blocksize,nprow)
                                    if (ipr.ne.myrowhs) cycle
#endif
                                    j=j+1

!
!        CLYL = (CIMAG**L)*DCONJG(YL(L0M0,J,JEQ))
!
                                    CLYLR = CLR*DBLE(YL(L0M0,J_g,JEQ)) + &
                                         CLI*DIMAG(YL(L0M0,J_g,JEQ))
                                    CLYLI = CLI*DBLE(YL(L0M0,J_g,JEQ)) - &
                                         CLR*DIMAG(YL(L0M0,J_g,JEQ))
!
!        C6 ... Alm, C7 ... Blm
!
                                    C6R = CLYLR*AL_r(J,L)*PI4R2V
                                    C6I = CLYLI*AL_r(J,L)*PI4R2V
                                    C7R = CLYLR*BL_r(J,L)*PI4R2V
                                    C7I = CLYLI*BL_r(J,L)*PI4R2V
!
!        CAB = DCONJG(C6)*C1 + DCONJG(C7)*C2
!
                                    CABR =  C6R*C1RUP + C7R*C2RUP + &
                                         C6I*C1IUP + C7I*C2IUP
                                    CABI = -C6I*C1RUP - C7I*C2RUP + &
                                         C6R*C1IUP + C7R*C2IUP
!
!        update Overlap-matrix
!
                                    SPANEL(J,IHELP) = SPANEL(J,IHELP) + &
                                         (CABR*DBLE(PHSC(J_g)) - CABI*DIMAG(PHSC(J_g)))
!_COMPLEX                           SPANEL(J,IHELP) = SPANEL(J,IHELP) +  &
!_COMPLEX                                    CIMAG *  (CABR*DIMAG(PHSC(J_g)) + &
!_COMPLEX                                             CABI*DBLE(PHSC(J_g)))
!
!        CAB = DCONJG(C6)*C11 + DCONJG(C7)*C12
!
                                    CABR =  C6R*C11RUP + C7R*C12RUP + &
                                         C6I*C11IUP + C7I*C12IUP
                                    CABI = -C6I*C11RUP - C7I*C12RUP + &
                                         C6R*C11IUP + C7R*C12IUP
                                    akinbr = (c6r*u(l,jneq)+c7r*up(l,jneq)) * akinlorup &
                                         +(c6i*u(l,jneq)+c7i*up(l,jneq)) * akinloiup
                                    akinbi = (c6r*u(l,jneq)+c7r*up(l,jneq)) * akinloiup &
                                         -(c6i*u(l,jneq)+c7i*up(l,jneq)) * akinlorup
!
!        update Hamilton-matrix
!
                                    HPANEL(J,IHELP) = HPANEL(J,IHELP) + &
                                         (CABR*DBLE(PHSC(J_g)) - CABI*DIMAG(PHSC(J_g)))
!_COMPLEX                           HPANEL(J,IHELP) = HPANEL(J,IHELP) + &
!_COMPLEX                                       CIMAG *(CABR*DIMAG(PHSC(J_g)) + &
!_COMPLEX                                               CABI*DBLE(PHSC(J_g)))
                                    HPANEL(J,IHELP) = HPANEL(J,IHELP) + &
                                         (akinbr*DBLE(PHSC(J_g)) - &
                                         akinbi*DIMAG(PHSC(J_g)))
!_COMPLEX                           HPANEL(J,IHELP) = HPANEL(J,IHELP) +  &
!_COMPLEX                                     CIMAG *  (akinbr*DIMAG(PHSC(J_g)) + &
!_COMPLEX                                               akinbi*DBLE(PHSC(J_g)))
                                 enddo
                                 
                                 jlop_loop: do  jlop=1,jlo
                                    DO J_g = NV + NNLO - (MO1+L) - &
                                         (2*L+1)*(JEQO-1) - (jlo-jlop)*(2*l+1)*mult(jneq), NV + NNLO &
                                         -(jlo-jlop)*(MO1+L+1)-(jlo-jlop)*(JEQO-1)*(2*l+1)
#ifdef Parallel
                                       ipr=mod((j_g-1)/blocksize,nprow)
                                       if (ipr.ne.myrowhs) cycle
                                       j=global2local(j_g,nprow,blocksize)
#else
                                       j=j_g
#endif

!
!        CLYL = (CIMAG**L)*DCONJG(YL(L0M0,J,JEQ))
!
                                       CLYLR = CLR*DBLE(YL(L0M0,J_g,JEQ)) + &
                                            CLI*DIMAG(YL(L0M0,J_g,JEQ))
                                       CLYLI = CLI*DBLE(YL(L0M0,J_g,JEQ)) - &
                                            CLR*DIMAG(YL(L0M0,J_g,JEQ))
!
!        C6 ... CONJG(ALOlm), C7 ... CONJG(BLOlm), C8 ... CONJG(CLOlm)
!
                                       C6R = CLYLR*ALO(L,jlop,JNEQ)*PI4R2V
                                       C6I = CLYLI*ALO(L,jlop,JNEQ)*PI4R2V
                                       C7R = CLYLR*BLO(L,jlop,JNEQ)*PI4R2V
                                       C7I = CLYLI*BLO(L,jlop,JNEQ)*PI4R2V
                                       C8R = CLYLR*CLO(L,jlop,JNEQ)*PI4R2V
                                       C8I = CLYLI*CLO(L,jlop,JNEQ)*PI4R2V
!                                                         
!        CABC = DCONJG(C6)*C1 + DCONJG(C7)*C2 + DCONJG(C8)*C3
!
                                       CABCR =  C6R*C1RUP + C6I*C1IUP + &
                                            C7R*C2RUP + C7I*C2IUP +    &
                                            C8R*C3RUP + C8I*C3IUP    
                                       CABCI = -C6I*C1RUP + C6R*C1IUP - &
                                            C7I*C2RUP + C7R*C2IUP - &
                                            C8I*C3RUP + C8R*C3IUP 
!
!        update Overlap-matrix
!
                                       SPANEL(J,IHELP) = SPANEL(J,IHELP) + &
                                            (CABCR*DBLE(PHSC(J_g)) - &
                                            CABCI*DIMAG(PHSC(J_g)))
!_COMPLEX                              SPANEL(J,IHELP) = SPANEL(J,IHELP) +  &
!_COMPLEX                                     CIMAG *  (CABCR*DIMAG(PHSC(J_g)) + &
!_COMPLEX                                               CABCI*DBLE(PHSC(J_g)))
!
!        CABC = DCONJG(C6)*C11 + DCONJG(C7)*C12 + DCONJG(C8)*C13
!
                                       CABCR =  C6R*C11RUP + C6I*C11IUP + &
                                            C7R*C12RUP + C7I*C12IUP +    &
                                            C8R*C13RUP + C8I*C13IUP    
                                       CABCI = -C6I*C11RUP + C6R*C11IUP - &
                                            C7I*C12RUP + C7R*C12IUP - &
                                            C8I*C13RUP + C8R*C13IUP 
!
!        update Hamilton-matrix
!
                                       HPANEL(J,IHELP) = HPANEL(J,IHELP) + &
                                            (CABCR*DBLE(PHSC(J_g)) - &
                                            CABCI*DIMAG(PHSC(J_g)))
!_COMPLEX                                 HPANEL(J,IHELP) = HPANEL(J,IHELP) +  &
!_COMPLEX                                      CIMAG * (CABCR*DIMAG(PHSC(J_g)) + &
!_COMPLEX                                               CABCI*DBLE(PHSC(J_g)))
                                    enddo

                                 enddo jlop_loop
                              enddo mo2_loop
#ifdef Parallel
                           endif
#endif

                              
                        enddo jeq_loop

                        INDATM = INDATM - MULT(JNEQ)

                        IF(I_g.EQ.IMAX) THEN

                           ii=0
                           DO II_g = IMIN, IMAX
#ifdef Parallel
                              ipc=mod((ii_g-1)/blocksize,npcol)
                              if (ipc.ne.mycolhs) cycle
                              ii=global2local(ii_g,npcol,blocksize)
                              IHELP = MOD(II_g,BLOCKSIZE)
                              IF(IHELP.EQ.0) IHELP=BLOCKSIZE
!_REAL                              CALL DCOPY(HSdim_r, HPANEL(1,IHELP), 1, H(1,II), 1 )
!_REAL                              CALL DCOPY(HSdim_r, SPANEL(1,IHELP), 1, S(1,II), 1 )
!_COMPLEX                           CALL ZCOPY(HSdim_r, HPANEL(1,IHELP), 1, H(1,II), 1 )
!_COMPLEX                           CALL ZCOPY(HSdim_r, SPANEL(1,IHELP), 1, S(1,II), 1 )
#else
                              ii=ii+1
                              ii=ii_g
                              IHELP = MOD(II,BLOCKSIZE)
                              IF(IHELP.EQ.0) IHELP=BLOCKSIZE
!_REAL                              CALL DCOPY(II-1, HPANEL(1,IHELP), 1, HS(II, 1), HSROWS )
!_REAL                              CALL DCOPY(II, SPANEL(1,IHELP), 1, HS(1,II), 1 )
!_COMPLEX                           CALL ZCOPY(II-1, HPANEL(1,IHELP), 1, HS(II, 1), HSROWS )
!_COMPLEX                           CALL ZCOPY(II, SPANEL(1,IHELP), 1, HS(1,II), 1 )
                              HSDIAG(II)=HPANEL(II,IHELP)
#endif
                           END DO

!_REAL                           HPANEL(:,:)=ZERO
!_REAL                           SPANEL(:,:)=ZERO
!_COMPLEX                        HPANEL(:,:)=CZERO
!_COMPLEX                        SPANEL(:,:)=CZERO
                           IMIN = IMAX+1
                           IMAX = MIN(NV + NLO,((IMIN-1)/(BLOCKSIZE*NPCOL)+1)*BLOCKSIZE*NPCOL)
                              
                        END IF
                        I_g = I_g + 1
                        
                     enddo mo1_loop
                  enddo jeq0_loop
                  
               enddo jlo_loop
            enddo l_loop
            TMP = CLR
            CLR = -CLI
            CLI = TMP
            INDATM = INDATM + MULT(JNEQ)

         enddo jneq_lo_loop

         nevertrue: if(imax.gt.imin) THEN
            ii=0
            DO II_g = IMIN, IMAX
#ifdef Parallel
               ipc=mod((ii_g-1)/blocksize,npcol)
               if (ipc.ne.mycolhs) cycle
               ii=global2local(ii_g,npcol,blocksize)
               IHELP = MOD(II_g,BLOCKSIZE)
               IF(IHELP.EQ.0) IHELP=BLOCKSIZE
!_REAL         CALL DCOPY(HSdim_r, HPANEL(1,IHELP), 1, H(1,II), 1 )
!_REAL         CALL DCOPY(HSdim_r, SPANEL(1,IHELP), 1, S(1,II), 1 )
!_COMPLEX      CALL ZCOPY(HSdim_r, HPANEL(1,IHELP), 1, H(1,II), 1 )
!_COMPLEX      CALL ZCOPY(HSdim_r, SPANEL(1,IHELP), 1, S(1,II), 1 )
#else
               ii=ii+1
               ii=ii_g
               IHELP = MOD(II,BLOCKSIZE)
               IF(IHELP.EQ.0) IHELP=BLOCKSIZE
!_REAL         CALL DCOPY(II-1, HPANEL(1,IHELP), 1, HS(II, 1), HSROWS )
!_REAL         CALL DCOPY(II, SPANEL(1,IHELP), 1, HS(1,II), 1 )
!_COMPLEX      CALL ZCOPY(II-1, HPANEL(1,IHELP), 1, HS(II, 1), HSROWS )
!_COMPLEX      CALL ZCOPY(II, SPANEL(1,IHELP), 1, HS(1,II), 1 )
               HSDIAG(II)=HPANEL(II,IHELP)
#endif
            END DO
         END IF nevertrue
	 CALL STOP_TIMER(time_lo)
!         WRITE (6,'(a,2f12.1)') 'Time for los      (hamilt, cpu/wall) : ', &
!                             READ_CPU_TIME(time_lo),READ_wall_TIME(time_lo)

         deallocate( PHSC )
         deallocate( YL )
         call end_albl('HAM')

      ENDIF

      deallocate( hpanel,spanel)
      deallocate(ROTV2,ROTV1,VEC)

      CALL BARRIER

      RETURN
!
!         End of 'HAMILT'
!
      END


More information about the Wien mailing list