<< >> Title Contents Index Home Help

2 PGI Workstation 4.1
Release Notes


This document describes changes between the Workstation 4.1 and previous releases, as well as late-breaking information not included in the current printing of the PGI User's Guide.

2.1 Workstation 4.1 Contents

The PGI Workstation 4.1 includes the following components:

Depending on the product you purchased, you may not have received all of the above components.

2.2 Supported Systems and Licensing

PGI Workstation 4.1 is supported on systems using the Intel Pentium or Pentium Pro/II/III/4 or compatible processors, including AMD Athlon/AthlonXP, and running Linux with a kernel version of 2.2.10 or above, or Win32 operating systems including NT 4.1, Win98, Win2K, WinME, and WinXP. This includes versions of Linux that use glibc2.2.x, such as Redhat 6.0 to 8.0, and SuSE 6.1 to 8.1. For more information about release levels and operating systems supported, go to http://www.pgroup.com/faq/install.htm .

The compilers and tools are license-managed. For Workstation products using PGI-style licensing (the default), a single user can run as many simultaneous copies of the compiler as desired, on a single system, and no license daemon or Ethernet card is required. However, usage of the compilers and tools is restricted to a pre-specified username. If you would like our compilers and tools to be usable under any username, you must request FLEXlm-style license keys and use FLEXlm-style licensing. See section 1, The PGI Workstation 4.1 Installation Notes, for a more detailed description of licensing. http://www.pgroup.com/faq/install.htm#install1m covers many of the online license generation questions.

2.3 New Features

Following are the new features included in The PGI Workstation 4.1:

2.4 New Compiler Options

2.4.1 New Generic Options

Several new or updated generic compiler options (options which apply to all of our compilers) are present in release 4.1. These are in addition to those previously documented in the 4.0 Release Notes (http://www.pgroup.com/docs.htm) . Some prior switches are mentioned here as well, to provide context or clarify their use.

* -Mscalarsse - a newly introduced technology for cpus that support SSE and/or SSE2 type instructions. Pentium III and AthlonXP cpus support SSE instruction types, while the Pentium 4 cpu also supports the SSE2 instruction types. In prior releases only the vectorizing optimizations used these instructions, and now they are being utilized in all coding opportunities. Note: some versions of linux and cygwin have assemblers that do not support the newer SSE style instructions. This switch should only be used if your assembler `as' accepts these instructions. Older versions of linux (for example, Red Hat 6.2) do not accept the SSE2 type instructions.

* -fastsse - This switch combines a set of optimizations that we believe to work well together to improve performance. This switch is meant only for machines with SSE instruction type support, as in the Pentium III and AthlonXP. It will also work on Pentium 4s, and supports SSE and SSE2 instruction types in that case. See the warning above about your assembler version.

* -Mnontemporal - This switch is important to note with -fastsse -Mvect-=sse options. Some programs run slower with -fastsse, due the prefetching used. -Montemporal offers a different data movement scheme. When tuning code, it is good practice to try `-fastsse -Mnontemporal' in addition to `-fastsse ` alone, to see what will be faster.

2.4.2 New Win32 Compiler Options

No new Win32 specific compiler options are present in 4.1.

2.5 Problems Corrected in 4.1

The following problems were corrected in the current release. A description of the problem is given, but some problems can only be described in general terms because of complexity or confidentiality.

Technical Problem Reports (TPRs) Corrected in 4.1-1


TPR


Lang


Description


Symptom


2823


pgf90


Reading real `3.2e4' with `I' edit descriptor in list directed integer input did not cause error message.


No error message produced during execution.

2826


pgf90


Intrinsic call with a derived data type caused an internal compiler error.


pgf90 -c -Mfree xx.f

ICE: mkexpr1: bad id 14 (xx.f: 185)


2835


pgf90


Fails to detect error with -Mstandard

read(...,...,SIZE=n) ...

which should read

read(...,...,SIZE=n,ADVANCE='NO') ...


No compiler error detected.


2837


pgf90


A merge operation causes a false -Mbounds error


0: Subscript out of range for ...


2839


pgcc


pgcc generates assembly output with -Kpic that is invalid.

cvtsi2ss $88,%xmm0


pgcc xx.c -fastsse -Kpic

Error: suffix or operands invalid for ´cvtsi2ss'


2847


ALL


__STDC__,__PGI need to be undefined

pgxx -U__STDC__ fails


__STDC__, __PGI and other symbols remain defined.


2849


pgf90 pgf77


programs compiled -I8 fail unless IO specifiers like IOSTAT are declared integer*4 or logical*4


compiler error messages


2851


pgf90


-Mstandard was not catching our f90 extension of allowing derived type members to be allocatable, as an error.

pgf90 -Mstandard x.f does not report errors.


2855


pgf90pgCC


pgf90 -U does not work. related to 2847


pgf90 -U xxx does not become -undef xxx


2865


pgCC


s=NULL;cout<<s fails


program fails


2867


pgCC


pgCC

--one_instantiation_per_object p.cc


C++ prelinker: executing --inf-loops


2872


pgf90


array constructors with 2 or more implied DOs using the same DO index fails.

x = (/ (x(j), j=Nx/2+1, Nx), (x(j), j=1, Nx/2) /) fails


2873


pgf90

function declared in the interface block of a module and later,in a program contained in that module that is passed as an ARGUMENT to another subprogram


Internal compiler error. Errors in Lowering


2874


pgf90


SCALE function in f90 gives wrong results for double precision.


wrong answers


2881


pgf90


DOT_PRODUCT(CONJG(a(k1:k2)),u(i(k1:k2))) returns wrong results


wrong results

2882


pgCC


C++ ignores `pragma omp' in templates


1 thread in omp areas


2902


pgf90


-fastsse causes internal compiler error


ICE:replace_invar: nonzero subsc stride


2907


pgf90


matmul gives wrong complx answers


wrong answers


2912


pgf90


program causes Internal Compiler Error

ICE. exp_ref:IM_BASE op#2 not based sym


2916


pgf77 pgf90


OPEN(...,...,...,CONVERT='NATIVE') should ignore -byteswapio


results swapped when they shouldn't be.


2.6 PGCC C and C++ Compiler Notes

The Rogue Wave Standard Template Library has been replaced with STLport, version 4.5. Rogue Wave is no longer supported. Users should look at the STLport license for any usage issues.

2.7 The PGI Workstation 4.1 and libpthread

Previous releases of our Linux compiler products have included a customized version of libpthread.so called libpgthread.so. The purpose of this library is to give the user more thread stack space to run openmp and -Mconcur compiled programs. With Release 8.0 Red Hat and equivalent releases, we are seeing libpthread.so and libpthread.a with `re-sizeable' thread stack areas. In these cases

1. The filename $PGI/linux86/lib/libpghtread.so is a soft link to /usr/lib/libpthread.so.

2. Instead of `setenv MPSTKZ 256M', for example, to increase the libpgthread.so thread stack area, the linux system call `limit stacksize 256M' will now apply to thread stacks.

2.8 The PGI Workstation 4.1 and glibc

Release 4.1 of the PGI Workstation compilers and tools are built and validated under both the Linux 2.2.10 through 2.4.x kernels. Distributions of Linux, from Red Hat 6.0 to 8.0 and SuSE 6.1 to 8.1, incorporate revision 2.2.10 or greater of the Linux kernel and glibc2.1.x or greater. If you are using a version of Linux that is supported by our current release, the PGI installation script will automatically detect it. Your installation will be modified as appropriate for these systems.

2.9 The PGI Workstation 4.1 for Win32

2.9.1 Workstation Shell Environment

On Win32, a UNIX-like shell environment is bundled with the Workstation. After installation, a double-left-click on the Workstation icon on your desktop will launch a bash shell command window with pre-initialized environment settings. Most familiar UNIX commands are available (vi, emacs, sed, grep, awk, make, etc). If you are unfamiliar with the bash shell, reference the user's guide included with the online HTML documentation.

Alternatively, you can launch a standard Win32 command window pre-initialized for usage of the compilers by selecting the appropriate option from the Workstation program group accessed in the usual way through the "Start" button.

Except where noted in the User's Guide, the command-level compilers and tools on Win32 function identically to their UNIX counterparts. You can customize your command window (white background with black text, add a scroll bar, etc.) by right-clicking on the top border of the PGI Workstation command window, selecting "Properties", and making the appropriate modifications. When the changes are complete, Win32 will allow you to apply the modifications globally to any command window launched using the Workstation desktop icon.

2.9.2 PGI Compilers and Mingw32

The PGI Workstation 4.1 has updated the mingw32 components it utilizes to maintain a UNIX-like development environment under Windows. With this release, we have the ability to create DLLs without having to use the cygwin environment.

2.9.3 Creating DLLs with PGI Compilers

Dynamically linked libraries, or DLLs , are a useful way for libraries to be shared and updated without being relinked to the application each time they are modified. However, not all libraries can be converted to DLLs, and it is important to understand the restrictions placed upon DLL candidates, in order to work successfully.

DLLs, in the way they are implemented here, are like an independent process that is started each time a DLL routine is called and stopped each time the code exits the DLL, to return to the process of the calling routine.

DLLs must be internally consistent, and creating them involves linking in all routines called by the main DLL routines. This linking could involve yet another DLL.

Data is not shared between the calling process and the DLL, so any data initialization must be performed within the DLL itself. The only information passed automatically to DLL routines is that contained in the arguments.

There is more information about sharing data with DLLs at http://msdn.microsoft.com/library/\ default.asp?url=/library/en-us/vccore/html\ /_core_how_do_i_share_data_in_my_dll_with_an_application_or_with_other_dlls.3f.asp

Information about mingw32 tools for creating DLLs can be found at http://www.mingw.org/ and at http://www.nanotech.wisc.edu/%7Ekhan/software/gnu-win32/.

We present the steps first without the shorthand of dllwrap.

The steps in creating a DLL libfoo.dll are

1. Use dlltool to create an output-def file foo.def , of all of the objects in the DLL you wish to call from outside the DLL.

2. Use ld, the DLL objects, the mingw32 dllcrt1.o object, and any necessary libs to link the DLL in order to create a base-file foo.bas. The linker output file can be discarded on this step (see step 2 below and notice the unused output file trash.dll).

3. Use dlltool, foo.def, and foo.bas to create the output-exp file foo.exp

4. Use ld, foo.exp, the DLL objects, the mingw32 dllcrt1.o object, and any necessary libraries to create the output DLL, libfoo.dll

Example1: Create a DLL for the pgf77 object foo.o, called by the main program main.f:

program test
integer n
print *,"(main) calling foo"
n=10
call foo(n)
print *,"(main) foo called"
stop
end

subroutine foo.f

subroutine foo(n)
integer n,i
real x(n),y
print *,"(foo) n=",n
y=0
do i=1,n
x(i)=I
y=y+x(i)
end do
print *,"total is",y
return
end

Step 1 - Use dlltool to create an output-def file, of all of the objects in the DLL you wish to call from outside the DLL. (foo.def).

pgf77 -c foo.f
dlltool --export-all --output-def foo.def foo.o

Step 2 - Use ld, the DLL objects, the mingw32 dllinit.o and dllcrt1.o objects, and any necessary libs to link the DLL in order to create a base-file (foo.bas) The linker output file can be discarded on this step. Note: we are using the minw32 entry point _DllMainCRTStartup@12.

ld --base-file foo.bas \
--dll --entry _DllMainCRTStartup@12 \
-o trash.dll foo.o \ C:/PGI/nt86/mingw/lib/dllcrt1.o \
C:/PGI/nt86/lib/libpgftnrtl.a \
C:/PGI/nt86/lib/libpgc.a \
C:/PGI/nt86/mingw/lib/libmingw32.a \
C:/PGI/nt86/mingw/lib/gcc-lib/mingw32/2.95.3-5/libgcc.a \
C:/PGI/nt86/mingw/lib/libmoldname.a \
C:/PGI/nt86/mingw/lib/libmsvcrt.a \
C:/PGI/nt86/mingw/lib/libuser32.a \
C:/PGI/nt86/mingw/lib/libkernel32.a \
C:/PGI/nt86/mingw/lib/libadvapi32.a \
C:/PGI/nt86/mingw/lib/libcrtdll.a \
C:/PGI/nt86/mingw/lib/libshell32.a
rm -rf trash.dll

Step 3 - Use dlltool, foo.def, and foo.bas to create the output-exp file (foo.exp).

dlltool --base-file foo.bas --output-exp foo.exp \ --def foo.def

Step 4 - Use ld, foo.exp, the DLL objects, the mingw32 dllinit.o and dllcrt1.o objects, and any necessary libraries to create the output DLL. (libfoo.dll)

ld --base-file foo.bas foo.exp \
--dll -entry _DllMainCRTStartup@12 \
-o libfoo.dll foo.o \ C:/PGI/nt86/mingw/lib/dllcrt1.o \
C:/PGI/nt86/lib/libpgftnrtl.a \
C:/PGI/nt86/lib/libpgc.a \
C:/PGI/nt86/mingw/lib/libmingw32.a \
C:/PGI/nt86/mingw/lib/gcc-lib/mingw32/2.95.3-5/libgcc.a \
C:/PGI/nt86/mingw/lib/libmoldname.a \
C:/PGI/nt86/mingw/lib/libmsvcrt.a \
C:/PGI/nt86/mingw/lib/libuser32.a \
C:/PGI/nt86/mingw/lib/libkernel32.a \
C:/PGI/nt86/mingw/lib/libadvapi32.a \
C:/PGI/nt86/mingw/lib/libcrtdll.a \
C:/PGI/nt86/mingw/lib/libshell32.a

Finally, create the executable test

pgf77 -o test main.f libfoo.dll

In order to determine if you have successfully created a DLL, run test

PGI$ test
(main) calling foo
(foo) n= 10
total is 55.00000
(main) foo called
FORTRAN STOP.

After running test, modify the loop in foo.f to be

do i=1,n
x(i)=I
y=2.0*y+x(i)
end do

And only rebuild libfoo.dll. If running test now produces

PGI$ test
(main) calling foo
(foo) n= 10
total is 110.00000
(main) foo called
FORTRAN STOP

then dynamic linkage is working.

2.9.4 Creating DLLs with PGF90

Because DLLs do not share data with the calling procedure, pgf90 compiled programs can fail because of internal information not conveyed to the DLL from the calling program. It is for this reason that it is NOT true that a library libxxx.a can always successfully be converted to a dynamically linked library (i.e. libxxx.dll). In order to support all f90 constructs in DLLs, all of the pgf90 libraries we provide would need to be DLLs as well. We do not provide this in this release.

If we look at the last example with libfoo.dll, we see that the steps are the same except for the extra f90 libraries that need to be linked in

ld --base-file foo.bas foo.exp \
--dll -entry _DllMainCRTStartup@12 \
-o libfoo.dll foo.o \ C:/PGI/nt86/mingw/lib/dllcrt1.o \
C:/PGI/nt86/lib/libpgf90.a \ !!!
C:/PGI/nt86/lib/libpgf90_rpm1.a \ !!!
C:/PGI/nt86/lib/libpgf902.a \ !!!
C:/PGI/nt86/lib/libpgf90rtl.a \ !!!
C:/PGI/nt86/lib/libpgftnrtl.a \
C:/PGI/nt86/lib/libpgc.a \
C:/PGI/nt86/mingw/lib/libmingw32.a \
C:/PGI/nt86/mingw/lib/gcc-lib/mingw32/2.95.3-5/libgcc.a \
C:/PGI/nt86/mingw/lib/libmoldname.a \
C:/PGI/nt86/mingw/lib/libmsvcrt.a \
C:/PGI/nt86/mingw/lib/libuser32.a \
C:/PGI/nt86/mingw/lib/libkernel32.a \
C:/PGI/nt86/mingw/lib/libadvapi32.a \
C:/PGI/nt86/mingw/lib/libcrtdll.a \
C:/PGI/nt86/mingw/lib/libshell32.a

However, the program fails if we replace foo.f with a program that allocates the space for the scratch array x, rather than declare it initially.

The foo.f90 program could look like

subroutine foo(n)
integer n,I
real y
real,allocatable::x(:)
print *,"(foo) n=",n
print *,"(foo) before allocate"
allocate(x(n))
print *,"(foo) after allocate"
y=0
do i=1,n
x(i)=I
y=y+x(i)
end do
print *,"total is",y
deallocate(x)
return
end

Users of this example and libfoo.dll that would be created as is done above will find that the program fails at the point of the allocate(x(n)). The reason is that there are internal flags and tables to pgf90 programs that are not being initialized properly in the DLL.

mingw32 provides source to dllinit.c (see ftp://ftp.xraylith.wisc.edu/pub/khan/gnu-win32/mingw32/misc/dllhelpers-0.2.5.zip ) , and we modify it here with the additional call to pghpf_init(), a routine that initializes internal pgf90 and pghpf info.

PGI$ diff dllinit.c dllinit_pgf90.c

39a40,41
> extern void pghpf_init(int*);
> static int z=0;
67c69
<
---
> pghpf_init(&z);
PGI$

By adding this call to dllinit.c to create dllinit_pgf90.c, the example with foo.f90 now executes.

ld --base-file foo.bas foo.exp --dll --entry _DllMainCRTStartup@12 \
-o libfoo.dll foo.o \
dllinit_pgf90.o \
C:/PGI/nt86/mingw/lib/dllcrt1.o \
C:/PGI/nt86/lib/libpgf90.a \
C:/PGI/nt86/lib/libpgf90_rpm1.a \
C:/PGI/nt86/lib/libpgf902.a \
C:/PGI/nt86/lib/libpgf90rtl.a \
C:/PGI/nt86/lib/libpgftnrtl.a \
C:/PGI/nt86/lib/libpgc.a \
C:/PGI/nt86/mingw/lib/libmingw32.a \
C:/PGI/nt86/mingw/lib/gcc-lib/mingw32/2.95.3-5/libgcc.a \
C:/PGI/nt86/mingw/lib/libmoldname.a \
C:/PGI/nt86/mingw/lib/libmsvcrt.a \
C:/PGI/nt86/mingw/lib/libuser32.a \
C:/PGI/nt86/mingw/lib/libkernel32.a \
C:/PGI/nt86/mingw/lib/libadvapi32.a \
C:/PGI/nt86/mingw/lib/libcrtdll.a \
C:/PGI/nt86/mingw/lib/libshell32.a

There are many problems with supporting pgf90 in DLLs. For example, OPTIONAL arguments will be a problem if the if(present(xxx)) type of argument interrogation is used. The information present in the calling program has not been conveyed to the DLL, and is not updated by calling pghpf_init(). We have no workaround for this at this time.


<< >> Title Contents Index Home Help