automake
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cuda compilation


From: Tomas Oberhuber
Subject: Re: cuda compilation
Date: Sat, 9 Jan 2010 17:43:49 +0100
User-agent: KMail/1.9.10

Hi Ralph again,

On Wednesday 06 of January 2010 20:44:57 Ralf Wildenhues wrote:
> Hello Tomas,
>
> * Tomas Oberhuber wrote on Sat, Jan 02, 2010 at 11:33:46AM CET:
> > Now I try to compile whole project with nvcc. It seems to work but I get
> > this
> >
> > ibtool: link:
> > nvcc -shared -nostdlib   
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugStructu
> >re.o .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-parse.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugGroup.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugParser.
> >o .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebug.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugScanner
> >.o 
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlParameterConta
> >iner.o .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlString.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTimerCPU.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTimerRT.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ion.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ionScanner.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-mpi-supp.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTester.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-parse.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ionParser.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlObject.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-compress-file.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-mfilename.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlLogger.o  
> > .libs/libtnl-0.1.lax/libtnlmatrix-0.1.a/libtnlmatrix_0_1_la-tnlBaseMatrix
> >.o   -L/usr/local/cuda/lib64 -lcppunit -lcudart     -Wl,-soname
> > -Wl,libtnl-0.1.so.0 -o .libs/libtnl-0.1.so.0.0.0 nvcc fatal   : Unknown
> > option 'nostdlib'
> >
> > which means that nvcc is also used as linker. Even if I remove -nostdlib,
> > nvcc complains about other parameters. So I think it would be better to
> > link with g++. Can I change linker somehow? And in that case if I do it
> > by hand (copy the command on the command line and replace nvcc by g++) I
> > get this
> >
> > g++ -shared -nostdlib   
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugStructu
> >re.o .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-parse.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugGroup.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugParser.
> >o .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebug.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugScanner
> >.o 
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlParameterConta
> >iner.o .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlString.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTimerCPU.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTimerRT.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ion.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ionScanner.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-mpi-supp.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTester.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-parse.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ionParser.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlObject.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-compress-file.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-mfilename.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlLogger.o  
> > .libs/libtnl-0.1.lax/libtnlmatrix-0.1.a/libtnlmatrix_0_1_la-tnlBaseMatrix
> >.o   -L/usr/local/cuda/lib64 -lcppunit -lcudart     -Wl,-soname
> > -Wl,libtnl-0.1.so.0 -o .libs/libtnl-0.1.so.0.0.0 /usr/bin/ld:
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugStructu
> >re.o: relocation R_X86_64_32 against `.rodata.str1.1' can not be used when
> > making a shared object; recompile with -fPIC
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugStructu
> >re.o: could not read symbols: Bad value
> > collect2: ld returned 1 exit status
> >
> > Or maybe we can solve it using -Xcompiler nad -Xlinker. May I ask what
> > does libtool do now in case we use nvcc to compile or link?
>
> You're right.  Libtool doesn't support CXX=nvcc yet, and we also forgot
> some bits of CC=nvcc support.  This still needs to be done in Libtool.
>
> Thanks,
> Ralf

so yesterday I found that it is not so simple as I thought. There is a problem 
with dependencies. They cannot be solved by gcc but directly by nvcc 
otherwise we get something like this

cudefile.lo cudafile.o:  \
 /tmp/tmpxft_0000021f_00000000-10_cudafile.ii

Moreover, any other headers are omitted because they were already processed by 
nvcc preprocessor (it is my guess). The result of this is that we cannot 
solve dependencies by gcc but by nvcc. I have found that nvcc has a flag -M 
which generates dependencies. Unfortunately no fast dependencies are possible 
here (I understood that it means that gcc3 is able to compile and generate 
dependencies at the same time, nvcc does not seem to do this). So I erased 
all the stuff around am__fastdepnvcc which I introduced yesterday :-(. 
Another ugly think is that nvcc is not able to filter out system headers and 
so the depedency files are pretty large :-(. Thinks are complicated even 
more. nvcc uses flag -o for the target file with dependencies which is 
confusing for libtool. I did not fully understood whats going on between 
libtool and depcomp but I think that if we call depcomp we want it to 
generate dependencies as well as compile the source file. depcomp filter the 
arguments a then in fact it calls libtool. I handled it somehow in depcomp 
but this is by no means nice solution. It seems to work :) and I really hope 
that I will not experience another problems.
As I said yesterday, the way it works now is just the simplest solution to get 
it work. I would like to solve this properly and in my opinion nvcc should by 
only used for .cu files.

The following is patch for my yesterday version. In fact, it rejects all 
changes made in files depend2.am and depend.m4. The main changes are in 
depcomp.

diff -r automake-1.11.1/lib/am/depend2.am 
autotools/automake-1.11.1/lib/am/depend2.am
73a74,84
> if %FASTDEPNVCC%
> ## Fast-dep mode for nvcc is similar to gcc
> ## We just add -Xcompiler flag.
> ?!GENERIC?    
%VERBOSE%%COMPILE% -Xcompiler -MT -Xcompiler %OBJ% -Xcompiler -MD -Xcompiler 
-MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJ% %SOURCEFLAG%`test -f 
'%SOURCE%' || 
echo '$(srcdir)/'`%SOURCE%
> ?!GENERIC?    %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> ?GENERIC??!SUBDIROBJ? 
%VERBOSE%%COMPILE% -Xcompiler -MT -Xcompiler %OBJ% -Xcompiler -MD -Xcompiler 
-MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJ% %SOURCEFLAG%%SOURCE%
> ?GENERIC??!SUBDIROBJ? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> ?GENERIC??SUBDIROBJ?  %VERBOSE%depbase=`echo %OBJ% | sed 's|[^/]*$$|
$(DEPDIR)/&|;s|\.o$$||'`;\
> ?GENERIC??SUBDIROBJ?  
%COMPILE% -Xcompiler -MT -Xcompiler %OBJ% -Xcompiler -MD -Xcompiler -MP 
-Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJ% %SOURCEFLAG%%SOURCE% 
&&\
> ?GENERIC??SUBDIROBJ?  $(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> else !%FASTDEPNVCC%
86a98
> endif !%FASTDEPNVCC%
102a115,125
> if %FASTDEPNVCC%
> ## In fast-dep mode, we can always use -o.
> ## For non-suffix rules, we must emulate a VPATH search on %SOURCE%.
> ?!GENERIC?    
%VERBOSE%%COMPILE% -Xcompiler -MT -Xcompiler %OBJOBJ% -Xcompiler -MD -Xcompiler 
-MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJOBJ% %SOURCEFLAG%`if 
test -f '%SOURCE%'; then $(CYGPATH_W) '%SOURCE%'; else 
$(CYGPATH_W) '$(srcdir)/%SOURCE%'; fi`
> ?!GENERIC?    %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> ?GENERIC??!SUBDIROBJ? 
%VERBOSE%%COMPILE% -Xcompiler -MT -Xcompiler %OBJOBJ% -Xcompiler -MD -Xcompiler 
-MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJOBJ% 
%SOURCEFLAG%`$(CYGPATH_W) '%SOURCE%'`
> ?GENERIC??!SUBDIROBJ? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> ?GENERIC??SUBDIROBJ?  %VERBOSE%depbase=`echo %OBJ% | sed 's|[^/]*$$|
$(DEPDIR)/&|;s|\.obj$$||'`;\
> ?GENERIC??SUBDIROBJ?  
%COMPILE% -Xcompiler -MT -Xcompiler %OBJOBJ% -Xcompiler -MD -Xcompiler -MP 
-Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJOBJ% 
%SOURCEFLAG%`$(CYGPATH_W) '%SOURCE%'` 
&&\
> ?GENERIC??SUBDIROBJ?  $(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> else !%FASTDEPNVCC%
115a139
> endif !%FASTDEPNVCC%
132a157,166
> if %FASTDEPNVCC%
> ## fast-dep mode for nvcc only add -Xcompiler
> ?!GENERIC?    
%VERBOSE%%LTCOMPILE% -Xcompiler -MT -Xcompiler %LTOBJ% -Xcompiler -MD 
-Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %LTOBJ% 
%SOURCEFLAG%`test -f '%SOURCE%' || 
echo '$(srcdir)/'`%SOURCE%
> ?!GENERIC?    %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Plo
> ?GENERIC??!SUBDIROBJ? 
%VERBOSE%%LTCOMPILE% -Xcompiler -MT -Xcompiler %LTOBJ% -Xcompiler -MD 
-Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %LTOBJ% 
%SOURCEFLAG%%SOURCE%
> ?GENERIC??!SUBDIROBJ? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Plo
> ?GENERIC??SUBDIROBJ?  %VERBOSE%depbase=`echo %OBJ% | sed 's|[^/]*$$|
$(DEPDIR)/&|;s|\.lo$$||'`;\
> ?GENERIC??SUBDIROBJ?  
%LTCOMPILE% -Xcompiler -MT -Xcompiler %LTOBJ% -Xcompiler -MD -Xcompiler -MP 
-Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %LTOBJ% %SOURCEFLAG%%SOURCE% 
&&\
> ?GENERIC??SUBDIROBJ?  $(am__mv) %DEPBASE%.Tpo %DEPBASE%.Plo
> else !%FASTDEPNVCC%
141a176
> endif !%FASTDEPNVCC%
diff -r automake-1.11.1/lib/depcomp autotools/automake-1.11.1/lib/depcomp
97,98d96
< echo $@
< 
128,146c126,128
< ## nVidia CUDA 2.3 does not suppport fast-dep mode :-(
< ## this part is ugly someone should rewrite it
< ## 1. nvcc flag fro dependencies is -M
< ##    however nvcc does not filter system headers :-(
< ## 2. the output file for the dependencies is given by -o
< ##    which is confusing for libtool and so we proceed as  follows
< ##    a. we need to call directly nvcc therefore we filter out args like:
< ##      /bin/bash (this is not robust enough it works only with bash)
< ##      ../../libtool
< ##      --tag=CXX or --tag=CC
< ##      --mode=compile
< ##      what remains after filtering is
< ##      nvcc -M -odir $depfiledir $source
< ##    b. we call something like
< ##      nvcc -M -odir $depfiledir $source > $tmpdepfile
< ## 3. as I understood libtool assumes that calling depcomp
< ##    initiates compilation. Therefore we call again given arguments.
<   depfiledir=`dirname $depfile`
<   ARG_STORE=$@
---
> ## nVidia CUDA 2.3 compiler combined with gcc3
> ## here we just add -Xcompiler parameter to pass
> ## gcc3 parameters to gcc3
150,156c132
<     -c) set fnord "$@" -M ;;
<     -o) set fnord "$@" -odir "$depfiledir" ;;
<     "$object") set fnord "$@" ;;
<     *libtool) set fnord "$@" ;;
<     --tag*) set fnord "$@" ;;
<     --mode*) set fnord "$@" ;;
<     *bash) set fnord "$@" ;;
---
>     -c) set 
fnord "$@" -Xcompiler -MT -Xcompiler "$object" -Xcompiler -MD -Xcompiler -MP 
-Xcompiler -MF -Xcompiler "$tmpdepfile" "$arg" ;;
162,163c138,140
<   if "$@" > "$tmpdepfile"; then
<      mv "$tmpdepfile" "$depfile"
---
>   "$@"
>   stat=$?
>   if test $stat -eq 0; then :
166c143
<     exit 255;
---
>     exit $stat
168,170c145
< 
<   $ARG_STORE
<   exit $?;
---
>   mv "$tmpdepfile" "$depfile"
diff -r automake-1.11.1/m4/depend.m4 autotools/automake-1.11.1/m4/depend.m4
155a156,158
> AM_CONDITIONAL([am__fastdepnvcc$1], [
>   test "x$enable_dependency_tracking" != xno \
>   && test "$am_cv_$1_dependencies_compiler_type" = nvcc])


Cheers, Tomas.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]