[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug #40639] GNU Make with profiling information
Re: [bug #40639] GNU Make with profiling information
Tue, 14 Jan 2014 15:55:11 +0000
I forgot to say that start times don't need to be absolute times -
only relative to the start of the top level gmake if possible. That
creates a problem for submakes I suppose.
I would guess that one could put the absolute build start time in an
environment variable like MAKE_START_TIME and then use that in every
submake to get the relative start time.
I haven't looked at the patch - perhaps it's doing this?
In any case, fixed/floating point seconds since 1970 is the nicest
format to process from scripts in my experience.
On 14 January 2014 15:49, Tim Murphy <address@hidden> wrote:
> To some, using a spreadsheet might not seem like the most worthwhile
> way to visualise timing information.
> If it was me, I'd be far more concerned about whether I could write a
> script that could easily cope with all this information. Builds with
> hundreds of thousands of targets were common for me at one point and
> nowadays I do android stuff - how much is that? I think it's
> somewhere around 36,000.
> This scale makes spreadsheets relatively unimportant as an analysis
> tool and makes it necessary to pass information through a script to
> first extract or summarise the information to a level where humans can
> deal with it.
> a) an absolute start time and
> b) a duration
> ...are easy to process in scripts to reconstruct whatever form one
> needs - a spreadsheet for you and a different kind of special graph
> for me. Both examples of a profiling feature for make that I'm aware
> of already use this format to good effect.
> It's also worth trying to produce these figures as each job finishes
> and then throw them away because then the build doesn't have to finish
> before one is able to process the data. You might use it, for example
> to provide progress information. e.g. if you keep information from a
> previous build and combine it with profiling information coming out of
> the new build you can guess how long is left.
> On 14 January 2014 13:48, Eddy Petrișor <address@hidden> wrote:
>> 2014/1/12 Paul Smith <address@hidden>:
>>> On Wed, 2013-12-18 at 13:28 +0200, Eddy Petrișor wrote:
>>>> Could you please confirm if the general direction of the the is OK in
>>>> the latest patch I sent?
>>> Conceptually it seems OK. I'm still not jazzed about having any more
>>> than one output format, and I'd prefer that format to be in a
>>> more-or-less readable form, more like the "long" form than the others.
>> If that will be the only format, then it would mean always imposing a lot
>> more more work on the information processing stage due to necessary
>> filtering and transformations to fit a format accepted or easily parsed in a
>> tool such as Oocalc, Gnumeric or Excel.
>> Although human readable seems nice and understandable, it is the least
>> machine parseable, hence my choice for the 'simple' format to be default.
>>> I think the output should go in the standard make output format, so
>>> something like:
>>> make[<LVL>]: <target>: <details...>
>>> Or, alternatively:
>>> <target>[<LVL>]: <details>
>>> Also I think it's enough to show the start offset and the elapsed time.
>>> End offset is not necessary IMO.
>> Unfortunately, depending on the used tools, when processing the information
>> the end time is necessary. For instance, in Microsoft Excel the only way to
>> display graphs for intervals is via a graph designed for visualising stocks'
>> variation (and it even forces the insertion of an extra field).
>> Oocalc is fine with start and stop, but for the graphs look awful and are
>> unusable with absolute values (it scales so the entire 0-timestamp interval
>> is visible, so a difference of a few seconds is invisible on the graph), so
>> relative values are better here.
>> OTOH, relative time stamps or just durations are useful for a human eye
>> examination, since is easy to spot offenders that way.
>> These are three different scenarios which I myself encountered and had in
>> mind when designing the code the way I did.
>>> I'm unsure about the PID. This is the pid of the make process so I'm
>>> not sure what the goal is. Is it to be able to collect all the times
>>> together maybe?
>> The goal is to be able to:
>> - spot targets evaluated redundantly in a recursive makefile in different
>> processes but on the same recursion level (these targets could be candidates
>> to be moved in a parent make invocation)
>> - spot wasteful/needless recursive calls (e.g. several targets called in
>> different make processes when they could be grouped in a single call)
>> - be able to analyse a single make invocation or a call sub-tree
>>> Is it necessary to dump all the output times at the end? Doing so
>>> requires that we increase the size of the file structure to hold the
>>> information, and this is already large AND the most common structure in
>>> memory; there's one for every single target which for non-recursive
>>> builds can get really big. I'm trying to keep memory usage under
>>> If instead of that we print the information after each target is
>>> complete we can shift the storage of this information out of the file
>>> structure and into the commands structure or similar. To me it seems
>>> more useful to keep the elapsed time info right next to the command
>>> output rather than dumping it all at the end.
>> I'm afraid I am missing some details of the implementation so I can't answer
>> that question in any meaningful way.
>> I will have to look into the code, but if a single target does NOT have
>> multiple commands structures, it should work.
>> Any pointers to the appropriate code area or suggestions would be welcome.
>>> Some other comments:
>>> 1. In general remember that GNU make code must conform to ANSI
>>> C89 / ISO C90. We shouldn't be using newer features of the
>>> language or runtime library unless we need to, and most of those
>>> require some kind of autoconf test.
>> I'm sure you had some specific code in mind when you wrote this. I assumed
>> the build system would have the appropriate compiler options for the desired
>> compliance level. Should I compile with '-std=c99 -pedantic-errors' to
>> check, or do you have other options in mind?
>>> 2. Let's avoid float and double (and struct timeval). There's no
>>> reason why we can't fit enough precision in a uint32 to count
>>> elapsed time in milliseconds for a build: that gives 50 days or
>>> so. GNU make still supports running on systems where there is
>>> no floating point support (see the NO_FLOAT #define). Although
>>> I haven't tested it in a while.
>> I was aware of this since your first email. I wanted to know of the general
>> idea is OK.
>> I will change this, too.
>>> 3. The use of "$" tokens in printf() statements is likely
>>> problematic from a portability standpoint. It seems like this
>>> should be relatively easy to avoid.
>> I'll change them into simple format specifiers, in spite of the repetitions.
>>> 4. If the printed string contains text then it needs to be marked
>>> for translation (with the "_(...)" macro).
>> Since the profiling info should be machine parsable, I think the only
>> translatable string would be the long format. I will change this, although
>> the inconsistency rubs me the wrong way.
>>> 5. We don't want to be using fprintf() here. All output needs to
>>> go through the output.c module, so that it's properly managed
>>> via output sync.
>> I was aware of the output sync issue, I will look into it.
>>> 6. gettimeofday is not portable. Also, it's not really the best
>>> option for timing things because (due to NTP etc.) it can change
>>> (by that I mean that if it reports 10s elapsed it might not be
>> Hmm, didn't thought of that.
>>> that 10s really elapsed). Using a monotonic clock is better,
>>> although that's also not portable. But if we have to be
>>> non-portable maybe we should try to get an accurate accounting.
>>> On the gripping hand maybe it's not that important to be
>>> absolutely accurate.
>> Is there a portable timing API that has us resolution? Not sure if ms
>> resolution is enough given the speed of current system and Moore's Law
>> predicting even faster systems.
>>> You mentioned something about trying to send the start time through the
>>> environment but I don't see any code to that effect here; how were you
>>> doing that?
>> As said in my previous mail, I wanted to avoid confusion regarding what code
>> to review.
>> In reply to your newest mail:
>>> Sorry, I was not clear: I wasn't suggesting that it would be better to
>>> display the absolute time for the start time. I was wondering why we
>>> display the start time at all. Why not just show the elapsed time, and
>>> nothing else? That would avoid all of these issues.
>>> However Tim makes a reasonable point in his response so if it can be
>>> done without too much difficulty it would be good to show a relative
>>> start time.
>> Also, the start time is necessary to be able to see which target was waiting
>> for which in a graph. This way one can see a target starting later than it
>> should logically start, making it a good avenue for performance improvement
>> Eddy Petrișor
>> Bug-make mailing list
> You could help some brave and decent people to have access to
> uncensored news by making a donation at:
You could help some brave and decent people to have access to
uncensored news by making a donation at: