bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Bug/limitation in 'time' (kernel setings?)...


From: Bruce Dawson
Subject: RE: Bug/limitation in 'time' (kernel setings?)...
Date: Wed, 20 Mar 2013 10:51:30 -0700

I looked at the CPU sampling data to try to confirm what is going on. The
profile data (captured by zoom, using perf) shows ~90% of the samples
hitting in various invocations of expr. The tool won't summarize call stacks
across processes so I manually looked at a bunch of them. I looked at
samples in 15 invocations of expr. Each invocation runs for less than a ms
and the sampling frequency is one KHz so no individual process got more than
one sample. The samples I saw were: 

- page_fault in module vmlinux
- page_fault-> flush_tlb_page in module vmlinux
- do_lookup_x in ld-2.15.so
- perf_event__output_id_sample in vmlinux
- page_fault-> do_page_fault in vmlinux
- do_lookup_x in ld-2.15.so
- stub_execve-> load_elf_binary-> exec_mmap in vmlinux
- page_fault-> handle_pte_fault-> copy_page_c in vmlinux
- page_fault-> handle_pte_fault-> page_add_file_rmap in vmlinux
- system_call_fastpath-> sys_mmap-> mmap_region in vmlinux
- page_fault-> do_page_fault-> handle_mm_fault in vmlinux
- <Unknown Symbol> in unknown module
- page_fault-> handle_pte_fault-> page_add_file_rmap in vmlinux
- <Unknown Symbol> in unknown module
- page_fault-> do_page_fault-> up_read in vmlinux

To summarize, the zoom profiler says that ~90% of CPU time when running
ExprCount() is spent in expr. Of that time it says that more than half of it
is spent in page_fault or mmap. That sounds consistent with the claim that
it is process startup that is consuming the missing CPU time that 'time'
fails to account for.

I see no indications in any of the data of a significant amount of CPU time
being spent in the OS scheduler. Neither the context switch data nor the
sampling data supports that conclusion.

I could be wrong of course but I think my conclusion makes sense, and it is
supported by significant evidence.

FWIW.

-----Original Message-----
From: Bruce Dawson [mailto:address@hidden 
Sent: Wednesday, March 20, 2013 7:44 AM
To: 'Linda Walsh'
Cc: 'bug-bash'
Subject: RE: Bug/limitation in 'time' (kernel setings?)...

Note that the ~2ms/process create is when the power management fails to get
invoked. It drops to ~0.6ms/process create when the CPU ramps up to full
speed. The chart you list from twelve years ago shows 6.0 ms to spawn a new
process, so my machine is about ten times faster -- seems about right.
Cloning a process is not comparable because in my test the kernel has to
load and exec() expr, not just fork() bash.

> Right there, it shows you that the scheduler is taking almost 33% of your
time.

I don't follow your logic. How do you know that it is the scheduler? I can
see that when you run your schedtool script it shows 15% to 34.4% of CPU
time consumed. I say that the rest is process startup, you say scheduler.
I'm not sure what evidence points to the scheduler, and I'm not sure what
math you used to come up with the 33% figure. You'll have to elucidate.

> Then your kernel is misconfigured for your workload.

What? My kernel is executing the tasks fine. When my CPU runs at full speed
it is slightly faster than yours, and when it runs at 1/3 speed it is slower
than yours. In both cases 'time' reports that the CPU is not pegged when it
is. Our kernels are behaving quite similarly.

> What documentation would you have updated?

The bash documentation where it mentions the 'time' command, and the 'time'
documentation.

-----Original Message-----
From: Linda Walsh [mailto:address@hidden 
Sent: Wednesday, March 20, 2013 1:12 AM
To: Bruce Dawson
Cc: bug-bash
Subject: Re: Bug/limitation in 'time' (kernel setings?)...



Bruce Dawson wrote:
> The reply below is way too long, but oh well.
> 
> The summary is:
> I am 100% certain that my ExprCount() loop is CPU bound.
----
        Your stats claim 530 iterations/second when spawning expr.

That's ~2ms/process create.

In an article from 14 years ago on a 200Mhz Pentium, Linux process creates
were ~1ms.

http://www.linuxjournal.com/article/2941
(table @
http://www.linuxjournal.com/files/linuxjournal.com/linuxjournal/articles/029
/2941/2941f1.jpg)

If you believe that a modern processor running at 1.6GHz, takes twice as
long as a 14 year old pentium, then there is little likelihood that you will
be convinced by any logical argument.



You need to rebuild your kernel with the config option turned on -- in fact,
you need to build your own kernel if you want to do any sort of kernel
benchmarks.


> You seem convinced that the task is not CPU bound on my machine, but 
> I'm convinced that it is.
---
        Then your kernel is misconfigured for your workload.

You said:
> I am 100% certain that my ExprCount() loop is CPU bound. It is 
> unfortunate that time gives inaccurate information about this. It 
> would be nice if the documentation acknowledged that in order to save 
> future generations from confusion.
----
What documentation would you have updated?


Here is a perl program:
#!/usr/bin/perl
die "need count" unless @ARGV;
my $count=$ARGV[0];
my $counter=$count;
while ($counter-->0) {
  my $pid;
  if (my $pid=fork() ) {
    waitpid($pid,0);
  }else {
    exec "/bin/true"; # use 'exit(0)' to just time forks w/no image load
  }
}
-----------------------------
> /usr/bin/time /tmp/sp.pl 10000
 time schedtool -a 3 -e /tmp/sp.pl 10000 6.83sec 0.05usr 0.97sys (15.05%
cpu)
---
But if I don't confine it to 1 cpu:
> time /tmp/sp.pl 10000                  
10.54sec 0.20usr 2.59sys (26.52% cpu)

Just forks:
> time /tmp/sp.pl 10000
7.23sec 0.12usr 2.36sys (34.35% cpu)
> time /tmp/sp.pl 10000
7.21sec 0.12usr 2.35sys (34.39% cpu)
---
1cpu forks:

> time schedtool -a 3 -e /tmp/sp.pl 10000               
4.32sec 0.06usr 0.95sys (23.69% cpu)
Ishtar:law> time schedtool -a 3 -e /tmp/sp.pl 10000 4.29sec 0.05usr 0.96sys
(23.80% cpu)

-----------------

Right there, it shows you that the scheduler is taking almost 33% of your
time.

That may be cpu, but it is not cpu that your program is using.
That's why I say your accounting is off.







>From these figures, I would say it's closer to 1000/second





reply via email to

[Prev in Thread] Current Thread [Next in Thread]