qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 04/19] target/ppc: PMU Book3s basic insns count for pseries T


From: David Gibson
Subject: Re: [PATCH 04/19] target/ppc: PMU Book3s basic insns count for pseries TCG
Date: Tue, 17 Aug 2021 12:59:28 +1000

On Mon, Aug 16, 2021 at 02:53:13PM -0300, Daniel Henrique Barboza wrote:
> 
> 
> On 8/10/21 12:39 AM, David Gibson wrote:
> > On Mon, Aug 09, 2021 at 10:10:42AM -0300, Daniel Henrique Barboza wrote:
> > > The PMCC (PMC Control) bit in the MMCR0 register controls whether the
> > > counters PMC5 and PMC6 are being part of the performance monitor
> > > facility in a specific time. If PMCC allows it, PMC5 and PMC6 will
> > > always be used to measure instructions completed and cycles,
> > > respectively.
> > > 
> > > This patch adds the barebones of the Book3s PMU logic by enabling
> > > instruction counting, using the icount framework, using the performance
> > > monitor counters 5 and 6. The overall logic goes as follows:
> > > 
> > > - a helper is added to control the PMU state on each MMCR0 write. This
> > > allows for the PMU to start/stop as quick as possible;
> > 
> > Um.. why does a helper accomplish that goal?
> > 
> > > 
> > > - only PMC5 and PMC6 are being set. PMC6 (cycles) is default to 4*insns
> > > (for cycles per instruction) for now;
> > 
> > What's the justification for that value?  With a superscalar core, I'd
> > expect average (median) cycles per instruction to be < 1 a lot of the
> > time.  Mean cycles per instruction could be higher since certain
> > instructions could take a lot.
> > 
> > > - the intended usage is to freeze the counters by setting MMCR0_FC, do
> > > any additional setting via MMCR1 (not implemented yet) and setting
> > > initial counter values,  and enable the PMU by zeroing MMCR0_FC. Software
> > > must freeze counters to read the results - on the fly reading of the PMCs
> > > will return the starting value of each one.
> > 
> > Is that the way hardware behaves?  Or is it just a limitation of this
> > software implementation?  Either is fine, we should just be clear on
> > what it is.
> > 
> > > 
> > > Since there will be more PMU exclusive code to be added next, let's also
> > > put the PMU logic in its own helper to keep all in the same place. The
> > > code is also repetitive and not really extensible to add more PMCs, but
> > > we'll handle this in the next patches.
> > > 
> > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> > > ---
> > >   target/ppc/cpu.h               |  4 ++
> > >   target/ppc/cpu_init.c          |  4 +-
> > >   target/ppc/helper.h            |  1 +
> > >   target/ppc/meson.build         |  1 +
> > >   target/ppc/pmu_book3s_helper.c | 78 ++++++++++++++++++++++++++++++++++
> > >   target/ppc/translate.c         | 14 ++++--
> > >   6 files changed, 97 insertions(+), 5 deletions(-)
> > >   create mode 100644 target/ppc/pmu_book3s_helper.c
> > > 
> > > diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> > > index 4d96015f81..229abfe7ee 100644
> > > --- a/target/ppc/cpu.h
> > > +++ b/target/ppc/cpu.h
> > > @@ -1175,6 +1175,10 @@ struct CPUPPCState {
> > >       uint32_t tm_vscr;
> > >       uint64_t tm_dscr;
> > >       uint64_t tm_tar;
> > > +
> > > +    /* PMU registers icount state */
> > > +    uint64_t pmc5_base_icount;
> > > +    uint64_t pmc6_base_icount;
> > >   };
> > >   #define SET_FIT_PERIOD(a_, b_, c_, d_)          \
> > > diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> > > index 71062809c8..fce89ee994 100644
> > > --- a/target/ppc/cpu_init.c
> > > +++ b/target/ppc/cpu_init.c
> > > @@ -6822,7 +6822,7 @@ static void 
> > > register_book3s_pmu_sup_sprs(CPUPPCState *env)
> > >       spr_register_kvm(env, SPR_POWER_MMCR0, "MMCR0",
> > >                        SPR_NOACCESS, SPR_NOACCESS,
> > >                        &spr_read_pmu_generic, &spr_write_pmu_generic,
> > > -                     KVM_REG_PPC_MMCR0, 0x00000000);
> > > +                     KVM_REG_PPC_MMCR0, 0x80000000);
> > >       spr_register_kvm(env, SPR_POWER_MMCR1, "MMCR1",
> > >                        SPR_NOACCESS, SPR_NOACCESS,
> > >                        &spr_read_pmu_generic, &spr_write_pmu_generic,
> > > @@ -6870,7 +6870,7 @@ static void 
> > > register_book3s_pmu_user_sprs(CPUPPCState *env)
> > >       spr_register(env, SPR_POWER_UMMCR0, "UMMCR0",
> > >                    &spr_read_pmu_ureg, &spr_write_pmu_ureg,
> > >                    &spr_read_ureg, &spr_write_ureg,
> > > -                 0x00000000);
> > > +                 0x80000000);
> > >       spr_register(env, SPR_POWER_UMMCR1, "UMMCR1",
> > >                    &spr_read_pmu_ureg, &spr_write_pmu_ureg,
> > >                    &spr_read_ureg, &spr_write_ureg,
> > > diff --git a/target/ppc/helper.h b/target/ppc/helper.h
> > > index 4076aa281e..5122632784 100644
> > > --- a/target/ppc/helper.h
> > > +++ b/target/ppc/helper.h
> > > @@ -20,6 +20,7 @@ DEF_HELPER_1(rfscv, void, env)
> > >   DEF_HELPER_1(hrfid, void, env)
> > >   DEF_HELPER_2(store_lpcr, void, env, tl)
> > >   DEF_HELPER_2(store_pcr, void, env, tl)
> > > +DEF_HELPER_2(store_mmcr0, void, env, tl)
> > >   #endif
> > >   DEF_HELPER_1(check_tlb_flush_local, void, env)
> > >   DEF_HELPER_1(check_tlb_flush_global, void, env)
> > > diff --git a/target/ppc/meson.build b/target/ppc/meson.build
> > > index b85f295703..bf252ca3ac 100644
> > > --- a/target/ppc/meson.build
> > > +++ b/target/ppc/meson.build
> > > @@ -14,6 +14,7 @@ ppc_ss.add(when: 'CONFIG_TCG', if_true: files(
> > >     'int_helper.c',
> > >     'mem_helper.c',
> > >     'misc_helper.c',
> > > +  'pmu_book3s_helper.c',
> > >     'timebase_helper.c',
> > >     'translate.c',
> > >   ))
> > > diff --git a/target/ppc/pmu_book3s_helper.c 
> > > b/target/ppc/pmu_book3s_helper.c
> > > new file mode 100644
> > > index 0000000000..fe16fcfce0
> > > --- /dev/null
> > > +++ b/target/ppc/pmu_book3s_helper.c
> > 
> > I'd prefer to call this just book3s_pmu.c.  Or better yet
> > "powerX_pmu.c", where X is the specific PMU model you're implementing
> > since IIRC the particulars of the PMU vary quite a lot from POWER7
> > through to POWER10.
> 
> I'll go with book3s_pmu.c because this PMU implementation will not go
> deep enough to touch the differences between POWER processors.

I think you are mistaken.

> The only aspects that will be implementation specific will be 2 perf
> events (0x2 and 0x1E) that the kernel has been using for a long time
> and only recently migrated to architected events. We'll support all
> architected events that are related to those events so that newer
> kernels - and other non-IBM processors - will use the PMU without
> the need of having to know about 0x2 and 0x1E.

So, possibly in the last few POWER chips, the only differences are the
table of event keys.  That is definitely not the case for the whole
POWER line though.  I remember from my time at IBM, the PMU underwent
huge changes much deeper than the event table.  Some had different
numbers of PMCs, different bit fields in the MMCRs, different methods
of event selection (in some cases through insanely compplex chains of
multiplexers).  And everything from POWER4 onwards could reasonably be
described as "book3s".  So we definitely need a different
name... working out what it should be is harder though.

If the modern core structure of the PMU got codified in a particular
BookS architecture version we could name it after that version, maybe?

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]