qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] MPOL_MF_STRICT & hugetlbfs (was Re: [PATCH v3.1 25/31] host


From: Eduardo Habkost
Subject: [Qemu-devel] MPOL_MF_STRICT & hugetlbfs (was Re: [PATCH v3.1 25/31] hostmem: add properties for NUMAS memory policy)
Date: Tue, 10 Jun 2014 15:44:11 -0300
User-agent: Mutt/1.5.21 (2010-09-15)

On Mon, Jun 09, 2014 at 10:12:07AM +0800, Hu Tao wrote:
[...]
> > > 
> > > >     mbind(ptr, sz, policy, maxnode ? backend->host_nodes : NULL, 
> > > > maxnode + 1, flags);
> > > > 
> > > > 
> > > > (I am starting to wonder if it was worth dropping the libnuma
> > > > requirement and implementing our own mbind()-calling code.)
> > > > 
> > > > > +    if (mbind(ptr, sz, backend->policy, backend->host_nodes, maxnode 
> > > > > + 2, 0)) {
> > > > > +        error_setg_errno(errp, errno,
> > > > > +                         "cannot bind memory to host NUMA nodes");
> > > > 
> > > > Don't we want to set flags to MPOL_MF_STRICT here? I believe we
> > > > shouldn't have any pages preallocated at this point, but in case we do,
> > > > I would expect them to be moved instead of ignoring the policy set by
> > > > the user.
> > > 
> > > MPOL_MF_STRICT | MPOL_MF_MOVE to move. Actually in this version the
> > > preallocation happens before mbind, which is fixed in v3.2.
> > 
> > If memory was already allocated in a different node and has to be moved
> > that early, that's a bug we want to detect and fix (instead of
> > triggering useles memory moves). So I would use only MPOL_MF_STRICT.
> 
> Fair enough. But what about huge pages? As man page says, MPOL_MF_STRICT
> is ignored on huge page mappings. Is leaving a comment at the place of
> memory preallocation to warn people against alocating memory before mbind
> (like it's done in v3.2) the only thing we can do?

Well, maybe the kernel should be fixed to not ignore MPOL_MF_STRICT on
huge page mappings, then. Does anybody know if the warning on the
manpage still applies, and if this can be changed?

In the meantime, it looks like all we can do is to print a warning, or
refuse to preallocate before mbind().

-- 
Eduardo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]