h5md-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h5md-user] fields of observable group


From: Felix Höfling
Subject: Re: [h5md-user] fields of observable group
Date: Tue, 06 Sep 2011 11:00:57 +0200
User-agent: Opera Mail/11.11 (Linux)

On Mon, 05 Sep 2011 11:17:55 +0200, Konrad Hinsen
<address@hidden> wrote:

On 2 Sep, 2011, at 14:07 , Felix Höfling wrote:

How should the edges/offset scheme be extended to account for such things
as a truncated octahedral?

I see two options:

1) Introduce a special case for the truncated octahedral shape, which is probably the most frequent one. The size of the box is specifed by the edges just like for a "normal" (parallelepipedic) box. Some additional label says that it is truncated octahedral box, meaning that the unit cell of the parallelepipedic system actually contains two copies of the system.

2) Provide a general mechanism for specifying symmetry inside the box. This would allow the simulation of arbitrary crystals while maintaining their symmetry. The part of the simulation universe stored explicitly becomes the asymmetric unit, to which a set of symmetry transforms are applied implicitly to reconstruct the whole system. The most straightforward way to store the symmetry information is as a list of symmetry transformations, i.e. one 3x3 rotation matrix plus a translation vector.

In my "molecular system" data model, I have chosen the second approach because in my field of work (molecular biophysics), crystals are important because crystallography is the main source for protein structures.

An optional set of affine symmetry transformations sounds very good. In
the most general case, this would include a matrix A (allowing for
reflections, rotations, rescaling) and a translation vector b for each
copy: x' = A x + b.

If we restrict to isometric transformations, the matrix shall be
orthogonal. This appears to be pretty general already, see
http://en.wikipedia.org/wiki/Euclidean_group.

It think about two optional attributes "transformation" and "shift"
attached to the "box" group. They hold datasets of ranks 3 and 2,
respectively: a square matrix and a vector for each copy of the stored
particle coordinates. If both attributes are present, their first
dimensions must agree; the remaining dimensions correspond to the space
dimension. [Alternatively, one may store a d by d+1 dimensional matrix (A,
b).] The order of operations is understood such that the matrix
multiplication is carried out first, then the translation. The unity
transformation shall not be specified and is always included [or would it
be better to require it explicitly if the attributes are present?]. This
would imply the following HDF5 structure:

parameters
   \-- box
        +-- [transformation]  [#copies-1][d][d]  (-1 because unity is
included by default)
        +-- [shift]  [#copies-1][d]
        \-- edges
        |    \-- sample
        |    \-- time
        |    \-- step
        \-- offset
             \-- sample
             \-- time
             \-- step

One problem appears here: if the box size fluctuates, the shift vector has
to be adjusted as the simulation progresses. If the matrix is orthognonal,
i.e., of norm unity, it is unaffected. Maybe the shift vector should be
unified with 'offset'?

Shall the boundary conditions of the box be stored in an H5MD file? I can
think of open boundaries, periodic boundaries and (a bit weird)
Klein-bottle boundaries (a torus plus a twist). The same question arises
for the velocities in case of Lee-Edwards boundaries.


The offset can be useful if, e. g., different simulation snapshots shall
be glued together (for creating layered structures of pre-equilibrated
phases). And it is necessary for the complete description of an
arbitrarily positioned box in space. Of course, it is redundant for
particle positions reduced to the periodic box.

Ultimately this comes down to the question of what conditions you want to impose on the particle coordinates stored in the trajectory. There are various options:

1) No conditions at all. A particle implicitly stands for all its periodic images, which are constructed by applying integer multiples of the edge vectors. There is then no point in storing an offset. This convention makes life easy for programs generating a trajectory, but requires more work by programs that read a trajectory. It provides most freedom for pairs of generators/readers to establish their own conventions, but leaves the verification of these conventions to the readers.

2) Coordinates are required to be in the interval [offset ... offset+edge[. Trajectory generating programs must ensure this condition, but have the freedom of specifying the offset as it suits them. Reading programs gain a bit compared to 1) but must still handle arbitrary offsets.

3) Coordinates are required to be in an interval defined by the trajectory format specification, such as [0 ... edge[ or [-edge/2 ... edge/2[. This puts more constraints on trajectory generators but provides the most guarantees to readers.

There are also intermediate choices, such as permitting various conventions and indicating them in metadata.

In biomolecular simulations, the usual convention is 1) because it permits a useful arrangement for visualization: coordinates can be arranged such that biologically important molecular assemblies are represented in the way that is most useful to a biologist. This comes down to having a specific arrangement for each individual simulation. All programs get a bit more complicated, but it's the users who gain. However, the same effect can be obtained otherwise, e.g. by storing an explicit "visualization offset" outside of the trajectory.


The role of the offset may change with the inclusion of Euclidean
transformations, see above. Just one remark: for studies of transport in
liquids, it is desirable to store the unwrapped trajectory of each
particle in periodic space, otherwise the displacements after long time
lags get unphysically truncated. Thus the positions should definitely not
be enforced to be within the unit cell, leaving it open to the writer
whether to reduce them or not.

Felix



reply via email to

[Prev in Thread] Current Thread [Next in Thread]