[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] comments on stream tags and metadata storage

From: Peter A. Bigot
Subject: Re: [Discuss-gnuradio] comments on stream tags and metadata storage
Date: Fri, 25 Jul 2014 06:00:45 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.0

I'd hoped my comments below would start a more extensive dialog on GNU Radio's metadata infrastructure. Several years experience that I have with this capability in a non-commercial C++ DSP framework suggests many enhancements in flow, representation, and utilities.

I have a slight itch to contribute to a solution, but without community involvement can't hope to provide anything mergable. Is this simply not something anybody feels needs to be addressed, or did I ask in the wrong forum?


On 07/17/2014 05:11 PM, Peter A. Bigot wrote:
Some comments after playing with stream tags and metadata this

(1) Although the discussion of stream tag insertion hints that this
should be done within the scheduler's call to work() it could be more
clear that doing it in any other context can result in race conditions.
(I did think I saw it stated more clearly somewhere, but can't find
that now, so maybe this point has been addressed.)

(2) In the current implementation it's further necessary that tags be
added to an output in monotonic non-decreasing offset order.
file_meta_sink does not sort the return value from get_tags_in_range(),
and emits all data up to the timestamp of the next tag, so a subsequent
tag with an earlier offset is dropped from the archive.

(I note that tagged_file_sink() does sort the tags it receives in one
case, but not in others.)

I don't see this requirement on ordered generation documented.  In some
cases, it may be inconvenient to do this, e.g. when a block's analysis
discovers after-the-fact that something interesting can be associated
with a past sample.  Similarly, a user might want a block to associate
a tag with sample that not yet arrived, to notify a downstream block
that will need to process the event.

A simple solution for the infrastructure is to require that tags only be
generated from within work(), with offsets corresponding to samples
generated in that call to work(), and in non-decreasing offset order
(though this last requirement could be handled by add_item_tag()).  The
developer must then handle the too-late/too-early tag associations
through some other mechanism, such as carrying the effective offset as
part of the tag value.

(3) Qt GUI Range with widget Counter + Slider invokes callbacks twice,
even if the value itself was set exactly once through the counter text
entry.  If the callback records the change by queuing a stream tag for
addition to the output, multiple tags with the same offset/key/value
will be generated.

There are ugly solutions to this but it's probably sufficient to note
somewhere that it can happen.  It's really not specific to tags, but is
clearly visible in that case.

(4) The in-memory stream of tags can produce multiple settings of the
same key at the same offset.  However, when stored to a file only the
last setting of the key is recorded.

I believe this last behavior is incorrect and that it's a mistake to use
a map instead of a multimap or simple list for the metadata record of
stream tags associated with a sample.

One argument is that it's critical that a stream archive of a processing
session faithfully record the contents of the stream so that re-running
the application using playback reproduces that stream and thus the
original behavior (absent non-determinism due to asynchrony). This
faithful reproduction is what would allow a maintainer to diagnose an
operational failure caused by a block with a runtime failure when the
same tag is processed twice at the same offset.  This is true even if
the same key is set to the same value at the same sample offset multiple
times, which some might otherwise want to argue is redundant.

A corollary argument is that the sample number at which an event like a
tuner configuration change occurs usually can't be exactly associated
with a sample; the best estimate is likely to be the index of the first
sample generated by the next call to work.  But depending on processing
speed an application might change an attribute of a data source multiple
times before work was invoked.  The effect of those intermediate changes
may be visible in the signal, and to lose the fact they occurred by
discarding all but the last change affects both reproducibility and
interpretation of the signal itself.

(5) All stream tags are placed in the extras block, and when a segment
is completed file_meta_sink will generate a new header.  The new header
contains copies of the unique tags, but updates their offsets to be the
start of the new segment.

This is incorrect as the original stream did not have those tags
associated with those samples, so re-playing will introduce a behavioral
difference.  For example, a tag that is meant to be associated with the
start of a packet will be duplicated at an offset that is probably not
the start of a packet.

Solutions include (a) leave the original offset setting for tags in the
extras section when they're reproduced in a new segment, even though
that offset is not present in the segment; (b) treat stream tags as
ephemeral and do not persist them in the extras section when generating
a new segment; (c) extend the add_item_tag API to record whether the
tag is ephemeral or persistent.  Offhand I can see no argument
supporting persisting a tag and updating its offset, and only rare cases
where it's appropriate to replicate outdated information in a new
segment, so (b) seems to be the right move.

All the above is based on my understanding and expectations of how
stream  tags are/should be used.  If my understanding is mistaken,
please let me know.


Discuss-gnuradio mailing list

reply via email to

[Prev in Thread] Current Thread [Next in Thread]