help-octave
[Top][All Lists]

## Re: counting runs of a certain length in binary data

 From: Paul Kienzle Subject: Re: counting runs of a certain length in binary data Date: Sat, 4 Mar 2000 01:19:27 +0000 (GMT)

```>On  3-Mar-2000, Mike Miller <address@hidden> wrote:
>
>| Here's a problem:  I want to count the number of times in a sequence of
>| binary digits that I observe a run of N or more digits that are the same.
>| I think there must be an easy and efficient way to get Octave to pump out
>|
>| For example, if this were my vector of binary digits
>|
>| [0 1 1 1 1 0 0 0 1 0 1 0 0 0 1 1 1 1 0 1 0 1]'
>|    ^     # ^   #       ^   # ^     #
>|
>| I would count four runs of length 2 or greater.  (I've marked the
>| beginning of each run with '^' and the end with '#'.)
>|
>| I have been able to get Octave to tell me how many times I have a run of
>| length 2 or more using the code given in the third line below:
>|
>| octave:1> m=100; n=10;
>| octave:2> X=(rand(m,n)<.5);
>| octave:3> sum(diff([ones(1,n);abs(diff(X))])==-1)
>| ans =
>|
>|   24  25  27  24  25  31  23  26  25  26
>|
>| The first line tells the size of the matrix, the second line generates a
>| matrix of binary scores and the third line counts the number of runs of
>| length two or more for every column of the input matrix.
>|
>| Does anyone have any neat ways of counting runs of length greater than
>| two?  I assume 'diff' would be used repeatedly somehow.
>
>Would the following work?
>
>  y = [0; X; 0];
>  find (diff (y) == -1) - find (diff (y) == 1)
>
>X is the vector of binary digits.  The idea is that prepending and
>appending zeros ensures that the first nonzero element of diff(y) will
>be 1 and that the last nonzero element will be -1, and that there will
>be an equal number of 1 and -1 elements.  Then the results of the two
>find commands should always have the same length, and their difference
>should be the run-lengths for the ones in the original vector.
>
>jwe

More generally, the following will work with runs of arbitrary values:

n = length(x);
idx = find(x(1:n-1) != x(2:n); # index where the run changes
runs = diff([0 idx n]);        # convert indexes into run lengths,
# not forgetting the first or last runs
sum(runs>=N);                  # count the runs of length >= N

This can be expressed more compactly as:

n = length(x);
sum(diff([0 find(x(1:n-1) != x(2:n)) n])>=N)

Paul Kienzle

-----------------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:  http://www.che.wisc.edu/octave/octave.html
How to fund new projects:  http://www.che.wisc.edu/octave/funding.html
Subscription information:  http://www.che.wisc.edu/octave/archive.html
-----------------------------------------------------------------------

```