[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH] Fix biased shuffle by avoiding already "struck" elements
|
From: |
Sergei Trofimovich |
|
Subject: |
[PATCH] Fix biased shuffle by avoiding already "struck" elements |
|
Date: |
Tue, 18 Jun 2024 22:37:54 +0100 |
From: Sergei Trofimovich <siarheit@google.com>
Artem Klimov noticed that current shuffle implementation suffers from
probability bias due to a typical bug in the shuffling implementation.
When we consider already shuffled element we slightly bias their
propability.
https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#The_modern_algorithm
suggests shuffling each "struck" element only once.
Before the change probabilities of 4-element array to land from `i` index
to `j` index are:
0 1 2 3
_____ _____ _____ _____
0 | 25.00 29.30 24.60 21.10
1 | 25.00 22.25 28.13 24.63
2 | 25.00 23.44 22.26 29.30
3 | 25.00 25.01 25.01 24.98
Note that `0->1` probability is almost 29% while `0->3` os only 21%.
After the change probabilities are not as biased:
0 1 2 3
_____ _____ _____ _____
0 | 24.99 24.99 25.01 25.01
1 | 24.99 25.04 24.99 24.99
2 | 25.01 25.00 25.00 24.99
3 | 25.01 24.98 25.00 25.01
* src/shuffle.c (random_shuffle_array): Fix biased shuffle by avoiding
already "struck" elements.
---
src/shuffle.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/src/shuffle.c b/src/shuffle.c
index 8fbefc7b..cda28f20 100644
--- a/src/shuffle.c
+++ b/src/shuffle.c
@@ -104,12 +104,16 @@ static void
random_shuffle_array (void **a, size_t len)
{
size_t i;
- for (i = 0; i < len; i++)
+
+ if (len <= 1)
+ return;
+
+ for (i = len - 1; i >= 1; i--)
{
void *t;
/* Pick random element and swap. */
- unsigned int j = make_rand () % len;
+ unsigned int j = make_rand () % (i + 1);
if (i == j)
continue;
--
2.45.1
- [PATCH] Fix biased shuffle by avoiding already "struck" elements,
Sergei Trofimovich <=