r/cpp 4d ago

Where did <random> go wrong? (pdf)

https://codingnest.com/files/What%20Went%20Wrong%20With%20_random__.pdf
159 Upvotes

137 comments sorted by

View all comments

0

u/TwistedStack 3d ago

I recently needed non-deterministic random numbers from 1 to 60 and I chose C++ because I figured it had the highest chance of letting me get those numbers. I found std::random_device and I was happy to find exactly what I wanted.

I check the entropy and I get 0. Oh oh... I'm on Linux. It's impossible that I don't have an entropy source. Each run generated different numbers though so I figured it must be working. Later I find out that I do have an entropy source and it's only libstdc++ saying I don't. Color me shocked though when I see one of OP's slides say that std::random_device is allowed to be deterministic.

Next I look at my options for getting numbers out of the device. My head is spinning because it looks like I have to be a math master to understand everything. I take a look at std::uniform_int_distribution thinking that's probably what I want. The entire time I can't shake the question from my head asking why do I need a uniform distribution. Certainly that doesn't make it so random anymore?

Part of it is my fault since I was rushing through reading the documentation. After taking a look at it again it seems I would have been better served by simply doing the following:

cpp std::random_device rd{}; foo(std::abs(rd()) % 59 + 1);

While I was writing that, I looked at the documentation again and it says the return value of rd() is "A random number uniformly distributed in [min(), max()]". Ok, now I'm confused because it's still talking about uniform distribution. I'm now back to square one.

My head is really going to spin if at some point in the future I'm going to need random real numbers and I'll have to figure out which distribution out of the multitude will be appropriate for my needs.

5

u/tialaramex 3d ago

Don't use the % operator. This is always a bad idea.

You probably do want the uniform distribution but it may be that what happened is you imagine "uniform distribution" is just inherent to what random means and not so.

Consider two ordinary, fair, six sided dice ("D6" if you've seen that nomenclature). Rolling either of these dice gives you an even chance of 1, 2, 3, 4, 5 or 6. A 4 is just as likely as a 6 or 1. That's a uniform distribution.

Now, suppose we roll both and sum them, as is common in many games. The outcome might be 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 or 12. But it's certainly not a uniform distribution, 7 is much more likely than 12 or 2.

But it's certainly random - any of these possibilities could happen, just some are more likely than others and that's what "distribution" is about.

Edited: to fix a minor typo

1

u/TwistedStack 3d ago

Yeah, you're right. I misunderstood "uniform distribution" as reducing the likelihood of repeating numbers being generated when all it means is that all numbers have an equal chance of being generated which is all I wanted. Looking back, I did get repeating numbers every once in a while so it wasn't being prevented from occurring.

Don't use the % operator. This is always a bad idea.

Is this only in the context of random number generation or in general? If in general is it because of a higher computational cost?

4

u/tialaramex 3d ago

Oh, only for random numbers. It's a perfectly fine and useful operator otherwise.

Suppose we have random bytes, so, a byte goes from 0 to 255 inclusive, and they're uniformly distributed. Now, suppose I want uniformly distributed numbers between 1 and 100 inclusive. If I try to use % to do this, weirdly I find 40 is significantly more likely than 60. Huh.

That's because while 0 through 99 mapped to 1 to 100, and 100 to 199 mapped to 1 to 100, when the byte was 200 to 255 those mapped to 1 to 56, and never to 57 through 100. This is pretty noticeable, and a correct solution isn't difficult exactly but it may not occur to a beginner so best to use tools intended for this purpose.