Generate Random Data
Generating random data for testing in YugabyteDB
I had to create a 10 million row table for testing recently, and put together a query to generate random data for it.
INSERT INTO my_table
(id,
mydatetime,
string1,
string2)
SELECT
(random() * 70 + 10)::int,
TIMESTAMP '2024-01-01 00:00:00.000000' + interval '1 millisecond' * (random() * 86400 * 1000 * 365),
(array['alligator','bear','cat','dog'])[(random() * 3 + 1)::int],
substr(md5(random()::text), 1, 10)
FROM generate_series(1, 10);
The id field is just a random integer in this example, but you’d probably use an identity column.
The timestamp field will have a random timestamp from January 1, 2024 through around the end of the year. 86,400 is the number of seconds in a day, which multiplied by 1000 equals the number of milliseconds in a day. I could have just used 86400000, but that’s harder to read if I want to edit this in the future.
The first string field will be a random pick from alligator, bear, cat, or dog.
The second string field is a purely random alphanumeric string of 10 characters.
The generate_series function will generate 10 rows at a time. To create 10 million, I did 100 loops through generate_series for 100,000 at a time. That seemed to be the sweet spot.
Hope this helps someone else!