random palette knife work

Generate Random Data

Share this post on:

I had to create a 10 million row table for testing recently, and put together a query to generate random data for it.

INSERT INTO my_table
(id,
mydatetime,
string1,
string2)

SELECT
(random() * 70 + 10)::int,
TIMESTAMP '2024-01-01 00:00:00.000000' + interval '1 millisecond' * (random() * 86400 * 1000 * 365),
(array['alligator','bear','cat','dog'])[(random() * 3 + 1)::int],
substr(md5(random()::text), 1, 10)

FROM generate_series(1, 10);

The id field is just a random integer in this example, but you’d probably use an identity column.

The timestamp field will have a random timestamp from January 1, 2024 through around the end of the year. 86,400 is the number of seconds in a day, which multiplied by 1000 equals the number of milliseconds in a day. I could have just used 86400000, but that’s harder to read if I want to edit this in the future.

The first string field will be a random pick from alligator, bear, cat, or dog.

The second string field is a purely random alphanumeric string of 10 characters.

The generate_series function will generate 10 rows at a time. To create 10 million, I did 100 loops through generate_series for 100,000 at a time. That seemed to be the sweet spot.

Hope this helps someone else!

Author: Valerie Parham-Thompson

View all posts by Valerie Parham-Thompson >

Leave a Reply

Your email address will not be published. Required fields are marked *