One project every month - making stuff better ...

Generating randomized incremental values

In the previous entries, I explained the basics of testdata_ninja and then I explained how to create foreign key relations between your test data and existing tables. This time I will show how you can create data that are random but always increasing, either to create time patterns or to create unique id columns for primary keys. I will extend the format that I have already used in the previous entries, so the code will be familiar. If we look at the output of the test_days generator we have created, it currently looks like this:

SQL> select * from table(tdg_test_days.test_days);

NAME                                               BIRTHDAY            MY_PK_REF
-------------------------------------------------- ------------------ ----------
Arianna Turner                                     02-MAR-93 00:24:58          1
Sofie Dam                                          17-MAY-91 17:21:09          1
Camila King                                        27-OCT-62 18:16:56         10
Malik Henriksen                                    24-NOV-65 06:23:51         10
.... [more rows] ....
There is no clear column with possible unique values, so I will create a new column where the data generator is an "incrementor". Like all the other different data fields, it is in the 3 field definition of the block that we define how to generate the data. So far the format looks like this:

So let us add a column in the beginning, that increments randomly but is unique nonetheless:

and here is the explanation of the fields, which are separated by the "~" character:
  • incrementor type (numiterate for numbers and datiterate for dates)
  • start from (number or date to start increment from)
  • increment range (for numbers min and max separated by the "¤" character and dates interval type (seconds, minutes etc), min and max separated by the "¤" character for example: seconds¤1¤5)
So if we now take a look at the full format in the code, it would look like this:

  the_format varchar2(4000) := 'people_pk#number#^numiterate~1~3¤5@name#varchar2(50)#person_random.r_name@birthday#date#person_random.r_birthday@my_pk_ref#number#£t_people¤person_num_pk¤simple¤5';
  testdata_ninja.generator_create('test_days', the_format);
And would create a result like the following:

1	Sigurd Andreasen	20-NOV-63 12:33:49	1
6	Ella Madsen	30-JAN-80 07:09:39	1
9	Alexandra White	16-JUL-53 16:32:07	1
13	Benjamin Holst	08-APR-88 13:39:38	1
17	Lea Jepsen	24-MAY-88 17:21:01	1
21	Eli Morgan	07-AUG-86 15:54:27	10
26	Oskar Bruun	20-JUL-69 22:40:56	10
31	Jens Friis	26-APR-63 16:32:21	10
35	Camila Hall	05-DEC-78 23:30:19	10

Another example could be if we wanted to create a stream of error messages. For that we would want a column with a time stamp that was increasing. So we can create a format like the following:

so to create the generator, we would run the following piece of plsql code:

  p_format varchar2(4000) := 'log_time#date#^datiterate~01022017-12:23:34~seconds¤1¤3@log_error#varchar2(500)#computer_random.r_error';
  testdata_ninja.generator_create('ERRORLOG', p_format);

We can then use the generator like this to get a list of error messages in chronological order:

SQL> select * from table(tdg_errorlog.errorlog);

LOG_TIME           LOG_ERROR                              
------------------ --------------------------------------
01-FEB-17 12:26:05 NameError: name nkbfj is not defined  
01-FEB-17 12:26:07 NameError: name vmfeu is not defined  
01-FEB-17 12:26:10 ORA-06949                             
01-FEB-17 12:26:12 errorcode=-345                         
01-FEB-17 12:26:14 NameError: name cftvl is not defined  
01-FEB-17 12:26:17 Msg 53938                             
01-FEB-17 12:26:19 errorcode=-767                        
01-FEB-17 12:26:20 NameError: name ocjzm is not defined  
01-FEB-17 12:26:21 errorcode=-249        
So as you can see we can very easily create these incremental but still random column values as well.

Next entry will be on how you can "back-reference" column values for later input arguments, and how you can get different output of the same data such as CSV, JSON and more.

Tagged in : RANDOM_NINJA