One project every month - making stuff better ...

Test data and foreign keys

In my last entry on testdata_ninja I showed the basics on how to create your own custom test data generators.

Once you have created the simple ones you probably want to extend it a little bit, and create test tables that has relations. So you want to create random data, but random data that actually can join to another table. The way to do this, is with a special data generator construct in the testdata_ninja format. If you read the old entry, you can see that testdata_ninja format is a series of columns seperated by the "@" sign. Each column has at least 3 parts (divided by the "#" character), where the third is the data generator.

So in the previous example I wrote about we used functions from the random_ninja package. You can however, reference existing tables instead. First we want to create a master table that we can reference in our new generator. We will use the basic data generators that are already part of the testdata_ninja package to create this:

create table t_people as select * from table(testdata_generator.people(100));
This statement will create a table with a hundred rows in. This table is what we will use to create our other test data table. One of the rows is called person_num_pk, and that is the column with the relation to our other table. The way we define that is with a special character as the first character in our generator field. So in my last entry I defined the test table with the following format:

So let us say that I wanted to add another column here, that linked to my primary key in the new table I just created. If I wanted to do that I simply create a third column definition like the following:

Notice the "£" sign. This is what specifies this generator as a reference to another table and not a function. The fields in the reference generator (divided with the "¤" character) are the following:

  • table name that we are referencing (t_people)
  • column in that table we are referencing (person_num_pk)
  • distribution rule (simple or range)
  • distribution count. If simple just one number, and every ref will this many rows in the new generator or range and you specify 2,5 then every reference will have between 2 and 5 rows in the new generator
So to create this new table we would simply run:

  the_format varchar2(4000) := 'name#varchar2(50)#person_random.r_name@birthday#date#person_random.r_birthday@my_pk_ref#number#£t_people¤person_num_pk¤simple¤5';
  testdata_ninja.generator_create('test_days', the_format);
and we can now select from the generator and see that each value in the my_pk_ref column will be a real reference back to the t_people table and there will be 5 "child" rows per real row from the t_people table:

  , b.birthday
  t_people a
  , (select name, birthday, my_pk_ref from table(tdg_test_days.test_days)) b
  a.person_num_pk = b.my_pk_ref;

FIRST_NAME                                                                                           BIRTHDAY         
---------------------------------------------------------------------------------------------------- ------------------
Parker                                                                                               14-SEP-64 13:35:56
Parker                                                                                               18-APR-62 05:02:35
Parker                                                                                               05-SEP-93 12:35:50
Parker                                                                                               16-AUG-82 21:19:14
Parker                                                                                               11-MAR-85 15:51:54
Bryson                                                                                               03-SEP-84 19:56:20
Bryson                                                                                               08-JUL-96 14:33:07
Bryson                                                                                               30-DEC-62 06:28:59
.......... [more rows]

If you want more variance in the number of child rows, you can use the range definition instead of the simple definition, like the following:

  the_format varchar2(4000) := 'name#varchar2(50)#person_random.r_name@birthday#date#person_random.r_birthday@my_pk_ref#number#£t_people¤person_num_pk¤range¤1,7';
  testdata_ninja.generator_create('test_days', the_format);
This way the generator will create a random number of "child" rows (between 1 and 7) for each of the rows in the parent table. You can add as many reference data generators as you want to your data generators, as long as you know that for every reference table, there is a required scan of that table to fetch and generate the key values whenever you use the generator.

Tagged in : RANDOM_NINJA