• #TESTDATA_NINJA
Picture of author Morten Egan

Generating randomized incremental values

- March 14, 2017

How to generate random sequencing

In the previous entries, I explained the basics of testdata_ninja and then I explained how to create foreign key relations between your test data and existing tables. This time I will show how you can create data that are random but always increasing, either to create time patterns or to create unique id columns for primary keys. I will extend the format that I have already used in the previous entries, so the code will be familiar. If we look at the output of the test_days generator we have created, it currently looks like this:

SQL> select * from table(tdg_test_days.test_days);

NAME                                               BIRTHDAY            MY_PK_REF
-------------------------------------------------- ------------------ ----------
Arianna Turner                                     02-MAR-93 00:24:58          1
Sofie Dam                                          17-MAY-91 17:21:09          1
Camila King                                        27-OCT-62 18:16:56         10
Malik Henriksen                                    24-NOV-65 06:23:51         10
.... [more rows] ....
There is no clear column with possible unique values, so I will create a new column where the data generator is an "incrementor". Like all the other different data fields, it is in the 3 field definition of the block that we define how to generate the data. So far the format looks like this:
name#varchar2(50)#person_random.r_name@birthday#date#person_random.r_birthday@my_pk_ref#number#£t_people¤person_num_pk¤simple¤5
So let us add a column in the beginning, that increments randomly but is unique nonetheless:
people_pk#number#^numiterate~1~3¤5
and here is the explanation of the fields, which are separated by the "~" character:
  • incrementor type (numiterate for numbers and datiterate for dates)
  • start from (number or date to start increment from)
  • increment range (for numbers min and max separated by the "¤" character and dates interval type (seconds, minutes etc), min and max separated by the "¤" character for example: seconds¤1¤5)
So if we now take a look at the full format in the code, it would look like this:
declare
  the_format varchar2(4000) := 'people_pk#number#^numiterate~1~3¤5@name#varchar2(50)#person_random.r_name@birthday#date#person_random.r_birthday@my_pk_ref#number#£t_people¤person_num_pk¤simple¤5';
begin
  testdata_ninja.generator_create('test_days', the_format);
end;
/
And would create a result like the following:
1	Sigurd Andreasen	20-NOV-63 12:33:49	1
6	Ella Madsen	30-JAN-80 07:09:39	1
9	Alexandra White	16-JUL-53 16:32:07	1
13	Benjamin Holst	08-APR-88 13:39:38	1
17	Lea Jepsen	24-MAY-88 17:21:01	1
21	Eli Morgan	07-AUG-86 15:54:27	10
26	Oskar Bruun	20-JUL-69 22:40:56	10
31	Jens Friis	26-APR-63 16:32:21	10
35	Camila Hall	05-DEC-78 23:30:19	10
  • #TESTDATA_NINJA
Picture of author Morten Egan

Test data and foreign keys

- March 13, 2017

Making the right connections

In my last entry on testdata_ninja I showed the basics on how to create your own custom test data generators.

Once you have created the simple ones you probably want to extend it a little bit, and create test tables that has relations. So you want to create random data, but random data that actually can join to another table. The way to do this, is with a special data generator construct in the testdata_ninja format. If you read the old entry, you can see that testdata_ninja format is a series of columns seperated by the "@" sign. Each column has at least 3 parts (divided by the "#" character), where the third is the data generator.

So in the previous example I wrote about we used functions from the random_ninja package. You can however, reference existing tables instead. First we want to create a master table that we can reference in our new generator. We will use the basic data generators that are already part of the testdata_ninja package to create this:

create table t_people as select * from table(testdata_generator.people(100));
This statement will create a table with a hundred rows in. This table is what we will use to create our other test data table. One of the rows is called person_num_pk, and that is the column with the relation to our other table. The way we define that is with a special character as the first character in our generator field. So in my last entry I defined the test table with the following format:
name#varchar2(50)#person_random.r_name@birthday#date#person_random.r_birthday
So let us say that I wanted to add another column here, that linked to my primary key in the new table I just created. If I wanted to do that I simply create a third column definition like the following:
name#varchar2(50)#person_random.r_name@birthday#date#person_random.r_birthday@my_pk_ref#number#£t_people¤person_num_pk¤simple¤5
Notice the "£" sign. This is what specifies this generator as a reference to another table and not a function. The fields in the reference generator (divided with the "¤" character) are the following:

  • table name that we are referencing (t_people)
  • column in that table we are referencing (person_num_pk)
  • distribution rule (simple or range)
  • distribution count. If simple just one number, and every ref will this many rows in the new generator or range and you specify 2,5 then every reference will have between 2 and 5 rows in the new generator
So to create this new table we would simply run:
declare
  the_format varchar2(4000) := 'name#varchar2(50)#person_random.r_name@birthday#date#person_random.r_birthday@my_pk_ref#number#£t_people¤person_num_pk¤simple¤5';
begin
  testdata_ninja.generator_create('test_days', the_format);
end;
/
and we can now select from the generator and see that each value in the my_pk_ref column will be a real reference back to the t_people table and there will be 5 "child" rows per real row from the t_people table:
select
  a.first_name
  , b.birthday
from
  t_people a
  , (select name, birthday, my_pk_ref from table(tdg_test_days.test_days)) b
where
  a.person_num_pk = b.my_pk_ref;

FIRST_NAME                                                                                           BIRTHDAY         
---------------------------------------------------------------------------------------------------- ------------------
Parker                                                                                               14-SEP-64 13:35:56
Parker                                                                                               18-APR-62 05:02:35
Parker                                                                                               05-SEP-93 12:35:50
Parker                                                                                               16-AUG-82 21:19:14
Parker                                                                                               11-MAR-85 15:51:54
Bryson                                                                                               03-SEP-84 19:56:20
Bryson                                                                                               08-JUL-96 14:33:07
Bryson                                                                                               30-DEC-62 06:28:59
.......... [more rows]
  • #TESTDATA_NINJA
Picture of author Morten Egan

Creating real test data - Fast and Efficiently

- March 12, 2017

Building on RANDOM_NINJA to create real data

When I build RANDOM_NINJA I knew already one of things I wanted to use that library for. Building good valid and life-like test data has and is always a really big problem. The base for good test data is good and valid looking random data. Without that, most tests are not really valid, as data can be clustered wrongly, indexes act strange because data does not mimic real life and relations and constraints are hard to define or validate on data that is created using most available randomizing packages. That is why it was important to me that RANDOM_NINJA would be able to create random data from as many different domains as possible. As of today it can create more than 185 different data points from as many as 14 different data domains.

But having good random data is only half of it. You still need something that can define and create those tables. You also need something that can still maintain relations between those test tables, and make sure that foreign key distributions are real as well. So I created TESTDATA_NINJA. This package has some generic generators to create simple tables of people, populations which are true according to UN demographics data, CDR records and credit card transactions. The real function in this package is the custom generation procedure. This procedure can parse a string representation of what your data looks like and from that create pipelined functiones that can create thousands of test rows extremely fast and efficiently.

The testdata_ninja.generator_create procedure takes 2 arguments. The name of the new generator and the format of the test data. Below is a short description on how the format looks like.

  • #PROMISES_NINJA
Picture of author Morten Egan

plsql promises - promise complete solution

- February 25, 2017

racing for the finish line, all is done.

Finally. I got the 2 last methods completed. promise.all and promise.race is finished. So now the plsql promises library is feature complete. promise.all method takes a list of promises, built with the help of the promises_ninja package, and once the entire list of promises is fulfilled the top promise is set to fulfilled. Here is an example:

-- Test all method
declare
test_promise promise := promise();
a_promise1 promise;
a_promise2 promise;
a_promise3 promise;
all_promises promises_ninja.promise_list_type;
begin
  a_promise1 := promise('p_multiplier', 10);
  promises_ninja.build_promise_list(all_promises, a_promise1);
  a_promise2 := promise('p_multiplier', 20);
  promises_ninja.build_promise_list(all_promises, a_promise2);
  a_promise3 := promise('p_multiplier', 30);
  promises_ninja.build_promise_list(all_promises, a_promise3);
  test_promise.all_p(promises_ninja.convert_promise_list(all_promises));
  dbms_lock.sleep(5);
  test_promise.check_and_set_value();
  dbms_output.put_line('State is: ' || test_promise.state);
end;
/

The output is the list of the promises with the fulfilled values. So you can loop through the output and display individual values like this:

declare
test_promise promise := promise();
a_promise1 promise;
a_promise2 promise;
a_promise3 promise;
all_promises promises_ninja.promise_list_type;
promise_list_o promises_list_obj;
promise_val promise;
begin
  a_promise1 := promise('p_multiplier', 10);
  promises_ninja.build_promise_list(all_promises, a_promise1);
  a_promise2 := promise('p_multiplier', 20);
  promises_ninja.build_promise_list(all_promises, a_promise2);
  a_promise3 := promise('p_multiplier', 30);
  promises_ninja.build_promise_list(all_promises, a_promise3);
  test_promise.all_p(promises_ninja.convert_promise_list(all_promises));
  dbms_lock.sleep(5);
  test_promise.check_and_set_value();
  dbms_output.put_line('State is: ' || test_promise.state);
  if test_promise.val.getObject(promise_list_o) = dbms_types.success then
    -- We converted the object successfully
    for i in 1..promise_list_o.promise_list.count loop
      if promise_list_o.promise_list(i).getObject(promise_val) = dbms_types.success then
        -- We have an individual promise output value
        dbms_output.put_line('Value of promise is: ' || promise_val.getanyvalue);
      end if;
    end loop;
  end if;
end;
/

Same code and technique can be used for the race method. The only difference is that the output in the race call is a normal promise and not a list of promises, as only the first promise to complete is returned.

  • #RANDOM_NINJA
Picture of author Morten Egan

Updated random_ninja and testdata_ninja packages

- February 18, 2017

Installing the random package manually

A couple of people have requested that I explain how to install the entire random_ninja and testdata_ninja packages manually. I have created a gist here with the full and complete order of the files: Full install list. So download random_ninja zipfile and testdata_ninja zipfile and follow the order of the gist to install all the code.

Once you have installed the packages you have the full random library available, which you can read in full detail about here: Post #1, Post #2 and Post #3. For testdata package you can see a quick demo here and here.

Release 3 of random_ninja is coming up and this is the list of new functionality that will be fully supported (HINT: The code is already in the github source):

  • company_random
    • r_companyname - generate a random company name.
    • r_industry - generate a random industry type.
    • r_companyid - generate a random country tax id.
    • r_employees - generate a random employee count.
    • r_revenue - generate a random revenue number.
  • phone_random
    • r_brand - generate a random brand name.
  • computer_random
    • r_error - generate a random error string.
  • util_random
    • ru_permute - Permute a string: Morten -> Metron.
    • ru_scramble - Scramble a string: Morten -> Diwnah.
    • ru_obfuscate - Obfuscate a string: Morten -> FasNituha.
  • transport_random
    • r_vehicle_registration - generate random country specific vehicle registration plate.
    • r_icao - generate a random country specific aircraft registration code.
    • r_imo - generate a random vessel registration code.

Also there are 3 new locale supported for names, addresses and more. These countries are Denmark, China and Dhubai. Performance improvements for Markov text generation are coming as well as new SWIFT and FIX financial random generation.