Data generation

Do you need a large amount of synthetic test data quickly? For example for a performance test? We can generate a set of test data for you on demand in any desired volume. All we need is a sample record.

What is data generation?

Generating synthetic data is different from anonymizing or pseudonymizing data. Data generation involves creating new, fictitious test data. These fictitious data must meet the requirements you set for them in all respects, such as field length, type and mutual relationships.

Why choose synthetic data?

Sometimes it is desirable to have (large amounts of) synthetic test data. For a performance test, for example. This requires a wide variety of different test cases. Or consider a new product that requires test cases that do not yet exist in the production environment. In those situations you have to generate test data.

Our solution? The DataFactory

EntrD has developed a solution that allows us to provide a generated dataset on demand. This dataset is compiled specifically for you and fully meets your wishes in terms of both layout and content. Based on a sample record you provide, we can generate any volume of data for you. This can be done quickly: as a rule, it does not take more than a few days. You will receive the generated test set electronically and can use it immediately in any test environment.

Would you like to receive more information? Please contact us!

Product specifications DataFactory

General characteristics

  • Easy to implement
  • Quick to roll out (on average 2 to 6 weeks)
  • Low opertaing costs
  • Anonymized data is irreducible (in line with requirements of the GDPR)
  • Speeds up the development cycle
  • Aligns with agile working
  • Prevents the need to maintain risk capital
  • Prevents the impact of data leaks (fines and reputational damage)
  • Anonymized data can be widely used (Test, Analysis, Training, Demo, Support, Outsourcing, etc.)

Functional properties

  • Anonymize consistently over time without using a ‘translation table’
  • Consistently anonymize an entire application chain
  • Maintaining relevant relationships (if desired)
  • Geographical distribution of relationships remains intact (if desired)
  • Ages remain unchanged (if desired)
  • Generated data adheres to data-specific rules
  • Data quality remains unchanged
  • Anonymized data is easy to distinguish from production data

Technical properties

  • Completely database independent
  • Easily scalable
  • High performance
  • Cross-platform
  • Minimal management effort
  • Easy integration with CI/CD pipeline
  • Supports large data sets
  • Anonymization is done completely in-memory
  • Ability to add your own masking rules
  • Comes standard with more than 10 options to anonymize data

Latest news

May 22, 2024

Safely testing and analyzing data in retail: the importance of anonymization

In the retail industry, analyzing and testing data is essential to…

Read more

Protection of privacy in employment services

Anonymizing privacy-sensitive data One of the core aspects of work in job…

Read more
May 20, 2024

Robotization in document management

The cleaning crew of the future in document management Discover how robots play…

Read more