Sultan Alhusain

Assistant Professor of
Computer Science

Software Design Pattern Datasets

The aim of these datasets is to pave the way for data-driven machine-learning based solutions for the problem of Design Pattern recognition.

The datasets and other resources to be made available in this website include:

  1. The 539 open-source Java systems from which the datasets have been constructed.
  2. The raw output of the 6 DP recognition tools that have been used as "voters" in the process of constructing the datasets.
  3. A separate dataset file (XML) for the instances of each DP and DP role.
    • The instances in each file will be organized based on the number of "votes" they have received.
  4. The tool developed and used to generate training datasets from the datasets of DP instances.
  5. Readymade training datasets for DPs and DP roles.
    • A set of test datasets prepared based on the P-MARt benchmark repository (13/04/21 version) will also be provided.
  6. The tool developed and used to perform the recognition process based on any set of trained classifiers.

The datasets have been developed as part of the work described in this PhD thesis, and they will be uploaded here by the end of 2021 August 2022!