java.lang.Object

org.variantsync.diffdetective.datasets.DatasetFactory

public class DatasetFactory extends Object

The DatasetFactory loads datasets and provides default values for DiffFilters and parse options. In particular, this class turns DatasetDescription objects into Repository objects.

Author:: Paul Bittner

Field Summary

Fields

Modifier and Type

Field

Description

private final Path

cloneDirectory

static final DiffFilter

DEFAULT_DIFF_FILTER

Default value for diff filters.

static final String

LINUX

Name of Linux.

static final String

MARLIN

Name of Marlin.

static final String

PHP

Name of PHP.
Constructor Summary

Constructors

Constructor

Description

DatasetFactory(Path cloneDirectory)

Creates a new DatasetFactory that will clone any loaded datasets to the given directy.
Method Summary

Modifier and Type

Method

Description

Repository

create(DatasetDescription dataset)

Loads the repository of the given dataset description.

List<Repository>

createAll(Collection<DatasetDescription> datasets, boolean preload, boolean pull)

Runs create(org.variantsync.diffdetective.datasets.DatasetDescription) for all given dataset description.

static DiffFilter

getDefaultDiffFilterFor(String repositoryName)

Returns the default DiffFilter for the repository with the given name.

private static PatchDiffParseOptions

getParseOptionsFor(String repositoryName)

Returns the default parse options for the repository with the given name.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- MARLIN
  
  public static final String MARLIN
  
  Name of Marlin.
  See Also:
  
  Constant Field Values
- LINUX
  
  public static final String LINUX
  
  Name of Linux.
  See Also:
  
  Constant Field Values
- PHP
  
  public static final String PHP
  
  Name of PHP.
  See Also:
  
  Constant Field Values
- DEFAULT_DIFF_FILTER
  
  public static final DiffFilter DEFAULT_DIFF_FILTER
  
  Default value for diff filters. It disallows merge commits, only considers patches that modified files, and only allows source files of C/C++ projects ("h", "hpp", "c", "cpp").
- cloneDirectory
  
  private final Path cloneDirectory
Constructor Details
- DatasetFactory
  
  public DatasetFactory(Path cloneDirectory)
  
  Creates a new DatasetFactory that will clone any loaded datasets to the given directy.
  
  Parameters:
  
  cloneDirectory - Directory to clone remote repositories to upon dataset loading.
Method Details
- getDefaultDiffFilterFor
  
  public static DiffFilter getDefaultDiffFilterFor(String repositoryName)
  
  Returns the default DiffFilter for the repository with the given name. For Marlin, this applies the same DiffFilter as Stanciulescu et al. did in their ICSME paper.
  See Also:
  
  StanciulescuMarlin.DIFF_FILTER
- getParseOptionsFor
  
  private static PatchDiffParseOptions getParseOptionsFor(String repositoryName)
  
  Returns the default parse options for the repository with the given name. For Marlin, uses the Marlin.ANNOTATION_PARSER.
- create
  
  public Repository create(DatasetDescription dataset)
  
  Loads the repository of the given dataset description. This will laod the repository with the DiffFilter and ParseOptions provided by getDefaultDiffFilterFor(java.lang.String) and getParseOptionsFor(java.lang.String), respectively.
  
  Parameters:
  
  dataset - The dataset to load.
  
  Returns:
  
  A repository referencing the loaded dataset.
- createAll
  
  public List<Repository> createAll(Collection<DatasetDescription> datasets, boolean preload, boolean pull)
  
  Runs create(org.variantsync.diffdetective.datasets.DatasetDescription) for all given dataset description. Optionally, may also preload the repository which means that the repository will be cloned if it is remote or unzipped if it is a zip archive. Optionally, may also run git pull on all repositories to update them.
  Parameters:
  
  datasets - Datasets to load.
  
  preload - Set to true iff the repositories should be cloned / unzipped in case they are not locally available already.
  
  pull - Set to true iff git pull should be run on all repositories before returning.
  
  Returns:
  
  Repository references for all dataset descriptions in the same order.
  
  See Also:
  
  GitLoader

Class DatasetFactory

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

MARLIN

LINUX

PHP

DEFAULT_DIFF_FILTER

cloneDirectory

Constructor Details

DatasetFactory

Method Details

getDefaultDiffFilterFor

getParseOptionsFor

create

createAll