Class DatasetFactory
java.lang.Object
org.variantsync.diffdetective.datasets.DatasetFactory
The DatasetFactory loads datasets and provides default values for DiffFilters and parse options.
In particular, this class turns
DatasetDescription
objects into Repository
objects.- Author:
- Paul Bittner
-
Field Summary
Modifier and TypeFieldDescriptionprivate final Path
static final DiffFilter
Default value for diff filters.static final String
Name of Linux.static final String
Name of Marlin.static final String
Name of PHP. -
Constructor Summary
ConstructorDescriptionDatasetFactory
(Path cloneDirectory) Creates a new DatasetFactory that will clone any loaded datasets to the given directy. -
Method Summary
Modifier and TypeMethodDescriptioncreate
(DatasetDescription dataset) Loads the repository of the given dataset description.createAll
(Collection<DatasetDescription> datasets, boolean preload, boolean pull) Runscreate(org.variantsync.diffdetective.datasets.DatasetDescription)
for all given dataset description.static DiffFilter
getDefaultDiffFilterFor
(String repositoryName) Returns the default DiffFilter for the repository with the given name.private static PatchDiffParseOptions
getParseOptionsFor
(String repositoryName) Returns the default parse options for the repository with the given name.
-
Field Details
-
MARLIN
Name of Marlin.- See Also:
-
LINUX
Name of Linux.- See Also:
-
PHP
Name of PHP.- See Also:
-
DEFAULT_DIFF_FILTER
Default value for diff filters. It disallows merge commits, only considers patches that modified files, and only allows source files of C/C++ projects ("h", "hpp", "c", "cpp"). -
cloneDirectory
-
-
Constructor Details
-
DatasetFactory
Creates a new DatasetFactory that will clone any loaded datasets to the given directy.- Parameters:
cloneDirectory
- Directory to clone remote repositories to upon dataset loading.
-
-
Method Details
-
getDefaultDiffFilterFor
Returns the default DiffFilter for the repository with the given name. For Marlin, this applies the same DiffFilter as Stanciulescu et al. did in their ICSME paper.- See Also:
-
getParseOptionsFor
Returns the default parse options for the repository with the given name. For Marlin, uses theMarlin.ANNOTATION_PARSER
. -
create
Loads the repository of the given dataset description. This will laod the repository with the DiffFilter and ParseOptions provided bygetDefaultDiffFilterFor(java.lang.String)
andgetParseOptionsFor(java.lang.String)
, respectively.- Parameters:
dataset
- The dataset to load.- Returns:
- A repository referencing the loaded dataset.
-
createAll
public List<Repository> createAll(Collection<DatasetDescription> datasets, boolean preload, boolean pull) Runscreate(org.variantsync.diffdetective.datasets.DatasetDescription)
for all given dataset description. Optionally, may also preload the repository which means that the repository will be cloned if it is remote or unzipped if it is a zip archive. Optionally, may also rungit pull
on all repositories to update them.- Parameters:
datasets
- Datasets to load.preload
- Set to true iff the repositories should be cloned / unzipped in case they are not locally available already.pull
- Set to true iffgit pull
should be run on all repositories before returning.- Returns:
- Repository references for all dataset descriptions in the same order.
- See Also:
-