Class Analysis
java.lang.Object
org.variantsync.diffdetective.analysis.Analysis
Encapsulates the state and control flow during an analysis of the commit history of multiple
repositories using
VariationDiffs. Each repository is processed sequentially but the commits
of each repository can be processed in parallel.
For thread safety, each thread receives its own instance of Analysis. The getters
provides access to the current state of the analysis in one thread. Depending on the current
phase only a subset of the state accessible via getters may be valid.
- Author:
- Paul Bittner, Benjamin Moosherr
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic interfaceHooks for analyzing commits usingVariationDiffs.static final classThe effective runtime in seconds that we have when using multithreading.static final classThe total number of commits in the observed history of the given repository. -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intDefault value forcommitsToProcessPerThreadprotected org.eclipse.jgit.revwalk.RevCommitprotected CommitDiffprotected PatchDiffprotected VariationDiff<DiffLinesLabel> static final StringFile extension that is used when writing AnalysisResults to disk.protected org.eclipse.jgit.api.Gitprotected final List<Analysis.Hooks> protected final Pathprotected Pathprotected final Repositoryprotected final AnalysisResultstatic final StringFile name that is used to store the analysis results for each repository. -
Constructor Summary
ConstructorsConstructorDescriptionAnalysis(String taskName, List<Analysis.Hooks> hooks, Repository repository, Path outputDir) Constructs the state used during an analysis. -
Method Summary
Modifier and TypeMethodDescription<T extends Metadata<T>>
voidappend(AnalysisResult.ResultKey<T> resultKey, T value) Convenience function forAnalysisResult.append(org.variantsync.diffdetective.analysis.AnalysisResult.ResultKey<T>, T)ongetResult().static <T> voidexportMetadata(Path outputDir, Metadata<T> metadata) Exports the given metadata object to a file named accordingTOTAL_RESULTS_FILE_NAMEin the given directory.static <T> voidexportMetadataToFile(Path outputFile, Metadata<T> metadata) Exports the given metadata object to the given file.static AnalysisResultforEachCommit(Supplier<Analysis> analysis) Same asforEachCommit(Supplier, int, int).static AnalysisResultforEachCommit(Supplier<Analysis> analysisFactory, int commitsToProcessPerThread, int nThreads) Runs the analysis for the repository given inAnalysis(java.lang.String, java.util.List<org.variantsync.diffdetective.analysis.Analysis.Hooks>, org.variantsync.diffdetective.datasets.Repository, java.nio.file.Path).static voidforEachRepository(List<Repository> repositoriesToAnalyze, Path outputDir, BiConsumer<Repository, Path> analyzeRepository) RunsanalyzeRepositoryon each repository, skipping repositories where an analysis was already run.static AnalysisResultforSingleCommit(String commitHash, Analysis analysis) Runs the analysis for the repository given inAnalysis(java.lang.String, java.util.List<org.variantsync.diffdetective.analysis.Analysis.Hooks>, org.variantsync.diffdetective.datasets.Repository, java.nio.file.Path)on the given commit only.static voidforSinglePatch(String commitHash, String fileName, Analysis analysis) Runs the analysis for the repository given inAnalysis(java.lang.String, java.util.List<org.variantsync.diffdetective.analysis.Analysis.Hooks>, org.variantsync.diffdetective.datasets.Repository, java.nio.file.Path)on the given patch only.<T extends Metadata<T>>
Tget(AnalysisResult.ResultKey<T> resultKey) Convenience getter forAnalysisResult.get(org.variantsync.diffdetective.analysis.AnalysisResult.ResultKey<T>)ongetResult().org.eclipse.jgit.revwalk.RevCommitThe currently processed commit.The currently processed commit diff.The currently processed patch.The currently processed patch.The destination for results which are written to disk.The destination for results which are written to disk and specific to the currently processed commit batch.The repository this analysis is run on.The results of the analysis.protected voidprotected voidprocessCommitBatch(List<org.eclipse.jgit.revwalk.RevCommit> commits) Sequential analysis of allcommitsas one batch.protected voidprotected <Hook> booleanrunFilterHook(ListIterator<Hook> hook, org.apache.commons.lang3.function.FailableBiFunction<Hook, Analysis, Boolean, Exception> callHook) protected <Hook> voidrunHook(ListIterator<Hook> hook, org.apache.commons.lang3.function.FailableBiConsumer<Hook, Analysis, Exception> callHook) protected <Hook> voidrunReverseHook(ListIterator<Hook> hook, org.apache.commons.lang3.function.FailableBiConsumer<Hook, Analysis, Exception> callHook)
-
Field Details
-
EXTENSION
File extension that is used when writing AnalysisResults to disk.- See Also:
-
TOTAL_RESULTS_FILE_NAME
File name that is used to store the analysis results for each repository.- See Also:
-
COMMITS_TO_PROCESS_PER_THREAD_DEFAULT
public static final int COMMITS_TO_PROCESS_PER_THREAD_DEFAULTDefault value forcommitsToProcessPerThread- See Also:
-
hooks
-
repository
-
git
protected org.eclipse.jgit.api.Git git -
currentCommit
protected org.eclipse.jgit.revwalk.RevCommit currentCommit -
currentCommitDiff
-
currentPatch
-
currentVariationDiff
-
outputDir
-
outputFile
-
result
-
-
Constructor Details
-
Analysis
Constructs the state used during an analysis.- Parameters:
taskName- the name of the overall analysis taskhooks- the hooks to be run for analysisrepository- the repository to analyzeoutputDir- the directory where all results are saved
-
-
Method Details
-
getRepository
The repository this analysis is run on. Always valid. -
getCurrentCommit
public org.eclipse.jgit.revwalk.RevCommit getCurrentCommit()The currently processed commit. Valid during the commitphase. -
getCurrentCommitDiff
The currently processed commit diff. Valid whenAnalysis.Hooks.onParsedCommit(org.variantsync.diffdetective.analysis.Analysis)is called until the end of the commit phase. -
getCurrentPatch
The currently processed patch. Valid during the patchphase. -
getCurrentVariationDiff
The currently processed patch. Valid only duringAnalysis.Hooks.analyzeVariationDiff(org.variantsync.diffdetective.analysis.Analysis). -
getOutputDir
The destination for results which are written to disk. Always valid. -
getOutputFile
The destination for results which are written to disk and specific to the currently processed commit batch. Valid during the batchphase. -
getResult
The results of the analysis. This may be modified by any hook and should be initialized inAnalysis.Hooks.initializeResults(org.variantsync.diffdetective.analysis.Analysis)(e.g. by usingappend(org.variantsync.diffdetective.analysis.AnalysisResult.ResultKey<T>, T)). Always valid. -
get
Convenience getter forAnalysisResult.get(org.variantsync.diffdetective.analysis.AnalysisResult.ResultKey<T>)ongetResult(). Always valid. -
append
Convenience function forAnalysisResult.append(org.variantsync.diffdetective.analysis.AnalysisResult.ResultKey<T>, T)ongetResult(). Always valid. -
forEachRepository
public static void forEachRepository(List<Repository> repositoriesToAnalyze, Path outputDir, BiConsumer<Repository, Path> analyzeRepository) RunsanalyzeRepositoryon each repository, skipping repositories where an analysis was already run. This skipping mechanism doesn't distinguish between different analyses as it only checks for the existence ofTOTAL_RESULTS_FILE_NAME. Delete this file to rerun the analysis.For each repository a directory in
outputDiris passed toanalyzeRepositorywhere the results of the given repository should be written.- Parameters:
repositoriesToAnalyze- the repositories for whichanalyzeRepositoryis runoutputDir- the directory where all repositories will save their resultsanalyzeRepository- the callback which is invoked for each repository
-
forSingleCommit
Runs the analysis for the repository given inAnalysis(java.lang.String, java.util.List<org.variantsync.diffdetective.analysis.Analysis.Hooks>, org.variantsync.diffdetective.datasets.Repository, java.nio.file.Path)on the given commit only.Analysis.Hookspassed toAnalysis(java.lang.String, java.util.List<org.variantsync.diffdetective.analysis.Analysis.Hooks>, org.variantsync.diffdetective.datasets.Repository, java.nio.file.Path)are the main customization point for executing different analyses.- Parameters:
commitHash- the commit to analyze relative to its first parentanalysis- the analysis to run
-
forSinglePatch
Runs the analysis for the repository given inAnalysis(java.lang.String, java.util.List<org.variantsync.diffdetective.analysis.Analysis.Hooks>, org.variantsync.diffdetective.datasets.Repository, java.nio.file.Path)on the given patch only.Analysis.Hookspassed toAnalysis(java.lang.String, java.util.List<org.variantsync.diffdetective.analysis.Analysis.Hooks>, org.variantsync.diffdetective.datasets.Repository, java.nio.file.Path)are the main customization point for executing different analyses. The Hooks will be manipulated in that a new hook for patch filtering will be inserted as the first hook for as long as the analysis runs. This hook will be removed afterwards. It is assumed that this hook remains at the same place and is not manipulated by the user.- Parameters:
commitHash- the commit to analyze relative to its first parentfileName- the name of the file that was edited in the given commitanalysis- the analysis to run
-
forEachCommit
Same asforEachCommit(Supplier, int, int). Defaults toCOMMITS_TO_PROCESS_PER_THREAD_DEFAULTand a machine dependent number ofDiagnostics.getNumberOfAvailableProcessors(). -
forEachCommit
public static AnalysisResult forEachCommit(Supplier<Analysis> analysisFactory, int commitsToProcessPerThread, int nThreads) Runs the analysis for the repository given inAnalysis(java.lang.String, java.util.List<org.variantsync.diffdetective.analysis.Analysis.Hooks>, org.variantsync.diffdetective.datasets.Repository, java.nio.file.Path). The repository history is processed in batches ofcommitsToProcessPerThreadonnThreadsin parallel.Analysis.Hookspassed toAnalysis(java.lang.String, java.util.List<org.variantsync.diffdetective.analysis.Analysis.Hooks>, org.variantsync.diffdetective.datasets.Repository, java.nio.file.Path)are the main customization point for executing different analyses. By default only the total number of commits and the total runtime with multithreading of theVariationDiffparsing is recorded.- Parameters:
analysisFactory- creates independent (at least thread safe) instances the analysis statecommitsToProcessPerThread- the commit batch sizenThreads- the number of parallel processed commit batches
-
processCommitBatch
protected void processCommitBatch(List<org.eclipse.jgit.revwalk.RevCommit> commits) throws Exception Sequential analysis of allcommitsas one batch.- Parameters:
commits- the commit batch to be processed- Throws:
Exception- See Also:
-
processCommit
- Throws:
Exception
-
processPatch
- Throws:
Exception
-
runHook
protected <Hook> void runHook(ListIterator<Hook> hook, org.apache.commons.lang3.function.FailableBiConsumer<Hook, Analysis, throws ExceptionException> callHook) - Throws:
Exception
-
runFilterHook
protected <Hook> boolean runFilterHook(ListIterator<Hook> hook, org.apache.commons.lang3.function.FailableBiFunction<Hook, Analysis, throws ExceptionBoolean, Exception> callHook) - Throws:
Exception
-
runReverseHook
protected <Hook> void runReverseHook(ListIterator<Hook> hook, org.apache.commons.lang3.function.FailableBiConsumer<Hook, Analysis, throws ExceptionException> callHook) - Throws:
Exception
-
exportMetadata
Exports the given metadata object to a file named accordingTOTAL_RESULTS_FILE_NAMEin the given directory.- Type Parameters:
T- Type of the metadata.- Parameters:
outputDir- The directory into which the metadata object file should be written.metadata- The metadata to serialize
-
exportMetadataToFile
Exports the given metadata object to the given file. Overwrites existing files.- Type Parameters:
T- Type of the metadata.- Parameters:
outputFile- The file to write.metadata- The metadata to serialize
-