|
jInfer | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object cz.cuni.mff.ksi.jinfer.twostep.TwoStepSimplifier
public class TwoStepSimplifier
TwoStepSimplifier works in two step for simplification. First it searches for a suitable clusterer submodule, to which it passes whole initialGrammar. Clusterer is responsible to cluster elements properly. Cluster of elements is then considered to be one, and the same element, with various instances in input files.
For every cluster of elements, the clusterProcessor submodule is called. Given the clusterer and list of observed positive examples (grammar = list of elements), cluster processor is expected to produce one Element instance, on which proper definition of regular expression representing content model of the element children will be held.
The attributes of elements in the cluster are processed separately afterwards. Currently, only simple processing is done - required/optional. This will be extended in future to separated attribute processor submodule.
Produced regular expressions are further refined in RegularExpressionCleaner
submodule.
Constructor Summary | |
---|---|
TwoStepSimplifier(ClustererFactory clustererFactory,
ClusterProcessorFactory clusterProcessorFactory,
RegularExpressionCleanerFactory regularExpressionCleanerFactory,
ContentInferrerFactory contentInfererFactory)
Create new simplifier and give all submodule factories to it. |
Method Summary | |
---|---|
List<Element> |
simplify(List<Element> initialGrammar)
Do the main job of simplifier - simplify given grammar. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public TwoStepSimplifier(ClustererFactory clustererFactory, ClusterProcessorFactory clusterProcessorFactory, RegularExpressionCleanerFactory regularExpressionCleanerFactory, ContentInferrerFactory contentInfererFactory)
clustererFactory
- factory of clusterer submoduleclusterProcessorFactory
- factory of ClusterProcessor submoduleregularExpressionCleanerFactory
- factory of cleaner submoduleMethod Detail |
---|
public List<Element> simplify(List<Element> initialGrammar) throws InterruptedException
initialGrammar
- grammar obtained from source files. In simple form - only concatenations.
InterruptedException
|
jInfer | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |