Class BoundedBreakIteratorScanner
- java.lang.Object
-
- java.text.BreakIterator
-
- org.apache.lucene.search.uhighlight.BoundedBreakIteratorScanner
-
- All Implemented Interfaces:
java.lang.Cloneable
public class BoundedBreakIteratorScanner extends java.text.BreakIteratorA custom break iterator that is used to find break-delimited passages bounded by a provided maximum length in theUnifiedHighlightercontext. This class uses aBreakIteratorto find the last break after the provided offset that would create a passage smaller thanmaxLen. If theBreakIteratorcannot find a passage smaller than the maximum length, a secondary break iterator is used to re-split the passage at the first boundary after maximum length. This is useful to split passages created byBreakIterators like `sentence` that can create big outliers on semi-structured text. WARNING: This break iterator is designed to work with theUnifiedHighlighter. TODO: We should be able to create passages incrementally, starting from the offset of the first match and expanding or not depending on the offsets of subsequent matches. This is currently impossible becauseFieldHighlighteruses only the first matching offset to derive the start and end of each passage.
-
-
Method Summary
Modifier and Type Method Description intcurrent()intfirst()intfollowing(int offset)Can be invoked only after a call to preceding(offset+1).static java.text.BreakIteratorgetSentence(java.util.Locale locale, int maxLen)Returns aBreakIterator.getSentenceInstance(Locale)bounded to maxLen.java.text.CharacterIteratorgetText()intlast()intnext()intnext(int n)intpreceding(int offset)Must be called with increasing offset.intprevious()voidsetText(java.lang.String newText)voidsetText(java.text.CharacterIterator newText)
-
-
-
Method Detail
-
getText
public java.text.CharacterIterator getText()
- Specified by:
getTextin classjava.text.BreakIterator
-
setText
public void setText(java.text.CharacterIterator newText)
- Specified by:
setTextin classjava.text.BreakIterator
-
setText
public void setText(java.lang.String newText)
- Overrides:
setTextin classjava.text.BreakIterator
-
preceding
public int preceding(int offset)
Must be called with increasing offset. SeeFieldHighlighterfor usage.- Overrides:
precedingin classjava.text.BreakIterator
-
following
public int following(int offset)
Can be invoked only after a call to preceding(offset+1). SeeFieldHighlighterfor usage.- Specified by:
followingin classjava.text.BreakIterator
-
getSentence
public static java.text.BreakIterator getSentence(java.util.Locale locale, int maxLen)Returns aBreakIterator.getSentenceInstance(Locale)bounded to maxLen. Secondary boundaries are found using aBreakIterator.getWordInstance(Locale).
-
current
public int current()
- Specified by:
currentin classjava.text.BreakIterator
-
first
public int first()
- Specified by:
firstin classjava.text.BreakIterator
-
next
public int next()
- Specified by:
nextin classjava.text.BreakIterator
-
last
public int last()
- Specified by:
lastin classjava.text.BreakIterator
-
next
public int next(int n)
- Specified by:
nextin classjava.text.BreakIterator
-
previous
public int previous()
- Specified by:
previousin classjava.text.BreakIterator
-
-