|
DLESE Tools v1.2 |
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.dlese.dpc.index.writer.FileIndexingServiceWriter org.dlese.dpc.index.writer.XMLFileIndexingWriter
Creates a Lucene Document
from any XML file by
stripping the XML tags to extract and index the content. The reader for this type of
Document is XMLDocReader.
The Lucene Document fields that are created by this class are (in addition the the
ones listed for FileIndexingServiceWriter
):
collection
- The collection associated with this resource.
FileIndexingService
,
XMLDocReader
Field Summary | |
---|---|
protected RecordDataService |
recordDataService
Serves indexible data for a given record such as recordStatus, annotations, vocab ID mappings and associated IDs. |
protected MetadataVocab |
vocab
DESCRIPTION |
Constructor Summary | |
---|---|
XMLFileIndexingWriter(RecordDataService recordDataService)
Constructor for the XMLFileIndexingWriter. |
Method Summary | |
---|---|
protected void |
addCustomFields(Document newDoc,
Document existingDoc,
File sourceFile)
Adds the full content of the XML to the default search field. |
protected abstract void |
addFields(Document newDoc,
Document existingDoc,
File sourceFile)
Adds additional fields that are unique the document format being indexed. |
protected abstract String |
getCollection()
Returns unique collection keys for the item being indexed, separated by spaces. |
protected String |
getFieldContent(String[] values,
String useVocabMapping)
Gets the vocab encoded keys for the given values, separated by the '+' symbol. |
protected String |
getFieldContent(String value,
String useVocabMapping)
Gets the encoded vocab key for the given content. |
protected String |
getFieldName(String fieldString)
Gets the fieldName attribute of the XMLFileIndexingWriter object |
protected abstract String |
getId()
Returns unique IDs for the item being indexed, one for each collection that catalog the resource, separted by spaces. |
static String |
getOaiModtime(File sourceFile,
Document existingDoc)
Gets the oaiModtime for the given File or Document. |
Methods inherited from class org.dlese.dpc.index.writer.FileIndexingServiceWriter |
---|
abortIndexing, addToAdminDefaultField, addToDefaultField, create, destroy, getDeletedDoc, getDocType, getExistingDoc, getFileIndexingService, getReaderClass, getSourceDir, getSourceFile, getValidationReport, init, isValidationEnabled, prtln, prtlnErr, setDebug, setDefaultFieldName, setFileIndexingService, setValidationEnabled |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
protected MetadataVocab vocab
protected RecordDataService recordDataService
Constructor Detail |
public XMLFileIndexingWriter(RecordDataService recordDataService)
recordDataService
- Used to get data about the file.Method Detail |
protected abstract String getId() throws Exception
Exception
- This method should throw and Exception with appropriate error
message if an error occurs.protected abstract String getCollection() throws Exception
Exception
- This method should throw and Exception with appropriate error
message if an error occurs.protected abstract void addFields(Document newDoc, Document existingDoc, File sourceFile) throws Exception
Document
class to add a Field
.
The following Lucene Field
types are available for
indexing with the Document
:
Field.Text(string name, string value) -- tokenized, indexed, stored
Field.UnStored(string name, string value) -- tokenized, indexed, not stored
Field.Keyword(string name, string value) -- not tokenized, indexed, stored
Field.UnIndexed(string name, string value) -- not tokenized, not indexed, stored
Field(String name, String string, boolean store, boolean index, boolean tokenize) --
allows control to do anything you want
Example code:
protected void addCustomFields(Document newDoc, Document existingDoc) throws Exception {
String customContent = "Some content";
newDoc.add(Field.Text("mycustomefield", customContent));
}
newDoc
- The new Document
that is
being created for this resourceexistingDoc
- An existing Document
that
currently resides in the index for the given resource, or null if none was
previously presentsourceFile
- The sourceFile that is being indexed
Exception
- This method should throw and Exception with appropriate error
message if an error occurs.protected void addCustomFields(Document newDoc, Document existingDoc, File sourceFile) throws Exception
addCustomFields
in class FileIndexingServiceWriter
newDoc
- The new Document
that is
being created for this resourceexistingDoc
- An existing Document
that
currently resides in the index for the given resource, or null if none was
previously presentsourceFile
- The feature to be added to the CustomFields attribute
Exception
- This method should throw and Exception with appropriate error
message if an error occurs.protected String getFieldContent(String[] values, String useVocabMapping) throws Exception
values
- The valuse to encode.useVocabMapping
- The mapping to use, for example "contentStandards".
Exception
- If error.protected String getFieldContent(String value, String useVocabMapping) throws Exception
value
- The value to encode.useVocabMapping
- The vocab mapping to use, for example "contentStandard".
Exception
- If error.protected String getFieldName(String fieldString) throws Exception
fieldString
- DESCRIPTION
Exception
- DESCRIPTIONpublic static final String getOaiModtime(File sourceFile, Document existingDoc)
sourceFile
- The source fileexistingDoc
- The existing Doc
|
DLESE Tools v1.2 |
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |