|
DLESE Tools v1.2 |
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.dlese.dpc.index.writer.FileIndexingServiceWriter org.dlese.dpc.index.writer.XMLFileIndexingWriter org.dlese.dpc.index.writer.ItemFileIndexingWriter
Abstract class for writing a Lucene Document
for a
collection of item-level metadata records of a specific format (DLESE IMS, ADN-Item,
ADN-Collection, etc). The reader for this type of Document
is XMLDocReader
or ItemDocReader
.
The Lucene Document
fields that are created by
this class are (in addition the the ones listed for FileIndexingServiceWriter
):
title
- The tile for the resource. Stored.
description
- The description for the resource. Stored.
url
- The url to the resoruce. Stored.
Stored. Appended with a '0' at the beginning to support wildcard searching.
metadatapfx
- The metadata prefix (format) for this record, for
example 'adn' or 'oai_dc'. Stored. Appended with a '0' at the beginning to support
wildcard searching.
accessionstatus
- The accession status for this record. Stored.
Appended with a '0' at the beginning to support wildcard searching.
annotypes
- Annotataion types that are refer to this record.
Keyword.
annopathways
- Annotataion pathways that are refer to this
record. Keyword.
associatedids
- A list of record IDs that refer to the same
resource. Keyword.
valid
- Indicates whether the record is valid [true | false]. Not
stored.
validationreport
- Text describing an error in the validation of
the data for this record. Stored. Only indexed if there was a validation error
indicated by the valid field containing false.
ItemDocReader
,
XMLDocReader
,
RecordDataService
,
FileIndexingServiceWriter
Field Summary |
---|
Fields inherited from class org.dlese.dpc.index.writer.XMLFileIndexingWriter |
---|
recordDataService, vocab |
Constructor Summary | |
---|---|
protected |
ItemFileIndexingWriter(RecordDataService recordDataService)
Creates a ItemFileIndexingWriter that indexes the given collection in field collection. |
Method Summary | |
---|---|
protected void |
addFields(Document newDoc,
Document existingDoc,
File sourceFile)
Adds fields to the index that are common to all collection-related documents. |
protected abstract void |
addFrameworkFields(Document newDoc,
Document existingDoc)
Adds fields to the index that are unique to the given framework. |
protected abstract void |
destroy()
This method is called at the conclusion of processing and may be used for tear-down. |
protected abstract String |
getAccessionStatus()
Returns the accession status of this record, for example 'accessioned'. |
protected abstract String |
getCreator()
Returns the items creator's full name. |
protected abstract String |
getCreatorLastName()
Returns the items creator's last name. |
Document |
getDeletedDoc(Document existingDoc)
Creates a Lucene Document from an existing
CollectionFileIndexing Document by setting the field "deleted" to "true" and making
the modtime equal to current time. |
protected abstract String |
getDescription()
Returns a description for the document being indexed. |
abstract String |
getDocType()
Returns a unique document type key for this kind of record, corresponding to the format type. |
protected abstract String |
getKeywords()
Returns the items keywords. |
abstract String |
getReaderClass()
Gets the fully qualified name of the concrete DocReader class that is used to read this type of Document , for example
"org.dlese.dpc.index.reader.ItemDocReader". |
protected abstract String |
getTitle()
Returns a title for the document being indexed. |
protected abstract String |
getUrl()
Returns the URL to the resource being indexed. |
protected abstract String |
getValidationReport()
Gets a report detailing any errors found in the validation of the data, or null if no error was found. |
abstract void |
init(File source,
Document existingDoc)
This method is called prior to processing and may be used to for any necessary set-up. |
Methods inherited from class org.dlese.dpc.index.writer.XMLFileIndexingWriter |
---|
addCustomFields, getCollection, getFieldContent, getFieldContent, getFieldName, getId, getOaiModtime |
Methods inherited from class org.dlese.dpc.index.writer.FileIndexingServiceWriter |
---|
abortIndexing, addToAdminDefaultField, addToDefaultField, create, getExistingDoc, getFileIndexingService, getSourceDir, getSourceFile, isValidationEnabled, prtln, prtlnErr, setDebug, setDefaultFieldName, setFileIndexingService, setValidationEnabled |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
protected ItemFileIndexingWriter(RecordDataService recordDataService)
RecordDataService
is used to get
indexible data such as recordStatus, annotations, vocab ID mappings and associated
IDs.
recordDataService
- The recordData service used with writer.Method Detail |
protected abstract String getTitle() throws Exception
Exception
- This method should throw and Exception with appropriate error
message if an error occurs.protected abstract String getDescription() throws Exception
Exception
- This method should throw and Exception with appropriate error
message if an error occurs.protected abstract String getUrl() throws Exception
Exception
- This method should throw and Exception with appropriate error
message if an error occurs.protected abstract String getKeywords() throws Exception
Exception
- This method should throw and Exception with appropriate error
message if an error occurs.protected abstract String getCreatorLastName() throws Exception
Exception
- This method should throw and Exception with appropriate error
message if an error occurs.protected abstract String getCreator() throws Exception
Exception
- This method should throw and Exception with appropriate error
message if an error occurs.protected abstract String getAccessionStatus() throws Exception
Exception
- This method should throw and Exception with appropriate error
message if an error occurs.protected abstract void addFrameworkFields(Document newDoc, Document existingDoc) throws Exception
The following Lucene Field
types are available for
indexing with the Document
:
Field.Text(string name, string value) -- tokenized, indexed, stored
Field.UnStored(string name, string value) -- tokenized, indexed, not stored
Field.Keyword(string name, string value) -- not tokenized, indexed, stored
Field.UnIndexed(string name, string value) -- not tokenized, not indexed, stored
Field(String name, String string, boolean store, boolean index, boolean tokenize) --
allows control to do anything you want
Example code:
protected void addFrameworkFields(Document newDoc, Document existingDoc) throws Exception {
String customContent = "Some content";
newDoc.add(Field.Text("mycustomefield", customContent));
}
newDoc
- The new Document
that is
being created for this resourceexistingDoc
- An existing Document
that
currently resides in the index for the given resource, or null if none was
previously present
Exception
- This method should throw and Exception with appropriate error
message if an error occurs.public abstract String getDocType() throws Exception
StandardAnalyzer
so it
must be lowercase and should not contain any stop words.
getDocType
in interface DocWriter
getDocType
in class FileIndexingServiceWriter
Exception
- This method should throw and Exception with appropriate error
message if an error occurs.public abstract String getReaderClass()
DocReader
class that is used to read this type of Document
, for example
"org.dlese.dpc.index.reader.ItemDocReader".
getReaderClass
in interface DocWriter
getReaderClass
in class FileIndexingServiceWriter
DocReader
.public abstract void init(File source, Document existingDoc) throws Exception
init
in class FileIndexingServiceWriter
source
- The source file being indexedexistingDoc
- An existing Document that currently resides in the index for
the given resource, or null if none was previously present
Exception
- If an error occured during set-up.protected abstract void destroy()
destroy
in class FileIndexingServiceWriter
protected abstract String getValidationReport() throws Exception
getTitle()
,
addFrameworkFields(Document, Document)
, etc.) so that data verification can
be done during those calls, if needed.
getValidationReport
in class FileIndexingServiceWriter
Exception
- If error in performing the validation.protected final void addFields(Document newDoc, Document existingDoc, File sourceFile) throws Exception
addFields
in class XMLFileIndexingWriter
newDoc
- The new Document that is being created for this resourceexistingDoc
- An existing Document that currently resides in the index for
the given resource, or null if none was previously presentsourceFile
- The sourceFile that is being indexed.
Exception
- If an error occurspublic Document getDeletedDoc(Document existingDoc) throws Throwable
Document
from an existing
CollectionFileIndexing Document by setting the field "deleted" to "true" and making
the modtime equal to current time.
getDeletedDoc
in class FileIndexingServiceWriter
existingDoc
- An existing FileIndexingService Document that currently resides
in the index for the given resource.
Throwable
- Thrown if error occurs
|
DLESE Tools v1.2 |
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |