|
DLESE Tools v1.2 |
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||
java.lang.Objectorg.dlese.dpc.index.writer.FileIndexingServiceWriter
org.dlese.dpc.index.writer.XMLFileIndexingWriter
org.dlese.dpc.index.writer.ItemFileIndexingWriter
Abstract class for writing a Lucene Document for a
collection of item-level metadata records of a specific format (DLESE IMS, ADN-Item,
ADN-Collection, etc). The reader for this type of Document is XMLDocReader or ItemDocReader.
The Lucene Document fields that are created by
this class are (in addition the the ones listed for FileIndexingServiceWriter):
title - The tile for the resource. Stored.
description - The description for the resource. Stored.
url - The url to the resoruce. Stored.
Stored. Appended with a '0' at the beginning to support wildcard searching.
metadatapfx - The metadata prefix (format) for this record, for
example 'adn' or 'oai_dc'. Stored. Appended with a '0' at the beginning to support
wildcard searching.
accessionstatus - The accession status for this record. Stored.
Appended with a '0' at the beginning to support wildcard searching.
annotypes - Annotataion types that are refer to this record.
Keyword.
annopathways - Annotataion pathways that are refer to this
record. Keyword.
associatedids - A list of record IDs that refer to the same
resource. Keyword.
valid - Indicates whether the record is valid [true | false]. Not
stored.
validationreport - Text describing an error in the validation of
the data for this record. Stored. Only indexed if there was a validation error
indicated by the valid field containing false.
ItemDocReader,
XMLDocReader,
RecordDataService,
FileIndexingServiceWriter| Field Summary |
|---|
| Fields inherited from class org.dlese.dpc.index.writer.XMLFileIndexingWriter |
|---|
recordDataService, vocab |
| Constructor Summary | |
|---|---|
protected |
ItemFileIndexingWriter(RecordDataService recordDataService)
Creates a ItemFileIndexingWriter that indexes the given collection in field collection. |
| Method Summary | |
|---|---|
protected void |
addFields(Document newDoc,
Document existingDoc,
File sourceFile)
Adds fields to the index that are common to all collection-related documents. |
protected abstract void |
addFrameworkFields(Document newDoc,
Document existingDoc)
Adds fields to the index that are unique to the given framework. |
protected abstract void |
destroy()
This method is called at the conclusion of processing and may be used for tear-down. |
protected abstract String |
getAccessionStatus()
Returns the accession status of this record, for example 'accessioned'. |
protected abstract String |
getCreator()
Returns the items creator's full name. |
protected abstract String |
getCreatorLastName()
Returns the items creator's last name. |
Document |
getDeletedDoc(Document existingDoc)
Creates a Lucene Document from an existing
CollectionFileIndexing Document by setting the field "deleted" to "true" and making
the modtime equal to current time. |
protected abstract String |
getDescription()
Returns a description for the document being indexed. |
abstract String |
getDocType()
Returns a unique document type key for this kind of record, corresponding to the format type. |
protected abstract String |
getKeywords()
Returns the items keywords. |
abstract String |
getReaderClass()
Gets the fully qualified name of the concrete DocReader class that is used to read this type of Document, for example
"org.dlese.dpc.index.reader.ItemDocReader". |
protected abstract String |
getTitle()
Returns a title for the document being indexed. |
protected abstract String |
getUrl()
Returns the URL to the resource being indexed. |
protected abstract String |
getValidationReport()
Gets a report detailing any errors found in the validation of the data, or null if no error was found. |
abstract void |
init(File source,
Document existingDoc)
This method is called prior to processing and may be used to for any necessary set-up. |
| Methods inherited from class org.dlese.dpc.index.writer.XMLFileIndexingWriter |
|---|
addCustomFields, getCollection, getFieldContent, getFieldContent, getFieldName, getId, getOaiModtime |
| Methods inherited from class org.dlese.dpc.index.writer.FileIndexingServiceWriter |
|---|
abortIndexing, addToAdminDefaultField, addToDefaultField, create, getExistingDoc, getFileIndexingService, getSourceDir, getSourceFile, isValidationEnabled, prtln, prtlnErr, setDebug, setDefaultFieldName, setFileIndexingService, setValidationEnabled |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
protected ItemFileIndexingWriter(RecordDataService recordDataService)
RecordDataService is used to get
indexible data such as recordStatus, annotations, vocab ID mappings and associated
IDs.
recordDataService - The recordData service used with writer.| Method Detail |
protected abstract String getTitle()
throws Exception
Exception - This method should throw and Exception with appropriate error
message if an error occurs.
protected abstract String getDescription()
throws Exception
Exception - This method should throw and Exception with appropriate error
message if an error occurs.
protected abstract String getUrl()
throws Exception
Exception - This method should throw and Exception with appropriate error
message if an error occurs.
protected abstract String getKeywords()
throws Exception
Exception - This method should throw and Exception with appropriate error
message if an error occurs.
protected abstract String getCreatorLastName()
throws Exception
Exception - This method should throw and Exception with appropriate error
message if an error occurs.
protected abstract String getCreator()
throws Exception
Exception - This method should throw and Exception with appropriate error
message if an error occurs.
protected abstract String getAccessionStatus()
throws Exception
Exception - This method should throw and Exception with appropriate error
message if an error occurs.
protected abstract void addFrameworkFields(Document newDoc,
Document existingDoc)
throws Exception
The following Lucene Field types are available for
indexing with the Document:
Field.Text(string name, string value) -- tokenized, indexed, stored
Field.UnStored(string name, string value) -- tokenized, indexed, not stored
Field.Keyword(string name, string value) -- not tokenized, indexed, stored
Field.UnIndexed(string name, string value) -- not tokenized, not indexed, stored
Field(String name, String string, boolean store, boolean index, boolean tokenize) --
allows control to do anything you want
Example code:
protected void addFrameworkFields(Document newDoc, Document existingDoc) throws Exception {
String customContent = "Some content";
newDoc.add(Field.Text("mycustomefield", customContent));
}
newDoc - The new Document that is
being created for this resourceexistingDoc - An existing Document that
currently resides in the index for the given resource, or null if none was
previously present
Exception - This method should throw and Exception with appropriate error
message if an error occurs.
public abstract String getDocType()
throws Exception
StandardAnalyzer so it
must be lowercase and should not contain any stop words.
getDocType in interface DocWritergetDocType in class FileIndexingServiceWriterException - This method should throw and Exception with appropriate error
message if an error occurs.public abstract String getReaderClass()
DocReader class that is used to read this type of Document, for example
"org.dlese.dpc.index.reader.ItemDocReader".
getReaderClass in interface DocWritergetReaderClass in class FileIndexingServiceWriterDocReader.
public abstract void init(File source,
Document existingDoc)
throws Exception
init in class FileIndexingServiceWritersource - The source file being indexedexistingDoc - An existing Document that currently resides in the index for
the given resource, or null if none was previously present
Exception - If an error occured during set-up.protected abstract void destroy()
destroy in class FileIndexingServiceWriter
protected abstract String getValidationReport()
throws Exception
getTitle(),
addFrameworkFields(Document, Document), etc.) so that data verification can
be done during those calls, if needed.
getValidationReport in class FileIndexingServiceWriterException - If error in performing the validation.
protected final void addFields(Document newDoc,
Document existingDoc,
File sourceFile)
throws Exception
addFields in class XMLFileIndexingWriternewDoc - The new Document that is being created for this resourceexistingDoc - An existing Document that currently resides in the index for
the given resource, or null if none was previously presentsourceFile - The sourceFile that is being indexed.
Exception - If an error occurs
public Document getDeletedDoc(Document existingDoc)
throws Throwable
Document from an existing
CollectionFileIndexing Document by setting the field "deleted" to "true" and making
the modtime equal to current time.
getDeletedDoc in class FileIndexingServiceWriterexistingDoc - An existing FileIndexingService Document that currently resides
in the index for the given resource.
Throwable - Thrown if error occurs
|
DLESE Tools v1.2 |
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||