Sunday, July 4, 2010

PDFViewer using ICEFaces, ICEPDF and Lucene

PDFViewer
My intention in this tutorial is to send some rays of light to some ICEFaces UIComponents such as: Collapsible panel, file upload, input/output text. Also we will know how to use effect tag component. Using ICEPdf and Lucene API we build searchable document and see how it's easy extract text from pdf and search through terms. Finally we wrap all our development into JSF web application.
Our application looks like that:
Let's begin.
Before delving into UIComponents explanation I would like to describe user's use case. Imagine a government worker who is not allowed install pdf reader by security reason. He tried explain his requirements to the IT team and here they are. He wants to upload pdf file into web server and read his stuff, also he wants to get possibility perform search on the uploaded file. IT team scratch their heads and understand that they can implement file uploading, but they don't know how to extract a text from pdf and how to provide searching capability. One very talented programmer from the team learned that ICESoft also has good and reliable package named ICEPDF that especially suites for that goal. This guy also suggests that if they will use Lucene API then search will be approachable.Well, it's time to transform the requirements to real JSF application. As we can see, at least two pages are needed, one for file upload and the second for viewer. The first question is how to navigate between pages? ICEFaces has a lot of navigational UI controls (tabs, collapsable or accordion and even more). My most likable ui control is Collapsible panel. Collapsible panel is a component consists of two parts: the content area and a header section, which can be clicked on, to cause the content area to collapse into not being visible, or expand to become visible. The code for file uploading looks like this:

<ice:panelCollapsible id="upload" expanded="true">
<f:facet name="header">
<ice:panelGroup>
<ice:outputText id="imageHeader" value="upload pdf files" />
ice:panelGroup>
f:facet>
<ice:panelGroup style="width: 100% ">
<ice:inputFile id="inputFileName"
autoUpload="#{inputFileController.autoUpload}"
actionListener="#{inputFileController.uploadFile}"
progressListener="#{inputFileController.fileUploadProgress}" />
<ice:outputProgress value="#{inputFileController.fileProgress}" />
<ice:inputHidden id="pathLucene" value="#{inputFileController.pathLuceneIndex}" />
<ice:inputHidden id="totalPages" value="#{inputFileController.totalPages}" />
ice:panelGroup>
ice:panelCollapsible>


After file is uploaded we need to extract the text provided by ICEPDF API:

PageText pageText = document.getPageViewText(pageNumber);
When text is extracted we build the Lucene's index. Think about pdf document, it contains pages, pages contain some text and text itself compound from words. If we project our pdf knowledge to Lucene's world, we get: document that wraps fields (pages), each field wraps terms (words). An index is a searchable ADT that works pretty well.
Here is a piece of code for building the index:

public void buildIndex(){
Document document = new Document();
String txt = null;
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_30);
this.indexDirectory = getIndexPath();
IndexWriter writer;
try {
writer = new IndexWriter(FSDirectory.open(new File(getIndexDirectory())), analyzer, IndexWriter.MaxFieldLength.UNLIMITED);
/* Creates Lucene document */
for (int currentPage = 0; currentPage <>
try {
txt = getTe().getTextFromPage(currentPage);
document.add(new Field(String.valueOf(currentPage), txt, Field.Store.YES, Field.Index.ANALYZED));
} catch (Exception e) {
System.err.println(e.getLocalizedMessage());
}
}
writer.addDocument(document);
writer.optimize();
writer.close();
} catch (CorruptIndexException e1) {
e1.printStackTrace();
} catch (LockObtainFailedException e1) {
e1.printStackTrace();
} catch (IOException e1) {
e1.printStackTrace();
}
}


The second page – pdf viewer:

<ice:panelCollapsible id="pdfViewer" expanded="false">
<f:facet name="header">
<ice:panelGroup>
<ice:outputText id="textHeader" value="veiw pdf" />
ice:panelGroup>
f:facet>
<ice:panelGrid columns="8" style="width: 70%">
<ice:commandButton id="arrowDown"
image="./images/Arrowdowngreen.png"
actionListener="#{buttonsInputBean.imageButtonListener}" partialSubmit="true" />
<ice:commandButton id="arrowUp" image="./images/Arrowupgreen.png"
actionListener="#{buttonsInputBean.imageButtonListener}" partialSubmit="true" />
<ice:inputText id="currentPage"
value="#{buttonsInputBean.pageNumber}" partialSubmit="true"
valueChangeListener="#{buttonsInputBean.inputTextListener}">
<f:converter converterId="javax.faces.Integer" />
ice:inputText>
<ice:outputText id="separator" value="/"/>
<ice:inputText id="totalPage" value="#{buttonsInputBean.totalPages}" disabled="true" partialSubmit="true"/>
<ice:inputText id="search" value="#{buttonsInputBean.searchTerm}"
partialSubmit="true" effect="#{buttonsInputBean.effectOutputText}"
valueChangeListener="#{buttonsInputBean.inputTextListener}" />
<ice:commandButton id="searchRight"
image="./images/arightbl2_search.gif"
actionListener="#{buttonsInputBean.imageButtonListener}" partialSubmit="true" />
<ice:commandButton id="searchLeft"
image="./images/aleftbl4_search.gif"
actionListener="#{buttonsInputBean.imageButtonListener}" partialSubmit="true" />
ice:panelGrid>
<ice:panelGroup style="width: 50%; align: middle">
<ice:outputText id="textPage" style="align: center;" value="#{buttonsInputBean.regularText}" escape="false"/>
ice:panelGroup>
ice:panelCollapsible>


Since the text is extracted, an index built, we can try to search. By typing some word (query) and clicking on “Enter” key, a text in the inputText field colored Orange and the page with emphasized words appears. How do we achieve that? Simply like a charm. ICEfaces Component Suite provide special attributes that can be used to invoke effects on the components.


<ice:inputText id="search" value="#{buttonsInputBean.searchTerm}"
partialSubmit="true"
effect="#{buttonsInputBean.effectOutputText}"
valueChangeListener="#{buttonsInputBean.inputTextListener}" />


The Effect initialization is done by backing bean:

Highlight effectOutputText = new Highlight(WHITE_BKG);


That's it.
The source code of the PDFViewer located here:





3 comments:

david said...

Hi,

Great job.

I wonder how your collapsible panel is expanded 100% of your page.

I can't be able to expand the collapsible to get all the width of the page, nor when is contracted or expanded.

Did you do something special with the style??

Thanks for your blog.
David

Oleg said...

Hi David.
Thanks for appreciating that modest post. It would be much healthy for ICEFaces community if you could post your question on http://www.icefaces.org/JForum/forums/list.page or/and on LinkedIn Icefaces group. I guess you would change width=100% in the panelGroup, i.e. in the top container where panelCollapsible resides. Think about it as Russian doll "Matreshka". What you trying to do is "increase" the size of internal doll, however, the outer "gramma" does not allow. :-)

david said...

Hi, Oleg.

Thanks for comment.

I asked and searching for this stuff in icefaces forum, and I see many other people having problems with the same issue.

Anyway, I'll try your comments.

Thanks