Full Text Search using Apache Lucene (Part-I)

    Home  »  Big DataCloud computing   »   Full Text Search using Apache Lucene (Part-I)

    Full Text Search using Apache Lucene (Part-I)


    Lucene is an open source, highly scalable text search-engine library available from the Apache Software Foundation. Lucene’s powerful APIs focus mainly on text indexing and searching. It can be used to build search capabilities for applications such as e-mail clients, mailing lists, Web searches, database search, etc. Web sites like Wikipedia, LinkedIn have been powered by Lucene.

    Lucene has many features. It:

    • Has powerful, accurate, and efficient search algorithms.
    • Calculates a score for each document that matches a given query and returns the most relevant documents ranked by the scores.
    • Supports many powerful query types, such as PhraseQuery, WildcardQuery, RangeQuery, FuzzyQuery, BooleanQuery, and more.
    • Supports parsing of human-entered rich query expressions.
    • Allows users to extend the searching behavior using custom sorting, filtering, and query expression parsing.
    • Uses a file-based locking mechanism to prevent concurrent index modifications.
    • Allows searching and indexing simultaneously.


    Steps to build an Application using Apache Lucene:

    The below image demonstrates various stages/phases in building an application using Lucene.

    1. Indexing data
    2. Analysing data
    3. Searching Indexed data.



    I will discuss about how to Index data to make it searchable and how to search Lucene indexed data in subsequent posts.


    Please follow and like us: