Index Developer Guide

Introduction

Index is a data structure that can be used to accelerate certain query of the table. Different Index can be implemented by developers. Currently, Carbondata supports three types of Indexes:

  1. BloomFilter Index: A space-efficient probabilistic data structure that is used to test whether an element is a member of a set.
  2. Lucene Index: High performance, full-featured text search engine.
  3. Secondary Index: Sencondary index tables to hold blocklets are created as indexes and managed as child tables internally by Carbondata.

Index Provider

When user issues CREATE INDEX index_name ON TABLE main AS 'provider', the corresponding IndexProvider implementation will be created and initialized. Currently, the provider string can be:

  1. class name IndexFactory implementation: Developer can implement new type of Index by extending IndexFactory

When user issues DROP INDEX index_name ON TABLE main, the corresponding IndexFactory class will be called.

Click for more details about Index Management and supported DSL.