DataForge is a scalable engine for performing queries across a heterogenous set of data sources that can be both relational (SQL) and non-relational (anything!). Any datasource can be queried if it can be expressed as a tabular dataset and a transformer can be written for it.
Through the use of a streaming pipeline, DataForge provides a simple and efficient, yet powerful way to integrate java code and data sources.
New reference manual - PDF reference or HTML reference