Transwarp Inceptor

Overview
Transwarp Inceptor is a database for batch processing and analysis. It compatibly supports SQL 2003 and Oracle PL/SQL, DB2 SQL PL, which is unprecedented in Hadoop industry. ACID support is its another competitive advantage of assuring data processing consistency. Inceptor provides extreme performance for big data analysis (>10x faster than Apache Hadoop, >5x faster than MPP), and has much better performance on both TPC-DS and TPC-H benchmarks than other Hadoop and MPP products. Inceptor is widely applied to building data warehouse and data mart. Currently, over 500 customers are running their applications on it in China.


Transwarp Inceptor Composition

Transwarp Inceptor is an independent analytical database, which is composed of five parts from top to bottom: service layer, compiler layer, execution layer, interface layer and storage layer. Its architecture is as follows:



Technical Advantages

SQL Support

TDH is the first SQL-on-Hadoop engine in industry that provides full SQL 2003 support. With this functionality, the traditional applications built on Oracle or DB2 can be conveniently migrated to TDH to leverage the horsepower of big data.

In order to fit database difference better, TDH allows to set database dialects and Oracle/DB2/Teradata have been strongly supported so far.


Superior Performance

The superior performance and scalability of Transwarp Inceptor makes it perfectly fit for big data analysis.

The distributed computing engine in Inceptor is deeply optimized and equipped with flexible scalability. It introduces SQL optimizer which is made up of four modules: Inter-SQL-Optimization (ISO), Rule-Based-Optimization (RBO), Materialization-Based-Optimization (MBO) and Cost-Based-Optimization (CBO). When queries enter computing engine, they are firstly converted into logical expressions and then delivered to the four-module optimizer before executed in the vectorized processing engine.

Data shuffling and broadcasting logic are well tuned by Inceptor for better performance. Data reading can speed up by using Holodesk which is a columnar storage engine accelerated by SSD/memory since IO impacts are avoided. Data processing are improved by the Cost-Based-Optimizer and Rule-Based-Optimizer which generate best execution plans for queries. All these features make the batch processing efficient and scalable, which empowers Inceptor to pass the TPC-DS benchmark test with remarkable performance.

Inceptor also fits the interactive data analysis and OLAP scenarios well. Holodesk provides index support, and effectively uses SSD to speed up scan, so the business under the interactive analysis scenarios can be processed in a multiplied speed. For fixed pattern data report business, users can use OLAP Cube technique to improve the analysis performance by 10x ~ 100x.


Transactions support

Transwarp Inceptor is the first product that provides complete transactions support in the Hadoop industry. It achieves serializable transaction isolation and maintains data consistency through two-phase locking and MVCC protocol. The performance of Transwarp Inceptor satisfies the requirements in most batch processing scenarios.

The table on the right presents the comparison among Inceptor and other mainstream solutions. As shown, Inceptor has reached a close level to Oracle in terms of ACID and CRUD, while Hive and Impala only support partial functions, being completely unable to meet the requirements in production.


Scheduler - SLA scheduler

SLA scheduler is a freshly released component in TDH 5.0 to address a problem encountered by plenty of users that when performing batch processing tasks heavy workload is easy to block light tasks yet with higher priority. SLA scheduler solves this issue by assigning exclusive channels to tasks according to their size and priority through methods like priority differentiation and pool isolation.

The structure on the left is the architecture of SLA scheduler which is made up of a strategy analysis module and a strategy distribution module. It selects suitable pool for SQL, specifies its priority and assigns resource weight by users and system load information collected in real time. Furion Scheduler is newly integrated to provide better scheduling strategies than FAIR Scheduler, which is finer-grained and takes weight and user strategy into consideration, therefore provides more flexible scheduling function.

With the support of SLA scheduler, Inceptor can flexibly handle application deployment under various mixed load and facilitates users to plan business for system.