Transwarp Inceptor is a database for batch processing and analysis. It compatibly supports SQL 2003 and Oracle PL/SQL, DB2 SQL PL, which is unprecedented in Hadoop industry. ACID support is its another competitive advantage of assuring data processing consistency. Inceptor provides extreme performance for big data analysis (>10x faster than Apache Hadoop, >5x faster than MPP), and has much better performance on both TPC-DS and TPC-H benchmarks than other Hadoop and MPP products. Inceptor is widely applied to building data warehouse and data mart. Currently, over 500 customers are running their applications on it in China.
Transwarp Inceptor Composition
Transwarp Inceptor is an independent analytical database, which is composed of five parts from top to bottom: service layer, compiler layer, execution layer, interface layer and storage layer. Its architecture is as follows:
- • Service layer: The service layer of Inceptor includes the supports for standard JDBC, ODBC and shell. In addition, some external service modules such as session management, security management, HA management and data dictionary are also supported.
- • Compiler layer: It is mainly responsible for compiling, optimizing, and processing SQL, storage procedure and transactions. It accesses programs submitted by applications and translates them into executable plans.
- • Execution layer: Execution layer is consisted of a scheduler providing QoS management ability and a self-developed distributed engine. The first SLA scheduler is released with TDH 5.0, which can ameliorate the situation when heavy tasks block the small tasks under mixed load. SLA scheduler is capable of assigning different priorities to tasks, restricting the maximum number of tasks that users can submit, making an explicit division between system tasks and user tasks, and handling mixed load better through the fine-grained resource scheduling.
- • Interface layer: Data Description Layer (Stargate) acts as a bridge between computing engine and storage as well as the link between third-party storage/database and Inceptor, making it fundamental for achieving database federation.
- • Storage layer: In TDH 5.0, Holodesk in storage layer has improved a lot in both function and performance to better satisfy interactive requirements in higher concurrency situations.
TDH is the first SQL-on-Hadoop engine in industry that provides full SQL 2003 support. With this functionality, the traditional applications built on Oracle or DB2 can be conveniently migrated to TDH to leverage the horsepower of big data.
In order to fit database difference better, TDH allows to set database dialects and Oracle/DB2/Teradata have been strongly supported so far.
The superior performance and scalability of Transwarp Inceptor makes it perfectly fit for big data analysis.
The distributed computing engine in Inceptor is deeply optimized and equipped with flexible scalability. It introduces SQL optimizer which is made up of four modules: Inter-SQL-Optimization (ISO), Rule-Based-Optimization (RBO), Materialization-Based-Optimization (MBO) and Cost-Based-Optimization (CBO). When queries enter computing engine, they are firstly converted into logical expressions and then delivered to the four-module optimizer before executed in the vectorized processing engine.
Data shuffling and broadcasting logic are well tuned by Inceptor for better performance. Data reading can speed up by using Holodesk which is a columnar storage engine accelerated by SSD/memory since IO impacts are avoided. Data processing are improved by the Cost-Based-Optimizer and Rule-Based-Optimizer which generate best execution plans for queries. All these features make the batch processing efficient and scalable, which empowers Inceptor to pass the TPC-DS benchmark test with remarkable performance.
Inceptor also fits the interactive data analysis and OLAP scenarios well. Holodesk provides index support, and effectively uses SSD to speed up scan, so the business under the interactive analysis scenarios can be processed in a multiplied speed. For fixed pattern data report business, users can use OLAP Cube technique to improve the analysis performance by 10x ~ 100x.
Transwarp Inceptor is the first product that provides complete transactions support in the Hadoop industry. It achieves serializable transaction isolation and maintains data consistency through two-phase locking and MVCC protocol. The performance of Transwarp Inceptor satisfies the requirements in most batch processing scenarios.
The table on the right presents the comparison among Inceptor and other mainstream solutions. As shown, Inceptor has reached a close level to Oracle in terms of ACID and CRUD, while Hive and Impala only support partial functions, being completely unable to meet the requirements in production.
Scheduler - SLA scheduler
SLA scheduler is a freshly released component in TDH 5.0 to address a problem encountered by plenty of users that when performing batch processing tasks heavy workload is easy to block light tasks yet with higher priority. SLA scheduler solves this issue by assigning exclusive channels to tasks according to their size and priority through methods like priority differentiation and pool isolation.
The structure on the left is the architecture of SLA scheduler which is made up of a strategy analysis module and a strategy distribution module. It selects suitable pool for SQL, specifies its priority and assigns resource weight by users and system load information collected in real time. Furion Scheduler is newly integrated to provide better scheduling strategies than FAIR Scheduler, which is finer-grained and takes weight and user strategy into consideration, therefore provides more flexible scheduling function.
With the support of SLA scheduler, Inceptor can flexibly handle application deployment under various mixed load and facilitates users to plan business for system.