Transwarp Discover

Transwarp Discover, containing Spark, MapReduce and R, is a distributed machine learning engine designed for big data platforms. Discover allows users to access data stored in HDFS, Hyperbase and Inceptor distributed memory by executing R programs via the R command line interface or the graphic RStudio. Discover is equipped with distributed realizations of a large number of machine learning algorithms, which can be used along with existing algorithms in R. When combined with the highly optimized proprietary algorithms built inside TDH, these machine learning algorithms can efficiently analyze graph data such as association networks. In addtion, Discover has integrated several machine learning libraries, which contain common algorithms for cluster analysis, classification, association analysis, anomaly detection, etc.

Transwarp Discover Architecture
Transwarp Discover Functionalities
Functionalities Description
Statistics Library A library for high-performance, parallel, statistical algorithms for elimination of noise, outliers, missing values in data, data normalization, statistical distributions, etc. It is a basic toolkit for machine learning and data mining.
Machine Learning Library A library for high-performance, parallel, machine learning algorithms including data classification, association, clustering, prediction and recommendation. It can be used to build a high precision recommendation engine or prediction endgine.
Support R Transwarp Data Hub Support powerful mainstream statistics and graphing language R and Web GUI RStudio. Together they support statistics and mining on large data sets by calling TDH's built-in parallel algorithm library..
Integrated Solution Support Provides various solutions including text analysis, fraud detection, risk analysis, recommendation systems, fault detection, enabling users to build their own solutions tailored for their own needs.