Cloudera | EMC Greenplum | Hortonworks | IBM | MapR | Microsoft | Platform Computing | |
---|---|---|---|---|---|---|---|
Product Name | Cloudera's Distribution including Apache Hadoop | Greenplum HD | Hortonworks Data Platform | InfoSphere BigInsights | MapR | Big Data Solution | Platform MapReduce |
Free Edition | CDH Integrated, tested distribution of Apache Hadoop | Community Edition 100% open source certified and supported version of the Apache Hadoop stack | Basic Edition An integrated Hadoop distribution. | MapR M3 Edition Free community edition incorporating MapR's performance increases | Platform MapReduce Developer Edition Evaluation edition, excludes resource management features of regualt edition | ||
Enterprise Edition | Cloudera Enterprise Adds management software layer over CDH | Enterprise Edition Integrates MapR's M5 Hadoop-compatible distribution, replaces HDFS with MapR's C++-based file system. Includes MapR management tools | Enterprise Edition Hadoop distribution, plus BigSheets spreadsheet interface, scheduler, text analytics, indexer, JDBC connector, security support. | MapR M5 Edition Augments M3 Edition with high availability and data protection features | Big Data Solution Windows Hadoop distribution, integrated with Microsoft's database and analytical products | Platform MapReduce Enhanced runtime for Hadoop MapReduce, API-compatible with Apache Hadoop | |
Hadoop Components | Hive, Oozie, Pig, Zookeeper, Avro, Flume, HBase, Sqoop, Mahout, Whirr | Hive, Pig, Zookeeper, HBase | Hive, Pig, Zookeeper, HBase, None, Ambari | Hive, Oozie, Pig, Zookeeper, Avro, Flume, HBase, Lucene | Hive, Pig, Flume, HBase, Sqoop, Mahout, None, Oozie | Hive, Pig | |
Security | Cloudera Manager Kerberos, role-based administration and audit trails | Security features LDAP authentication, role-based authorization, reverse proxy | Active Directory integration | ||||
Admin Interface | Cloudera Manager Centralized management and alerting | Administrative interfaces MapR Heatmap cluster administrative tools | Apache Ambari Monitoring, administration and lifecycle management for Hadoop clusters | Administrative interfaces Administrative features including Hadoop HDFS and MapReduce administration, cluster and server management, view HDFS file content | Administrative interfaces MapR Heatmap cluster administrative tools | System Center integration | Administrative interfaces Platform MapReduce Workload Manager |
Job Management | Cloudera Manager Job analytics, monitoring and log search | High-availability job management JobTracker HA and Distributed NameNode HA prevent lost jobs, restarts and failover incidents | Apache Ambari Monitoring, administration and lifecycle management for Hadoop clusters | Job management features Job creation, submission, cancellation, status, logging. | High-availability job management JobTracker HA and Distributed NameNode HA prevent lost jobs, restarts and failover incidents | ||
Database connectors | Greenplum Database | DB2, Netezza, InfoSphere Warehouse | SQL Server, SQL Server Parallel Data Warehouse | ||||
Interop features | Hive ODBC Driver, Excel Hive Add-in | ||||||
HDFS Access | Fuse-DFS Mount HDFS as a traditional filesystem | NFS Access HDFS as a conventional network file system | WebHDFS REST API to HDFS | NFS Access HDFS as a conventional network file system | |||
Installation | Cloudera Manager Wizard-based deployment | Quick installation GUI-driven installation tool | |||||
Additional APIs | Jaql Jaql is a functional, declarative query language designed to process large data sets. | REST API | JavaScript API JavaScript Map/Reduce jobs, Pig-Latin, and Hive queries | Includes R, C/C++, C#, Java, Python | |||
Volume Management | Mirroring, snapshots |