img avatar
 
× You must be logged in to change this data. If you don't have an account, please join.

Settings : Manage Projects

  Name I Use This Lines of Code   Current Committers Community Rating Reviews Description  
Apache Sentry 0 192965
6 none 0 Apache Sentry™ is a system for enforcing fine grained role based authorization to data and metadata stored on a Hadoop cluster.
CarbonData 0 345448
59 none 0 Apache CarbonData(incubating) is an indexed columnar data format for fast analytics on big data platform, e.g.Apache Hadoop, Apache Spark, etc. You can find the latest CarbonData document and learn more at: http://carbondata.incubator.apache.org
Apache Milagro 1 666749
3 none 0 Apache Milagro (incubating) establishes a new internet security framework purpose-built for ... [More] cloud-connected app-centric software and IoT devices that require Internet scale. Milagro's purpose is to provide a secure, free, and positive open source alternative to centralised and proprietary monolithic trust providers such as commercial certificate authorities and the certificate backed cryptosystems that rely on them. [Less]
Apache Pirk 0 103333
0 none 0 Pirk is a framework for scalable Private Information Retrieval (PIR). The goal of Pirk is to provide ... [More] a landing place for robust, scalable, and practical implementations of PIR algorithms. Apache Pirk (incubating) is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the name of Apache TLP sponsor. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. [Less]
Apache SystemML 0 2074936
7 none 0 Declarative large-scale machine learning (ML) that aims at flexible specification of ML algorithms ... [More] and automatic generation of hybrid runtime plans ranging from single-node, in-memory computations, to distributed computations on Apache Hadoop and Apache Spark. ML algorithms are expressed in an R-like or Python-like syntax that includes linear algebra primitives, statistical functions, and ML-specific constructs. This high-level language significantly increases the productivity of data scientists as it provides (1) full flexibility in expressing custom analytics, and (2) data independence from the underlying input formats and physical data representations. Automatic optimization according to data and cluster characteristics ensures both efficiency and scalability. [Less]
Apache HAWQ 1 1350139
8
5.0
 
0 Apache HAWQ (incubating) is an advanced, MPP, elastic query engine and analytic database for enterprises running on top of Apache Hadoop.
Apache Toree (inc... 1 23863
2 none 0 Toree provides applications with a mechanism to interactively and remotely access Apache Spark.
Apache Kudu 1 513771
43 none 0 Apache Kudu
Apache Arrow 1 12543848
202 none 0 Apache Arrow is a columnar in-memory analytics layer designed to accelerate big data. It houses a set of canonical in-memory representations of flat and hierarchical data along with multiple language-bindings for structure manipulation. It also provides IPC and common algorithm implementations.
Apache Ambari 1 1249573
0 none 0 The Apache Ambari project is aimed at making Hadoop management simpler by developing software for provisioning, managing, and monitoring Apache Hadoop clusters. Ambari provides an intuitive, easy-to-use Hadoop management web UI backed by its RESTful APIs.
Apache Knox 2 210320
24 none 0 REST API Gateway for the Hadoop Ecosystem
Apache Slider 1 0
0 none 0 Apache Slider is a YARN application to deploy existing distributed applications on YARN, monitor them and make them larger or smaller as desired -even while the application is running.
Apache Tez 0 226958
13
5.0
 
0 Mirror of Apache Tez
Apache Apex 6 284119
1 none 0 Apache Apex is an enterprise grade big data-in-motion platform that unifies stream and batch ... [More] processing. Apex was built for scalability and low-latency processing, high availability and operability. The Apex engine is supplemented by Malhar, the library of pre-built operators, including connectors that integrate with many existing technologies as sources and destinations, like message buses, databases, files or social media feeds. [Less]
MXNet 1 424442
208 none 0 Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware ... [More] Dataflow Dep Scheduler; for Python, R, Julia, Go, and more MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to mix the flavours of symbolic programming and imperative programming together to maximize the efficiency and your productivity. In its core, a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations on the fly. A graph optimization layer is build on top, which makes symbolic execution fast and memory efficient. The library is portable and lightweight, and is ready scales to multiple GPUs, and multiple machines. [Less]
Apache REEF 2 194671
4
5.0
 
0 Mirror of Apache REEF
Apache Ignite 4 1532573
0 none 0 Apache Ignite In-Memory Data Fabric is a high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time, orders of magnitude faster than possible with traditional disk-based or flash technologies.
Apache NiFi 8 1871574
115
5.0
 
0 Apache NiFi is an easy to use, powerful, and reliable system to process and distribute data.
Apache Zeppelin 4 475816
57 none 0 A web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more.
Apache Trafodion 2 0
28 none 0 Apache Trafodion is a webscale SQL-on-Hadoop solution enabling transactional or operational ... [More] workloads on Hadoop. The name "Trafodion" (the Welsh word for transactions, pronounced "Tra-vod-eee-on") was chosen specifically to emphasize the differentiation that Trafodion provides in closing a critical gap in the Hadoop ecosystem. Trafodion builds on the scalability, elasticity, and flexibility of Hadoop. Trafodion extends Hadoop to provide guaranteed transactional integrity, enabling new kinds of big data applications to run on Hadoop. [Less]