Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

Apache Apex

Compare

Claimed by Apache Software Foundation Analyzed 5 days ago

Apache Apex is an enterprise grade big data-in-motion platform that unifies stream and batch processing. Apex was built for scalability and low-latency processing, high availability and operability. The Apex engine is supplemented by Malhar, the library of pre-built operators, including connectors ... [More] that integrate with many existing technologies as sources and destinations, like message buses, databases, files or social media feeds. [Less]

284K lines of code

1 current contributors

over 4 years since last commit

6 users on Open Hub

Inactive
0.0
 
I Use This

deeplearning4j

Compare

  Analyzed about 4 hours ago

Deeplearning4j is the first commercial-grade, open-source, distributed deep-learning library; designed to be used in business environments. Deeplearning4j aims to be cutting-edge plug and play, more convention than configuration, which allows for fast prototyping for non-researchers. Vast ... [More] support of scale out: Hadoop, Spark and Akka + AWS et al It includes both a distributed, multi-threaded deep-learning framework and a normal single-threaded deep-learning framework. Iterative reduce net training. First framework adapted for a micro-service architecture. A versatile n-dimensional array class. GPU integration [Less]

1.1M lines of code

17 current contributors

2 months since last commit

5 users on Open Hub

Low Activity
4.0
   
I Use This

Apache Bigtop

Compare

Claimed by Apache Software Foundation Analyzed about 10 hours ago

Bigtop is an Apache Foundation project for Infrastructure Engineers and Data Scientists looking for comprehensive packaging, testing, and configuration of the leading open source big data components. Bigtop supports a wide range of components/projects, including, but not limited to, Hadoop, HBase ... [More] and Spark. Bigtop support many Operating Systems, including Debian, Ubuntu, CentOS, Fedora, openSUSE and many others. Bigtop includes tools and a framework for testing at various levels (packaging, platform, runtime, etc.) for both initial deployments as well as upgrade scenarios for the entire data platform, not just the individual components. [Less]

112K lines of code

13 current contributors

about 11 hours since last commit

4 users on Open Hub

Moderate Activity
5.0
 
I Use This

Apache Flume

Compare

Claimed by Apache Software Foundation Analyzed about 4 hours ago

Apache Flume is a system for reliably collecting high-throughput data from streaming data sources like logs.

72.8K lines of code

3 current contributors

11 months since last commit

4 users on Open Hub

Very Low Activity
0.0
 
I Use This

Facebook Presto

Compare

Claimed by Facebook Analyzed about 18 hours ago

Distributed SQL query engine for big data Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. Presto was designed and written from the ground up for interactive analytics and ... [More] approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook. Presto is a distributed SQL query engine optimized for ad-hoc analysis at interactive speed. It supports standard ANSI SQL, including complex queries, aggregations, joins, and window functions. [Less]

2.88M lines of code

125 current contributors

about 1 month since last commit

4 users on Open Hub

Very High Activity
0.0
 
I Use This

Apache Drill

Compare

Claimed by Apache Software Foundation Analyzed about 6 hours ago

Apache Drill is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop data storage systems. It was inspired in part by the Google's Dremel. It is a top-level project of the Apache Software Foundation.

729K lines of code

37 current contributors

4 days since last commit

4 users on Open Hub

Moderate Activity
5.0
 
I Use This
Licenses: No declared licenses

StreamSets Data Collector

Compare

Claimed by StreamSets No analysis available

Open source software for the rapid development and ​reliable​ operation of complex data flows.

0 lines of code

60 current contributors

0 since last commit

4 users on Open Hub

Activity Not Available
5.0
 
I Use This
Mostly written in language not available
Licenses: apache_2

OpenIMAJ

Compare

  Analyzed about 2 hours ago

OpenIMAJ is a collection of libraries for multimedia analysis written in the Java programming language. OpenIMAJ contains classes that can perform processing, analysis and content-creation of many kinds of multimedia data, including images, video, audio and text. OpenIMAJ also incorporates a number ... [More] of tools to enable extremely-large-scale multimedia analysis using a distributed computing approach based on Apache Hadoop [Less]

819K lines of code

3 current contributors

over 3 years since last commit

3 users on Open Hub

Inactive
0.0
 
I Use This

Apache Giraph

Compare

Claimed by Apache Software Foundation Analyzed about 1 year ago

Giraph builds upon the graph-oriented nature of Pregel but additionally adds fault-tolerance to the coordinator process with the use of ZooKeeper as its centralized coordination service. Its implemented a graph-processing framework that is launched as a typical Hadoop job to leverage existing ... [More] Hadoop infrastructure, such as Amazon's EC2. Giraph follows the bulk-synchronous parallel model relative to graphs where vertices can send messages to other vertices during a given superstep. [Less]

141K lines of code

5 current contributors

over 3 years since last commit

3 users on Open Hub

Activity Not Available
0.0
 
I Use This

snowplow

Compare

  Analyzed 1 day ago

Code base for computer science projects.

5.14K lines of code

15 current contributors

3 months since last commit

3 users on Open Hub

Very Low Activity
5.0
 
I Use This