Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

Apache Spark

Compare

Claimed by Apache Software Foundation Analyzed about 2 hours ago

Apache Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark provides primitives for in-memory cluster computing: your job can load data into memory and query it repeatedly more rapidly than with ... [More] disk-based systems like Hadoop. To make programming faster, Spark offers high-level APIs in Scala, Java and Python, letting you manipulate distributed datasets like local collections. You can also use Spark interactively to query big data from the Scala or Python shells. Spark integrates closely with Hadoop to run inside Hadoop clusters and can access any existing Hadoop data source. [Less]

1.21M lines of code

431 current contributors

about 5 hours since last commit

50 users on Open Hub

Very High Activity
5.0
 
I Use This

collectd

Compare

  Analyzed about 4 hours ago

collectd is a small daemon which collects system information periodically and writes the results to an RRD-file. What does collectd do? collectd collects information about the system it is running on and writes this information into special database files. These database files can then be used ... [More] to generate graphs of the collected data. collectd itself does not generate graphs, it only collects the data. You should use software like drraw to generate pretty pictures from these RRD-files. Nonetheless, sample scripts are included to get you started on own graphing scripts. [Less]

278K lines of code

56 current contributors

2 days since last commit

33 users on Open Hub

High Activity
4.81818
   
I Use This
Licenses: BSD-3-Clause, MIT

Apache Mesos

Compare

Claimed by Apache Software Foundation Analyzed about 3 hours ago

Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively.

664K lines of code

114 current contributors

2 days since last commit

31 users on Open Hub

Very High Activity
4.0
   
I Use This

Apache Hive

Compare

Claimed by Apache Software Foundation Analyzed 16 days ago

Hive is a data warehouse infrastructure built on top of Hadoop that provides tools to enable easy data summarization, adhoc querying and analysis of large datasets data stored in Hadoop files. It provides a mechanism to put structure on this data and it also provides a simple query language called ... [More] Hive QL which is based on SQL and which enables users familiar with SQL to query this data. At the same time, this language also allows traditional map/reduce programmers to be able to plug in their custom mappers and reducers to do more sophisticated analysis which may not be supported by the built-in capabilities of the language. [Less]

1.49M lines of code

93 current contributors

16 days since last commit

23 users on Open Hub

Very High Activity
5.0
 
I Use This

GlusterFS

Compare

  Analyzed about 3 hours ago

GlusterFS is a distributed file-system capable of scaling to several peta-bytes. It aggregates various storage bricks over Infiniband RDMA or TCP/IP into one large parallel network file system. Storage bricks can be made of any commodity hardware such as x86-64 server with SATA RAID, and can use Infiniband HBAs'.

1.23M lines of code

137 current contributors

2 days since last commit

22 users on Open Hub

Very High Activity
4.6
   
I Use This
Licenses: GNU GPL v2, LGPL-3.0+

DragonFly BSD

Compare

  Analyzed about 1 hour ago

DragonFly BSD is a UNIX-like operating system that has been continuously developed since it forked from FreeBSD 4.8 in 2004. The development focus is on innovation and performance, as well as usability. Nearly 22,000 third-party software packages are available due to the Ports Collection it ... [More] shares with FreeBSD (known as DPorts). Currently only the x86_64 architecture is officially supported. [Less]

8.57M lines of code

20 current contributors

about 1 hour since last commit

20 users on Open Hub

Very High Activity
4.78571
   
I Use This

Proxmox VE

Compare

  Analyzed about 21 hours ago

Proxmox VE is a complete open source virtualization management solution for servers based on KVM and OpenVZ. It manages virtual machines, storage, virtualized networks, and HA Clustering. Proxmox VE is an open source project, developed and maintained by Proxmox Server Solutions GmbH.

106K lines of code

16 current contributors

3 days since last commit

20 users on Open Hub

High Activity
3.0
   
I Use This

hazelcast

Compare

  Analyzed about 3 hours ago

Hazelcast is a clustering and highly scalable data distribution platform for Java. Features: Distributed implementations of java.util.{Queue, Set, List, Map} Distributed implementation of java.util.concurrency.locks.Lock Distributed implementation of java.util.concurrent.ExecutorService ... [More] Distributed MultiMap for one-to-many relationships Distributed Topic for publish/subscribe messaging Transaction support and J2EE container integration via JCA Socket level encryption support for secure clusters Synchronous (write-through) and asynchronous (write-behind) persistence Second level cache provider for Hibernate Monitoring and management of the cluster via JMX Dynamic HTTP session clustering Support for cluster info and membership events Dynamic discovery Dynamic scaling Dynamic partitioning with backups Dynamic fail-over Hazelcast is for you if you want to share data/state among many servers (e.g. web session sharing) cache your data (distributed cache) for better performance cluster your application provide secure communication among servers partition your in-memory data send/receive messages among applications distribute workload onto many servers take advantage of parallel processing provide fail-safe data management Hazelcast is pure Java. JVMs that are running Hazelcast will dynamically cluster. Although by default Hazelcast will use multicast for discovery, it can also be configured to only use TCP/IP for enviroments where multicast is not available or preferred. Communication among cluster members is always TCP/IP with Java NIO beauty. Default configuration comes with 1 backup so if one node fails, no data will be lost. It is as simple as using java.util.{Queue, Set, List, Map}. Just add the hazelcast.jar into your classpath and start coding. A test application comes with the Hazelcast distribution that simulates the queue, set, map and lock APIs. You may want to watch the following 12 minute screencast to quickly get started. [Less]

763K lines of code

56 current contributors

2 days since last commit

14 users on Open Hub

Very High Activity
5.0
 
I Use This

Kazoo Platform

Compare

  Analyzed about 18 hours ago

Kazoo is a scalable, distributed, cloud-based telephony platform that allows you to build powerful telephony applications with a rich set of APIs. Designed to handle anything from large carrier to small countries, the Whistle infrastructure can do it all. There are no lock-ins and the software is ... [More] open-source to give you complete freedom. Services include: - Complete redundancy and failover between data centers - Complete replication of all data - Use of Map/Reduce algorithms inside NoSQL databases - Multi-master replication and caching of registrations, active channels and call lookups - Load balancing built-in - Event driven messaging for managing and using calls - A complete REST interface for implementing call flow features [Less]

269K lines of code

36 current contributors

1 day since last commit

14 users on Open Hub

High Activity
5.0
 
I Use This

ceph

Compare

  Analyzed about 2 hours ago

Ceph is a distributed network file system designed to provide excellent performance, reliability, and scalability. Ceph fills two significant gaps in the array of currently available file systems: 1. Robust, open-source distributed storage — Ceph is released under the terms of the LGPL, which ... [More] means it is free software (as in speech and beer). Ceph will provide a variety of key features that are generally lacking from existing open-source file systems, including seamless scalability (the ability to simply add disks to expand volumes), intelligent load balancing, and efficient, easy to use snapshot functionality. 2. Scalability — Ceph is built from the ground up to seamlessly and gracefully scale from gigabytes to petabytes and beyond. Scalability is considered in terms of ... [Less]

990K lines of code

346 current contributors

about 22 hours since last commit

13 users on Open Hub

Very High Activity
5.0
 
I Use This