0
I Use This!
Moderate Activity
Analyzed about 8 hours ago. based on code collected about 8 hours ago.

Project Summary

Declarative large-scale machine learning (ML) that aims at flexible specification of ML algorithms and automatic generation of hybrid runtime plans ranging from single-node, in-memory computations, to distributed computations on Apache Hadoop and Apache Spark.
ML algorithms are expressed in an R-like or Python-like syntax that includes linear algebra primitives, statistical functions, and ML-specific constructs. This high-level language significantly increases the productivity of data scientists as it provides (1) full flexibility in expressing custom analytics, and (2) data independence from the underlying input formats and physical data representations. Automatic optimization according to data and cluster characteristics ensures both efficiency and scalability.

Tags

cluster distributed dml hadoop java machine_learning pydml python spark

Apache License 2.0
Permitted

Commercial Use

Modify

Distribute

Place Warranty

Sub-License

Private Use

Use Patent Claims

Forbidden

Hold Liable

Use Trademarks

Required

Include Copyright

State Changes

Include License

Include Notice

These details are provided for information only. No information here is legal advice and should not be used as such.

This Project has No vulnerabilities Reported Against it

Did You Know...

  • ...
    in 2016, 47% of companies did not have formal process in place to track OS code
  • ...
    compare projects before you chose one to use
  • ...
    there are over 3,000 projects on the Open Hub with security vulnerabilities reported against them
  • ...
    you can subscribe to e-mail newsletters to receive update from the Open Hub blog

Languages

HTML
60%
Java
28%
JavaScript
7%
13 Other
5%

30 Day Summary

Jan 11 2026 — Feb 10 2026

12 Month Summary

Feb 10 2025 — Feb 10 2026
  • 218 Commits
    Down -199 (47%) from previous 12 months
  • 38 Contributors
    Down -5 (11%) from previous 12 months