Apache Tika

Settings | Report Duplicate

23

I Use This!

Activity Not Available

Analyzed 4 months ago. based on code collected 4 months ago.

Project Summary

The Apache Tika™ toolkit detects and extracts metadata and structured text content from various documents using existing parser libraries.

Tika is a project of the Apache Software Foundation, and was formerly a subproject of Apache Lucene.

Tags

apache content java lucene metadata mime parser tika

In a Nutshell, Apache Tika...

...
has had 14,973 commits made by 192 contributors
representing 441,758 lines of code
...
is mostly written in Java
with an average number of source code comments
...
has a well established, mature codebase
maintained by a large development team
with stable Y-O-Y commits
...
took an estimated 118 years of effort (COCOMO model)
starting with its first commit in March, 2007
ending with its most recent commit 4 months ago

Quick Reference

Organization:

Apache Software Foundation

Homepage
Download

Code Locations:

(2 Locations)

Similar Projects:

Licenses

Apache License 2.0

Permitted

Commercial Use

Modify

Distribute

Place Warranty

Sub-License

Private Use

Use Patent Claims

Forbidden

Hold Liable

Use Trademarks

Required

Include Copyright

State Changes

Include License

Include Notice

These details are provided for information only. No information here is legal advice and should not be used as such.

All Licenses

Project Security

Vulnerabilities per Version ( last 10 releases )

Project Vulnerability Report

Security Confidence Index

Poor security track-record

Favorable security track-record

Vulnerability Exposure Index

Many reported vulnerabilities

Few reported vulnerabilities

About Project Vulnerability Report

Did You Know...

...
55% of companies leverage OSS for production infrastructure
...
you can embed statistics from Open Hub on your site
...
there are over 3,000 projects on the Open Hub with security vulnerabilities reported against them
...
learn about Open Hub updates and features on the Open Hub blog

About Project Security

Code

Lines of Code

Activity

Commits per Month

Community

Contributors per Month

Languages

Java	84%
XML	13%
13 Other	3%

30 Day Summary

Feb 22 2026 — Mar 24 2026

104 Commits
5 Contributors

12 Month Summary

Mar 24 2025 — Mar 24 2026

881 Commits
Down -49 (5%) from previous 12 months
19 Contributors
Down -4 (17%) from previous 12 months

Most Recent Contributors

	Tamas Cservenak		tallison
	Tim Allison		Tilman Hausherr
	Nicholas DiPiazza		dependabot[bot]

Ratings

6 users rate this project:

5.0/5.0

Click to add your rating

Review this Project!