openhub.net
Black Duck Software, Inc.
Open Hub
Follow @
OH
Sign In
Join Now
Projects
People
Organizations
Tools
Blog
BDSA
Projects
People
Projects
Organizations
W
WatchList
Settings
|
Report Duplicate
0
I Use This!
×
Login Required
Log in to Open Hub
Remember Me
Inactive
Commits
: Listings
Analyzed
3 days
ago. based on code collected
3 days
ago.
Mar 17, 2025 — Mar 17, 2026
Showing page 1 of 1
Search / Filter on:
Commit Message
Contributor
Files Modified
Lines Added
Lines Removed
Code Location
Date
Remove some anntaylor parser codes
juny78
More...
about 15 years ago
Merge branch 'master' of github.com:sichen/WatchList
juny78
More...
about 15 years ago
Burberry Parser
juny78
More...
about 15 years ago
add a class to initialize database. we can use the following command to initialize database used by watch list. bin/nutch initwldb script/initdb.sql
Si Chen
More...
about 15 years ago
re-factor the configuration of watchlist database access, added a class for connection pooling.
Si Chen
More...
about 15 years ago
Implement the IdGenerator
Si Chen
More...
about 15 years ago
Test the repo
juny78
More...
about 15 years ago
test the repo
juny78
More...
about 15 years ago
Add an option to log_analysis script, to skip test data generation. This is good for statistics collection only
Si Chen
More...
about 15 years ago
Add the functionality to log_analysis script to count the total extracted product records
Si Chen
More...
about 15 years ago
Fix some misc bugs.
Si
More...
about 15 years ago
A little optimization for crawl, get rid of some urls
Si Chen
More...
about 15 years ago
Reduce the number of test cases to avoid Java OutOfMemory Exception.
Si Chen
More...
about 15 years ago
Fix bugs I encountered in my first round of real jcrew site crawl. Added lots of test cases. All of them pass, but I have to disable most of them because there are too many of them and Java VM run out of memory.
Si Chen
More...
about 15 years ago
Add a python script to parse hadoop.log to detect JCrewParser errors. This script also helps generating Java unit test from those failed pages.
Si Chen
More...
about 15 years ago
1. Fix a bug I encountered when I am doing real JCrew site crawl 2. add a automatic crawl script to run Nutch
Si
More...
about 15 years ago
Implement the parser and index/query filter for JCrew. Added JUnit test cases for the JCrewParser class.
Si Chen
More...
about 15 years ago
Si Chen: create the initial repository with Nutch source code use this as the code baseline for future development
Si Chen
More...
about 15 years ago
first commit with README
Si Chen
More...
over 15 years ago
This site uses cookies to give you the best possible experience. By using the site, you consent to our use of cookies. For more information, please see our
Privacy Policy
Agree