0
I Use This!
Inactive

Commits : Listings

Analyzed 3 days ago. based on code collected 3 days ago.
Mar 17, 2025 — Mar 17, 2026
Commit Message Contributor Files Modified Lines Added Lines Removed Code Location Date
Remove some anntaylor parser codes More... about 15 years ago
Merge branch 'master' of github.com:sichen/WatchList More... about 15 years ago
Burberry Parser More... about 15 years ago
add a class to initialize database. we can use the following command to initialize database used by watch list. bin/nutch initwldb script/initdb.sql More... about 15 years ago
re-factor the configuration of watchlist database access, added a class for connection pooling. More... about 15 years ago
Implement the IdGenerator More... about 15 years ago
Test the repo More... about 15 years ago
test the repo More... about 15 years ago
Add an option to log_analysis script, to skip test data generation. This is good for statistics collection only More... about 15 years ago
Add the functionality to log_analysis script to count the total extracted product records More... about 15 years ago
Fix some misc bugs.
Si
More... about 15 years ago
A little optimization for crawl, get rid of some urls More... about 15 years ago
Reduce the number of test cases to avoid Java OutOfMemory Exception. More... about 15 years ago
Fix bugs I encountered in my first round of real jcrew site crawl. Added lots of test cases. All of them pass, but I have to disable most of them because there are too many of them and Java VM run out of memory. More... about 15 years ago
Add a python script to parse hadoop.log to detect JCrewParser errors. This script also helps generating Java unit test from those failed pages. More... about 15 years ago
1. Fix a bug I encountered when I am doing real JCrew site crawl 2. add a automatic crawl script to run Nutch
Si
More... about 15 years ago
Implement the parser and index/query filter for JCrew. Added JUnit test cases for the JCrewParser class. More... about 15 years ago
Si Chen: create the initial repository with Nutch source code use this as the code baseline for future development More... about 15 years ago
first commit with README More... over 15 years ago