Boosting Algorithm
2008
- Lecture Slides: (pdf) (6/24, 2008)
- Report Submission Site: http://lecture.utgenome.org/lms2008/
- 注意:u-tokyo.ac.jpのアドレスを使った場合、登録確認用のメールが遅れて届くことがあります。参考:ECC メールが届かない。登録だけは早めに済ませておくこと。
How to Submit Your Report
- First, open http://lecture.utgenome.org/lms2008/ , then click 'New User?' link. Type your student ID number, full name (Kanji is allowed) and e-mail address and push the regist button.
- Be careful not to input the wrong information.
- Then, wait until a confirmation e-mail message arrives. It may take a few minutues (ECC intentionally delays some e-mails.)
- The conformation process is required to reset your password.
- The title of the confirmation email is "Welcome to the Report Submission System", and the sender is "noreply@…". If you cannot find the e-mail, check your spam mail folder.
- Click the link in your confirmation e-mail.
- Enter your e-mail address and password to login http://lecture.utgenome.org/lms2008/
- Pack your source codes and report document files into a single zip file.
- Select your zip file and push the submit button to send your report.
- You can confirm the content of your report file by clicking the link on the file name.
Report Task
Deadline: July 31th (Reports submitted by August 1st, 8:00 a.m. can be accepted)
Task 1. Implement the Ada-Boost algorithm. (You can use any programming language, but I recommend you to use Java for practice, since the other classes in our carriculum are also based on Java.)
- If you use the sample program as it is, you will get C (50) or less mark.
- If you implement the algorithm from scratch, you will get C (50) or higher mark.
- You may use OptionParser?, Logger, etc. because these components are merely helper utilities that can be used for other purposes.
- We highly appriciate it if your experimental results (Task 3 to 6) are reproducible. Prepare JUnit test codes, Eclipse's launch files, Makefile (GNU Make), Ant build.xml, etc. in order for us to be able to repeat your experiments. You may use any method, including writing some script codes etc., to run your experiments as long as it works properly.
Task 2. Explain the data structures and program flow of your source codes.
- You can do this by adding appropriate comments in your source codes.
- Readability of the names of classes, methods and variables also helps to make your codes easy to understand.
Task 3. Show the classifiers generated from an input sample data (0/1 binary data).
Task 4. Analyse how the weight values and error ratio vary according to several parameters (e.g., the number of classifiers, loop count of the weak learning process, data sizes, etc.).
- Use graphs or tables to show these changes.
- Then, discuss the results.
Task 5. Measure the computation time and error ratio when you increase the size of the input data (e.g. # of rows = 100, 1000, ...)
- Draw graphs or tables, then discuss the results.
- Characteristics of the input data might be an important factor.
Task 6. Extend the weak learn process so that the generated classifiers use 2 or more attribute values of the input data.
- For example, you can do this task by implementing Classifier interface in the sample code (e.g., MyClassfier?), then write necessary methods, predict(), getBetaValue(), etc.
2007
- Lecture Slides: (pdf) (7/10, 7/17, 2007)
Sample Program
- Source Codes: sandbox/trunk/AdaBoost
- If you have Subclipse plug-in in your Eclipse or TortoiseSVN (in Windows), checkout the following path as an Eclipse project:
- JAR Package: http://www.xerial.org/svn/project/sandbox/trunk/AdaBoost/AdaBoost.jar
Usage
> java -jar AdaBoost.jar -v (training data file) (test data file)
- without -v option, this program does not output anything
- with -m (num weak learner) option, you can change the number of WeakLearner? to be generated
- Sample Output
leo@leopardcat:~/../work/workspace/AdaBoost> java -jar AdaBoost.jar test/org/xerial/mining/adaboost/sampledata.tab test/org/xerial/mining/adaboost/sampledata.tab -v [WeakLearner]: learner: col=3 on true error= 0.1875, beta= 0.23076923076923078 [WeakLearner]: learner: col=0 on false error= 0.23076923076923075, beta= 0.3 [WeakLearner]: learner: col=2 on true error= 0.31666666666666665, beta= 0.46341463414634143 [WeakLearner]: learner: col=3 on true error= 0.23780487804878045, beta= 0.31199999999999994 [WeakLearner]: learner: col=2 on true error= 0.3280000000000001, beta= 0.4880952380952383 [AdaBoost]: num learner:5 [AdaBoost]: learner:[col=3 on true, col=0 on false, col=2 on true, col=3 on true, col=2 on true] [AdaBoost]: sample: 0 final=4.117466979642955 lowerBound =2.6607198919844453 [AdaBoost]: sample: 1 final=4.117466979642955 lowerBound =2.6607198919844453 [AdaBoost]: sample: 2 final=4.117466979642955 lowerBound =2.6607198919844453 [AdaBoost]: sample: 3 final=1.4863778196768729 lowerBound =2.6607198919844453 [AdaBoost]: sample: 4 final=1.4863778196768729 lowerBound =2.6607198919844453 [AdaBoost]: sample: 5 final=2.6310891599660815 lowerBound =2.6607198919844453 [AdaBoost]: sample: 6 final=2.6310891599660815 lowerBound =2.6607198919844453 [AdaBoost]: sample: 7 final=2.6310891599660815 lowerBound =2.6607198919844453 [AdaBoost]: sample: 8 final=3.835061964292018 lowerBound =2.6607198919844453 [AdaBoost]: sample: 9 final=5.3214397839688905 lowerBound =2.6607198919844453 [AdaBoost]: sample: 10 final=3.835061964292018 lowerBound =2.6607198919844453 [AdaBoost]: sample: 11 final=3.835061964292018 lowerBound =2.6607198919844453 [AdaBoost]: sample: 12 final=3.835061964292018 lowerBound =2.6607198919844453 [AdaBoost]: sample: 13 final=2.690350624002809 lowerBound =2.6607198919844453 [AdaBoost]: sample: 14 final=1.2039728043259361 lowerBound =2.6607198919844453 [AdaBoost]: sample: 15 final=2.690350624002809 lowerBound =2.6607198919844453 [AdaBoost]: [true, true, true, false, false, false, false, false, true, true, true, true, true, true, false, true] [AdaBoost]: 1 0 1 1 1 => 1 [AdaBoost]: 1 0 1 1 1 => 1 [AdaBoost]: 1 1 1 1 1 => 1 [AdaBoost]: 1 1 1 0 0 => 0 [AdaBoost]: 1 0 1 0 0 => 0 [AdaBoost]: 1 1 0 1 0 => 0 [AdaBoost]: 1 0 0 1 0 => 0 [AdaBoost]: 1 1 0 1 0 => 0 [AdaBoost]: 0 1 0 1 1 => 1 [AdaBoost]: 0 0 1 1 1 => 1 [AdaBoost]: 0 1 0 1 1 => 1 [AdaBoost]: 0 1 0 1 1 => 1 [AdaBoost]: 0 0 0 1 1 => 1 [AdaBoost]: 0 0 1 0 0 => 1 [AdaBoost]: 0 1 0 0 0 => 0 [AdaBoost]: 0 0 1 0 0 => 1
Programing Tips
Logging
- See the logger code sandbox/trunk/AdaBoost/src/org/xerial/util/log/Logger.java, which makes easier to debug your programs.
- Related : log4j http://logging.apache.org/log4j/docs/
JUnit Tests
Links
Attachments
- AdaBoost.zip (262.6 kB) - added by leo 18 months ago.
- enshu20070710.pdf (263.0 kB) - added by leo 18 months ago.
- adaboost.pdf (469.1 kB) - added by leo 7 months ago.


