o Configure the runtime options as:
• Name: AdLoad
• Main->Project: bcounts
• Main->Main class: com.cloudera.bayesiancounters.util.Driver
• Arguments->Program arguments: loader /tmp/ad.small ad
• Arguments->VM arguments: -Dlog4j.configuration=file:debug-log4j.properties -Xmx1024M
o Click on Apply, then Click on Run and then view the Console tab of the parent window

6.33

Perform NB inference on the Ad dataset
Run->Run Configurations…
o Java Application-> (right click) -> New
o Click on the New_configuration to edit its settings on the right
o Configure the runtime options as:
• Name: AdInference
• Main->Project: bcounts
• Main->Main class: com.cloudera.bayesiancounters.util.Driver
• Arguments->Program arguments: nb ad 604800 “sepal_length=5;petal_length=1.4″ class=2
• Arguments->VM arguments: -Dlog4j.configuration=file:debug-log4j.properties
o Click on Apply, then Click on Run and then view the Console tab of the parent window
o Wait for results in the Console tab and execution to complete.
o Close eclipse
Note: These results are from bcounts on 2 lines of the input data only. Recommend using small or medium sized cluster for processing the entire ad.data file. See Cloudera Manager Documentation for cluster size specifications.

6.34

Create bag of words file from configuration file
su poulin
cd /home/poulin/bcounts-0.1.0-SNAPSHOT/
python273 ./bin/sp_bag_of_words.py ./conf/bayesiancounters-site.xml /tmp/bag-of-words
tail /tmp/bag-of-words

worker
working
wreckage
xvi
yates
young
SP_increase