<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl" href="/syllabus.xsl"?>
<syllabus><subject>Computer Engineering</subject><prefix>COEN</prefix><course>281</course><title>Patt Recog and Data Mining                                                          	</title><term>Spring</term><year>2011</year><section>67863</section><component><what>Lec (67863)</what><when>TTh 07:10-09pm</when><where>KENNA 308</where></component><note id="526F56FC"><label>Description</label><text>How does an online retailer decide what product to recommend you based on your previous purchases? How do bio-scientists decide how many different types of a disease are out there? How computers rank web pages in response to a user query? In this two-part course we introduce some of the computational methods currently used to answer these and other similar questions. Some topics included are association rules, clustering, data visualization, logistic regression, neural networks, decision trees, ensemble methods and text mining. We'll describe covered algorithms as a "tuple" consisting of &lt;task, model structure, score function, parameter search method, data management technique&gt; which provides a useful framework for understanding and comparing different techniques.</text><table><type>bulleted</type></table></note><note id="44E217B2"><label>Units</label><text>4 </text><table><type>bulleted</type></table></note><note id="27735804"><label>Prerequisites</label><text>Introductory courses in probability and linear algebra (e.g., AMTH 210 and 245), and some programming experience beyond a first course (or permission from instructor).</text><table><type>bulleted</type></table></note><note id="2086C01E"><label>Expected Learning Outcomes</label><text>Use the language R to conduct statistical and graphical analysis of data. Build regression and classification models from data.</text><table><type>bulleted</type></table></note><note id="40F1CD4B"><label>Evaluation</label><text>There will be bi-weekly assignments in R to apply the methods discussed in class, and a take home final project. Bi-weekly work is to be done in groups of 2; partner will be assigned randomly for each project. </text><table><type>bulleted</type></table></note><note id="2FC0341B"><label>References</label><text> </text><table><type>bulleted</type><item><col>Duda, Hart, Stork, "Pattern Classification", Wiley, 2nd ed., 2001.</col><col>Required</col><col></col></item><item><col>Seni, Elder, "Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions", Morgan and Claypool, 2010</col><col>Required</col><col></col></item><item><col>Hand, Mannila, Smyth, "Data Mining", MIT Press, 2001.</col><col>Optional</col><col></col></item><item><col>Venables, Ripley, "Modern Applied Statistics with S", Springer, 2003</col><col>Optional</col><col></col></item></table></note><note id="3224A097"><label>Week, Topics, Reading</label><text> </text><table><type>bulleted</type><item><col>Week 1</col><col>Introduction; R</col><col></col></item><item><col>Week 2</col><col>Bayesian Decision Theory; Parameter Estimation</col><col>2.1-2.6, 2.9; 3.1-3.4; see also 4.5 HMS</col></item><item><col>Week 3</col><col>Linear Discriminant Functions; Regularization </col><col>3.8.2, 5.1-5.8; Ch.3 SE</col></item><item><col>Week 4</col><col>Neural Networks</col><col>6.1-6.6, 6.8, 8.3, 8.4</col></item><item><col>Week 5</col><col>Support Vector Machines</col><col>5.11</col></item><item><col>Week 6</col><col>Decision Trees</col><col>8.3; Ch.2 SE</col></item><item><col>Week 7</col><col>Ensemble Models</col><col>Ch.4, Ch.5 SE</col></item><item><col>Week 8</col><col>Clustering</col><col>10.6, 10.7; see also 9.3-9.6 HMS 10.9, 10.10</col></item><item><col>Week 9</col><col>Non-metric: Association Rules; Visualization: SOMs </col><col>5.3.2 HMS; 10.14</col></item><item><col>Week 10</col><col>Text Retrieval</col><col>14.1-14.3 HMS</col></item></table></note><note id="2E12C401"><label>Angel</label><text>Visit angel.scu.edu weekly for lecture notes. Post homework questions here first for the benefit of all students (instead of mailing the instructor directly).</text><table><type>bulleted</type></table></note><staff><name>Giovanni Seni</name><webpage></webpage><photo>graphic/giovanni.seni.gif</photo><attribute><name>Role</name><value>Instructor</value></attribute><attribute><name>Email</name><value>gseni<at>@</at>scu.edu</value></attribute><attribute><name>Hours</name><value>By Appointment</value></attribute><attribute><name>Personal Page</name><value><url>http://gseni.minedata2learn.com/</url></value></attribute></staff><staff><name>Stacie Riddle</name><webpage></webpage><photo>graphic/371e245.gif</photo><attribute><name>Role</name><value>TA</value></attribute><attribute><name>Email</name><value>coen281<at>@</at>gmail.com</value></attribute></staff></syllabus>
