Quantitative Textual Analysis: Introduction to Text Mining|
|Certificates: Applied Data Science|
We encounter computerized processing of text in almost every field of life. Google tries to infer the meaning of our search queries, online review engines try to extract information about what products are popular with the users, and across different fields scholars analyze text for insights into the processes and phenomena they study. This course will introduce you to the skills necessary to mine text for information and knowledge. You will learn how to use R to retrieve text from a variety of sources, how to use regular expressions to identify which pieces of text are useful to your study, and how to use techniques from data mining to analyze the processed text in order to extract information and for classification and prediction.
||Gen Ed Area Dept:
|Course Format: Lecture / Discussion||Grading Mode: Graded|
||Prerequisites: QAC211 OR ECON300 OR [GOVT367 or QAC302]
||Fulfills a Major Requirement for: (CADS)(DATA-MN)
||Past Enrollment Probability: Not Available
|Special Attributes: CQC|
|Major Readings: Wesleyan RJ Julia Bookstore
Aggarwal, Charu and Zhai ChengXiang, eds., MINING TEXT DATA. Springer Verlag, 2012. Available online through Wesleyan library: http://link.springer.com/book/10.1007%2F978-1-4614-3223-4
Jockers, Matthew A., TEXT ANALYSIS WITH R FOR STUDENTS OF LITERATURE. Springer Verlag, 2014. Available online through Wesleyan library. http://link.springer.com/book/10.1007%2F978-3-319-03164-4
|Examinations and Assignments: |
Several homework assignments, two take-home midterms, and a final project. Part of the grade depends on in-class participation/preparedness.
|Additional Requirements and/or Comments: |
An introductory statistics/data analysis background is a prerequisite for the course and that is why QAC201, or 211, or 221 are listed as formal prerequisites. Pre-req overrides will be approved by the Professor for students who satisfy this basic requirements through other course work. The course includes a strong lab component and programming in R is a significant part of the course work.
|Instructor(s): Oleinikov,Pavel V Times: ..T.R.. 01:10PM-02:30PM; Location: ALLB107; |
|Total Enrollment Limit: 19||SR major: 0||JR major: 0|| || |
|Seats Available: 13||GRAD: 1||SR non-major: 7||JR non-major: 7||SO: 4||FR: 0|
|Drop/Add Enrollment Requests|
|Total Submitted Requests: 0||1st Ranked: 0||2nd Ranked: 0||3rd Ranked: 0||4th Ranked: 0||Unranked: 0|