Proseminar: Web Scraping
QAC 239
Spring 2018
| Section:
01
|
This course may be repeated for credit. |
Crosslisting:
CIS 239 |
Certificates: Applied Data Science |
Using Python and R programming tools, "Scraping the Web" is an introduction to the collection, measurement and management of publicly available information (data) from the World Wide Web. |
Credit: .5 |
Gen Ed Area Dept:
None |
Course Format: Laboratory | Grading Mode: Student Option |
Level: UGRD |
Prerequisites: COMP112 OR QAC155 OR QAC156 |
|
Fulfills a Requirement for: (CADS)(DATA-MN)(PSYC) |
|
Past Enrollment Probability: 90% or above |
SECTION 01 - 3rd Quarter | Special Attributes: CQC |
Major Readings: Wesleyan RJ Julia Bookstore
Moat, Helen Susannah et al. QUANTIFYING WIKIPEDIA USAGE PATTERNS BEFORE STOCK MARKET MOVES. The University of Warwick. Scientific Reports, Volume 3. Article number 1801. Available online at: http://wrap.warwick.ac.uk/54525/1/WRAP_Moat_srep01801.pdf
Morstatter, Fred and Huan Liu. DISCOVERING, ASSESSING, AND MITIGATING DATA BIAS IN SOCIAL MEDIA. Elsevier preprint. Available online at: http://www.public.asu.edu/~fmorstat/paperpdfs/osnem_preprint.pdf
Munzert, Simon, Christian Rubba, Peter Meissner, and Dominic Nyhuis. AUTOMATED DATA COLLECTION WITH R. A PRACTICAL GUIDE TO WEB SCRAPING AND TEXT MINING. Wiley Publishers, Hoboken, 2014. Available online through Wesleyan library at: https://ebookcentral.proquest.com/lib/wesleyan/detail.action?docID=1824310
|
Examinations and Assignments: Weekly programming assignments, discussion of methodological literature, and a term project. |
Additional Requirements and/or Comments: The course requires a basic programming background that is why COMP 112, QAC155, QAC156 etc. are formal prerequisites. Pre-req overrides will be approved by the Professor for students who satisfy this basic requirement through other course work. The course includes a strong lab component and programming in R and Python is a significant part of the course work. |
Instructor(s): Oleinikov,Pavel V Times: ...W... 07:10PM-10:00PM; Location: ALLB204; |
Total Enrollment Limit: 19 | | SR major: 0 | JR major: 0 |   |   |
Seats Available: 7 | GRAD: X | SR non-major: 7 | JR non-major: 7 | SO: 5 | FR: 0 |
Drop/Add Enrollment Requests | | | | | |
Total Submitted Requests: 0 | 1st Ranked: 0 | 2nd Ranked: 0 | 3rd Ranked: 0 | 4th Ranked: 0 | Unranked: 0 |
|
|