Data Matching
Read a book chapter draft on string matching that I will send to the class mailing list. You should read and understand the materials in Sections 1.1, 1.2.1 (here read only the edit distance, Needleman-Wunch, and Smith-Waterman measures), 1.2.2, 1.2.3 (here read only the Generalized Jaccard Measure and the soft TF/IDF measure), Sec 1.3 (here read only Inverted Index over Strings, Size Filtering + B-Tree Index, and Prefix Filtering + Inverted Index over Prefixes).

IR / Web Search
Read Chapter 27 (IR and XML Data) of the Cow Book, but only from 27.1 to 27.5.
Read paper IR overview. Ignore Sections 2.2 and 2.3.
Read Web search, Pagerank

Information Extraction
Scan the tutorial Managing information extraction.
Read Datalog applied to information extraction (scan only Sec 5).

Data Warehousing, OLAP
Read Chapter 25 (Data Warehousing and Decision Support) of the Cow Book.

Data Mining
Read Chapter 26 (Data Mining) of the Cow Book (scan Sections 26.4 to 26.8).

Big Data Analysis
Read the following papers:
MapReduce: simplified data processing on large clusters
MapReduce and parallel DBMSs: friends or foes?
MapReduce: a flexible data processing tool

User Feedback / Mass Collaboration
Read the following papers:
Building community wikipedias: a human-machine approach
Mass collaboration systems on the World-Wide Web