Real Estate II
Abstract
This data set lists houses for sale.
The data set is used for the purpose of schema matching, especially for the complex schema matching.
Contribution
Original Owner and Donor
Anhai Doan
Department of Computer Science
University of Illinois, Champaign-Urbana
anhai@cs.uiuc.edu
Date Donated: February 6, 2004
Description
- The data is created from the Real Estate I by removing, merging, and splitting certain schema elements.
The objective was to create real-estate schemas that have a fair number of complex matchings.
- This data was collected as a designed experiment for the purpose of Schema Matching.
- As of now the publications that have used this data are:
- Learning to Map between Structured Representations of Data, A. Doan. Ph.D. Dissertation, Univ. of Washington-Seattle, 2002.
- iMAP: Discovering Complex Semantic Matches between Database Schemas, A. Doan, Y. Lee, R. Dhamankar, A. Halevy, and P. Domingos.
Proc. of the ACM SIGMOD Conf. on Management of Data. To appear.
Data Format
This data set consists of one original data set and one sample mapping over generated data from the original one.
- Original Data Set
- The data was selected from the Real Estate I.
- The user can browse each XML file and download one tarred and gzipped file.
- Sample Mapping
- We asked a volunteer to examine and create complex query formulas that combine the attributes.
- We created sample target sources by applying the query formulas over a subset of the original one.
- The target source has been divided into two files to represent some join paths.(Here, the join path is "agent-id".)
Data Files
- Data Sources
- Mapping
- Source
- Target(join path:"agent_id")
- Agent Details
- House Details
Illini Semantic Integration Archive
Department of Computer Science
University of Illinois, Champaign-Urbana
Urbana, IL 61801
Last modified: February 6, 2004