CS 838 Data Science Project (Spring 2017)
Perform entity matching in heterogeneous and structured data-sets derived from top restaurant review sites (Yelp and Zomato) for extracting insights.
Stage 0
- Form Project team : Tarun Bansal (tbansal@wisc.edu), Ayush Gupta (gupta92@wisc.edu) and Rohit Damkondwar (damkondwar@wisc.edu)
Stage 1
- Define the DS problem and collect structured data
- Report of this stage can be found here.
Stage 2
- Perform information extraction (IE) from natural text documents, using a supervised learning approach.
- Link to directory containing Text documents.
- Link to directory containing Text Documents in Set I (200 documents).
- Link to directory containing Text Documents in Set J (105 documents).
- Link to directory containing the code.
- Link to a compressed file that stores all of the above directories.
- Report of this stage can be found here.
Stage 3
- Entity Matching
- Link to directory containing the data
- Link to directory containing the code
- Report of this stage can be found here
Stage 4
Stage 5
- Data Analysis
- E.csv
- Python Script to run prediction models
- Jupyter Nodebook
- Report
TEAM MEMBERS
Tarun Bansal (tbansal@wisc.edu), Ayush Gupta (gupta92@wisc.edu) and Rohit Damkondwar (damkondwar@wisc.edu)