Google Books Visualization Services

Michael Gleicher (email) and Xiujun Li (email)

Home   ·   Introduction   ·   Server   ·   Applications   ·   Search Services   ·   Examples

Status: not open yet!


Data  §   Multi-Words Search  §   Event Detections  §   Words Percent  §   Words Ring
Words Tree  §   3Grams Words 1  §   3Grams Words 2  §   2Grams Trees

Aggregations
Long S problem  §   Sentence Matrix  §   Top N words (Bars & Lines)
Normalized vs. Non-normalized lines  §   Word Frequency & Activity  §   ongoing

Introduction

We create this project to investigate and explore the google book datasets. I worked on it under the supervision of Prof. Michael Gleicher. The idea of this project is initially raised by Michael Gleicher.

All samples here are online live, supported by the back-end database, you can play with them.


Server    [+more]


Web Search Service   [+more]

Sentence Visualization

http://sepc111.se.cuhk.edu.hk:8080/gbooks/sent/sen.php

http://sepc111.se.cuhk.edu.hk:8080/gbooks/sent/

Description:
Phrase and phrase are separated by ','.


About Data

Here is a page for the Data quality:

http://pages.cs.wisc.edu/~lixiujun/samples/gbooks/data

Sample homepage:

http://pages.cs.wisc.edu/~lixiujun/samples/gbooks/data

Here is graph search engine for word counts by year, decade, century:

http://sepc111.se.cuhk.edu.hk:8080/gbooks/wordcent/

Sample homepage:

http://pages.cs.wisc.edu/~lixiujun/samples/gbooks/wordcent

Normalized Data:

http://sepc111.se.cuhk.edu.hk:8080/gbooks/nm/fre.php

Sample homepage:

http://pages.cs.wisc.edu/~lixiujun/samples/gbooks/fre

Description:
Try to magnify and shrink some influence for the factors, like book num, and focus on the interested time period.

Example: Top10 by my algorithm

thefe, fuch, fome, moft, muft, firft, fhould, himfelf, againft, themfelves

Description:
Pop, aggregate the interesting word, and appear on the top of the datasets.


Applications

Sentence

http://sepc111.se.cuhk.edu.hk:8080/gbooks/sent/

http://sepc111.se.cuhk.edu.hk:8080/gbooks/sent/sen.php

Description:
Phrase and phrase are separated by ','.

3Grams Words

http://sepc111.se.cuhk.edu.hk:8080/gbooks/3grams/

Sample homepage:

http://pages.cs.wisc.edu/~lixiujun/samples/gbooks/3grams

Multi-words Search Graph

http://sepc111.se.cuhk.edu.hk:8080/gbooks/multiwords.php

Sample homepage:

http://pages.cs.wisc.edu/~lixiujun/samples/gbooks/multiwords

Description:
Phrase and phrase are separated by ','.

Events Detection

Sample homepage:

http://pages.cs.wisc.edu/~lixiujun/samples/gbooks/events

Words Percent

A pie chart to show the word count by year, decade, century.

Sample homepage:

http://pages.cs.wisc.edu/~lixiujun/samples/gbooks/wordcent

Words Tree

To explore the co-relation between character strings.

Application homepage:

http://sepc111.se.cuhk.edu.hk:8080/gbooks/wordtree/

Sample homepage:

http://pages.cs.wisc.edu/~lixiujun/samples/gbooks/wordtree

Words Ring

To explore the co-relation between character strings.

Application homepage:

http://sepc111.se.cuhk.edu.hk:8080/gbooks/wordring/

Sample homepage:

http://pages.cs.wisc.edu/~lixiujun/samples/gbooks/wordring


Events Detection with Usage frequency   [url]

An Example: Olympics

http://sepc111.se.cuhk.edu.hk:8080/gbooks/search.php?keyword=Olympics&submit=Search+Graph

Description:
In the small trend graph, we can see some striking points (topper than its surrounding points.)
From the large detailed trend graph, we can get more accurate information, these points are:
1976's Montreal Summer Olympics, 1980's Moscow, 1988's Seoul, 1996's Atlanta, 2000's Sydney, 2004's Athens.

An Example: Bush

http://sepc111.se.cuhk.edu.hk:8080/gbooks/search.php?keyword=Bush&submit=Search+Graph

Description:
In the small trend graph, we can see two peaks:
One is in George H. W. Bush's administration period (1989-1993), and the other is in George W. Bush's administration (2001-2009).

More interesting examples, please here.

Examples


A test sample

http://pages.cs.wisc.edu/~lixiujun/samples/gbooks/test/sample

Description:
This is my first, very very simple sample, I test it just for the google dataset to see whether there is any great, interesting graphs to explore.

1-grams word history trend Text Search

http://sepc111.se.cuhk.edu.hk:8080/gbooks/

Description:
A Text Search Engine for World history trend.

1-grams word history trend Graph Search

http://sepc111.se.cuhk.edu.hk:8080/gbooks/

Description:
A Graph Search Engine for World history trend: Overview(left) and Details(right).

Note: play this page with Firefox or Chrome.