Examples of a few recovered documents are shown in Table 1.
Tables 2 and 3 show the results of our experiments on
different combinations of index types, heuristics, pruning strategies, domains,
and document lengths, in terms of the BLEU scores.
We make the following observations:
1. The results show two conditions under which we can recover documents with good
success: (i) if the original document is short (Table 2, ``'' row), or (ii) if the index is a bigram
count vector (Table 2, ``bigram'' column). Long documents, given a unigram BOW, are much more difficult to
recover. This makes intuitive sense: when the original document is short or the index
preserves ordering constraints, the feasible set is small, which makes recovery
easier.
2. Among unigram BOWs, the index type affects the recovery rate. It is
easiest to recover documents from count BOWs, somewhat harder to recover from
indicator BOWs, and hardest of all from stopwords-removed count BOWS (Table 2, ``counts, stopwords, indicator'' columns). The fact
that we must infer the document length from the BOW contributes to the
difficulty of the latter two index types; when we artificially substitute the
true original document length for our estimated document length, recovery
improves, especially for short documents where each word is relatively more
important (Table 2, ``'' vs. ``
'' columns).
3. The domains vary in difficulty of recovery. The medical and stock domains seem the easiest (rows in Table 2). This may be because they are both more similar to general Web text than, for example, Switchboard, and our language model is trained on Web text.
4. Finally, Table 3 shows that our choice of heuristic and pruning strategy affects recovery. The
empirical heuristic performs consistently better than the admissible heuristic.
As for pruning strategies, is substantially the worst, but the other
two strategies,
and
, yield more or less equally good
results. However, it is worth noting that A
search is much faster using
than it is using
. For example, producing the third
column of Table 3 took about an hour, while the second column took
over a day.