Deemter, Krahmer, and Theune. (2004). Real vs. template-based natural language
generation: a false opposition?
http://www.csd.abdn.ac.uk/~kvdeemte/mock-rae-2004/templates-squib-8pages.pdf

Argues that the line between template-based NLG systems and "real" NLG systems
(where every word is allowed to vary) is blurry.

Walker et al. (2002) Training a Sentence Planner for Spoken Dialogue Using
Boosting. http://www.dcs.shef.ac.uk/~walker/spot-csl-5.pdf

Uses a boosting ranking algorithm to learn a probability distribution from
which to select valid choices when planning text. Also has references to ML
work in NLG.

Langkilde and Knight. (1998). Generation that Exploits Corpus-Based Statistical
Knowledge.
http://portal.acm.org/citation.cfm?id=980451.980963&coll=GUIDE&dl=GUIDE,

Langkilde. (2000). Forest-Based Statistical Sentence Generation.
http://citeseer.ist.psu.edu/langkilde00forestbased.html

By using a "forest" instead of a "lattice" representation of a set of possible
alternative phrases, exponentially reduces the number of possible parses that
need to be evaluated in order to choose the optimal sentence to generate. Very
useful, I think.

Langkilde-Geary. (2002). An Empirical Verification of Coverage and Correctness
for a General-Purpose Sentence Generator.
http://www.cs.rutgers.edu/~mdstone/inlg02/112.pdf

Describes a sentence generator, HALogen, which generates a "forest" of possible
sentences and ranks them using a statistical method. Also describes a way to
evaluate sentence generators, by generating an input to the generator from the
Penn treebank parse of some sentence from that treebank, and then generating a
sentence and calculating the Bleu score between the original sentence and the
generated one (?correct).

Learning templates:

Lu, Zhou, Li, Huang, and Zhao. (2001). Automatic Translation Template
Acquisition Based on Bilingual Structure Alignment.

Use aligned bilingual corpus + LM to extract templates for translation.

McTait. (2001). Linguistic Knowledge and Complexity in an EBMT System Based on
Translation Patterns.

Extracts sentences from bilingual corpus that are translations of each other.

Carl. (1999). Inducing Translation Templates for Example-Based Machine
Translation.

Extracts more complicated, general templates from bilingual corpus.

--
Poetry specifically:

Manurung, Graeme Ritchie, Henry Thompson. (2000). Towards A Computational Model
of Poetry Generation.  http://citeseer.ist.psu.edu/manurung00towards.html

Uses stochastic hill climbing search (evolutionary algorithm) to find the best
poem. They evaluate candates based on their rhythm; they mutate them using a
``semantic explorer'', a ``semantic realizer'', and a ``syntactic
paraphraser''. They use hand-crafted grammar and lexicon to do all this,
however.

Manurung. (2003). An evolutionary algorithm approach to poetry generation.
(Ph.D. thesis).

Continues the above. Includes lots of relevant discussions.

Gerv\'as. (2000). An Expert System for the Composition of Formal Spanish
Poetry.
and
Gerv\'as. (2000). WASP: Evaluation of Different Strategies for the Automatic
Generation of Spanish Verse

Uses a rule-based system to generate Spanish poetry. The rules were manually
made by reviewing academic literature on poetry.

Gerv\'as. (2002). Exploring Quantitative Evaluations of the Creativity of
Automatic Poets

Develops evaluation metrics for poetry.

Pease, Winterstein, and Colton. (2001). Evaluating Machine Creativity.

Distinguish between the process and the results, among other things.

Ritchie. (2001). Assessing Creativity.

What makes a program creative?

--
MARGINALLY RELEVANT:

Freund et al. (2003). An Efficient Boosting Algorithm for Combining
Preferences. JMLR 4.
http://jmlr.csail.mit.edu/papers/volume4/freund03a/freund03a.pdf

Collins and Koo. (2003). Discriminative Reranking for Natural Language Parsing.
http://citeseer.ist.psu.edu/cache/papers/cs2/128/http:zSzzSzpeople.csail.mit.eduzSzmaestrozSzpaperszSzcollins05cl.pdf/collins00discriminative.pdf