Title: Generating Different Tellings of Stories and Answers to Questions from Narrative Representations
Abstract: Sharing our experiences by storytelling is a fundamental and prevalent aspect of human social behavior. A critical aspect of storytelling in the wild is that it is socially interactive and situation dependent. Storytellers dynamically adjust their narratives to the context and their audience, telling and retelling the same story in many different ways depending on who the listener is. In order to tell stories in different voices for different audiences, interactive story systems require: (1) a semantic representation of the narrative structure, and (2) the ability to generate expressively different utterances from this semantic representation. This talk presents our current research on methods for linking narrative representations to the Personage expressive NLG engine in order to automatically generate tellings and engage in dialogue using the narrative representation. Such capability has application far beyond interactive story systems: the inability of various NLP applications to talk back in coherent way is becoming more and more apparent as NLU technology becomes more advanced.
Bio: Marilyn Walker is a Professor of Computer Science, head of the Natural Language and Dialogue Systems Lab, as Associate Dean of Graduate Studies in the Baskin’s School of Engineering at University of California, Santa Cruz (UCSC). She received her Ph.D. in 1993 in Computer Science from University of Pennsylvania. Before coming to UCSC, she was a Professor of Computer Science at University of Sheffield where she was a Royal Society Wolfson Research Merit Fellow, recruited to the U.K. under Britain’s “Brain Gain” program. From 1996 to 2003, she was a Principal Member of Research Staff in the Speech and Information Processing Lab at AT&T Bell Labs and AT&T Research. While she was at AT&T, she was a PI on two DARPA projects. The first was the Communicator Evaluation project, where she was the chair of the Evaluation Committee and led the design of the cross-site evaluation experiments with implementation help from NIST. The second project funded by DARPA was the AT&T Communicator project where she developed a new architecture for spoken dialogue systems and statistical methods for dialogue management and generation. While at AT&T she received the AT&T Labs Mentoring Award in 2001 for her excellence in mentoring Ph.D. students and junior researchers. She has served on many program committees both as reviewer and as senior area chair, organized dozens of workshops, and was the Program Chair for ACL 2004 and IVA 2012. She was a member of the founding board for the North American ACL, serving to set up and orchestrate its first conference between 1998 and 2001. She has supervised 16 doctoral students and 30 undergraduate researchers. She has published over 200 papers, and has 10 granted U.S. patents. Her H-index, a measure of research impact and excellence is 43.
Title: Large-scale paraphrasing for natural language generation
Abstract: I will present my method for learning paraphrases – pairs of English expressions with equivalent meaning – from bilingual parallel corpora, which are more commonly used to train statistical machine translation systems. My method equates pairs of English phrases like when they share an aligned foreign phrase like festgenommen. Because bitexts are large and because a phrase can be aligned many different foreign phrases (including phrases in multiple foreign languages), the method extracts a diverse set of paraphrases. For thrown into jail, we not only learn imprisoned, but also arrested, detained, incarcerated, jailed, locked up, taken into custody, and thrown into prison, along with a set of incorrect/noisy paraphrases. I’ll show a number of methods for filtering out the poor paraphrases, by defining a paraphrase probability calculated from translation model probabilities, and by re-ranking the candidate paraphrases using monolingual distributional similarity measures.
In addition to lexical and phrasal paraphrases, I’ll show how the bilingual pivoting method can be extended to learn meaning-preserving syntactic transformations like the English possessive rule or dative shift. I’ll describe a way of using synchronous context free grammars (SCGFs) to represent these rules. This formalism allows us to re-use much of the machinery from statistical machine translation to perform sentential paraphrasing. We can adapt our “paraphrase grammars” to do monolingual text-to-text generation tasks like sentence compression or simplification.
I’ll also briefly sketch future directions for adding a semantics to the paraphrases, which my lab will be exploring in the DARPA DEFT program.
Bio: Chris Callison-Burch is an assistant professor in the Computer and Information Science Department at the University of Pennsylvania. Before joining Penn, he was a research faculty member at the Center for Language and Speech Processing at Johns Hopkins University for 6 years. He was the Chair of the Executive Board of the North American chapter of the Association for Computational Linguistics (NAACL) from 2011-2013, and has served on the editorial boards of the journals Transactions of the ACL (TACL) and Computational Linguistics. Callison-Burch has more than 80 publications, which have been cited more than 5000 times. He is a Sloan Research Fellow, and has received faculty research awards from Google, Microsoft and Facebook in addition to funding from DARPA and the NSF.