回复:多模态话语研究的语言学和方法论基础
以下是引用 xujiajin 在 2005-6-30 16:41:57 的发言:
欢迎帮忙一起寻找有关multimodal approach方面的文献和相关研究方面的信息。
There are some sessions at the LREC 2004 conference that focus on this topic. If you are interested in some papers, I will try to upload it.
Volume: I / II / III / IV / V / VI
--------------------------------------------------------------------------------
VOLUME I
Introductory Messages
Maria Teresa Lino, Message of the Chair of the Local Organising Committee
Nicoletta Calzolari, Introduction of the Conference Chair
Khalid Choukri, Message from ELRA CEO
Joseph J. Mariani, Message from ELRA President
Biographical Notes
ANTONIO ZAMPOLLI - A life for Computational Linguistics
DON ÁNGEL MARTÍN MUNICIO - Biography
Messages in Memory of Antonio Zampolli C To remember Ángel Martín Municio
Bernard Quemada, A Antonio ZAMPOLLI / To Antonio ZAMPOLLI
Makoto Nagao, In memory of Professor Antonio Zampolli
Martin Kay, Antonio Zampolli
Panel Summary
Bente Maegaard, Industrial Needs for Language Resources
Session O1-TW : Ontologies
Sergei Nirenburg, Marjorie McShane, Stephen Beale, The Rationale for Building an Ontology Expressly for NLP
Paul Buitelaar, Daniel Olejnik, Mihaela Hutanu, Alexander Schutz, Thierry Declerck, Michael Sintek, Towards Ontology Engineering Based on Linguistic Analysis
Jordi Atserias, Salvador Climent, German Rigau, Towards the Meaning Top Ontology: Sources of Ontological Meaning
Bodil Nistrup Madsen, Hanne Erdman Thomsen, Carl Vikner, Principles of a System for Terminological Concept Modelling
Session O2-W: Learning & Acquisition (I)
Dekai Wu, Grace Ngai, Marine Carpuat, Raising the Bar: Stacked Conservative Error Correction Beyond Boosting
Juan Fernández, Mauro Castillo, German Rigau, Jordi Atserias, Jordi Turmo, Automatic Acquisition of Sense Examples Using ExRetriever
Michael Schiehlen, Kristina Spranger, Automatic Methods to Supplement Broad-Coverage Subcategorization Lexicons
Jordi Atserias, Bernardo Magnini, Octavian Popescu, Eneko Agirre, Aitziber Atutxa, German Rigau, John Carroll, Rob Koeling, Cross-Language Acquisition of Semantic Models for Verbal Predicates
Session O3-W : Tagging & Grammar
Dan Tufis, Liviu Dragomirescu, Tiered Tagging Revisited
Jesús Giménez, Lluís Màrquez, SVMTool: A general POS Tagger Generator Based on Support Vector Machines
Eneko Agirre, Aitziber Atutxa, Koldo Gojenola, Kepa Sarasola, Exploring Portability of Syntactic Information from English to Basque
Alessandro Mazzei, Vincenzo Lombardo, Building a Large Grammar for Italian
Session O4-S: Speech Corpora with Linguistic Annotations
Ineke Schuurman, Wim Goedertier, Heleen Hoekstra, Nelleke Oostdijk, Richard Piepenbrock, Machteld Schouppe, Linguistic Annotation of the Spoken Dutch Corpus: If We Had To Do It All Over Again
Kris Demuynck, Tom Laureys, Patrick Wambacq, Dirk Van Compernolle, Automatic Phonemic Labeling and Segmentation of Spoken Dutch
Stephanie Strassel, Linguistic Resources for Effective, Affordable, Reusable Speech-to-Text
Christopher Cieri, David Miller, Kevin Walker, The Fisher Corpus: a Resource for the Next Generations of Speech-to-Text
Session O5-T: Terminology & Knowledge
Costanza Navarretta, Bolette Sandford Pedersen, Dorte Haltrup Hansen, "Human Language Technology Elements in a Knowledge Organisation System - The VID Project"
Patrick Drouin, Detection of Domain Specific Terminology Using Corpora Comparison
Henk Harkema, Robert Gaizauskas, Mark Hepple, Neil Davis, Yikun Guo, Angus Roberts, Ian Roberts, A Large-Scale Resource for Storing and Recognizing Technical Terminology
M. Teresa Cabré, Carme Bach, Rosa Estopà, Judit Feliu, Gemma Martínez, Jorge Vivaldi, The GENOMA-KB Project: Towards the Integration of Concepts, Terms, Textual Corpora and Entities
Session O6-GSW: Large Programs on LRS
J. C. Roux, P. H. Louw, T. R. Niesler, The African Speech Technology Project: An Assessment
Henk van den Heuvel, Phil Hall, Harald Höge, Asunción Moreno, Antonio Rincon, Francesco Senia, SALA II Across the Finish Line: A Large Collection of Mobile Telephone Speech Databases from North and Latin America completed
Asunción Moreno, Khalid Choukri, Phil Hall, Henk van den Heuvel, Eric Sanders, Francesco Senia, Herbert Tropf, Collection of SLR in the Asian-Pacific Area
Catia Cucchiarini, Elisabeth D'Halleweyn, The New Dutch-Flemish HLT Programme: a Concerted Effort to Stimulate the HLT Sector
Bente Maegaard, NEMLAR - An Arabic Language Resources Project
Peter Wittenburg, Greg Gulrajani, Daan Broeder, Marcus Uneson, Cross-Disciplinary Integration of Metadata Descriptions
Session O7-GW: Standards
Susanne Salmon-Alt, Laurent Romary, Towards a Reference Annotation Framework
Sue Ellen Wright, A Global Data Category Registry for Interoperable Language Resources
David Dalby, Lee Gillam, Christopher Cox, Debbie Garside, Standards for Language Codes: developing ISO 639
Francesca Bertagna, Alessandro Lenci, Monica Monachini, Nicoletta Calzolari, Content Interoperability of Lexical Resources: Open Issues and "MILE" Perspectives
Nancy Ide, Laurent Romary, A Registry of Standard Data Categories for Linguistic Annotation
Syd Bauman, Alejandro Bia, Lou Burnard, Tomaž Erjavec, Christine Ruotolo, Susan Schreibman, Migrating Language Resources from SGML to XML: The Text Encoding Initiative Recommendations
Session O8-W: Lexicon & Semantics (I)
Javier Farreres, Horacio Rodríguez, Selecting the Correct English Synset for a Spanish Sense
Jer Hayes, Tony Veale, Nuno Seco, Enriching WordNet Via Generative Metonymy and Creative Polysemy
Iulia Nica, Mª Antònia Martí, Andrés Montoyo, Sonia Vázquez, Enriching EWN with Syntagmatic Information by Means of WSD
Rita Marinelli, Proper Names and Polysemy: From a Lexicographic Experience
Jordi Atserias, Luís Villarejo, German Rigau, Spanish WordNet 1.6: Porting the Spanish Wordnet Across Princeton Versions
Antonietta Alonge, Birte Lönneker, Metaphors in Wordnets: From Theory to Practice
Session O9-SE: Speech, Expression & Emotion
A. Batliner, C. Hacker, S. Steidl, E. Nöth, S. D'Arcy, M. Russell, M. Wong, "You Stupid Tin Box" - Children Interacting with the AIBO Robot: A Cross-linguistic Emotional Speech Corpus
Albert Rilliard, Véronique Aubergé, Nicolas Audibert, Evaluating an Authentic Audio-Visual Expressive Speech Corpus
Véronique Aubergé, Nicolas Audibert, Albert Rilliard, E-Wiz: a Trapper Protocol for Hunting the Expressive Speech Corpora in Lab
Nick Campbell, Speech & Expression; the Value of a Longitudinal Corpus
Robert S. Melvin, Win May, Shrikanth Narayanan, Panayiotis Georgiou, Shadi Ganjavi, Creation of a Doctor-Patient Dialogue Corpus Using Standardized Patients
Session O10-MSE: Multimodal Resources, Tools & Applications
Baden Hughes, David Penton, Steven Bird, Catherine Bow, Gillian Wigglesworth, Patrick McConvell, Jane Simpson, Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Project
Laila Dybkjær, Niels Ole Bernsen, Towards General-Purpose Annotation Tools C How Far Are We Today?
A. Braffort, A. Choisier, C. Collet, P. Dalle, F. Gianni, F. Lenseigne, J. Segouat, Toward an Annotation Software for Video of Sign Language, Including Image Processing Tools and Signing Space Modelling
Stelios Piperidis, Iason Demiros, Prokopis Prokopidis, Peter Vanroose, Anja Hoethker, Walter Daelemans, Elsa Sklavounou, Manos Konstantinou, Yannis Karavidas, Multimodal Multilingual Resources in the Subtitling Process
Ielka van der Sluis, Emiel Krahmer, Evaluating Multimodal NLG Using Production Experiments
Hennie Brugman, Onno Crasborn, Albert Russel, Collaborative Annotation of Sign Language Data with Peer-to-Peer Technology
Session P1-W: Corpus & Lexicon Tools
Ajay S. Bhaskarabhatla, Sriganesh Madhvanath, An XML Representation for Annotated Handwriting Datasets for Online Handwriting Recognition
Umut Özge, Bilge Say, Development of a Corpus Workbench for the METU Turkish Corpus
Salvador España, María José Castro, José Luis Hidalgo, The SPARTACUS-Database: a Spanish Sentence Database for Offline Handwriting Recognition
Vincent Vandeghinste, Erik Tjong Kim Sang, Using a Parallel Transcript/Subtitle Corpus for Sentence Compression
Kiril Simov, Alexander Simov, Hristo Ganev, Krasimira Ivanova, Ilko Grigorov, The CLaRK System: XML-based Corpora Development System for Rapid Prototyping
Xavier Carreras, Isaac Chao, Lluís Padró, Muntsa Padró, FreeLing: An Open-Source Suite of Language Analyzers
Yves Lepage, Guilhem Peralta, Using Paradigm Tables to Generate New Utterances Similar to those Existing in Linguistic Resources
Carlos Amaral, Dominique Laurent, André Martins, Afonso Mendes, Cláudia Pinto, Design and Implementation of a Semantic Search Engine for Portuguese
Arantza Díaz de Ilarraza, Aitzpea Garmendia , Maite Oronoz, Abar-Hitz: An Annotation Tool for the Basque Dependency Treebank
Valia Kordoni, Julia Neu, Creating Multi-purpose Linguistic Resources for Modern Greek: a Deep Modern Greek Grammar
Felix Sasaki, Andreas Witt, Dafydd Gibbon, Thorsten Trippel, Concept-based Queries: Combining and Reusing Linguistic Corpus Formats and Query Languages
Paul Gévaudan, Dirk Wiebel, Dynamic Lexicographic Data Modelling. A Diachronic Dictionary Development Report
Agnès Tutin, Meriam Haddara, Ruslan Mitkov, Constantin Orasan, Annotation of Anaphoric Expressions in an Aligned Bilingual Corpus
Christian Biemann, Uwe Quasthoff, Christian Wolff, Linguistic Corpus Search
A. Chalamandaris, P. Tsiakoulis, S. Raptis, G. Giannopoulos, G. Carayannis, Bypassing Greeklish!
Catarina Ribeiro, Ricardo Santos, Rui Pedro Chaves, Palmira Marrafa, Semi-Automatic UNL Dictionary Generation Using WordNet.PT
Zygmunt Vetulani, An Environment for Dialogue Corpora Collection (ENDIACC)
Nuno Cavalheiro Marques, Sérgio Gonçalves, Applying a Part-of-Speech Tagger to Postal Address Detection on the Web
Long Qiu, Min-Yen Kan, Tat-Seng Chua, A Public Reference Implementation of the RAP Anaphora Resolution Algorithm
Luciana Bordoni, Leonardo Pasqualini, Filippo Sciarrone, CHeM: A System for the Automatic Analysis of e-mails in the Restoration and Conservation Domain
Session P2-W: Named Entity
Hsin-Hsi Chen, Yi-Lin Chu, Pattern Discovery in Named Organization Corpus
Eckhard Bick, A Named Entity Recognizer for Danish
Borislav Popov, Angel Kirilov, Diana Maynard, Dimitar Manov, Creation of Reusable Components and Language Resources for Named Entity Recognition in Russian
Jakub Piskorski, Extraction of Polish Named-Entities
Robert Irie, Beth Sundheim, Resources for Place Name Analysis
Joaquim F. Ferreira da Silva, Zornitsa Kozareva, José Gabriel Pereira Lopes, Cluster Analysis and Classification of Named Entities
Session P3-W: Machine Translation
Matthias Eck, Stephan Vogel, Alex Waibel, Language Model Adaptation for Statistical Machine Translation Based on Information Retrieval
Lambros Kranias, Anna Samiotou, Automatic Translation Memory Fuzzy Match Post-Editing: A Step Beyond Traditional TM/MT Integration
Francisco Nevado, Francisco Casacuberta, Josu Landa, Translation Memories Enrichment by Statistical Bilingual Segmentation
Tamás Gröbler, Gábor Hodász, Balázs Kis, MetaMorpho TM: A Rule-Based Translation Corpus
Ray Clifford, Neil Granoien, Douglas Jones, Wade Shen, Clifford Weinstein, The Effect of Text Difficulty on Machine Translation Performance -- A Pilot Study with ILR-Rated Texts in Spanish, Farsi, Arabic, Russian and Korean
Ariadna Font Llitjós, Jaime Carbonell, The Translation Correction Tool: English-Spanish User Studies
Top
--------------------------------------------------------------------------------
VOLUME II
Session P4-G: General Issues, Architectures for LRs & Evaluation Infrastructures
Cornelis H.A. Koster, Stefan Gradmann, The Language Belongs to the People!
David M. de Matos, Ricardo Ribeiro, Nuno J. Mamede, Rethinking Reusable Resources
Peter Wittenburg, Heidi Johnson, Markus Buchhorn, Hennie Brugman, Daan Broeder, Architecture for Distributed Language Resource Management and Archiving
Angelo Dalli, Valentin Tablan, Kalina Bontcheva, Yorick Wilks, Daan Broeder, Hennie Brugman, Peter Wittenburg, Web Services Architecture for Language Resources
Daan Broeder, Thierry Declerck, Laurent Romary, Markus Uneson, Sven Strömqvist, Peter Wittenburg, A Large Metadata Domain of Language Resources
Kiyong Lee, Lou Burnard, Laurent Romary, Eric de la Clergerie, Thierry Declerck, Syd Bauman, Harry Bunt, Lionel Clément, Tomaž Erjavec, Azim Roussanaly, Claude Roux, Towards an International Standard on Feature Structure Representation
Gregory Ernest Monaco, Abdelhadi Soudi, An Emerging Transcontinental Collaborative Research and Education Agenda in Human Language Technologies
Valérie Mapelli, Maria Nava, Sylvain Surcin, Djamel Mostefa, Khalid Choukri, Technolangue: A Permanent Evaluation and Information Infrastructure
Session P5-W: Learning & Acquisition
Maya Ando, Satoshi Sekine, Shun Ishizaki, Automatic Extraction of Hyponyms from Japanese Newspapers. Using Lexico-syntactic Patterns
Hideki Kashioka, Grouping Synonymous Sentences from a Parallel Corpus
Reinhard Rapp, A Freely Available Automatically Generated Thesaurus of Related Words
Hiroyuki Shinnou, Minoru Sasaki, Semi-supervised Learning by Fuzzy Clustering and Ensemble Learning
Laura Alonso, Irene Castellón, Jordi Escribano, Xavier Messeguer, Lluís Padró, Multiple Sequence Alignment for Characterizing the Lineal Structure of Revision
Ben Hutchinson, Mining the Web for Discourse Markers
Magnus Merkel, Andreas Lange, A Pattern Extraction Workbench Combining Multiple Linguistic Levels
Li Tang, Donghong Ji, Lingpeng Yang, Yu Nie, A Model of Semantic Representations Analysis for Chinese Sentences
Ulrich Heid, Bettina Säuberlich, Esther Debus-Gregor, Werner Scholze-Stubenrecht, Tools for Upgrading Printed Dictionaries by Means of Corpus-based Lexical Acquisition
Kyoko Kanzaki, Qing Ma, Eiko Yamamoto, Masaki Murata, Hitoshi Isahara, Extraction of Hyperonymy of Adjectives from Large Corpora by Using the Neural Network Model
Nadine Aldinger, Towards a Dynamic Lexicon: Predicting the Syntactic Argument Structure of Complex Verbs
Kiril Simov, Petya Osenova, A Hybrid Strategy For Regular Grammar Parsing
Svetlana Sheremetyeva, A Flexible Language Acquisition Tool Kit for Natural Language Processing
Anders Nøklestad, Memory-based Classification of Proper Names in Norwegian
Session P6-T: Terminology Tools & Data
Raúl Araya, Jordi Vivaldi, Mercedes, a Term-in-Context Highlighter
Luís Sarmento, Belinda Maia, Diana Santos, The Corpógrafo C a Web-based Environment for Corpora Research
Junko Hosaka, Igor V. Kurochkin, Akihiko Konagaya, PBIE: A Data Preparation Toolkit Toward Developing a Parsing-Based Information Extraction System
Minoru Sasaki, Hiroyuki Shinnou, Information Retrieval System Using Latent Contextual Relevance
Paola Mariani, Costanza Badii, Methods of Digital Access for Legal Language Documentation
Rita Marinelli, Adriana Roventini, Alessandro Enea, Building a Maritime Domain Lexicon: a Few Considerations on the Database Structure and the Semantic Coding
Lorenzo Piccioni, Eros Zanchetta, XTERM: A Flexible Standard-Compliant XML-Based Termbase Management System
Joachim Wermter, Udo Hahn, An Annotated German-Language Medical Text Corpus as Language Resource
Peter Anick, Exploiting Anchor Text as a Lexical Resource
Session P7-EW: Evaluation of LRs & Tools
Diana Santos, Anabela Barreiro, On the Problems of Creating a Golden Standard of Inflected Forms in Portuguese
Sofia Stamou, Dimitris Christodoulakis, Handling Subtle Sense Distinctions Through Wordnet Semantic Types
Marie-Laure Reinberger, Walter Daelemans, Unsupervised Text Mining for Ontology Extraction: An Evaluation of Statistical Measures
Oscar Corcho, Raúl García-Castro, Asunción Gómez-Pérez, Benchmarking Ontology Tools. A Case Study for the WebODE Platform.
Scott S. L. Piao, Paul Rayson, Dawn Archer, Tony McEnery, Evaluating Lexical Resources for a Semantic Tagger
Uwe D. Reichel, Karl Weilhammer, Automated Morphological Segmentation and Evaluation
António Branco, João Silva, Evaluating Solutions for the Rapid Development of State-of-the-Art POS Taggers for Portuguese
Le An Ha, A Practical Comparison of Different Filters Used in Automatic Term Extraction
Widad Mustafa El Hadi, Ismail Timimi, Marianne Dabbadie, EVALDA-CESART Project: Terminological Resources Acquisition Tools Evaluation Campaign
Session P8-M: Packaging Multimodal Corpora
Frédéric Landragin, Alexandre Denis, Annalisa Ricci, Laurent Romary, Multimodal Meaning Representation for Generic Dialogue Systems Architectures
Brian MacWhinney, Steven Bird, Christopher Cieri, Craig Martell, Talkbank: Building an Open Unified Multimodal Database of Communicative Interaction
Ai Kawazoe, Asanobu Kitamoto, Nigel Collier, Annotation of Coreference Relations Among Linguistic Expressions and Images in Biological Articles
Laurent Romary, Amalia Todirascu, David Langlois, Experiments on Building Language Resources for Multi-Modal Dialogue Systems
Philippe Martin, WinPitch Corpus, a Text to Speech Alignment Tool for Multimodal Corpora
Session P9-SE: Speech: Tools, Platforms, Databases, Infrastructures
Mickel Grönroos, Manne Miettinen, Infrastructure for Collaborative Annotation of Speech
Yorick Wilks, Nick Webb, Andrea Setzer, Mark Hepple, Roberta Catizone, Human Dialogue Modelling Using Annotated Corpora
Wolfgang Minker, Comparative Evaluation of a Stochastic Parser on Semantic and Syntactic-semantic Labels
J.C.T. Beeken, P.H.J. van der Kamp, The Centre for Dutch Language and Speech Technology (TST Centre)
Christoph Draxler, Klaus Jänsch, SpeechRecorder - A Universal Platform Independent Multi-Channel Audio Recording Software
Alessandro Panunzi, Eugenio Picchi, Massimo Moneglia, Using PiTagger for Lemmatization and PoS Tagging of a Spontaneous Speech Corpus: C-Oral-Rom Italian
Peter A. Heeman, The American English SALA-II Data Collection
Henk van den Heuvel, Dorota Iskra, Eric Sanders, Folkert de Vriend, SLR Validation: Current Trends and Developments
Emanuela Cresti, Fernanda Bacelar do Nascimento, Antonio Moreno Sandoval, Jean Veronis, Philippe Martin, Khalid Choukri, The C-ORAL-ROM CORPUS. A Multilingual Resource of Spontaneous Speech for Romance Languages
Dafydd Gibbon, Firmin Ahoua, Eddi Gbéry, Eno-Abasi Urua, Moses Ekpenyong, WALA: A Multilingual Resource Repository for West African Languages
Toomas Altosaar, Matti Karjalainen, Design of an Interactive Web-based User Interface for Speech Database Query Formation
Alvin Martin, David Miller, Mark Przybocki, Joseph Campbell, Hirotaka Nakasone, Conversational Telephone Speech Corpus Collection for the NIST Speaker Recognition Evaluation 2004
Dorota Iskra, Rainer Siemund, Jamal Borno, Asuncion Moreno, Ossama Emam, Khalid Choukri, Oren Gedge, Herbert Tropf, Albino Nogueiras, Imed Zitouni, Anastasios Tsopanoglou, Nikos Fakotakis, OrienTel - Telephony Databases Across Northern Africa and the Middle East
Petr Pollák, Jan Černocký, Orthographic and Phonetic Annotation of Very Large Czech Corpora with Quality Assessment
Viet-Bac Le, Do-Dat Tran, Eric Castelli, Laurent Besacier, Jean-François Serignat, Spoken and Written Language Resources for Vietnamese
Elisabeth Pinto, Delphine Charlet, Hélène François, Djamel Mostefa, Olivier Boëffard, Dominique Fohr, Odile Mella, Frédéric Bimbot, Khalid Choukri, Yann Philip, Francis Charpentier, Development of New Telephone Speech Databases for French: the NEOLOGOS Project
Josef Psutka, Pavel Ircing, Jan Hajič, Vlasta Radová, Josef V. Psutka, William J. Byrne, Samuel Gustman, Issues in Annotation of the Czech Spontaneous Speech Corpus in the MALACH project
Alex Trutnev, Antoine Ronzenknop, Martin Rajman, Speech Recognition Simulation and its Application for Wizard-of-Oz Experiments
Kallirroi Georgila, Nikos Fakotakis, George Kokkinakis, A Graphical Tool for Handling Rule Grammars in Java Speech Grammar Format
Georges Fafiotte, Christian Boitet, Mark Seligman, Zong Chengqing, Collecting and Sharing Bilingual Spontaneous Speech Corpora: the ChinFaDial Experiment
Oliver Schonefeld, Jan-Torsten Milde, Embedding IMDI Metadata into a Large Phonetic Corpus
Christopher Cieri, Joseph P. Campbell, Hirotaka Nakasone, David Miller, Kevin Walker, The Mixer Corpus of Multilingual, Multichannel Speaker Recognition Data
Session O11-EW: Evaluation of Disambiguation Systems & Ontologies
Florentina Vasilescu, Philippe Langlais, Guy Lapalme, Evaluating Variants of the Lesk Approach for Disambiguating Words
Roberto Basili, Marco Cammisa, Fabio Massimo Zanzotto, A Similarity Measure for Unsupervised Semantic Disambiguation
Christopher Brewster, Harith Alani, Srinandan Dasmahapatra, Yorick Wilks, Data Driven Ontology Evaluation
Asunción Gómez-Pérez, M. Carmen Suárez-Figueroa, Ontology Evaluation Functionalities of RDF(S),DAML+OIL, and OWL Parsers and Ontology Platforms
Session O12-W: Coreference & Anaphora (I)
Anke Holler, Jan Frederik Maas, Angelika Storrer, Exploiting Coreference Annotations for Text-to-Hypertext Conversion
Felix Sasaki, Andreas Witt, Co-reference in Japanese Task-oriented Dialogues: A Contribution to the Development of Language-specific and Language-general Annotation Schemes and Resources
Hélène Manuélian, Generating Coreferential Descriptions from a Structured Model of the Context
Massimo Poesio, Mijail A. Kabadjov, A General-Purpose, Off-the-shelf Anaphora Resolution Module: Implementation and Preliminary Evaluation
Session O13-S: Phonetically-oriented Databases
Daan Wissing, Jean-Pierre Martens, Ulrike Janke, Wim Goedertier, A Spoken Afrikaans Language Resource Designed for Research on Pronunciation Variations
Julie Carson-Berndsen, Robert Kelly, Acquiring Reusable Multilingual Phonotactic Resources
Moritz Neugebauer, Stephen Wilson, Phonological Treebanks. Issues in Generation and Application
Diana Binnenpoorte, Catia Cucchiarini, Helmer Strik, Lou Boves, Improving Automatic Phonetic Transcription of Spontaneous Speech Through Variant-Based Pronunciation Variation Modelling
Session O14-W: Summarisation (I)
Jahna Otterbacher, Dragomir Radev, RevisionBank: A Resource for Revision-based Multi-document Summarization and Evaluation
Kedar Bellare, Anish Das Sarma, Atish Das Sarma, Navneet Loiwal, Vaibhav Mehta, Ganesh Ramakrishnan, Pushpak Bhattacharyya, Generic Text Summarization Using WordNet
V. Finley Lacatusu, Steven J. Maiorano, Sanda M. Harabagiu, Multi-Document Summarization Using Multiple-Sequence Alignment
Dragomir Radev, Timothy Allison, Sasha Blair-Goldensohn, John Blitzer, Arda Çelebi, Stanko Dimitrov, Elliott Drabek, Ali Hakim, Wai Lam, Danyu Liu, Jahna Otterbacher, Hong Qi, Horacio Saggion, Simone Teufel, Michael Topper, Adam Winkel, Zhu Zhang, MEAD - a Platform for Multidocument Multilingual Text Summarization
Top
--------------------------------------------------------------------------------
VOLUME III
Panel Summary
Brian MacWhinney, Collaborative Commentary: Opening Up Spoken Language Databases
Keynote Speeches
Junichi Tsujii, Thesaurus or Logical Ontology, Which do we Need for Mining Text?
Marilyn Walker, Can We Talk? Prospects for Automatically Training Spoken Dialogue Systems
Session O15-W: Named Entity
Marc Rössler, Corpus-based Learning of Lexical Resources for German Named Entity Recognition
Diana Maynard, Kalina Bontcheva, Hamish Cunningham, Automatic Language-Independent Induction of Gazetteer Lists
Daniel Ferrés, Marc Massot, Muntsa Padró, Horacio Rodríguez, Jordi Turmo, Automatic Building Gazetteers of Co-referring Named Entities
Paul Morarescu, Sanda Harabagiu, NameNet: a Self-Improving Resource for Name Classification
Session O16-EW: Profiling, Document Classification & Evaluation
Viktor Pekar, Richard Evans, Ruslan Mitkov, Categorizing Web Pages as a Preprocessing Step for Information Extraction
Antonio Sanfilippo, Gus Calapristi, Vernon Crow, Beth Hetzler, Alan Turner, Meaningful Clusters
Jörg Steffen, N-Gram Language Modeling for Robust Multi-Lingual Document Classification
Udo Hahn, Joachim Wermter, Pumping Documents Through a Domain and Genre Classification Pipeline
Session O17-W: Information Extraction & Disambiguation
Dan Tufis, Radu Ion, Nancy Ide, Word Sense Disambiguation as a Wordnets' Validation Method in Balkanet
Kalliopi Zervanou, John McNaught, A Domain-Independent Approach to IE Rule Development
Hayssam Traboulsi, David Cheng, Khurshid Ahmad, Text Corpora, Local Grammars and Prediction
Glòria Vàzquez, Ana Fernández Montraveta, Irene Castellón, Laura Alonso, Semantic Categorization of Spanish Se-constructions
Session O18-MS: Multimodal Corpora
Lei Chen, Yang Liu, Mary Harper, Eduardo Maia, Susan McRoy, Evaluating Factors Impacting the Accuracy of Forced Alignments in a Multimodal Corpus
Alfonso Ortega, Federico Sukno, Eduardo LLeida, Alejandro Frangi, Antonio Miguel, Luis Buera, Ernesto Zacur, AV@CAR: A Spanish Multichannel Multimodal Corpus for In-Vehicle Automatic Audio-Visual Speech Recognition
Katerina Pastra, Yorick Wilks, Image-Language Multimodal Corpora: Needs, Lacunae and an AI Synergy for Annotation
Baden Hughes, Catherine Bow, Steven Bird, Functional Requirements for an Interlinear Text Editor
Session O19-TW: Information Retrieval & Indexing
Mark Stevenson, Paul Clough, EuroWordNet as a Resource for Cross-language Information Retrieval
Sofia Stamou, Goran Nenadic, Dimitris Christodoulakis, Exploring Balkanet Shared Ontology for Multilingual Conceptual Indexing
Manolis Maragoudakis, Nikos Fakotakis, Bayesian Semantics Incorporation to Web Content for Natural Language Information Retrieval
Yalina Alphonse, Pierrette Bouillon, Methodology For Building Thematic Indexes In Medicine For French
Session O20-W: Corpus Semantic Annotation
Roberto Bartolini, Alessandro Lenci, Simonetta Montemagni, Vito Pirrelli, Claudia Soria, Semantic Mark-up of Italian Legal Texts Through NLP-based Techniques
Katrin Erk, Sebastian Padó, A Powerful and Versatile XML Format for Representing Role-semantic Annotation
Adam Meyers, Ruth Reeves, Catherine Macleod, Rachel Szekely, Veronika Zielinska, Brian Young, Ralph Grishman, Annotating Noun Argument Structure for NomBank
Argyrios Vasilakopoulos, Michele Bersani, William J. Black, A Suite of Tools for Marking Up Textual Data for Temporal Text Mining Scenarios
Louise Guthrie, Roberto Basili, Fabio Zanzotto, Kalina Bontcheva, Hamish Cunningham, David Guthrie, Jia Cui, Marco Cammisa, Jerry Cheng-Chieh Liu, Cassia Farria Martin, Kristiyan Haralambiev, Martin Holub, Klaus Macherey, Fredrick Jelinek, Large Scale Experiments for Semantic Labeling of Noun Phrases in Raw Text
Kaarel Kaljurand, Fabio Rinaldi, James Dowdall, Michael Hess, Exploiting Language Resources for Semantic Web Annotations
Session O21-EW: Evaluation of Machine Translation & Multilinguality Systems
Paul Buitelaar, Diana Steffen, Martin Volk, Dominic Widdows, Bogdan Sacaleanu, Špela Vintar, Stanley Peters, Hans Uszkoreit, Evaluation Resources for Concept-based Cross-Lingual Information Retrieval in the Medical Domain
Christopher B. Quirk, Training a Sentence-Level Machine Translation Confidence Measure
Athanasios Karasimos, Amy Isard, Multi-lingual Evaluation of a Natural Language Generation System
Bogdan Babych, Anthony Hartley, Modelling Legitimate Translation Variation for Automatic Evaluation of MT Quality
George Doddington, Alexis Mitchell, Mark Przybocki, Lance Ramshaw, Stephanie Strassel, Ralph Weischedel, The Automatic Content Extraction (ACE) Program C Tasks, Data, and Evaluation
Carol Peters, Martin Braschler, Khalid Choukri, Julio Gonzalo, Michael Kluck, The Future of Evaluation for Cross-Language Information Retrieval Systems
Session O22-EW: Parsing Systems & Evaluation
Manolis Maragoudakis, Nikos Fakotakis, George Kokkinakis, A Bayesian Model for Shallow Syntactic Parsing of Natural Language Texts
Stefan Klatt, A High Quality Partial Parser for Annotating German Text Corpora
Bernd Bohnet, Halyna Seniv, Mapping Dependency Structures to Phrase Structures and the Automatic Acquisition of Mapping Rules
Roberto Bartolini, Alessandro Lenci, Simonetta Montemagni, Vito Pirrelli, Hybrid Constraints for Robust Parsing: First Experiments and Evaluation
Hisami Suzuki, Phrase-Based Dependency Evaluation of a Japanese Parser
Eric K. Ringger, Robert C. Moore, Eugene Charniak, Lucy Vanderwende, Hisami Suzuki, Using the Penn Treebank to Evaluate Non-Treebank Parsers
Session O23-SE: Broadcast News Speech Corpora
An Vandecatseye, Jean-Pierre Martens, Joao Neto, Hugo Meinedo, Carmen Garcia-Mateo, Javier Dieguez, France Mihelic, Janez Zibert, Jan Nouza, Petr David, Matus Pleva, Anton Cizmar, Harris Papageorgiou, Christina Alexandris, The COST278 Pan-European Broadcast News Database
C. Barras, G. Adda, M. Adda-Decker, B. Habert, P. Boula de Mareüil, P. Paroubek, Automatic Audio and Manual Transcripts Alignment, Time-code Transfer and Selection of Exact Transcripts
G. Bordel, A. Ezeiza, K. Lopez de Ipina, M. Méndez, M. Peñagarikano, T. Rico, C. Tovar, E. Zulueta, Development of Resources for a Bilingual Automatic Index System of Broadcast News in Basque and Spanish
G. Gravier, J-F. Bonastre, E. Geoffrois, S. Galliano, K. Mc Tait, K. Choukri, The ESTER Evaluation Campaign for the Rich Transcription of French Broadcast News
Khalid Choukri, Mahtab Nikkhou, Niklas Paulsson, Network of Data Centres (NetDC): BNSC - An Arabic Broadcast News Speech Corpus
Mohamed Afify, Ossama Emam, Collection and Evaluation of Broadcast News Data for Arabic
Session O24-TW: MultiWord Expressions & Terminology
Jonas Sjöbergh, Viggo Kann, Finding the Correct Interpretation of Swedish Compounds, a Statistical Approach
Jan Odijk, Reusable Lexical Representations for Idioms
Stefan Evert, Ulrich Heid, Kristina Spranger, Identifying Morphosyntactic Preferences in Collocations
Alexander Geyken, Bootstrapping a Database of German Multi-word Expressions
James Dowdall, Will Lowe, Jeremy Ellman, Fabio Rinaldi, Michael Hess, The Role of MultiWord Terminology in Knowledge Management
Béatrice Daille, Samuel Dufour-Kowalski, Emmanuel Morin, French-English Multi-word Term Alignment Based on Lexical Context Analysis
Session O25-EGSW: Large Programs, Data Centres & International Cooperation
Michael Emonts, Current Projects in Languages of Military Interest at the Defense Language Institute
Christopher Cieri, Mark Liberman, A Progress Report from the Linguistic Data Consortium: Recent Activities in Resource Creation and Distribution and the Development of Tools and Standards
Khalid Choukri, Recent Activities within the European Language Resources Association: Issues on Sharing Language Resources and Evaluation
Nicoletta Calzolari, Khalid Choukri, Maria Gavrilidou, Bente Maegaard, Paola Baroni, Hanne Fersøe, Alessandro Lenci, Valérie Mapelli, Monica Monachini, Stelios Piperidis, ENABLER Thematic Network of National Projects: Technical, Strategic and Political Issues of LRs
Hanne Fersøe, Monica Monachini, ELRA Validation Methodology and Standard Promotion for Linguistic Resources
Violetta Cavalli-Sforza, Jaime G. Carbonell, Peter J. Jansen, Developing Language Resources for a Transnational Digital Government System
Session O26-W: Learning & Acquisition (II)
Reinhard Rapp, Utilizing the One-Sense-per-Discourse Constraint for Fully Unsupervised Word Sense Induction and Disambiguation
Olivia Sanchez-Graillet, Massimo Poesio, Acquiring Bayesian Networks from Text
Lee Schwartz, Takako Aikawa, Multilingual Corpus-based Approach to the Resolution of English Cing
Manuela Kunze, Dietmar Rösner, Corpus Based Enrichment of GermaNet Verb Frames
Chris Biemann, Stefan Bordag, Uwe Quasthoff, Automatic Acquisition of Paradigmatic Relations Using Iterated Co-occurrences
Franca Debole, Fabrizio Sebastiani, An Analysis of the Relative Difficulty of Reuters-21578 Subsets
Session O27-ESW: Question Answering
Karin Müller, Semi-Automatic Construction of a Question Treebank
Francesca Bertagna, Using Semantic Language Resources to Support Textual Inference for Question Answering
Vasco Calais Pedro, Jeongwoo Ko, Eric Nyberg, Teruko Mitamura, An Information Repository Model for Advanced Question Answering Systems
Nina Wacholder, Sharon Small, Bing Bai, Diane Kelly, Robert Rittman, Sean Ryan, Robert Salkin, Peng Song, Ying Sun, Liu Ting, Paul Kantor, Tomek Strzalkowski, Designing a Realistic Evaluation of an End-to-end Interactive Question Answering System
Agnes Lisowska, Andrei Popescu-Belis, Susan Armstrong, User Query Analysis for the Specification and Evaluation of a Dialogue Processing and Retrieval System
Nelleke Oostdijk, Lou Boves, Using Large Multi-purpose Corpora for Specific Research Questions: Discourse Phenomena Related to Wh-questions in the Spoken Dutch Corpus
Session O28-S: Dialogue Corpora
Vincenzo Pallotta, Hatem Ghorbel, Patrick Ruch, Giovanni Coray, An Argumentative Annotation Schema for Meeting Discussions
Magdalena Wolska, Bao Quoc Vo, Dimitra Tsovaltzi, Ivana Kruijff-Korbayová, Elena Karagjosova, Helmut Horacek, Armin Fiedler, Christoph Benzmüller, An Annotated Corpus of Tutorial Dialogs on Mathematical Theorem Proving
Niels Ole Bernsen, Laila Dybkjær, Svend Kiilerich, Evaluating Conversation with Hans Christian Andersen
Florian Schiel, MAUS Goes Iterative
Jean Carletta, Shipra Dingare, Malvina Nissim, Tatiana Nikitina, Using the NITE XML Toolkit on the Switchboard Corpus to Study Syntactic Choice: a Case Study
Malvina Nissim, Shipra Dingare, Jean Carletta, Mark Steedman, An Annotation Scheme for Information Status in Dialogue
Session O29-EMSW: Summarisation Systems & Evaluation (II)
Hidetsugu Nanba, Manabu Okumura, Comparison of Some Automatic and Manual Methods for Summary Evaluation Based on the Text Summarization Challenge 2
Laura Alonso, Maria Fuentes, Marc Massot, Horacio Rodríguez, Re-using High-quality Resources for Continued Evaluation of Automated Summarization Systems
Constantin Orăsan, Viktor Pekar, Laura Hasler, A Comparison of Summarisation Methods Based on Term Specificity Estimation
Laura Hasler, "Why do you Ignore me?" - Proof that not all Direct Speech is Bad
Walter Daelemans, Anja Höthker, Erik Tjong Kim Sang, Automatic Sentence Simplification for Subtitling in Dutch and English
Saif Ahmad, Paulo C F de Oliveira, Khurshid Ahmad, Summarization of Multimodal Information
Top
--------------------------------------------------------------------------------
VOLUME IV
Session P10-W: Computational Lexicons
Tony Veale, Polysemy and Category Structure in WordNet: An Evidential Approach
Tomaž Erjavec, Kristina Hmeljak Sangawa, Irena Srdanović, Anton ml. Vahčič, Making an XML-based Japanese-Slovene Learners' Dictionary
Sanni Nimb, A Corpus-based Syntactic Lexicon for Adverbs
Dan Tufis, Eduard Barbu, A Methodology and Associated Tools for Building Interlingual Wordnets
Dan Tufis, Radu Ion, Nancy Ide, Word Sense Disambiguation as a Wordnets’ Validation Method in Balkanet
Florbela Barreto, Raquel Amaro, Multifunctional Computational Lexicon of Contemporary Portuguese: An Available Resource for Multitype Applications
Anna Braasch, Sussi Olsen, STO: A Danish Lexicon Resource - Ready for Applications
Carlo Strapparava, Alessandro Valitutti, WordNet Affect: an Affective Extension of WordNet
Leo Wanner, Margarita Alonso Ramos, Antonia Martí, Enriching the Spanish EuroWordNet by Collocations
Charles J. Fillmore, Collin F. Baker, Hiroaki Sato, FrameNet as a ``Net''
Adam Meyers, Ruth Reeves, Catherine Macleod, Rachel Szekely, Veronika Zielinska, Brian Young, The Cross-Breeding of Dictionaries
Nilda Ruimy, Pierrette Bouillon, Bruno Cartoni, Semi-Automatic Derivation of a French Lexicon from CLIPS
Cvetana Krstev, Duško Vitas, Ranka Stankoviæ, Ivan Obradoviæ, Gordana Pavloviæ-Lažetiæ, Combining Heterogeneous Lexical Resources
Monica Monachini, Federico Calzolari, Michele Mammini, Sergio Rossi, Marisa Ulivieri, Unifying Lexicons in view of a Phonological and Morphological Lexical DB
Ann Copestake, Fabre Lambeau, Benjamin Waldron, Francis Bond, Dan Flickinger, Stephan Oepen, A Lexicon Module for a Grammar Development Environment
Jaap Kamps, Maarten Marx, Robert J. Mokken, Maarten de Rijke, Using WordNet to Measure Semantic Orientations of Adjectives
Huarui Zhang, Churen Huang, Shiwen Yu, Distributional Consistency: As a General Method for Defining a Core Lexicon
Eneko Agirre, Oier Lopez de Lacalle, Publicly Available Topic Signatures for all WordNet Nominal Senses
Aline Villavicencio, Timothy Baldwin, Benjamin Waldron, A Multilingual Database of Idioms
Key-Sun Choi, Hee-Sook Bae, Wonseok Kang, Juho Lee, Eunhe Kim, Hekyeong Kim, Donghee Kim, Youngbin Song, Hyosik Shin, Korean-Chinese-Japanese Multilingual Wordnet with Shared Semantic Hierarchy
Palmira Marrafa, Extending Wordnets To Implicit Information
Session P11-W: Syntactic & Semantic Corpus Annotation
Andreas Wagner, Bettina Zeisler, A Syntactically Annotated Corpus of Tibetan
Richard Campbell, Eric Ringger, Converting Treebank Annotations to Language Neutral Syntax
Michael Daum, Kilian A. Foth, Wolfgang Menzel, Automatic Transformation of Phrase Treebanks to Dependency Trees
Milena Slavcheva, Verb Valency Descriptors for a Syntactic Treebank
Andrea Sansò, MED-TYP: A Typological Database for Mediterranean Languages
Manfred Klenner, Fabio Rinaldi, Michael Hess, Steps Towards Semantically Annotated Language Resources
Session P12-W: Corpora for Multilingual Use
Hiroshi Nakagawa, Hidetaka Masuda, Dai Sato, Terminal Device Oriented Comparable Corpora and its Alignment- Towards Extracting Paraphrasing Patterns
Masumi Narita, Chieko Sato, Masatoshi Sugiura, Connector Usage in the English Essay Writing of Japanese EFL Learners
Anthony McEnery, Zhonghua Xiao, The Lancaster Corpus of Mandarin Chinese: A Corpus for Monolingual and Contrastive Language Study
Xavier Gómez-Guinovart, Elena Sacau Fontenla, Parallel Corpora for the Galician Language: Building and Processing of the CLUVI (Linguistic Corpus of the University of Vigo)
Jörg Tiedemann, Lars Nygaard, The OPUS Corpus - Parallel and Free: http://logos.uio.no/opus
Božo Bekavac, Petya Osenova, Kiril Simov, Marko Tadić, Making Monolingual Corpora Comparable: a Case Study of Bulgarian and Croatian
Session P13-W: General Issues & Large Programs
Thatsanee Charoenporn, Virach Sornlertlamvanich, Sawit Kasuriya, Chatchawarn Hansakunbuntheung, Hitoshi Isahara, Open Collaborative Development of the Thai Language Resources for Natural Language Processing
Anna Samiotou, Lambros Kranias, Dimitrios Kokkinakis, Intelligent Building of Language Resources for HLT Applications
Péter Halácsy, András Kornai, László Németh, András Rung, István Szakadát, Viktor Trón, Creating Open Language Resources for Hungarian
Ulrich Callmeier, Andreas Eisele, Ulrich Schäfer, Melanie Siegel, The DeepThought Core Architecture Framework
Christian Biemann, Stefan Bordag, Uwe Quasthoff, Christian Wolff, Web Services for Language Resources and Language Technology Applications
Thorsten Trippel, Felix Sasaki, Dafydd Gibbon, Consistent Storage of Metadata in Inference Lexica: the MetaLex Approach
Fabio Tamburini, Building Distributed Language Resources By Grid Computing
Serge A. Yablonsky, Integration of Russian Language Resources
Session P14-W: Morphosyntactic Data & Tools
Stephan Bopp, Sandro Pedrazzini, Elisabeth Maier, How to Disassemble Alphabetical Processions - Morphological Treatment of Unknown Words
Thanh Bon Nguyen, Thi Minh Huyen Nguyen, Laurent Romary, Xuan Luong Vu, Developping Tools and Building Linguistic Resources for Vietnamese Morpho-syntactic Processing
Adam Przepiórkowski, Zygmunt Krynicki, Łukasz Dębowski, Marcin Woliski, Daniel Janus, Piotr Baski, A Search Tool for Corpora with Positional Tagsets and Ambiguities
Jaroslava Hlaváčová, Jana Klímová, Derivational Relations in Flectional Languages - Czech Case
Sun-Mee Bae, Key-Sun Choi, Lexical Analysis of Agglutinative Languages Using a Dictionary of Lemmas and Lexical Transducers
Tom Laureys, Guy De Pauw, Hugo Van hamme, Walter Daelemans, Dirk Van Compernolle, Evaluation and Adaptation of the Celex Dutch Morphological Database
Sonja E. Bosch, Laurette Pretorius, Software Tools for Morphological Tagging of Zulu Corpora and Lexicon Development
Attila Novák, Viktor Nagy, Csaba Oravecz, Combining Symbolic and Statistical Methods in Morphological Analysis and Unknown Word Guessing
Antoni Oliver, Marko Tadić, Enlarging the Croatian Morphological Lexicon by Automatic Lexical Acquisition from Raw Corpora
Helmut Schmid, Arne Fitschen, Ulrich Heid, SMOR: A German Computational Morphology Covering Derivation, Composition and Inflection
Yuka Tateisi, Jun-ichi Tsujii, Part-of-Speech Annotation of Biology Research Abstracts
Jochen Trommer, Dalina Kallulli, A morphological Analyzer for Standard Albanian
Abdelhadi Soudi, Andreas Eisele, Generating an Arabic Full-form Lexicon for Bidirectional Morphology Lookup
Radek Sedláček, The Core of the Czech Derivational Dictionary
Marc Vilain, Building part-of-speech Corpora Through Histogram Hopping
Session P15-T: Terminology Acquisition
Magnus Sahlgren, Automatic Bilingual Lexicon Acquisition Using Random Indexing of Aligned Bilingual Data
Alessandro Cucchiarelli, Roberto Navigli, Francesca Neri, Paola Velardi, Automatic Generation of Glosses in the OntoLearn System
Henrik Selsøe Sørensen, The Bilingual Web Dictionary on Demand
Bruno Cartoni, Pierrette Bouillon, Yalina Alphonse, Sabine Lehmann, Automatisation of the Activity of Term Collection in Different Languages
Margarita Hospedales, Manel Rodríguez, The GENOMA-KB Platform: Queries over Integrated Linguistic Resources
Eiko Yamamoto, Kyoji Umemura, Related Word-pairs Extraction Without Dictionaries
Marco Baroni, Silvia Bernardini, BootCaT: Bootstrapping Corpora and Terms from the Web
Daan Broeder, Peter Wittenburg, Onno Crasborn, Using Profiles for IMDI Metadata Creation
Andrew Hippisley, Chara Karavasili, A Natural Language Approach to Information Management: Tracking Scientific Advances Through the Structure of Words
Christophe Jouis, Jean-Marie Ferru, Intranet Try To Find Project (ITTF): An Approach for the Search of Relevant Information Inside an Organization
Session P16-E: Evaluation of Systems & Tools
H. Folch, B. Habert, M. Jardino, N. Pernelle, M.C. Rousset, A. Termier, Highlighting Latent Structure in Documents
Yasmina Quatrain, Sylvaine Nugier, Anne Peradotto, An Evaluation Protocol for Text Mining Tools : ALCESTE, SAS Text Miner, SPAD-CRM and Temis Text Mining Solutions Testing
Olga Uryupina, Evaluating Name-Matching for Coreference Resolution
Michael Kluck, Evaluation of Cross-Language Information Retrieval Using the Domain-Specific GIRT Data as Parallel German-English Corpus
Antoinette Renouf, Andrew Kehoe, Textual Distraction as a Basis for Evaluating Automatic Summarisers
Diana Pérez, Enrique Alfonseca, Pilar Rodríguez, Application of the BLEU Method for Evaluating Free-text Answers in an E-learning Environment
Jonathan G. Fiscus, Results of the 2003 Topic Detection and Tracking Evaluation
Boris Dobrov, Igor Kuralenok, Natalia Loukachevitch, Igor Nekrestyanov, Ilya Segalovich, Russian Information Retrieval Evaluation Seminar
Session P17-M: Multimodal Resources, Tools & Documentation
Paul Schmidt, Sandrine Garnier, Mike Sharwood, Toni Badia, Lourdes Díaz, Martí Quixal, Ana Ruggia, Antonio S. Valderrabanos, Alberto J. Cruz, Enrique Torrejon, Celia Rico, Jorge Jimenez, ALLES: Integrating NLP in ICALL Applications
Dafydd Gibbon, Catherine Bow, Steven Bird, Baden Hughes, Securing Interpretability: The Case of Ega Language Documentation
Lars Degerstedt, Arne Jönsson, Open Resources for Language Technology
Stefanie Herrmann, Hartmut Keck, Stephan Kepser, A Multi-Modal Documentation System for Warao
Session P18-S: Speech Corpora & Annotation/Processing Tools
Christina Alexandris, Stavroula-Evita Fotinea, Reusing Language Resources for Speech Applications involving Emotion
Eva Navas, Amaia Castelruiz, Iker Luengo, Jon Sánchez, Inmaculada Hernáez, Designing and Recording an Audiovisual Database of Emotional Speech in Basque
Nikos Fakotakis, Corpus Design, Recording and Phonetic Analysis of Greek Emotional Database
Stefan Schaden, CrossTowns: Automatically Generated Phonetic Lexicons of Cross-lingual Pronunciation Variants of European City Names
Darinka Verdonik, Matej Rojc, Zdravko Kačič, Creating Slovenian Language Resources for Development of Speech-to-speech Translation Components
Yong-Ju Lee, Bong-Wan Kim, Young-Il Kim, Dae-Lim Choi, Kwang-Hyun Lee, Yongnam Um, Creation and Assessment of Korean Speech and Noise DB in Car Environment
Mitsuo Shimohata, Eiichiro Sumita, Yuji Matsumoto, Building a Paraphrase Corpus for Speech Translation
John S. Garofolo, Christophe D. Laprun, Martial Michel, Vincent M. Stanford, Elham Tabassi, The NIST Meeting Room Pilot Corpus
Andrei Popescu-Belis, Abstracting a Dialog Act Tagset for Meeting Processing
Massimo Moneglia, Measurements of Spoken Language Variability in a Multilingual Corpus. Predictable Aspects
L. Devillers, I. Vasilescu, Reliability of Lexical and Prosodic Cues in Two Real-life Spoken Dialog Corpora
Robert S. Belvin, Susanne Riehemann, Kristin Precoda, A Fine-Grained Evaluation Method for Speech-to-Speech Machine Translation Using Concept Annotations
Hanne Fersøe, Elviira Hartikainen, Henk van den Heuvel, Giulio Maltese, Asuncíon Moreno, Shaunie Shammass, Ute Ziegenhain, Creation and Validation of Large Lexica for Speech-to-Speech Translation Purposes
Emi Izumi, Kiyotaka Uchimoto, Hitoshi Isahara, The Overview of the SST Speech Corpus of Japanese Learner English and Evaluation Through the Experiment on Automatic Detection of Learners' Errors
Tomoyosi Akiba, Atsushi Fujii, Katunobu Itou, Collecting Spontaneously Spoken Queries for Information Retrieval
Nikos Fakotakis, Cypriot Speech Database: Data Collection and Greek to Cypriot Dialect Adaptation
Evie Coussé, Steven Gillis, Hanne Kloots, Marc Swerts, The Influence of the Labeller’s Regional Background on Phonetic Transcriptions: Implications for the Evaluation of Spoken Language Resources
Andrei Popescu-Belis, Maria Georgescul, Alexander Clark, Susan Armstrong, Building and Using a Corpus of Shallow Dialogue Annotated Meetings
Ulrich Heid, Holger Voormann, Jan-Torsten Milde, Ulrike Gut, Katrin Erk, Sebastian Padó, Querying Both Time-aligned and Hierarchical Corpora with NXT Search
Victoria Arranz, Núria Castell, Josep Maria Crego, Jesús Giménez, Adrià de Gispert, Patrik Lambert, Bilingual Connections for Trilingual Corpora: An XML Approach
Mary D. Swift, Myroslava O. Dzikovska, Joel R. Tetreault, James F. Allen, Semi-automatic Syntactic and Semantic Corpus Annotation with a Deep Parser
Nadia Mana, Roldano Cattoni, Emanuele Pianta, Franca Rossi, Fabio Pianesi, Susanne Burger, The Italian NESPOLE! Corpus: a Multilingual Database with Interlingua Annotation in Tourism and Medical Domains
Esmeralda Uraga, César Gamboa, VOXMEX Speech Database: Design of a Phonetically Balanced Corpus
Session O30-SW: Infrastructure for LRs
Christian Biemann, Stefan Bordag, Uwe Quasthoff, Christian Wolff, Web Services for Language Resources and Language Technology Applications
Thierry Declerck, Paul Buitelaar, Nicoletta Calzolari, Alessandro Lenci, Towards a Language Infrastructure for the Semantic Web
Kalina Bontcheva, Open-source Tools for Creation, Maintenance, and Storage of Lexical Resources for Language Generation from Ontologies
Stefan Baumann, Caren Brinckmann, Silvia Hansen-Schirra, Geert-Jan Kruijff, Ivana Kruijff-Korbayová, Stella Neumann, Erich Steiner, Elke Teich, Hans Uszkoreit, The MULI Project: Annotation and Analysis of Information Structure in German and English
Session O31-EW: Coreference, Anaphora & Evaluation (II)
Anna Kupść, Teruko Mitamura, Benjamin Van Durme, Eric Nyberg, Pronominal Anaphora Resolution for Unrestricted Text
Judita Preiss, Caroline Gasperin, Ted Briscoe, Can Anaphoric Definite Descriptions be Replaced by Pronouns?
Rebecca J. Passonneau, Computing Reliability for Coreference Annotation
Andrei Popescu-Belis, Loïs Rigouste, Susanne Salmon-Alt, Laurent Romary, Online Evaluation of Coreference Resolution
Session O32-ES: Evaluation of Speech Annotation & Systems
Morena Danieli, Juan María Garrido, Massimo Moneglia, Andrea Panizza, Silvia Quazza, Marc Swerts, Evaluation of Consensus on the Annotation of Prosodic Breaks in the Romance Corpus of Spontaneous Speech "C-ORAL-ROM"
Jacques Duchateau, Tim Ceyssens, Hugo Van hamme, Use and Evaluation of Prosodic Annotations in Dutch
Alex Trutnev, Martin Rajman, Comparative Evaluations in the Domain of Automatic Speech Recognition
Arlindo O. Veiga, Fernando S. Perdigão, An Efficient Word Confidence Measure Using Likelihood Ratio Scores
Session O33-TW: Morphosyntactic Corpora & Tools
Rute Costa, Raquel Silva, The Verb in the Terminological Collocations. Contribution to the Development of a Morphological Analyser: MorphoCom
Tomaž Erjavec, MULTEXT-East Version 3: Multilingual Morphosyntactic Specifications, Lexicons and Corpora
Stefan Evert, The Statistical Analysis of Morphosyntactic Distributions
Alexis Palmer, Jonas Kuhn, Carlota Smith, Utilization of Multiple Language Resources for Robust Grammar-Based Tense and Aspect Classification
Top
--------------------------------------------------------------------------------
VOLUME V
Panel Summaries
Hans Uszkoreit, Strategic Directions of National and International Research Funding
Keynote Speeches
Nick Campbell, Getting to the Heart of the Matter; Speech is More than Just the Expression of Text or Language
Gregor Thurmair, Multilingual Content Processing
Session O34-W: Lexicon & Semantics (II)
Wim Peters, Incremental Knowledge Acquisition from WordNet and EuroWordNet
Chu-Ren Huang, Ru-Yng Chang, Hshiang-Pin Lee, Sinica BOW (Bilingual Ontological Wordnet): Integration of Bilingual WordNet and SUMO
Karin Kipper, Benjamin Snyder, Martha Palmer, Extending a Verb-lexicon Using a Semantically Annotated Corpus
Charles J. Fillmore, Collin F. Baker, Hiroaki Sato, FrameNet as a "Net"
Session O35-W: Learning & Acquisition (III)
Khurshid Ahmad, Maria Teresa Musacchio, Discovery of (New) Knowledge and the Analysis of Text Corpora
Doaa Samy, Antonio Moreno-Sandoval, José M. Guirao, Construction of a Bilingual Arabic-Spanish Lexicon of Verbs Based on a Parallel Corpus
Murat Deviren, Khalid Daoudi, Kamel Smaïli, Language Modeling Using Dynamic Bayesian Networks
Anna Sinopalnikova, Pavel Smrz, Word Association Norms as a Unique Supplement of Traditional Language Resources
Session O36-SW: Machine Translation & Speech-to-Speech Translation
Maja Popović, Hermann Ney, Towards the Use of Word Stems and Suffixes for Statistical Machine Translation
Toshiyuki Takezawa, Genichiro Kikui, A Comparative Study on Human Communication Behaviors and Linguistic Characteristics for Speech-to-Speech Translation
Stephan Vogel, Christian Monson, Augmenting Manual Dictionaries for Statistical Machine Translation Systems
Martin Čmejrek, Jan Cuřín, Jiří Havelka, Jan Hajič, Vladislav Kubo, Prague Czech-English Dependency Treebank. Syntactically Annotated Resources for Machine Translation
Session O37-EMS: Evaluation of Spoken & Multimodal Systems
Sebastian Möller, Jan Krebber, Alexander Raake, Paula Smeele, Martin Rajman, Mirek Melichar, Vincenzo Pallotta, Gianna Tsakou, Basilis Kladis, Anestis Vovos, Jettie Hoonhout, Dietmar Schuchardt, Nikos Fakotakis, Todor Ganchev, Ilyas Potamitis, INSPIRE: Evaluation of a Smart-Home System for Infotainment Management and Device Control
Sebastian Möller, A New ITU-T Recommendation on the Evaluation of Telephone-Based Spoken Dialogue Systems
Harald Höge, Josef G. Bauer, Christian Geißler, Panji Setiawan, Kai Steinert, Evaluation of Microphone Array Front-Ends for ASR - an Extension of the AURORA Framework
Salma Jamoussi, Kamel Smaïli, Dominique Fohr, Jean-Paul Haton, A Complete Understanding Speech System Based on Semantic Concepts
Session O38-EW: Proofing, Controlled Language & Evaluation
Lina Henriksen, Bart Jongejan, Bente Maegaard, Corporate Voice, Tone of Voice and Controlled Language Techniques
Na-Rae Han, Martin Chodorow, Claudia Leacock, Detecting Errors in English Article Usage with a Maximum Entropy Classifier Trained on a Large, Diverse Corpus
Christian Monson, Lori Levin, Rodolfo Vega, Ralf Brown, Ariadna Font Llitjos, Alon Lavie, Jaime Carbonell, Eliseo Cañulef, Rosendo Huisca, Data Collection and Analysis of Mapudungun Morphology for Spelling Correction
Johnny Bigert, Probabilistic Detection of Context-Sensitive Spelling Errors
SESSION O39-EW: Evaluation of Information Extraction & Summarisation Systems
Avik Sarkar, Anne De Roeck, A Framework for Evaluating the Suitability of Non-English Corpora for Language Engineering
Atsushi Fujii, Makoto Iwayama, Noriko Kando, Test Collections for Patent-to-Patent Retrieval and Patent Map Generation in NTCIR-4 Workshop
Anne De Roeck, Avik Sarkar, Paul Garthwaite, Frequent Term Distribution Measures for Dataset Profiling
Nikolaos Nanas, Victoria Uren, Anne de Roeck, John Domingue, Beyond TREC's Filtering Track
A. Lavelli, M. E. Califf, F. Ciravegna, D. Freitag, C. Giuliano, N. Kushmerick, L. Romano, A Critical Survey of the Methodology for IE Evaluation
Simone Teufel, Hans van Halteren, Agreement in Human Factoid Annotation for Summarization Evaluation
Session O40-W: Corpora
David Martínez, Eneko Agirre, The Effect of Bias on an Automatically-built Word Sense Corpus
Sabine Bartsch, Annotating a Corpus for Building a Domain-specific Knowledge Base
Karlheinz Mörth, Rethinking Readability of Digital Editions C The Case of the AAC's "Digital Brenner"
Balázs Kis, Begoña Villada, Gosse Bouma, Gábor Ugray, Tamás Bíró, Gábor Pohl, John Nerbonne, A New Approach to the Corpus-based Statistical Investigation of Hungarian Multi-word Lexemes
Nancy Ide, Keith Suderman, The American National Corpus First Release
Kiril Simov, Petya Osenova, Sia Kolkovska, Elisaveta Balabanova, Dimitar Doikoff, A Language Resources Infrastructure for Bulgarian
Session O41-EMS: Evaluation of Speech & Multimodal Dialogue Systems & Methodology
Hans Dybkjær, Laila Dybkjær, From Acts and Topics to Transactions and Dialogue Smoothness
Laila Dybkjær, Niels Ole Bernsen, Wolfgang Minker, Usability Evaluation of Multimodal and Domain-Oriented Spoken Language Dialogue Systems
David R. Traum, Susan Robinson, Jens Stephan, Evaluation of Multi-party Virtual Reality Dialogue Interaction
Holmer Hemsen, Evaluation of a Multimodal Dialogue System for Small-screen Devices
Susan Robinson, Bilyana Martinovski, Saurabh Garg, Jens Stephan, David Traum, Issues in Corpus Development for Multi-party Multi-modal Task-oriented Dialogue
Pedro Concejero Cerezo, Juan José Rodríguez Soler, Daniel Tapias Merino, Alberto J. Sánchez García, Methodology for Rapid Prototyping and Testing of ASR Based User Interfaces
Session O42-TW: Terminology & Learning
Gaël Dias, Sérgio Nunes, Evaluation of Different Similarity Measures for the Extraction of Multiword Units in a Reinforcement Learning Environment
Philipp Cimiano, Andreas Hotho, Steffen Staab, Clustering Concept Hierarchies from Text
Marco Baroni, Sabrina Bisi, Using Cooccurrence Statistics and the Web to Discover Synonyms in a Technical Language
Jorge Vivaldi, Horacio Rodríguez, Automatically Selecting Domain Markers for Terminology Extraction
I. Alegria, A. Gurrutxaga, P. Lizaso, X. Saralegi, S. Ugartetxea, R. Urizar, A XML-Based Term Extraction Tool for Basque
Yoko Mizuta, Nigel Collier, An Annotation Scheme for a Rhetorical Analysis of Biology Articles
Session P19-SW: Corpora
Serge Sharoff, Towards Basic Categories for Describing Properties of Texts in a Corpus
Beom-mo Kang, Hunggyu Kim, Sejong Korean Corpora in the Making
J. G. Kruyt, The Integrated Language Database of 8th - 21st-Century Dutch
Elisabete Ranchhod, Paula Carvalho, Cristina Mota, Anabela Barreiro Portuguese Large-scale Language Resources for NLP Applications
Lorena Seijo Pereiro, Ana Martínez Ínsua, Francisco Méndez Pazó, Francisco Campillo Díaz, Eduardo Rodríguez Banga, A Galician Textual Corpus for Morphosyntactic Tagging with Application to Text-to-Speech Synthesis
M. Taulé, M. Civit, N. Artigas, M. García, L. Màrquez, M.A. Martí, B. Navarro, MiniCors and Cast3LB: Two Semantically Tagged Spanish Corpora
P. H. J. van der Kamp, J. G. Kruyt, Putting the Dutch PAROLE Corpus to Work
Marco Baroni, Silvia Bernardini, Federica Comastri, Lorenzo Piccioni, Alessandra Volpi, Guy Aston, Marco Mazzoleni, Introducing the La Repubblica Corpus: A Large, Annotated, TEI(XML)-compliant Corpus of Newspaper Italian
Yoshida Kyôsuke, Hashimoto Taiichi, Tokunaga Takenobu, Tanaka Hozumi, Retrieving Annotated Corpora for Corpus Annotation
Sandra Aluisio, Gisele Montilha Pinheiro, Aline M. P. Manfrin, Leandro H. M. de Oliveira, Luiz C. Genoves Jr., Stella E. O. Tagnin, The Lácio-Web: Corpora and Tools to Advance Brazilian Portuguese Language Investigations and Computational Linguistic Tools
Dragomir Radev, Jahna Otterbacher, Zhu Zhang, CST Bank: A Corpus for the Study of Cross-document Structural Relationships
Jonas Kuhn, B'alam Mateo-Toledo, Applying Computational Linguistic Techniques in a Documentary Project for Q'anjob'al (Mayan, Guatemala)
Kyonghee Paik, Kiyonori Ohtake, Kazuhide Yamamoto, A Comparison of Two Variant Corpora: The Same Content with Different Source
Tylman Ule, Kiril Simov, Unexpected Productions May Well be Errors
Maria Luigia Ceccotti, Manuela Sassi, Computational Lexicography and Carlo Emilio Gadda, Principe dell'Analisi e Duca della Buona Cognizione
Hanno Biber, Evelyn Breiteneder, The AAC [Austrian Academy Corpus] C An Enterprise to Develop Large Electronic Text Corpora.
Robert Král, Semantic Annotating of Czech Corpus via WSD
Eugenio Picchi, Maria Luigia Ceccotti, Sebastiana Cucurullo, Manuela Sassi, Eva Sassolini, Linguistic Miner: An Italian Linguistic Knowledge System
Kenji Sagae, Brian MacWhinney, Alon Lavie, Adding Syntactic Annotations to Transcripts of Parent-Child Dialogs
Giuseppe Cappeli, Paulo Alberto, The OLISSIPO and LECTIO Projects
Session P20-W: Tools for Corpora & Lexicons
Maria Fernanda Bacelar do Nascimento, Amália Mendes, Luísa Pereira, Providing On-line Access to Portuguese Language Resources: Corpora and Lexicons
Tokunaga Takenobu, Koyama Tomofumi, Saito Suguru, Nakajima Masayuki, Classification of Japanese Spatial Nouns
Daisuke Kawahara, Ryohei Sasano, Sadao Kurohashi, Toward Text Understanding: Integrating Relevance-tagged Corpus and Automatically Constructed Case Frames
Donghong Ji, Li Tang, Lingpeng Yang, Building a Conceptual Graph Bank for Chinese Language
Lionel Clément, Benoît Sagot, Bernard Lang, Morphology Based Automatic Acquisition of Large-coverage Lexica
Kiril Ribarov, Towards Intelligent Written Cultural Heritage Processing - Lexical processing
Session P21-W: Acquisition of Collocations & Patterns
Hsin-Hsi Chen, Yi-Cheng Yu, Chih-Long Lin, Collocation Extraction Using Web Statistics
David Wible, Chin-Hwa Kuo, Nai-Lung Tsao, Improving Collocation Extraction for High Frequency Words
M. Begoña Villada Moirón, Discarding Noise in an Automatically Acquired Lexicon of Support verb Constructions
Valeria Quochi, Representing Italian Complex Nominals: A Pilot Study
Borja Navarro, Manuel Palomar, Patricio Martínez-Barco, Automatic Extraction of Syntactic Semantic Patterns for Multilingual Resources
Violeta Seretan, Luka Nerima, Eric Wehrli, Using the Web as a Corpus for the Syntactic-Based Collocation Identification
Top
--------------------------------------------------------------------------------
VOLUME VI
Session P22-W: Ontologies
Nuno Seco, Tony Veale, Jer Hayes, Concept Creation in Lexical Ontologies
Luciana Bordoni, Investigation on Semantics to Improve the COVAX System
Marjorie McShane, Stephen Beale, Sergei Nirenburg, Some Meaning Procedures of Ontological Semantics
Natalia V. Loukachevitch, Boris V. Dobrov, Development of Ontologies with Minimal Set of Conceptual Relations
Dominique Dutoit, Pierre Nugues, Patrick de Torcy, The Integral Dictionary: An Ontological Resource for the Semantic Web: Integration of EuroWordNet, Balkanet, TID, and SUMO
Karel Pala, Pavel Smrz, Top Ontology as a Tool for Semantic Role Tagging
Georgiana Puşcaşu, A Framework for Temporal Resolution
Guadalupe Aguado de Cea, Inmaculada Álvarez-de-Mon, Antonio Pareja-Lora, OntoTag's Linguistic Ontologies: Enhancing Higher Level and Semantic Web Annotations
Session P23-W: Tools, Systems & Applications
Kamlesh Dutta, Saroj Kaushik, Nupur Prakash, Information Extraction from Hindi Texts
Núria Bel, Cornelis H.A. Koster, Marta Villegas, Cost-effective Cross-lingual Document Classification
Toni Badia, Àngel Gil, Martí Quixal, Oriol Valentín, NLP-enhanced Error Checking for Catalan Unrestricted Text
Stephan Busemann, Hans-Ulrich Krieger, Resources and Techniques for Multilingual Information Extraction
Horacio Saggion, Identifying Definitions in Text Collections for Question Answering
Joaquim Moré, Salvador Climent, Antoni Oliver, A Grammar and Style Checker Based on Internet Searches
Hristo Tanev, Milen Kouylekov, Matteo Negri, Bonaventura Coppola, Bernardo Magnini, Multilingual Pattern Libraries for Question Answering: a Case Study for Definition Questions
Walter Kasper, Jörg Steffen, Jakub Piskorski, Paul Buitelaar, Integrated Language Technologies for Multilingual Information Services in the MEMPHIS Project
Rachel Aires, Aline Manfrin, Sandra Aluísio, Diana Santos, What is my Style? Using Stylistic Features of Portuguese Web Texts to Classify Web Pages According to Users' Needs
Catarina Ribeiro, Ricardo Santos, João Correia, Rui Pedro Chaves, Palmira Marrafa, INQUER: A WordNet-based Question-Answering Application
Márton Miháltz, Word Sense Disambiguation Using Random Indexing
Stephan Busemann, EGRAM C A Grammar Development Environment and its Usage for Language Generation
Roberto Basili, Nicola Lorusso, Maria Teresa Pazienza, Fabio Massimo Zanzotto, A2Q: An Agent-based Architecure for Multilingual Q&A
Dragomir Radev, Timothy Allison, Sasha Blair-Goldensohn, John Blitzer, Arda Çelebi, Stanko Dimitrov, Elliott Drabek, Ali Hakim, Wai Lam, Danyu Liu, Jahna Otterbacher, Hong Qi, Horacio Saggion, Simone Teufel, Michael Topper, Adam Winkel, Zhu Zhang, MEAD - A Platform for Multidocument Multilingual Text Summarization
Mark Hepple, Neil Ireson, Paolo Allegrini, Simone Marchi, Simonetta Montemagni, Jose Maria Gomez Hidalgo, NLP-enhanced Content Filtering Within the POESIA Project
Session P24-T: Terminology Tools & Data
Michael Carl, Ecaterina Rascu, Johann Haller, Using Weighted Abduction to Align Term Variant Translations in Bilingual Texts
Satoshi Sekine, Chikashi Nobata, Definition, Dictionaries and Tagger for Extended Named Entity Hierarchy
Dan Tufis, Term Translations in Parallel Corpora: Discovery and Consistency Check
Daniel Ferrés, Marc Massot, Muntsa Padró, Horacio Rodríguez, Jordi Turmo, Automatic Classification of Geographic Named Entities
I. Alegria, A. Gurrutxaga, P. Lizaso, X. Saralegi, S. Ugartetxea, R. Urizar, A XML-Based Term Extraction Tool for Basque
Natalia V. Loukachevitch, Boris V. Dobrov, Development of Bilingual Domain-Specific Ontology for Automatic Conceptual Indexing
Bodil Nistrup Madsen, Hanne Erdman Thomsen, Carl Vikner, Principles of a System for Terminological Concept Modelling
Melania Degeratu, Vasileios Hatzivassiloglou, An Automatic Method for Constructing Domain-Specific Ontology Resources
Gabriella Pardelli, Manuela Sassi, Sara Goggi, From Weaver to the ALPAC Report
Session P25-EW: Evaluation of Language Technologies
Alvin F. Martin, John S. Garofolo, Jonathan C. Fiscus, Audrey N. Le, David S. Pallett, Mark A. Przybocki, Gregory A. Sanders, NIST Language Technology Evaluation Cookbook
Yasuhiro Akiba, Eiichiro Sumita, Hiromi Nakaiwa, Seiichi Yamamoto, Hiroshi G. Okuno, Incremental Methods to Select Test Sentences for Evaluating Translation Ability
Andrew Finch, Yasuhiro Akiba, Eiichiro Sumita, How Does Automatic Machine Translation Evaluation Correlate with Human Scoring as the Number of Reference Translations Increases?
Anne Vilnat, Patrick Paroubek, Laura Monceaux, Isabelle Robba, Véronique Gendner, Gabriel Illouz, Michèle Jardino, The Ongoing Evaluation Campaign of Syntactic Parsing of French: EASY
Rita Nüebel, Evaluation and Adaptation of a Specialised Language Checking Tool for Non-specialised Machine Translation and Non-expert MT Users for Multi-lingual Telecooperation
Bogdan Babych, Debbie Elliott, Anthony Hartley, Calibrating Resource-light Automatic MT Evaluation: a Cheap Approach to Ranking MT Systems by the Usability of Their Output
Gabriel Infante-Lopez, Maarten de Rijke, Comparing the Ambiguity Reduction Abilities of Probabilistic Context-Free Grammars
Jennifer Foster, Parsing Ungrammatical Input: an Evaluation Procedure
Per Weijnitz, Eva Forsbom, Ebba Gustavii, Eva Pettersson, Jörg Tiedemann, MT Goes Farming: Comparing Two Machine Translation Approaches on a New Domain
Timothy Baldwin, Emily M. Bender, Dan Flickinger, Ara Kim, Stephan Oepen, Road-testing the English Resource Grammar Over the British National Corpus
Ying Zhang, Stephan Vogel, Alex Waibel, Interpreting BLEU/NIST Scores: How Much Improvement do We Need to Have a Better System?
Session P26-M: Multimodal Annotation Tools
Bayan Abu Shawar, Eric Atwell, A Chatbot as a Novel Corpus Visualization Tool
Carmen Garcia-Mateo, Javier Dieguez-Tirado, Laura Docio-Fernandez, Antonio Cardenal-Lopez, Transcrigal: A Bilingual System for Automatic Indexing of Broadcast News
Hennie Brugman, Albert Russel, Annotating Multi-media/Multi-modal Resources with ELAN
Christian Weiss, A Framework for Data-driven Video-realistic Audio-visual Speech-synthesis
David Day, Chad McHenry, Robyn Kozierok, Laurel Riek, Callisto: A Configurable Annotation Workbench
Kazuaki Maeda, Stephanie Strassel, Annotation Tools for Large-Scale Corpus Development: Using AGTK at the Linguistic Data Consortium
Session P27-SE: Spoken Corpora & Evaluation
Bojan Kotnik, Zdravko Kačič, Bogomir Horvat, The Development and Integration of the LDA-Toolkit Into COST249 SpeechDat(II) SIG Reference Recognizer
Özlem Öztürk, Özgul Salor, Tolga Çiloğlu, Mubeccel Demirekler, Duration Modeling For Turkish Text-to-Speech Synthesis System
Tania Ellbogen, Florian Schiel, Alexander Steffen, The BITS Speech Synthesis Corpus for German
Janez Žibert, France Mihelič, Development of Slovenian Broadcast News Speech Database
Daniel Tihelka, Jindřich Matoušek, The Design of Czech Language Formal Listening Tests for the Evaluation of TTS Systems
Andrej Žgank, Tomaž Rotovnik, Mirjam Sepesy Maučec, Darinka Verdonik, Janez Kitak, Damjan Vlaj, Vladimir Hozjan, Zdravko Kačič, Bogomir Horvat, Acquisition and Annotation of Slovenian Broadcast News Database
Andrej Žgank, Zdravko Kačič, Frank Diehl, Klara Vicsi, Gyorgy Szaszak, Jozef Juhar, Slavomir Lihan, The COST 278 MASPER Initiative - Crosslingual Speech Recognition with Large Telephone Databases
Janez Stergar, Caglayan Erdem, Bogomir Horvat, Zdravko Kačič, A Data-driven Adaptation of Prosody in a Multilingual TTS
Daniel Aioanei, Julie Carson-Berndsen, Anja Geumann, Robert Kelly, Moritz Neugebauer, Stephen Wilson A Multilingual Phonological Resource Toolkit for Ubiquitous Speech Technology
Slaven Bilac, Timothy Baldwin, Hozumi Tanaka, Evaluating the FOKS Error Model
Guillaume Gibert, Gérard Bailly, Frédéric Eliséi, Denis Beautemps, Rémi Brun, Evaluation of a Speech Cuer: From Motion Capture to a Concatenative Text-to-cued Speech System
V. Guijarrubia, I. Torres, L.J. Rodríguez, Evaluation of a Spoken Phonetic Database in Basque Language
Laurence Devillers, Hélène Maynard, Sophie Rosset, Patrick Paroubek, Kevin McTait, D. Mostefa, Khalid Choukri, Laurent Charnay, Caroline Bousquet, Nadine Vigouroux, Frédéric Béchet, Laurent Romary, Jean-Yves Antoine, J. Villaneau, Myriam Vergnes, J. Goulian, The French MEDIA/EVALDA Project: the Evaluation of the Understanding Capability of Spoken Language Dialogue Systems
Christophe Van Bael, Helmer Strik, Henk van den Heuvel, On the Usefulness of Large Spoken Language Corpora for Linguistic Research
Panagiotis Zervas, Manolis Maragoudakis, Nikos Fakotakis, George Kokkinakis, Learning to Predict Pitch Accents Using Bayesian Belief Networks for Greek Language
S.R. Deepa, Kalika Bali, A.G. Ramakrishnan, Partha Pratim Talukdar, Automatic Generation of Compound Word Lexicon for Hindi Speech Synthesis
Thierry Poibeau, Bénédicte Goujon, Semi-automatic Acquisition of Command Grammar
Lars Bo Larsen, Usability Evaluation of Spoken Dialogue Systems
António Teixeira, Liliana Ferreira, Lurdes Moutinho, Rosa Lídia Coimbra, Raquel Lisboa, An Acoustic Corpus Contemplating Regional Variation for Studies of European Portuguese Nasals
Kazuki Adachi, Tomoki Toda, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano, Perceptual Evaluation of Quality Deterioration Owing to Prosody Modification
Saurabh Garg, Bilyana Martinovski, Susan Robinson, Jens Stephan, Joel Tetreault, David R. Traum, Evaluation of Transcription and Annotation Tools for a Multi-modal, Multi-party Dialogue Corpus
Kitazawa Shigeyoshi, Kiriyama Shinya, Itoh Toshihiko, Nick Campbell, Japanese MULTEXT: a Prosodic Corpus
Session O43-W: Semantics & Semantic Web
Vivi Năstase, Rada Mihalcea, Finding Semantic Associations on Express Lane
Nancy Ide, David Woolner, Exploiting Semantic Web Technologies for Intelligent Access to Historical Documents
Hiroyuki Kaji, Osamu Imaichi, Constructing Word-Sense Association Networks from Bilingual Dictionary and Comparable Corpora
Session O44-EW: Corpus Annotation & Evaluation
Brian Mitchell, Robert Gaizauskas, A Labelled Corpus for Prepositional Phrase Attachment
Kateřina Veselá, Jiří Havelka, Eva Hajičová, Annotators’ Agreement: The Case of Topic-Focus Articulation
Ana-Maria Barbu, A Word Alignment System Based on a Translation Equivalence Extractor
Session O45-STW: Lexicon Syntax & Semantics
Canasai Kruengkrai, Thatsanee Charoenporn, Virach Sornlertlamvanich, Hitoshi Isahara, Enriching a Thai Lexical Database with Selectional Preferences
Lonneke van der Plas, Vincenzo Pallotta, Martin Rajman, Hatem Ghorbel, Automatic Keyword Extraction from Spoken Text. A Comparison of Two Lexical Resources: EDR and WordNet
Montserrat Marimon, Núria Bel, Lexical Entry Templates for Robust Deep Parsing
Session O46-MW: Annotation of Multimodal Corpora
Thorsten Trippel, Dafydd Gibbon, Alexandra Thies, Jan-Torsten Milde, Karin Looks, Benjamin Hell, Ulrike Gut, CoGesT: a Formal Transcription System for Conversational Gesture
Harry Bunt, Laurent Romary, Standardization in Multimodal Content Representation: Some Methodological Issues
Ajay S. Bhaskarabhatla, Sriganesh Madhvanath, Experiences in Collection of Handwriting Data for Online Handwriting Recognition in Indic Scripts
Session O47-W: Treebanks
Heike Telljohann, Erhard Hinrichs, Sandra Kübler, The Tüba-D/Z Treebank: Annotating German with a Context-Free Backbone
Anne Abeillé, Nicolas Barrier, Enriching a French Treebank
Eleni Miltsakaki, Rashmi Prasad, Aravind Joshi, Bonnie Webber, The Penn Discourse Treebank