[下载]NooJ: a linguistic development environment

回复: [下载]NooJ: a linguistic development environment

NooJ v2.0 RELEASED?

Dear colleagues,

we are pleased to announce the release of NooJ v2.0.

NooJ is a linguistic engineering development platform that allows
linguists and NLP developers to formalize various levels of
linguistic phenomena, and build various applications of NLP. See
www.nooj4nlp.net to download freely the software, its manual and
linguistic resources.

Beside a number of enhancements of the interface (syntax coloring,
linguistic resource management, etc.) and of its included free
linguistic resources, v2.0 contains:

-- A new corpus processor that applies a typical NooJ linguistic
query to a corpus made of 10,000+ texts in a few minutes.

-- A more robust dictionary compiler. For instance, it compiles the
Hungarian dictionary that describes the equivalent of a list of 120+
million word forms in a few hours (it takes a few minutes to compile
the English dictionary).

-- A new linguistic engine that better integrates the morphological
and syntactic levels of analyses via new operations on variables. Its
more visible enhancements are its two types of constraints:

<$N=:N+Hum> checks that the linguistic unit stored in variable $N
matches the query <N+Hum> (any NooJ query is valid to the right of
the operator "=:")

<$N$Nb="p"> checks that the value of the lexical property "Nb" of the
linguistic unit stored in $N is equal to "p"

<$N$Nb=$A$Nb> checks that the value of the lexical property "Nb" of
both linguistic units $N and $A are equal

Lexical properties can be set either in each dictionary entry
(e.g. "+Nb=p") or in the properties' definition file (via a rule such
as "Nb = s + p;").

-- When variables are not explicitely set, NooJ links them to the
corresponding lexical symbols. For instance, $N will be linked to the
nearest symbol <N> and $N$Vsup will encode the value of the property
VSup for the noun. Series of variables that are set in a loop can be
retrieved with the series' variable symbol "$$". For instance, the
series of adjectives that occur to the left of a noun can be accessed
with the symbol $$A

-- The Machine Translation engine now allows to perform checks on
recursively defined linguistic units. For instance,
<$N$ZH$Cl=$A$ZH$Cl> checks that the classifiers of the Chinese
translation of a noun and an adjective match.

-- noojapply includes the new dictionary and corpus processors; it
parses texts in which text units are delimited with XML tags (such as
<p> or <s>).

Enjoy,
Max Silberztein
 
回复: [下载]NooJ: a linguistic development environment

Hello everyone,

I'm chinese Nooj user in LASELDI, working on French-chinese machine translation. If u have any problem with Nooj, u can send mails to nooj-info@yahoogroups.com (in french or in English). By the way, we have a new version B0301 to update now http://www.nooj4nlp.net/ ).

Enjoy

Mei

PS:

Dear colleagues,

a new update for NooJ is available for download. Some important fixes,
including:

-- multi-word expressions were not always recognized correctly by
looking up a compound-word dictionary

-- there is now synchronization between the text and its TAS. if you
click a word in the text and the TAS is displayed, it should bring you
to the corresponding annotation

Enjoy
--Max
 
回复: 能否教一下如何用Nooj检索其附带语料库之外的语料,如Loncess或CLEC,

我在试用Nooj检索其附带text, "the portrait of a lady"中的it<be><A>that句型时一切正常,可是当我用在LOCNESS时无法使用上述Nooj regex,但可以检索具体的如“it is true that"句子。(我已经创建了一个一个Loncess.noc的文件)。不知为什么?
能否教一下如何用Nooj检索其附带语料库之外的语料,如Loncess或CLEC,以it<be><A>that为例?谢谢!
2。另可以创建new corpus, 但是为什么不能new text, 已经try and error一个晚上了,还是解决不了这两个问题,请帮我解决一下,非常谢谢!
 
Back
顶部