AILab@IMCS web service: morphological analyzer/synthesizer

category LRT (Language Resources and Technology)
language Latvian
access rights The web service is primarily intended for testing and evaluation purposes, and for experimental/academic/non-commercial usage.
Different terms of use can be discussed individually (see contacts below).
Please, let us know how and where you are using or would like to use this service or the underlying morphological lexicon.
download
Creative Commons License
Latvian Morphological Lexicon by Institute of Mathematics and Computer Science at the University of Latvia is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
description The morphological lexicon is based on the lexicon of the Dictionary of the Standard Latvian Language.
Participles and numerals currently are not included in the lexicon. Also some morphological features are not included in the tagset yet.
statistics
   lemmas  word forms  features*
nouns  32 386  355 488  710 976
pronouns  51  472  944
adjectives  6 086  681 632  3 408 160
verbs  12 002  347 729  1 174 964
adverbs  6 497  6 497  0
adpositions  40  40  0
conjunctions  28  28  0
interjections  288  288  0
particles  53  53  0
total  57 431  1 392 227  5 295 044

*Entry level features (partOfSpeech, grammaticalGender - for nouns and pronouns, and reflexivity) are not included in the figures.

last update 2011-03-11
service type RESTful
interface

http://valoda.ailab.lv/ws/morph/?method=analyze&writtenForm=<word_form>[&PID=<boolean>]
http://valoda.ailab.lv/ws/morph/?method=synthesize&writtenForm=<lemma>[&partOfSpeech=<word_class>][&PID=<boolean>]

PID - indicates whether to include or not the persistent identifiers (true by default)
boolean = true | false
word_class = noun | pronoun | adjective | verb | adverb | adposition | conjunction | interjection | particle

HTTP method POST / GET
MIME type request: application/x-www-form-urlencoded
response: text/xml
encoding UTF-8
annotation standards LMF (ISO 24613)
ISOcat (ISO 12620); data categories used by the analyzer/synthesizer are listed here.
contact us
demo
method writtenForm PID  

e.g.: es/es, ceļu/celt, māju/māja

          

(c) IMCS UL, 2010 - 2011