From lingua at geez.org Sun Nov 25 18:04:23 2007 From: lingua at geez.org (Daniel Yacob) Date: Sun Nov 25 19:28:09 2007 Subject: [am-nlp] Lexical Resources added to NLP.Amharic.Org Message-ID: Greetings All, I regret deeply that I can not be in Addis Ababa this week to participate in the "International Workshop on Language Resource Development for Language Technology Research in Ethiopian Languages" starting today. My best wishes go out to the workshop participants and hope that we might see a workshop summary sent to the list afterward. On the occasion of the workshop I have made an extra effort to make public some lexical resources online under the Amharic NLP site: http://nlp.amharic.org/resources/lexical/word-lists/ Selected highlights: Dictionaries: Desta Tekle Wold — Addis Yamarinya Mezgebe Kalat (term entry column, 58,323 entries) Tessema Habte Mikael Getzew — Kasatie Birhan Tessema YeAmarinya Mezgebe Kalat (full lexicon with glosses, 59,791 entries) Amsalu Aklilu — Amharic English Dictionary (word list only, 16,231 entries) Caressa Ferruccio - "Manuale Linguistico Per L'Africa Orientale" 1936 (1,471 Italian words translated into: Afaam Oromoo, Amharic, Tigrinya, French and Arabic) Verbs: C.H. Armbruster - "Initia Amharica" (983 verbs in 9 conjugations) Bender-Fulas — "Amharic Verb Morphology" (1,263 verbs in 5 conjugations) Merged Verbs of Arbruster & Bender-Fulas (1,775 verbs are in 9 basic conjugations with translations, 15,975 verbs in total) Toponymic: Solomon Gebre Christos — "List of Ethiopian Authors" (754 full names of persons and institutes) Berhan Ayyalew — "Transliteration of Some Amharic Names..." (7,194 personal names) Geographical Names (3,959 names of places) These resources will go through minor updates over the next two weeks (largely for documentation) then will likely freeze for at least two months. A merged word list from the dictionaries should also be added within this period. Getting the materials online today is more a benefit of the extended US holiday of Thanksgiving this week. I had earlier tried to get the resources online for the Ethiopian New Year but failed, then promised to get them up by mid-October and failed, and tried again for Wolf Leslau's birthday last week and still didn't have enough time. Almost every resource is under some stage of development. I have resisted placing them online until they were fully refined and defect free. However, this is turning out to be a lengthy process and I end up getting lots of requests for them anyway. The status of each resource is indicated on its individual page. My own intentions are to merge the resources into an RDF(S)/OWL knowledge base and take that toward a WordNet. Experimentation here has gone well and I should be placing some early results online early next year. I would be interested to hear requests for other useful formats that people would like to have. Related, I do have a Ge'ez word list with POS if anyone is interested in it (placing it under amharic.org didn't make total sense but I will probably add it there anyway), and a full Sabaean lexicon is also in the pipeline for research use. cheers, -Daniel