FANDOM


Apertium

Type

Dataset

Link

http://www.apertium.org/

Source

Ckan.net

"Apertium is a toolbox to build open-source shallow-transfer machine translation systems, especially suitable for related language pairs: it includes the engine, maintenance tools, and open linguistic data for several language pairs."

Language-pair data includes:


    Spanish ⇆ Catalan (apertium-es-ca)


    Spanish ← Romanian (apertium-es-ro)


    French ⇆ Catalan (apertium-fr-ca)


    Occitan ⇆ Catalan (apertium-oc-ca)


    English ⇆ Galician (apertium-en-gl)


    Swedish → Danish (apertium-sv-da)


    Occitan ⇆ Spanish (apertium-oc-es)


    Spanish ⇆ Portuguese (apertium-es-pt)


    English ⇆ Catalan (apertium-en-ca)


    English ⇆ Spanish (apertium-en-es)


    English ⇆ Esperanto (apertium-en-eo)


    Spanish ⇆ Galician (apertium-es-gl)


    French ⇆ Spanish (apertium-fr-es)


    Esperanto ← Spanish (apertium-eo-es)


    Welsh → English (apertium-cy-en)


    Breton → French (apertium-br-fr)


    Esperanto ← Catalan (apertium-eo-ca)


    Portuguese ⇆ Catalan (apertium-pt-ca)


    Portuguese ⇆ Galician (apertium-pt-gl)


    Basque → Spanish (apertium-eu-es)


    Norwegian Nynorsk ⇆ Norwegian Bokmål (apertium-nn-nb)

The above are the "released" language pairs, data includes:


    dictionaries for morphological analysis and generation


    disambiguation (statistical models, rules, in some cases Constraint Grammars)


    bilingual (transfer) dictionaries


    structural transfer rules

There is also a lot of data of the above kinds for unreleased language pairs, eg. Icelandic → English, North Sámi → Lule Sámi; and tools to maintain such data.

License COPYING file in language pair data archive contains a copy of the GPL.

Ad blocker interference detected!


Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.