Open language resources for promoting the development of language and technology in Sweden
Reference number | |
Coordinator | Institutet för språk och folkminnen - Språkrådet |
Funding from Vinnova | SEK 136 185 |
Project duration | November 2013 - June 2014 |
Status | Completed |
Important results from the project
The project´s goal has been to make many of the Language Councils lexicons and term lists freely available as open data in order to stimulate the development of technologies for the languages in Sweden and contribute to increased access to information and services. That goal has been met by making freely available 28 dictionaries comprising a total of over 500 000 words under the license CC BY. We have also listed 175 word lists that we can make available upon order.
Expected long term effects
The following dictionaries and word lists have been made directly available as open data at http://www.sprakochfolkminnen.se/sprak/sprak-och-it/oppna-sprakdata.html: - Lexin: Swedish dictionary + bilingual dictionaries between Swedish and 19 minority languages, about 28 000 entries for each language (in some cases 5000). - Wordlist for interpreters: Swedish terms, about 5 000 entries. - Multilingual wordlists: 6 small and large Swedish-Finnish term lists totaling about 10 000 words. 175 other listed multilingual wordlists are available on order from the Language Council.
Approach and implementation
The open language resources have been made available on the basis of a systematic inventory, prioritization, investigation and processing of data. Wordlists that have not previously been available in xml format have been translated into xml. As a result, the Language Council has also created better management practices for new resources so that they can be identified, maintained and made available in a systematic way according to the principle that our resources should be as openly accessible as possible.