Embeddings (mappings of linguistic units, such as words, sentences, characters, to vectors of real numbers) are playing an extremely important role in modern language technology. Training the embedding models is often costly, which is why pretrained embeddings are widely used. On this page we provide lists of various pretrained embeddings for Swedish and of studies that focus on evaluating Swedish embeddings. If you have suggestions or comments, please contact us.
- Sahlgren, Magnus, and Fredrik Olsson. 2016. Gender Bias in Pretrained Swedish Embeddings. Proceedings of the 22nd Nordic Conference on Computational Linguistics.
- Fallgren, Per, Jesper Segeblad, and Marco Kuhlmann. 2016. Towards a standard dataset of swedish word vectors. Sixth Swedish Language Technology Conference (SLTC).
- Holmer, Daniel. 2020. Context matters: Classifying Swedish texts using BERT's deep bidirectional word embeddings. Bachelor thesis at Linköping University.
- Adewumi, Tosin, Foteini Liwicki and Marcus Liwicki. 2020. Exploring Swedish & English fastText Embeddings with the Transformer
- Adewumi, Tosin, Foteini Liwicki and Marcus Liwicki. 2020. Corpora Compared: The Case of the Swedish Gigaword & Wikipedia Corpora