| Resource | Task | Split | Citation | 
|---|---|---|---|
| Aspect-Based Sentiment Analysis (Immigration) | Label the sentiment that the author of a text expressed towards immigration on the 1--5 scale | 10-fold cross-validation, consecutive | [1] | 
| DaLAJ | Determine whether a sentence is correct Swedish or not | Hold-out | [2] | 
| Swedish FAQ (mismatched) | Match the question with the answer within a category | Test only | |
| SweSAT synonyms | Select the correct synonym or description of a word or expression | Test only | |
| Swedish Analogy test set | Given two word pairs A:B and C:D, capture that the relation between A and B is the same as between C and D | Test only | [3] | 
| Swedish Test Set for SemEval 2020 Task 1: Unsupervised Lexical Semantic Change Detection | Determine whether a given word has changed its meaning during a hundred year period | Test only | [4] | 
| Determine to what extent a given word has changed its meaning during a hundred year period | |||
| SweFraCas | Given the question and the premises, choose the suitable answer | Test only | |
| SweWinograd | Resolve pronouns to their antecedents in items constructed to require reasoning (Winograd Schemata) | Test only | |
| SweWinogender | Find the correct antecedent of a personal pronoun, avoiding the gender bias | Test only | [5] | 
| SweDiagnostics | Determine the logical relation between the two sentences | Test only | |
| SweParaphrase | Determine how similar two sentences are | Test only | |
| SuperSim | Predict semantic word similarity and/or relatedness between words out of context. | Test only | [6] | 
| SweWiC | Say if instances of a word in two contexts represent the same word sense. | Test only | 
SuperLim does not currently have a leaderboard (though we do hope to create one). As a temporary solution, we will be collecting the available information here. If you have evaluated your model on some of our data, please let us know, we will add your results!
Yes, in its current version SuperLim is mostly a suite of test sets (however, splits into train, dev and test are provided for some of the larger resources). We strive to develop it further, which will hopefully result in training data appearing here as well.
There is currently no predefined answer, since we do not want to impose any unnecessary restrictions on how the models solve the tasks. We suggest that you devise the necessary method yourself (you can e.g. average across contextualized embeddings in order to generate "classic/statis/type" embeddings). It is important, however, that you document what you do very clearly (if you average: how exactly? If you use any additional corpora for that, which ones?), since that might affect comparability.
Instructions will appear here later. But if you have already trained something (even if it's one task and not the whole set), drop us a line about how it went, it will really help us!
Please contact us at sb-info@svenska.gu.se. Do the same if you have any other question not covered here.
The name of the collection is SuperLim. The initial work on it was funded by the project, which is called SuperLim in Swedish and SwedishGlue in English.