In computer-assisted language learning a main task is to repeatedly evaluate and compare the progress of language learners using language proficiency tests. We present a linguistically grounded model for predicting the difficulty of such tests based on four dimensions: solution difficulty, candidate ambiguity, inter-gap dependency, and paragraph difficulty. We show that cues from all four dimensions contribute to the test difficulty and that our model performs on par with human experts.