Assignment 1: WordNet

Deadline: Friday, September 18, 23.59 CET

How to report the assignment

  • The completed assignment file assign1.py should be mailed to ildiko.pilan@svenska.gu.se, with an appropriate subject line. Remember to write your full names in the mail, and as a header in the assignment file (as a doc string).
  • Requirements

    Description

    In this assignment we will work with the Princeton WordNet.

    1. Read NLTK book chapter 2.5 about Princeton WordNet, and test the examples.

      Note 1: the focus of this assignment is on the synonymy relation with its corresponding concept synset (Senses and Synonyms in chapter 2.5), but have a look at the other relations also.

    2. Create a file assign1.py containing the code below. (replace NAME with your full names). Define all functions in this file.

      """Assignment 1: WordNet
      Name 1: NAME
      Name 2: NAME
      """
      from nltk.corpus import wordnet as wn
      import random
      
    3. Define a function synset_string(word) that given word returns a string with the following information for every synset it is a member of:

      If word is missing from WordNet, output no synsets.

      Hint 1: before you start, think through what needs to be done, and define help functions where appropriate.

      Hint 2: use string formatting.

      The output of synset_string is exemplified below.

      >>> import assign1
      
      >>> assign1.synset_string('dog')
      'synset 1: {dog, domestic_dog, Canis_familiaris}\n  def: "a member of the genus Canis [...] the rabbit"\n\n'
      
      >>> print(assign1.synset_string('dog'))
      synset 1: {dog, domestic_dog, Canis_familiaris}
        def: "a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds"
        example: "the dog barked all night"
      
      synset 2: {frump, dog}
        def: "a dull unattractive unpleasant girl or woman"
        example: "she got a reputation as a frump"
        example: "she's a real dog"
      
      synset 3: {dog}
        def: "informal term for a man"
        example: "you lucky dog"
      
      synset 4: {cad, bounder, blackguard, dog, hound, heel}
        def: "someone who is morally reprehensible"
        example: "you dirty dog"
      
      synset 5: {frank, frankfurter, hotdog, hot_dog, dog, wiener, wienerwurst, weenie}
        def: "a smooth-textured sausage of minced beef or pork usually smoked; often served on a bread roll"
        no examples
      
      synset 6: {pawl, detent, click, dog}
        def: "a hinged catch that fits into a notch of a ratchet to move a wheel forward or prevent it from moving backward"
        no examples
      
      synset 7: {andiron, firedog, dog, dog-iron}
        def: "metal supports for logs in a fireplace"
        example: "the andirons were too hot to touch"
      
      synset 8: {chase, chase_after, trail, tail, tag, give_chase, dog, go_after, track}
        def: "go after with the intent to catch"
        example: "The policeman chased the mugger down the alley"
        example: "the dog chased the rabbit"
      
      >>> print(assign1.synset_string('qrsx'))
      no synsets
      
    4. Define a function synonyms(word) that returns a set of lemma_names of the synsets of word.

      The output of synonyms is exemplified below.

      >>> assign1.synonyms('dog')
      { 'go_after', 'chase_after', 'pawl', 'dog', 'wiener', 'tag',
      'frankfurter', 'hound', 'click', 'chase', 'andiron', 'hot_dog', 
      'tail', 'Canis_familiaris', 'give_chase', 'wienerwurst', 'bounder', 
      'domestic_dog', 'track', 'frank', 'trail', 'blackguard', 'weenie', 
      'frump', 'firedog', 'detent', 'dog-iron', 'cad', 'heel', 'hotdog' }
      
      >>> assign1.synonyms('computer')
      { 'data_processor', 'computing_machine', 'information_processing_system', 
      'calculator', 'computing_device', 'estimator', 'electronic_computer', 
      'computer', 'figurer', 'reckoner' }
      >>> assign1.synonyms('qrsx')
      { }
      
    5. Define a function random_pick(lst) that randomly selects an element from a list.

      Hint: use random.randint in module random. Try out random.randint(1,10) (execute it a couple of times).

      The output of random_pick is exemplified below.

      >>> assign1.random_pick(['a','b','c','d','e'])
      'e'
      
      >>> assign1.random_pick(['a','b','c','d','e'])
      'b'
      
      >>> assign1.random_pick(['a','b','c','d','e'])
      'c'
      
    6. Define a function random_synonym(word) that randomly generate one of the synonyms of word, except word itself, unless it is the only word.

      If word is not in WordNet, then the function returns *word* (a star in the beginning and end of word).

      Hint: use synonyms and random_pick.

      The output of random_synonym is exemplified below.

      >>> assign1.random_synonym('dog')
      'bounder'
      
      assign1.random_synonym('dog')
      'go_after'
      
      >>> assign1.random_synonym('dog')
      'hound'
      
      >>> assign1.random_synonym('computer')
      'computing_machine'
      
      >>> assign1.random_synonym('xqrs')
      '*xqrs*'
      
    7. Define a function synonymify(sentence) that apply random_synonym to every word in sentence.

      The output of synonymify is exemplified below.

      >>> assign1.synonymify('the dog sits in front his computer')
      '*the* wienerwurst baby-sit IN front_man *his* estimator'
      
      >>> assign1.synonymify('the dog sits in front his computer')
      '*the* frankfurter baby-sit inch front_line *his* calculator'
      
      >>> assign1.synonymify('the dog sits in front his computer')
      '*the* go_after sit_around IN look *his* data_processor'
      
    8. To think about: the output of synonymify is rather strange. Do you have any ideas on how to improve the function so the meaning of sentence is preserved?