Basic usage

Synset and Variants

To create a new (empty) synset:

>>> import eurown
>>> a = eurown.Synset()
>>> print a
<eurown.Synset object at 0x80ab10c>
>>> a.polarisText
u'0 WORD_MEANING'
>>> print a.polarisText
0 WORD_MEANING
>>>

Property polarisText returns (unicode) string of Synset in Polaris import-export format.

Synset has part of speech property, that can be one of ‘a’,’b’,’v’,’n’, or pre-defined as ‘pn’ if we have WORD_INSTANCE instead of WORD_MEANING:

>>> a.pos = 'n'
>>> print a.polarisText
0 WORD_MEANING
  1 PART_OF_SPEECH "n"
>>> b = eurown.WordInstance()
>>> print b.polarisText
0 WORD_INSTANCE
  1 PART_OF_SPEECH "pn"

To make some new variants (literal and sense number, gloss for var3 as well):

>>> var1 = eurown.Variant(literal='test',sense=1)
>>> var2 = eurown.Variant(literal='trial',sense=1)
>>> var3 = eurown.Variant(literal='test',sense=2)
>>> var3.gloss = u'This is test'
>>> var4 = eurown.Variant(literal='exam',sense=1)

Let’s assign variants var1 and var2 to synset a:

>>> a.variants = eurown.Variants([var, var2])
>>> print a.polarisText
0 WORD_MEANING
  1 PART_OF_SPEECH "n"
  1 VARIANTS
    2 LITERAL "test"
      3 SENSE 1
    2 LITERAL "trial"
      3 SENSE 1

and make a new synset and assign to it variants var3 and var4:

>>> snset2 = eurown.Synset(pos='n')
>>> snset2.variants = eurown.Variants([var3, var4])
>>> print var3.polarisText
2 LITERAL "test"
  3 SENSE 2
  3 DEFINITION "this is test"

pluss vaiant var5 to append directly to snset2.variants:

>>> snset2.variants.append(eurown.Variant(literal='examination',sense=1))

Now we should have a synset (snset2) with three variants:

>>> print snset2.polarisText
0 WORD_MEANING
  1 PART_OF_SPEECH "n"
  1 VARIANTS
    2 LITERAL "test"
      3 SENSE 2
      3 DEFINITION "this is test"
    2 LITERAL "exam"
      3 SENSE 1
    2 LITERAL "examination"
      3 SENSE 1

Internal Relations

Relation consists of relation name and target concept. Let’s make a synset that would have a relation to our snset2:

>>> snset3 = eurown.Synset(pos='n')
>>> var6 = eurown.Variant(literal="communication",sense=1)
>>> var7 = eurown.Variant(literal="communicating",sense=1)
>>> snset3.variants = eurown.Variants([var6,var7])
>>> print snset3.polarisText
0 WORD_MEANING
  1 PART_OF_SPEECH "n"
    1 VARIANTS
      2 LITERAL "communication"
        3 SENSE 1
      2 LITERAL "communicating"
        3 SENSE 1

Now we can link it to our snset2 via “has_hyperonym” relation:

>>> rel = eurown.Relation(name='has_hyperonym',target_concept=snset3)
>>> snset2.addRelation(rel)
>>> print snset2.polarisText
0 WORD_MEANING
  1 PART_OF_SPEECH "n"
  1 VARIANTS
    2 LITERAL "test"
      3 SENSE 2
      3 DEFINITION "This is test"
    2 LITERAL "exam"
      3 SENSE 1
    2 LITERAL "examination"
      3 SENSE 1
  1 INTERNAL_LINKS
    2 RELATION "has_hyperonym"
      3 TARGET_CONCEPT
        4 PART_OF_SPEECH "n"
        4 LITERAL "communication"
          5 SENSE 1

The same result will give the addRelation() function.

Parsing IO File

Parsing Polaris IO file is done by Parser. At first, we should create an instance of a parser:

>>> p = eurown.Parser()

Parser can get file name:

>>> p.fileName = 'kb59-utf_8.txt'

We can parse one line, one synset, or even one wordnet file at a time.

WordNet

The module can deal with more than one wordnet at a time. While instantiating a wordnet, we should give file name and make all necessary indexes. Making indexes may take time:

>>> wn = eurown.WordNet(name='et', ioFileName='kb59-utf_8.txt')
>>> wn.make_indexes()

Examples

A script that will ask user for a word to find and prints out some basic information (literal, sense, gloss and examples) for each synset:

import eurown

wn = eurown.WordNet(name='et',
                    ioFileName='kb59-utf_8.txt')

wn.make_indexes()

def test_by_literal(literal):
    if literal in wn.literalIndex:
        snset_offsets = wn.literalIndex[literal]
        for i in snset_offsets:
            print i
            p = eurown.Parser(fileName='kb59-utf_8.txt')
            synset = p.parse_synset(offset=i)
            print 5*'='
            for j in synset.variants:
                print '%s_%d' % (j.literal, j.sense)
                print j.gloss
                print j.examples

def show_synsets():
    while 1:
        a = raw_input('otsi: ')
        test_by_literal(a)

show_synsets()

Table Of Contents

Previous topic

Introduction

Next topic

Polaris Import Specifications

This Page