from nltk.corpus import wordnet as guru

Find Synonyms from NLTK WordNet in Python

Stats reveal that there are 155287 words and 117659 synonym sets included with English WordNet. Different methods available with WordNet can be found by typing dir(guru) [‘_LazyCorpusLoader__args’, ‘_LazyCorpusLoader__kwargs’, ‘_LazyCorpusLoader__load’, ‘_LazyCorpusLoader__name’, ‘_LazyCorpusLoader__reader_cls’, ‘class’, ‘delattr’, ‘dict’, ‘dir’, ‘doc’, ‘eq’, ‘format’, ‘ge’, ‘getattr’, ‘getattribute’, ‘gt’, ‘hash’, ‘init’, ‘le’, ‘lt’, ‘module’, ‘name’, ‘ne’, ‘new’, ‘reduce’, ‘reduce_ex’, ‘repr’, ‘setattr’, ‘sizeof’, ‘str’, ‘subclasshook’, ‘unicode’, ‘weakref’, ‘_unload’, ‘subdir’, ‘unicode_repr’] Let us understand some of the features available with the wordnet: Synset: It is also called as synonym set or collection of synonym words. Let us check a example

from nltk.corpus import wordnet syns = wordnet.synsets(“dog”) print(syns)

Output:

[Synset(‘dog.n.01’), Synset(‘frump.n.01’), Synset(‘dog.n.03’), Synset(‘cad.n.01’), Synset(‘frank.n.02’), Synset(‘pawl.n.01’), Synset(‘andiron.n.01’), Synset(‘chase.v.01’)]

Lexical Relations: These are semantic relations which are reciprocated. If there is a relationship between {x1,x2,…xn} and {y1,y2,…yn} then there is also relation between {y1,y2,…yn} and {x1,x2,…xn}. For example Synonym is the opposite of antonym or hypernyms and hyponym are type of lexical concept. Let us write a program using python to find synonym and antonym of word “active” using Wordnet.

from nltk.corpus import wordnet synonyms = [] antonyms = []

for syn in wordnet.synsets("active"):
	for l in syn.lemmas():
		synonyms.append(l.name())
		if l.antonyms():
			 antonyms.append(l.antonyms()[0].name())

print(set(synonyms))
print(set(antonyms))

The output of the code: {‘dynamic’, ‘fighting’, ‘combat-ready’, ‘active_voice’, ‘active_agent’, ‘participating’, ‘alive’, ‘active’} — Synonym {‘stative’, ‘passive’, ‘quiet’, ‘passive_voice’, ‘extinct’, ‘dormant’, ‘inactive’} — Antonym

Explanation of the code

Wordnet is a corpus, so it is imported from the ntlk.corpus List of both synonym and antonym is taken as empty which will be used for appending Synonyms of the word active are searched in the module synsets and are appended in the list synonyms. The same process is repeated for the second one. Output is printed

Conclusion: WordNet is a lexical database that has been used by a major search engine. From the WordNet, information about a given word or phrase can be calculated such as

synonym (words having the same meaning) hypernyms (The generic term used to designate a class of specifics (i.e., meal is a breakfast), hyponyms (rice is a meal) holonyms (proteins, carbohydrates are part of meal) meronyms (meal is part of daily food intake)

WordNet also provides information on co-ordinate terms, derivates, senses and more. It is used to find the similarities between any two words. It also holds information on the results of the related word. In short or nutshell one can treat it as Dictionary or Thesaurus. Going deeper in wordnet, it is divided into four total subnets such as

Noun Verb Adjective Adverb

It can be used in the area of artificial intelligence for text analysis. With the help of Wordnet, you can create your corpus for spelling checking, language translation, Spam detection and many more. In the same way, you can use this corpus and mold it to work some dynamic functionality. This is just like ready to made corpus for you. You can use it in your way.