Skip to content Skip to sidebar Skip to footer

Generating PCFG From Universal Tagset

I am trying to build a PCFG using the POS tags obtained from the below code: from nltk.corpus import treebank corpus = treebank.tagged_sents(tagset='universal') tags = set() for

Solution 1:

I got the answer to this question. Instead of using fromstring method, generate PCFG object by passing a list of nltk.ProbabilisticProduction objects and an nltk.Nonterminal object as below:

from nltk import ProbabilisticProduction 
from nltk.grammar import PCFG
from nltk import  Nonterminal as NT

g = ProbabilisticProduction(NT('TS'), [NT('.'), NT('NT6')], prob=1)

# Adding a terminal production
g = ProbabilisticProduction(NT('NT6'), ['terminal'], prob = 1)

start = NT('Q0')  # Q0 is the start symbol for my grammar
PCFG(start, [g]) # Takes a list of ProbabilisticProductions

Post a Comment for "Generating PCFG From Universal Tagset"