Technology

Cogilex uses a proprietery patented technology for its syntactic and semantic analysis components. Cogilex general approach is based on Dependency Syntax Theory and uses the concept of multi-interpretation Lexical Frames.

Lexical Frames

Lexical Frames can represent the syntactic relationship in a given sentence without resolving all syntactic ambiguities. For instance, for the sentence "Functional changes are early indicators of growth in clonal development", the Lexical Representation will look like this (noun phrase structure is not shown):

1	N	Functional changes
2	V	be
	Subject	Functional changes	1
	What	early indicators	4
	of	growth	6
	in	clonal development	9
4	N	early indicators
	of	growth	6
	in	clonal development	9
5	Prep	of
6	N	growth
	in	clonal development	9
7	Prep	in
9	N	clonal development

Words and Frames: All words in the original sentence occupy one numbered group in the result. Furthermore, words for which some syntactic dependencies were found have lexical frames attached (the indented lines below each numbered line).

Word Number: word numbers correspond to the position of words in the original text starting at 0. Noun phrases (consecutive adjectives and nouns) are collected as one unit and given the number of the last word.

Frame Slots: Frame slots contain 3 pieces of information: the Slot Name, the word as it appeared in the original sentence and the word number. The slot name is either the name of a syntactic relationship or a preposition.

Why are Slots repeated?: This is the power of Lexical Frames. Most parsers will try to syntactically disambiguate all ambiguous attachments. This is not a good idea because disambiguation can only be done accurately on semantic basis. The Cogilex Parser will return all possible attachments. For instance, in the example above, "clonal development" is attached at 3 different places:

  1. be - in clonal development
  2. early indicators - in clonal development
  3. growth - in clonal development

There are many ways to resolve this kind attachment, both syntactic and semantic. Here are a few possibilities:

  • Minimal attachment: This is the normal strategy used by parsers. In the example, attachment 3 would be selected.
  • NP Attachment: Prepositional phrases are attached to the closest noun. In the example, attachment 3 would still be selected.
  • VP Attachment: Prepositional phrases are attached to the closest verb. In the example, attachment 1 would be selected.
  • Semantic Frames: Cogilex Entity Attribute Extractor uses semantic frames in order to decide attachments. Those frames are basically a set of rules expressing how information about any specific entity is expected to be found. In that case, the attachment decision will be made in order to maximize the richness and likelihood of the different interpretations.
  • Lexical database: There are lexical ways for deciding between alternatives. Either statistical methods that can estimate the probably of words being related (i.e. is "indicator in development" more or less probable than "growth in development"?) or lexical methods that can compare word classes and word synonyms.
  • Other: Depending on the end application, there are a variety of other strageties that could be applied. Cogilex parser is geared towards flexibility by not forcing semantic decisions at the syntactic level.

  © 1998-2005 Cogilex R&D    |   Terms of Use   |    Privacy  |