Products and Services

Cogilex provides Natural Language Processing solutions to industrial partners and clients for a wide variety of applications involving text mining and information extraction. Cogilex' expertise lies in the fine grain understanding of unstructured text through data-driven syntactic and semantic analysis.

Cogilex uses a sophisticated toolbox of Win32 and Linux components to assemble customized solutions. Cogilex components include:

  • Language Identifier: Establishing the language of a document or portion of text. Cogilex Language Identifier can work of multi-language documents.
  • Canonical Form Generator: Attributing generic forms to words at different levels of abstraction, from simple stemming to synonyms and hypernyms replacement. This is useful for sophisticated text indexing.
  • Morphological Tagger: Assigning parts of speech and morphological attributes to words. This is used as input for syntactic parsing or can be used on its own for advanced text indexing and retrieval.
  • Noun Phrase Tagger: Determining the boundaries and structure of noun phrases. This can be used for concept extraction and text indexing and is used to generate input the Entity Identifier.
  • Entity Identifier: Identifying the semantic nature of noun phrases within a hierarchical ontology using sophisticated contextual rules.
  • Fine Grain Concordancer: Retrieving the context of occurrence of a pattern in a set of documents. Those pattern can be based on a variety of lexical, syntactic and semantic properties.
  • Syntactic Parser: Establishing the grammatical relationships between words and groups of words. Cogilex parser is a robust dependency-type English language parser.
  • Entity Attribute Extractor: Creating Entity-Attribute-Value records from the result of syntactic parsing based on knowledge representations of different domains. This is used for fine grain information extraction from sentences. High levels of accuracy can be achieved in well-specified domains.
  • Semantic Interpreter: Validating and deducing new Entity records from domain knowledge description. This component provides a high-level analysis of relationships between different extracted records as well as a measure of the accuracy and likeliness of information extracted from sentences.

Fine grain information extraction is as much an art as a science. Because of the complexity involved in parametrizing and fine tuning those NLP modules for any given application, Cogilex components are not available off-the-shelf but only as part of customized packages for specific applications.


  © 1998-2005 Cogilex R&D    |   Terms of Use   |    Privacy  |