analyzing syntax

موضوع: analyzing syntax الإثنين أكتوبر 20, 2008 12:26 am

1. A syntax analysis method for analyzing syntax and describing the grammatical function of the syntax, after establishing a morpheme dictionary program for analyzing morphemes of an input sentence, a grammar rule database for storing grammar rules, and a subcategorization database for storing the details of subcategories belonging to heads, such as stems of words and word endings, of each component of a sentence such that the syntactic status of an inflective word ending is admitted based on the marker theory which regards both postpositions and endings as syntactic units, and combination relations between words can be grammatically defined as a whole, the method comprising:

analyzing morphemes wherein if a sentence desired to be analyzed is input, the contents of morphemes are analyzed in units of polymorphemes according to the morpheme dictionary program, and after selecting an analysis case of a morpheme appropriate to the input data among morpheme analysis data by polymorpheme, preprocessing is performed; and

analyzing syntax wherein with the analyzed morphemes, partial structures of a sentence are first established according to grammatical roles stored in the grammar rule database, and then, by using the subcategorization database, the entire structure is established and by calculating the weighted value of each structure, a most appropriate optimum case is determined and output.
2. The method of claim 1, wherein analyzing syntax comprises:

performing preprocessing in which whether or not there is a sentence construction included in a multiple morpheme list is determined by a multiple morpheme list program, and if there is a multiple morpheme sentence construction, the multiple morpheme construction is transformed into a multiple morpheme form, and the meanings of words are determined by a semantic feature program and are included in morphemes;

forming a partial structure by operating and repeating an internal loop, wherein if a morpheme tagged with the semantic feature part of speech is input, the morpheme is treated as an individual morpheme, and by determining according to grammatical roles stored in the grammar rule database whether or not local structure rules are applied to a morpheme selected, a local structure is formed, and by referring to a succeeding object to be processed and determining whether or not a recursive local structure is formed, an internal structure is established, and if there are no other internal structures, a following process is repeatedly performed;

forming an entire structure according to the category and a sentence construction and an expression form based on the subcategorization database and the adjunct type database;

selecting an optimum case by calculating the weight of each structure based on the location or the characteristic of a sentence construction and selecting a most important structure; and
outputting an optimum case with mobile type (tree type) linking lines such that relations among the entire structure, each partial structure, and each morpheme of the determined optimum case are correspondingly connected and indicated by the linking lines.
3. The method of claim 2, wherein the semantic feature program is a program for classifying the meanings of words in predetermined types, the meanings as elements for determining the syntactic characteristic of a morpheme and meaning information, such that the meanings contribute to reducing structural equivalency in a compound sentence structure and the list of adjuncts for each inflective word is determined; the multiple morpheme list program is a program performing classification by type in order to classify word features of postpositions in an identical type or suffixes having postposition functions; the grammar rule database stores information defining grammatical roles on respective primitives; the subcategorization database stores information on details of constituents that can belong to an inflective word, and forms of changeable inflective word endings; and the adjunct type database stores information on general features of postpositions, endings, or suffixes having functions similar to postpositions or endings, which determine the type of a local structure capable of being combined by a core word, as elements determining equivalency of a multiple branch structure.

4. A natural language retrieval method for retrieving documents (sentences) by inputting a natural language question using a syntax analysis method based on a mobile configuration concept, the method comprising:

analyzing a document in which sentence analysis information of a document that is an object of retrieval is stored in a sentence information database by a syntax analysis method based on a mobile configuration concept wherein a subcategorization database, which stores the details of subcategories belonging to heads, such as stems of words and word endings, of each component of a sentence such that the syntactic status of an inflective word ending is admitted and the combination relations between words can be grammatically defined as a whole, is established, and if a sentence desired to be analyzed is input, the contents of morphemes are analyzed and with the analyzed morphemes, partial structures of a sentence are first established according to grammatical roles stored in a grammar rule database, and then, by using the subcategorization database, the entire structure is established;

analyzing question syntax in which in the document information database, if a question in a natural language is input, the syntax of the question is first analyzed according to the syntax analysis method based on the mobile configuration concept, the syntax analysis result is dissected in units of words according to syntax information, the interrogative sentence type of a question is captured, and a dissected, detailed question is determined;

retrieving a document in which the role of the tag of the detailed question determined in a sentence analysis dictionary is converted into a tag for retrieval according to the desired interrogative sentence type, a word having the converted tag for retrieval is retrieved in the sentence analysis dictionary, and a ranking is calculated based on the frequency of retrieval; and

displaying a result including retrieved words, sentences including tags for retrieval, and the contents of a document including the sentences.
5. The method of claim 4, wherein retrieving a document comprises:

performing a general retrieval mode (step) in which by using only syntactically analyzed information, and based on only the result of syntax analysis of a question, a document database already analyzed is searched and matching contents are extracted and provided; and

performing a special retrieval mode (step) in which when a special expression is included in a question, according to the selection of a retriever, retrieval conditions for special retrieval mode are generated, by special retrieval rule information and a noun system database, and based on the conditions, contents semantically dependent on a predetermined component are retrieved and provided,

wherein the general retrieval step is formed of a component matching retrieval method by which data matching direct constituents of a given question are extracted and provided, and a meaning matching retrieval method by which constituents forming a question are included and data including predicates that are core words and semantically similar predicates are extracted and provided, and the special retrieval step uses the special retrieval rule information and a database based on a semantic hierarchical structure of a noun such as a noun system database.