The challenge is:
- Natural language embodies meaning (semantics)
- The embodiment of this meaning should be extracted and translated to a different representation, ideally mathematical
- The interrelations between concepts should be clarified and also encoded into a mathematical representation
- A document should be analyzed according to a world model or instance model that a large network may have. Then generate a representative network model of the meaning of that document within that world model or instance model
- I make the distinction between what I call model meaning and what I call instance meaning in that model meaning is something that applies to all instances (the truth of an instance), whereas an instance may differ because it has different or additional concepts or elements that do not or not always apply to the model meaning. An instance is easily recognized in (correct) language by words as "he", "its", "his", "her", "them". Things that belong to someone or things/concepts that have a specific name or identifier. General concepts do not have these names or identifiers.
- Encode a query into a network model translation and disambiguate if necessary. Then find all network model translations that have similarities to the key network model
The necessity is to encode that particular meaning into a different key, such that this key has a specific meaning or range of meanings with error that can be used to look up the pertaining document. It should work the same way as storing a word that is referenced to a range of documents. Knowing the word, we can look it up from the database and retrieve all documents in which the word occurred.
For meaning, this is obviously very different and far from straight-forward, plus that there is very likely a large margin of error in analyzing its meaning (use of synonyms adds to this error and might also slightly change the meaning if changed by a single choice of synonym).
It would be great to choose a very long number for example, which properly resembles the induced meaning of the document and where the document itself generates a range of different possible meanings that can be expressed as close to the generated number. This allows a query to be more effective and find a wider or smaller range of documents.
No comments:
Post a Comment