Monday, August 06, 2007
Knowledge has arrived!
The purpose is not just to consume the knowledge within the books. I've interestingly experienced that through cognitive science explorations around, I've started to think more focused on very philosophical issues like the meaning of meaning. When making life decisions and so on, it's a good idea to know why and what you are doing things for.
So... if I don't post here for a while... that's the reason!
Thursday, August 02, 2007
The importance of Cognitive Science
The importance and application of cogsci is here now and in its fullest, to develop new applications for computers that go beyond the general click-and-do and replication of human action. Cogsci is an adventure into the (partial) replication of human will and need with the objective to filter information, entities or people on our behalf.
Think about social websites that are popping up everywhere. Why do I have to go online? Do I have the time and do I want to browse through paaaaages of irrelevant nonsense just to find something I find remotely interesting? And even if I resolve one particular need at that time, what about all the other stuff that comes next?
So, why consider social networking as the action of going online in the first place? Shouldn't we think of social networking as an ambient network that is all around us, all the time? Going online is a bit boring and limiting, but the alternative of being constantly notified of new events is pretty boring too.
The resolution is that a computer therefore must find out information that is meaningful to us at the time. For this, it needs to find out interests (beyond keyword-matching), find out how we feel, find out where we want to go, find out what we (would like) to buy, find out what is going around us and with us basically. More sensors in our environment are likely to produce that kind of information. Some smarter ways of interaction remotely (through phones) are also going to help. Some smarter ways of data mining and finding relevant resolutions are key to the resolution of this problem.
The problem is always that a computer has very limited sensors of its environment and cannot infer or create a meaningful representation (of meaning) in the first place. It doesn't have emotional sensors to find out our mood (unless we instruct some nonsense ambient ball for example to choose it). This makes the computer quite a limited and hopeless item in our battle for filtering information, yet!
Tuesday, July 17, 2007
On the meaning of meaning...
The challenge is:
- Natural language embodies meaning (semantics)
- The embodiment of this meaning should be extracted and translated to a different representation, ideally mathematical
- The interrelations between concepts should be clarified and also encoded into a mathematical representation
- A document should be analyzed according to a world model or instance model that a large network may have. Then generate a representative network model of the meaning of that document within that world model or instance model
- I make the distinction between what I call model meaning and what I call instance meaning in that model meaning is something that applies to all instances (the truth of an instance), whereas an instance may differ because it has different or additional concepts or elements that do not or not always apply to the model meaning. An instance is easily recognized in (correct) language by words as "he", "its", "his", "her", "them". Things that belong to someone or things/concepts that have a specific name or identifier. General concepts do not have these names or identifiers.
- Encode a query into a network model translation and disambiguate if necessary. Then find all network model translations that have similarities to the key network model
The necessity is to encode that particular meaning into a different key, such that this key has a specific meaning or range of meanings with error that can be used to look up the pertaining document. It should work the same way as storing a word that is referenced to a range of documents. Knowing the word, we can look it up from the database and retrieve all documents in which the word occurred.
For meaning, this is obviously very different and far from straight-forward, plus that there is very likely a large margin of error in analyzing its meaning (use of synonyms adds to this error and might also slightly change the meaning if changed by a single choice of synonym).
It would be great to choose a very long number for example, which properly resembles the induced meaning of the document and where the document itself generates a range of different possible meanings that can be expressed as close to the generated number. This allows a query to be more effective and find a wider or smaller range of documents.
Monday, July 16, 2007
The problem: inferring meaning for computers
Well, reading Wikipedia, which is of course not the best reference on knowledge but acceptable for starters like me, I see that there are a number of very difficult problems arising when mapping meaning towards a mathematical element.
Meaning is induced by the environment and the interpretation of elements of a language. One text noted that knowledge is not stored as a linear corpus of text in the mind, but rather more like a network of elements that together represent the idea or concept. This means that rather than recalling the text corpus that describes the idea (after reading it the first time for example), knowledge is continuously reconstructed from the stored elements that we find (individually) important and relevant. This seems to mean that memory and the method how things are stored are very relevant for semantics. This explains also quite well how interpretation (based on experience) allows one person to totally misunderstand another, even though the language may be correct.
The problem with computers is that they are in general stateful (stacks, memory, CPU cache) and process one thing at a time. Consider for example the following paragraph from Wikipedia:
"In these situations "context" serves as the input, but the interpreted utterance also modifies the context, so it is also the output. Thus, the interpretation is necessarily dynamic".
It's easy to understand that when we process a certain corpus of text, the meaning and interpretation of that text will change as we scan it. This to me means that the analysis of a text in itself in one pass does not equate to the continuous, recursive analysis of that text, since the text itself is able to modify the context in which it is read. There is a feedback in the text that a computer will need to simulate. It seems that the more I read about semantics, the less I find computers able to simulate the mind processes that lead to understanding of meaning and communication of ideas. Let alone searching for it in a 400TB database (Internet).
Besides natural language in text form or speec, we are able to make sounds, facial expressions and we communicate through body language. The total of these elements will form a larger message that a computer cannot process. Also the emotional weight of certain texts is difficult to simulate for computers.
As I have written before, it does not seem possible at the moment to reliably construct a mathematical model for semantic search that works. There are only parts of the problem as a whole that can be simulated (a better word is approximated ).
Whereas it would certainly be very interesting to see whether semantics as a whole can be better approximated if we apply further matrix operations on matrixes of different purposes. For example, we could use LSI and LSA to consider relevance of one text to another on a very dry level, but multiply this with the knowledge of a particular context of reference, also represented in another matrix in the hope to find something more meaningful.
Matrices seem very useful in the context of deriving knowledge out of something we don't really understand :). A neural network is a matrix, LSI uses matrices and probably it's possible to come up with different matrices that represent contextual information or an approximation of context itself.
Assuming that we have a matrix for a concept or context, what happens when we apply an operation of that matrix on an LSI document? It may be far too early to do that however. In order to come up with anything useful it's necessary (from the perspective of the computer) to come up with a certain processing pipeline for semantic search.
These efforts probably also require us to re-think Human Computer interaction. A lot of our communication abilities are simply lost when we interact with a computer over the keyboard, unless we assume that our ability to communicate those concepts through language is very precise. As I said before, when we communicate and we communicate with people that have similar experiences, the level of detail in the communication need not be very large. This is because the knowledge reconstruction at the other end is happening more or less the same way (based on rather crude elements in the communication), which means that a lot of details are not present in the text. A computer might then find it very difficult to reconstruct the same meaning or apply it to the right/same context.
A further problem is the representation of knowledge, context and semantics. We invented data-structures like lists, arrays and trees that represent elements from quite restricted sets. The choice between these structures is governed by the general operation that is executed upon them and decisions are led by resource or processing limitations. However, the data structures were generally developed on the basis that the operations on them were known beforehand and the kind of operation (and utility of each element) is known at or before processing time.
Semantic networks (or representation of knowledge and/or context) do not exhibit this requirement, seemingly:
- A representation of a concept, idea or element is never the root of things, or at least not a root that I can easily identify at the moment. Does the semantic network have a root at all? I imagine it more to be an infinitely connected network without a specific parent, a network of relationships.
- The representation of a network in a computer data structure is not basic computer science.
- Traversing this network is very costly.
- The memory requirements for maintaining it in computer memory as well.
- It is unclear how a computer can derive meaning from traversing the network, let alone apply meaning to the elements for which it is traversing the network.
- Even if there are specific meanings that can be matched or inferred, the processing power is likely very high.
- The stateful computer is not likely to be very helpful in this regard.
This goes back to a philosophical discussion on what the smallest elements of meaning are and how they interact together.
Latent Semantic Analysis
http://lsa.colorado.edu/whatis.html
"As a practical method for the statistical characterization of word usage, we know that LSA produces measures of word-word, word-passage and passage-passage relations that are reasonably well correlated with several human cognitive phenomena involving association or semantic similarity. Empirical evidence of this will be reviewed shortly. The correlation must be the result of the way peoples' representation of meaning is reflected in the word choice of writers, and/or vice-versa, that peoples' representations of meaning reflect the statistics of what they have read and heard. LSA allows us to approximate human judgments of overall meaning similarity, estimates of which often figure prominently in research on discourse processing. It is important to note from the start, however, that the similarity estimates derived by LSA are not simple contiguity frequencies or co-occurrence contingencies, but depend on a deeper statistical analysis (thus the term "Latent Semantic"), that is capable of correctly inferring relations beyond first order co-occurrence and, as a consequence, is often a very much better predictor of human meaning-based judgments and performance.
Of course, LSA, as currently practiced, induces its representations of the meaning of words and passages from analysis of text alone. None of its knowledge comes directly from perceptual information about the physical world, from instinct, or from experiential intercourse with bodily functions and feelings. Thus its representation of reality is bound to be somewhat sterile and bloodless."Having read this from the perspective of inferring meaning from a corpus of text, I think perspectives and statements on the use of LSA or LSI are too positive to become anything truly useful for web search by itself alone.
A philosophical discussion on the meaning of meaning can be useful to understand how meaning is actually represented or can be analyzed. If ever we understand how meaning is derived, it should be possible to generate better approximate (mathematical?) models.
It's very difficult to infer any kind of meaning without having access to the real world the way that humans do. It would be interesting to find out how the world looks like to deaf or blind people. This should give us useful clues on the way a computer is perceiving a corpus of text. Moreover, maybe the way disabled people compensate can be a useful indication for other compensations in LSA or LSI.
It is very interesting though to see how meaning and semantics can be (in limited ways) represented by a mathematical calculation. This begs the question whether the mind itself is a large, very quick and efficient calculator or whether it's depending on certain natural processes. I think personally, as in another post, that the mind does not rely on calculation alone and that the model of a stack-based computer does not even come close to resembling our "internal CPU".
The intricate and complex process of deriving meaning from the environment requires an interaction between memory, interpretation, analysis and emotion. Mapping this to a computer:
- Memory == RAM and disk, probably very, very large and not always accurately represented (human memory is 'fuzzy')
- Analysis == Deconstruction of events into smaller parts
- Interpretation == The idea inferred from the sum of the smaller parts, with extra information added from memory (similar cases)
- Emotion == A lookup and induction of feelings based on the sum of the smaller parts, that recall certain emotions associated with the (sum of) those events. This is induced feelings when watching/reading a romantic love-story or in other cases levels of stress induced by a previously suffered trauma.
These realizations lead me to believe that, in order for a semantic search to be really successful, one must replicate people's memories, emotions and contexts and analyze each corpus of text (the Internet) within the context of that particular person. To analyze and consider the whole Internet within the context of individuals is an impossible task. If we do this based on certain profiles, we might be able to execute this.
The ideal situation is the possibility to store "meaning" and not just keywords from a certain corpus of text and only later match this meaning with intention (search). I don't think we are able yet to represent meaning in other ways than text, unless we consider that LSA or LSI are indications of meaning by large arrays of numbers (matrices)?
Ugh! Sounds like LSD might be a better means to approximate meaning :)
Friday, July 13, 2007
Semantic Intelligence
I'm very skeptical about these approaches at the moment, but don't totally discard it. The problem with a computer is that it is a fairly linear device. Most programs today run by means of a stack, which is used to push information about current execution context. Basically, it's used to store contexts of previous actions temporarily, so that the CPU can perform other tasks either deeper or revert to previous contexts and continue from there.
I'm not sure whether in the future we're looking to change this computing concept significantly. A program is basically something that starts up and then, in general, proceeds deeper to process more specific actions, winds back, then process more specific actions of a different nature.
This concept also more or less holds for distributed computing, for many ways this is implemented today. If you look at Google's MapReduce for example, it reads input, processes that input and converts it to another representation, then stores the output of the process towards a more persistent medium, for example GFS.
I imagine a certain model in the next paragraphs, which is not an exact representation of the brain or how it works, but it serves to purpose to understand things better. Perhaps analogies can be made to specific parts of the brain later to explain this model.
I imagine that the brain and different kinds of processing work by signalling many nodes of a network at the same time, rather than choosing one path of execution. There are exceptionally complex rules for event routing and management and not necessarily will all events arrive, but each event may induce another node, which may become part of the storm of events until the brain reaches more or less a steady-state.
In this model, the events fire at the same time and very quickly resolve to a certain state that induce a certain thought (or memory?). Even though this sounds very random, there is one thing that gives these states meaning (in this model). It is the process of learning. The process where we remember what a certain state means, because we pull that particular similar state from memory and that state in another time or context induced a certain meaning. In this case, analogy is then pulling a more or less similar state from memory, analyzing the meaning again and comparing that with the actual context we are in at the moment. The final conclusion may be wrong, but in that case we have one more experience (or state) to store that allows us to better define the differences in the future.
So, in this model, I see that rather than processing a many linear functions for a result, it's as if networks of different purposes interact together to give us the context or semantics of a certain situation. I am not entirely sure yet whether this means thought or whether this is the combination of thought and feeling. Let's see if I can analyze the different components of this model:
- Analysis
- Interpretation
- Memory
- Instinct, feeling, emotion, fear, etc.
Well, the difference that this model shows is that semantic analysis talks about generally accepted meaning rather than individual meaning. The generally accepted meaning can be resolved by voting or allowing people to indicate their association when a word is on screen. This seems totally wrong. If for example a recent event, like 9/11 occurs, and the screen shows "plane", most would type "airplane" and the meaning of that word will very quickly distort other possible meanings: a surface, an "astral" plane, geometric plane, compass plane, etc. Meaning by itself doesn't seem to bear any relationship with frequency.
If this holds true, then it means that as soon as any model that shapes semantic analysis in computers has any relationship with frequency, it means the model or implementation is flawed.
Saturday, April 28, 2007
The IT worker of the future
What we all probably need to do is re-think the objectives of companies, as I wrote about in earlier posts. Whilst I am not going to even start to highlight particular companies in this posts, (it is irrelevant), the concept and context of a corporation and company responsibility should be evaluated. Is it meaningful that a company is only aiming to make money in this world? What if we, as a democracy, change the legislation to make them also aim for other objectives? Should we introduce protective laws against job off-shoring?
If the trend continues, then I am afraid all IT workers will need to start learning more than just IT. The technology is getting easier all the time or at least more accessible and documented. Just look on any search engine for some particular problem and it is likely that you get many possible solutions. People with lower wages in other countries have access to that very same information. How can you make a difference?
I believe that much of actual software that is written today will become more commoditized. Not to the level of steel, just more. If this happens, then it is no longer enough to just know about technology. In the future, you will very likely need complementary knowledge in order to keep your job in your own country. You must bridge the gap between technology and some other activity that is really valuable to the business. This kind of activity is much more difficult to execute than following a requirements specification. It is also an activity that can possibly produce a lot more value for your employer than if it would just acquire an existing solution.
The first thing to consider is that IT doesn't really matter. It is there to be used, but it's a bit like a truck. The truck itself provides no value just by its existence, but the way how it is used is meant to bring the cargo from A to B. So, the shipment of cargo from A to B is the actual thing that provides value, not the truck moving from A to B. The actual activity of building systems is meaningless, because you do it for a certain goal. That particular goal has meaning, not the system itself. The more you understand this distinction, the better and more efficient the systems you will build are.
I think that the IT people of the future can no longer sustain themselves in some richer countries unless they understand how they should improve themselves to better contribute to actual goals. This is possible by (better?) applying IT knowledge onto a better in-depth knowledge about the problem domain. Just building systems isn't going to cut it anymore. Focus on the problem, find out everything about the problem domain, find out how it really works, then shape your technical solution around that.
So, knowing about IT is still important in order to apply that knowledge to the problem. But it becomes much more important from a value perspective to understand the actual problem domains in which we are working. You should aim for being that person that writes the specification for producing something basically.