Tuesday, December 16, 2008

Wisdom: Rule mining

Artificial Intelligence is very likely to gain a lot more traction in the coming decade. I think it has already started. A.I. is not a science that is solely concerned with rebuilding the human brain or just a couple of cognitive aspects. A.I. is also related to questions that pertain to interaction of agents within a society or organization. In that sense, it's trying to combine individual decisions, individual cognitive abilities with the cognitive abilities and behaviour of that organization as a whole. A.I. is already a multi-disciplinary field of science with strong links to computer science, mathematics, philosophy, psychology, cognitive science, anthropology, management studies and possibly a couple more :). I'd like to start making the claim that A.I. isn't actually domain of computer science. I see it primarily as the science of conversion of problems in other spaces (psychological, behavioural, analytic, business) into expressive models that can run on computers. So it does share a lot of "knowledge" with the computer sciences, but for A.I. it is only the last step. C.S. on the other hand has many topics which are solely related to how to run something on a computer faster or more efficiently, so it is constrained to the elements of C.S. itself.

Thus, in other wordings, A.I. Computer Science as a means to offer a model or simulation of reality. Computers are good tools to use, since they have the capacity to process mountains of information in easy steps.

A big challenge of A.I. is the A. part actually. This A. part deals with computers that only accept an explicit and deterministic language, something that we're not exactly used to. A computer was designed (although it doesn't always behave that way :) to be 100% deterministic. Every cause and effect must be clear. In other words, every cause or event needs to have the intended effects and every observed effect must have a perfectly explainable cause. Challenges here are still abound in cases where events are not received or effects occur that seem inappropriate to the current context (the system is acting weird).

Yet the world doesn't always act in a deterministic way, and we don't use the same deterministic language within the same organization, not even within a single relationship. As soon as someone tries to impose a single perspective about "how the world works" on an organization, it somehow starts to fight back. Slightly different interpretations work better in different contexts. Computers can't deal with that though, since it's not truly contextually sensitive.

In this case, strategies that are based on more fuzzy representations of data sets can work better. The problem in those fuzzy strategies is that a computer can't derive rules from it. So, it doesn't actually gain any knowledge, other than a mathematical representation of how something works.

A very interesting academic exercise though would be an attempt to mine for rules in fuzzy data sets. Suppose that a system finds that customer A likes video A & B & C, and customer B likes video D & C & B, what are the common properties between those video's and how are they important for purchasing decisions? Can we actually profile customers in non-mathematical terms in this way and make statements like : "customer A likes action movies, except not with Keanu Reeves as a main actor?".

The establishment of such rules requires a lot of knowledge about the concepts that a computer is dealing with. As an example, SVD is an algorithm that can be used to analyze preferences or "like-ness" of books or video's. But it cannot state anything in our language about those concepts. If it were possible to construct phrases from such analysis, then also the computer could use that knowledge to develop (executable) rule sets.

Or maybe we shouldn't start with analysis in the first place, but start with rulesets. Develop a hypothesis and test that hypothesis (by how far it is true) through the mathematical analysis?

The ability for a computer to switch between sets of rules and mere "analytical processing", even though it was not programmed to do so in the first place, should be a very important area of research for the future. Learning for human beings is also about assimilating "knowledge statements" from our peers and then testing whether those statements are true by testing it against our experience of reality.

The outcome can be:
  • No experience on the topic, so I cannot verify if it is true or false. (insufficient evidence to validate your claim).
  • Insufficient knowledge in parsing the statement (I don't know what you mean, could you rephrase that please?)
  • That sounds interesting. Indeed I have some evidence that suggests your claim is true. Can you give me more examples?
A rule strength rating should probably also be given. We'll often find outcomes that contradict the rules. In those cases, we could be missing "except-if" cases or "and-A-and-B", where we failed to observe B being true most of the time, except for the last case where it was false.

How do you design a rule-based program that isn't as explicit and hard as Prolog for example, but more like a "soft-rule" program where it accepts statements that are generally true, but not necessarily always and where the computer can verify for itself the strength of those claims as well as form others based on observed data?

Thursday, December 04, 2008

Quantum mechanics and consciousness

In quantum mechanics, the very act of human observation changes reality. That is, by observing something, you are having an effect on what you are seeing. Sounds strange, doesn't it?

Once, I watched these "ants" at work when carrying a piece of lime up a wall to their hideout. Wondering how on earth an ant society that has limited ability to communicate between themselves (pheromones are the main way of communication), they still apparently have other ways of communication. Why? Because the piece of lime is heavy! Even for an ant. It still managed to work with 20 or so ants together, carrying it up hill. But how does one ant know that another is getting tired? Do they need to?

This blog post here is about "entangled minds". It is an explanation about how we use intuition as another means of sensing the world around us. If you think the idea is ridiculous that minds can be entangled, and that at a sub-atomic level, things may interact in a different way than we think imaginable (that is, not like "matter"), then...

How about this... Do you ever get this eery feeling of being watched? and when you turn around or look around you, there *is* someone actually watching you? And you can sense this even when you are with your back turned to the person?

Or how do you explain that people in different parts of the world made historic inventions roughly around the same time?

Going back a couple of posts, I considered the point that consciousness is actually nothing more than an observation of mental processes that will happen no matter what you want or do. Thus, a perspective on the brain as if it were a giant, powerful computer that executes things no matter what, can be influenced by drugs (by inhibiting or stimulating receptors on cells), which on a higher level receives pre-processed input from our senses. It is also influenced by those ideas in quantum mechanics like the observation problem.

If this holds, then just as matter could be slightly more static, us as human beings are actually also part of that matter, the table in front of you and everything around us. If we are not as much in control of our thoughts as we think, then our thoughts are possibly also controlled by those things around us. Thus, as some movies make us believe, by observing something it falls together into a single state. But by not observing it, some object could be anywhere, anyhow and anytime. That is, the possibilities are that it is in all places at the same time. When we turn around and observe it however, it becomes static.

Now... we should probably not assume that it makes us a superhero, but maybe the idea is that by observing it we are interacting with it at a sub-atomic level.

This is surely mind-numbing to think about. We'd like to think that we're discrete, individual personalities that make up our own minds about things and have our specific achievements. But then we find out that we're just part of this big mess in a very different way.

If the above holds, then what does this mean for "causality", "determinism"? What patterns of control are there at this sub-atomic layer? How does that system keep itself in balance? And are electron patterns or executions in a computer ever going to reach this quantum state, in such a way that when a computer observes something, it is interacting with the environment? And if it is not, how is that going to affect the computer's effectiveness? Without intuition, will it ever be able to interact with an environment at all?

If we push this quantum space out of balance, will it push back?

Saturday, November 29, 2008

Singular Value Decomposition

An extremely nice tutorial about Singular Value Decomposition shows how you can extract pretty specific information from a bunch of data. I think SVD is very interesting to analyze data from different perspectives, one perspective is the product (how close is it really to another?) and the other perspective is the customer (how close is customer A to customer B?).

The problem starts to occur when people change their preferences. People generally go through phases (well, not all of us, but many!), and this is accompanied by different needs and different preferences. For this reason, A.I. designers need to understand that historical information only has limited value. The temporal trends in such analysis never come up to the surface, but I'm sure that some research is being done in this area, to further contextualize data in the realms of time.

I'm likely to speak at Sogeti Engineering World 2009, yet to be confirmed. My presentation will be about Artificial Intelligence and how it applies to business. Already now, businesses at lower levels are getting more interested in making the most out of their data. They have good knowledge about how their business works and who (in general) their customers are, but they cannot quantify their customer base from different perspectives.

My presentation will make clear how Artificial Intelligence is important to cases like response modeling, online recommendations, retention modeling and it will explain to engineers how they can apply certain techniques (borrowed from libraries) to their own problems at hand.

Where most people think of A.I. as some kind of black magic or silver bullet, I think it's important to realize that it's just juggling with numbers (at the moment). Over the past 50 years, A.I. has expanded into a number of different territories. One territory is more related to our "explicit knowledge" about things, the rule based systems and prolog. The other area is more related to "tacit knowledge", or what we know without being able to tell how we know it. It just works/is.

Neural networks, SVD, Kohonen are more mathematical constructs around the idea of tacit knowledge. We can't really trace it from input -> output, we just know it works. Other languages like Prolog work on the execution of basic rules or truths and demonstrate how the real world would act.

Our minds continuously sway between these two different areas of knowledge. We infer a lot of different information just through observation, sometimes supported by external teachers. But we also judge observations on truths that we have learned, or rules.

Many solutions in A.I. have depended on the combination of different techniques to offer the best solution. One solution that seems to work well now, for example, is spam assassination. SpamAssassin, now an Apache project, is one of the most popular spam-fighting schemes for email servers. It doesn't depend on a single scheme to rule out spam, but combines them as part of a certain model. Each different technique is either restraining or backing up another technique.

The very interesting question here is that in computers, we tend to use either RBS (Rule Based Systems) or other techniques like Neural Networks or Bayesian Belief Networks to solve a certain problem. One system is invoked before the other, as in a type of hierarchy. If we assume that the human brain only has neurons at his disposal, how can all these different techniques be applied in unison at the right time and moment? How do we know which strategy to rely on?

Tuesday, November 11, 2008

linux intrepid tricks

I've upgraded to Intrepid recently and just two days ago, my system collapsed. For some reason, while opening a new tab in Firefox, the entire system just stopped functioning. No terminal, no Shift+F1, no login... So I reset, expecting things to resolve itself. Naturally, the reboot entered "fsck", which found a number of errors. However, I couldn't leave the machine working on that since I had to leave. In the evening, I tried things again, but it got slightly worse. It took 1 hour for a single fsck run with loads of messages inbetween. By then, I was thinking that I could reduce the time for fsck by removing a DVD dump from one of the DVD's I am owning. Bad idea. As soon as I restarted and went into rw mode, I got grub "17" errors on restart. That means that the boot loader can't even resolve the partition to boot from.

I did have a live cd lying around somewhere, but that was not of great help. "cfdisk" absolutely refused to run. I could not mount from a terminal in the liveCD

( mount -t ext3 /dev/mapper/isw_xxxxxxx /mnt/target )

resulting in "superblock errors" or "partition could not be recognized" and those sorts of things.

and from within "grub", it couldn't even see /boot/grub/stage1. setup (hd0) didn´t work either.

Well, searching around on the internet seems to regularly suggest to use the "grub" trick, or suggests that the root (hdx,y) setting is incorrect, but my problem clearly was a hosing of the entire file system. I thought.

Well, since I am running from a fake RAID array, I needed to remember to install "dmraid" (intrepid has this by default now), but in feisty that needed to be activated through the sources.list first, then apt-get updated and then installed. Then perform "dmraid -ay" to get the /dev/mapper devices to work.

It makes no sense to mount a RAID-ed partition directly through /dev/sda2. You should remember that as well :). On the internet, I couldn't see very good pointers, but eventually I decided to finish where the single user mode left off: fsck.

root@recife# fsck -y /dev/mapper/isw_xxxxxxxx02

eventually ran the entire file system check and resolved looooads of errors. Mounting this on /mnt/target later did work. I could also sort of boot into the system, but because /etc was gone, it wasn't very helpful :). So the entire system got hosed, but from the Live CD system, I could rescue a couple of important files and put them onto different systems or mail them around. Thus, I didn't lose my university assignments and what have you, but the entire installed system is a loss.

I've now re-installed intrepid from the netboot cd (download from the internet) and that worked in one go. There's a guide on the internet on how to install that for fakeraid systems. It's a lot easier. Grub however still has problems getting things organized, so you should pay heed there. Also, it seems that "update-grub" doesn't work properly when menu.lst does not exist. Actually, it does attempt to ask you if it should be generated, but that doesn't work well. I ended up creating a single file "line" with a "y" in there and then adjusting the /usr/sbin/update-grub script (line 1085).

On reboot, things already worked fine, but I like to install the nvidia restricted module drivers for better performance. The screen resolution for my IIyama was still problematic though. It only got to 1024x768. Eventually, I ran nvidia-xconfig, which put in more crap into xorg.conf, then restarted xorg ( nohup /etc/init.d/gdm restart ), after which I had more options to choose from.

Right now, I think I've more or less entirely upgraded to the system I had, so I can carry on hacking and doing things. For some reason, the old system was slowing down significantly. And then there's not even a heavy registry to be supported.

Monday, November 03, 2008

Mental Causation

An old philosophical problem is the problem related to mental causation. The question relates to how a mental event can cause physical events or whether mental events are the results of physical events. In my previous blogs, I once posted about how clever we think we are. This post is sort of an extension on that. In the post, I pointed out that we consciously often consider ourselves more intelligent and better than other species, but our actions are not necessarily that much better in regard to action -> consequence. It's just more words and more fluff. In short, we easily believe that we're radically analyzing a certain situation, considering it from any angle, objectively, but when one uses hindsight to analyze the situational developments later on, we often see that the original arguments were severely misguided or didn't have any such intended effect.

In my studies, I'm now following courses on modelling. The A.I. classes are divided into a group following Collective Web Intelligence and another is following Human Ambience. The latter requires to understand more stuff about decision-making, well-being, psychology, sociology, altruism and so on. You wouldn't possibly exactly expect it from courses in A.I., but there you go.

It's intensely interesting. One of the courses today is about emergence, which I also blogged about before. Emergence is about simple constructs which act/interact in rather simple ways, which eventually construct a new model of behaviour at a higher level. Ants are the most common examples, where each individual ant follows a couple of simple rules, but the behaviour of the ant-hill overall is far more complex than the sum of individual ant together.

You could consider the mind not having any actual conscious thought at all. A not-so inspiring idea is to think of ourselves as soul-less beings, within which just run a very high number of different physiological processes (100 billion neurons), shooting off electrical messages between one another whilst being impacted by a couple of hundreds of different proteins, which are messages from one organ to another. So, we have no specific 'soul', we're just like robots with very complex physiological processes, eventually yielding a certain behaviour that allows us to interact with others.

The ability of a neuron to form an electrical current than is the physiological level. Let's call this emergence level A. But by forming this current together with a simple method for recognizing a previous pattern (neuron A firing off and then neuron B responding similarly because it has done so before, also known as strengthening of a synapse), is a cognitive process, where it doesn't just become a process of firing electrical currents between neurons, but a more complicated process of responding to certain firing patterns. Let's call this emergence level B.

(We then need to take a couple of too quick steps by jumping to enormous assumptions and conclusions) If we assume that thoughts are somehow emerging from these patterns of firing neurons, then the 'memory' together with some other 'machinery' for computing and predicting the results of actions could be seen as the basis of our behaviour. Thus, behaviour in this definition is the ability to recognize and remember and predict future outcomes and then acting on those computations. The next level is our decision-making and behaviour, level C.

When you go one more level up, you get to the behaviour level of a complete society. Remember the ants? For humans, you can develop similar models, because we have a model for our economy (where each of us acts as agents) and a model for certain criminological events, etc. The behaviour of society is made up out of individual decisions at level C, but overall might develop a new emergence level D, that of the collective.

The interesting part in this consideration is that mental processes aren't so much "spirited". From the Stanford Encyclopedia of Philosophy:

(1) The human body is a material thing.
(2) The human mind is a spiritual thing.
(3) Mind and body interact.
(4) Spirit and matter do not interact.

The above four rules regard the mind as a very special kind of element, sort of like a merger of the soul with some physical abilities that the brain can do (vision, smell, motor control, etc.), but decision making, emotion, etc. are considered somewhat deitous.

If we simply regard the mind as a number of computations that are biologically there and thoughts and consciousness are the de-materialization(?) of certain cell assemblies becoming activated or not, then we can find ways to merge this blog story with certain theories about how DNA is actually indirectly programming us and how we serve as carrying "agents" for the continuation of the DNA structure. Thus, in that sense, we are walking biological computers, which are continuously responding to our environment, learning from it and through those processes become more efficient in the propagation of cultures of DNA.

One can wonder whether our consciousness is really that 'evolved' in the sense that it is the motor of all our cognitive processes, decisions and what have you. Are we guiding our actions and thoughts processes through our conscious 'participation' in this process or is consciousness the reflection of the human brain itself, which has basically already determined the best course of action and has considered each alternative? Thus, in this latter idea, consciousness is more like an observation of "mental processes" that have already taken place or are going to take place thereafter. Thus, the difference here is that we must properly identify the CPU, memory and machine and not point at the monitor screen to describe "the computer". In this analogy, consciousness is the reflection of what goes on in a computer (thus, the image on the computer monitor), but it should not be mistaken for the computer itself, which is generally more out of view, housing the CPU and memory.

What is not explained though in this entire story is the element of attention and how we are able to 'consciously' execute certain actions or pay attention to important things. Is that just a matter of directing more attention and execution power to physical events? If it is, then who's instructing our machine that it is important and should be paid attention to? Is the brain in this sense self-preserving and intelligent that it controls itself? Or is there an externality involved which directs the attention of the machine? Or are we thinking too much in hierarchical terms and is the entire problem of decision-making the problem of weighing off cost/benefit and dealing with direct influences first vs. more indirect influences?

Thursday, October 30, 2008

Gene expressions as the process for building programs

A gene expression is the process which eventually leads to the production of a complicated protein molecule. Each protein looks slightly different and has a different role overall in the human body. The encoding of the release of proteins, when and where, is encoded in the genes. Basically, from the DNA transcripts are created (RNA's), which could be viewed as part of a blueprint in reverse form and from these transcripts, the proteins are developed in combination with other processes (regulators). Eventually, the protein is assimilated and then it starts executing its 'designed' function. Some biologists are now working on reverse-engineering this process (thus, reverse-engineering the construction of biological processes as you could call it), back into the programming as it is contained in the DNA.

To call DNA 'building blocks' of life is thus a bit of a misnomer. It's a very large blue-print, or rather, information store. I then think of proteins as agents, constructed through the process of translation of instructions (its purpose) from the RNA transcript. Whereas DNA is just a big information store, the proteins actively carry out the duty as laid out in the instruction set of DNA. These duties can vary significantly. Some proteins help in cell construction, others help by being an information carrier, carrying messages from one organ or part of the body to another, where it meets other proteins (called receptors), causing a biochemical response in that cell, which in turn causes another biochemical reaction which can change our behaviour.

The timing of the construction of certain cells (body development) is contained in the DNA. The DNA will ensure that certain parts of the blueprint are released at the desired time to ensure correct development and functioning. It's difficult not to be in awe of the entire design of life, and how the relatively incomplex function of one element in combination with other not-so-complex functions eventually lead to an emergent intelligent behaviour, or rather, biologically balanced system.

One of the challenges in biology is how to discover where a certain protein, having a certain function, effectively was coded in the DNA. Thus... what did the information look like in the DNA structure which caused a certain protein to have its shape, size and function? Reverse-engineering that process will eventually lead to a much greater understanding of DNA itself. At the moment, this reverse-engineering is mostly done by comparing DNA strands of those individuals that have slightly different features, and then guessing where those differences are 'kept' in the blueprint. Although this is useful, it'll only give indications on what the sequence should be to produce that particular feature, it cannot yet be used to induce a feature that is different from both features observed.

The challenge for computer programs using genetic expressions however is even more challenging. There is no DNA yet for programs from which programs can be written. I really doubt whether they should lead to programs 'as we know it', (thus, a single DNA feature leading to a specific, one 'rule' or bytecode).

Imagine an execution environment in which a neural network could be executed. If the DNA contains instructions to form neurons and synapses, then the resulting network is going to be radically different from any NN design we know nowadays. If proteins govern the construction of the network and its size, then the execution environment itself can monitor available memory and take appropriate steps to regulate the proteins + network in such a way, that it gives the best accuracy and yield (function?). Thus, be a certain percentage of the natural selection algorithm.

The problem remains always in the construction of code, or 'function'. The function that is contained in a neural network will generally be constrained by the preprogramming of the environment itself. That is, the execution environment will be programmed to carry out certain functions, but the execution environment itself cannot 'self-innovate' and evolve new functions over time. So, in other words, it's like saying that the functions that a program could ever develop are those functions which are emergent from the simple functions defined in the execution environment.

Nevertheless, can such an environment with only a couple of pre-programmed capabilities lead to new insights and interesting scientific results?

If we produce the following analogy: "nature" is the execution environment of the world around us. We are biological complex life-forms which rely on this 'execution environment' to act. In this sense, 'nature' is an abstract form, all around us, not represented in concrete form. Our biological processes allow us to perceive events from material elements around us (being either other humans, cars, houses, etc.). We can see the representation, hear it, touch it or otherwise interact with that representation.

Similarly, in the execution environment in the computer, we can give a program developed by a gene expression a "world" or a "material representation". It'll be entirely abstract as in bits and bytes, but that doesn't necessarily matter.

We believe the world itself is real, because we experience it as consistent and always there. But if you've seen "The Matrix" (which btw I don't believe is real :), then you can easily understand my point that the experience of something, or being consciousness of something, doesn't necessarily mean that it has to be real 'as we know it'.

Back to the point, if the program doesn't know any better, it'll just be aware of its own world. That world are the inputs of the mouse, the microphone, internet?, keyboard and so on. The output available to it is the video card (thus the screen), the speakers and internet again. Following that pattern, the program could theoretically interface with us directly, reacting to outside real-world inputs, but always indirectly through a proxy and also indirectly provide feedback. It's as if the program always wears VR-goggles and doesn't know any better, but we can see the effects of it's reasoning on-screen or through other outputs.
  • Enormous simplification of nature (biological processes) == execution environment
  • Material objects == "modified" input/output hardware channels
Of course... one needs to start with the design for the execution environment in the first place :).

Wednesday, October 22, 2008

Genetic programming

The main philosophy behind the previous article was that Genetic Algorithms do not modify the structure of a computer program, but only the contents that the program uses. A program in this case is a specific design for a neural network and other things.

The same article hinted at the assumption that we're inclined to think in states, not in dynamics, and that we're able to reason "perfectly" using explicitly defined states with clear boundaries and attributes. The idea of evolving a program's structure as in the previous post has already been suggested before, but not researched to great extent. The interesting line of thought in those articles is that the program itself is something which evolves, not just the parameters that control it.

Possibly, it's just a philosophical discussion on where the draw the boundaries. The computer itself as hardware doesn't evolve from nothingness and the next thing that engineers will claim is that it should be built 'organically' in order to come up with more suitable 'organs' (hardware elements and devices) more suited to its task.

So having said that, is there any use in laying this boundary closer to the hardware? We'll need to define a correct boundary, representing the boundary between the individual or organism and its surroundings. In this comparison, the program becomes the evolutionary organism and the hardware is the environment. The hardware then becomes responsible for evaluating the program as it starts to move about the space. Programs misbehaving or inept in the capacity to perform its intended function should be removed from the space and a new program from the population should be taken.

One complexity here is that nature itself is analogous in nature and doesn't necessarily prevent or prohibit invalid combinations. That is, the design of it is very permissive. Since a computer has been designed by humans based on explicitly defined theories and models, it is not very difficult to reach an invalid state by the individual, thereby halting that individual's progress (it dies) or in worse cases, halting the environment altogether, requiring a reset. The latter, in analogy with the world around us, would mean that specific mutations in a specific individual on this earth might lead to the immediate termination of us all and a "restart" of planet earth.

So to regard the computer as a suitable environment for running evolutionary programs is a bit far-fetched, due to the, so far, required functioning of a computer, which is to behave consistently and explicitly, according to the rules that govern the execution of a certain program or operating system.

Another problem is that certain hardware is already in place and has been designed according to these explicit hardware designs (not organically grown). For example, a computer keyboard is attached to a computer and requires a device driver to read information from that device. Thus, a keyboard is input to an individual, but it's an organ which is already grown and needs to be attached to the program with all the limitations of its design that it may have. On the other hand, we could also regard the keyboard as some element in nature, for example light or auditory information, the signals of which need to be interpreted by the program in order to process it in the expected way.

Because the computer is not so permissive, it may be difficult to converge on a driver or program which starts to approximate that behaviour. There is only a small set of instructions in the gene which could lead to a valid program. In comparison with nature, it is more likely that the organism wouldn't be invalid, just that it would have features that are not as advantageous to its nature (unless the end result is the same... invalid == "dead for sure"?).

As complexity progresses, small mutations should eventually converge on an organism which is better suited to deal with its environment due to the concept of natural selection. Since a computer is so explicit about well-behaving programs and any invalid instruction anywhere might kill the program, this is reason for some thoughts on perhaps a design which comes closer to some better knowledge of the environment in which the program operates. For example, insert the program inbetween inputs/outputs and let it loose within those constraints, rather than allowing it to evolve naturally inbetween all kinds of input/output ports, hopefully evolving into something useful.

Thus, the biggest challenge here is to find a specific, suitable grammar which can be used to form the program itself, and how the elements of that grammar can be represented by lines of genetic instructions, such that any manipulation on the genetic instructions produce valid processing instructions, never invalid. My preference definitely goes out to runtime environments for handling such kinds of information. Both because the complexity of dealing with the hardware itself is greatly reduced and because it's able to run in a more controlled environment.

Another question is how that grammar is compiled. Graph theory? Can we express symbolic reasoning which come closer to the design of the computer, but which is not as explicit as a rule-based system?

It'd be cool to consider a program running in a JVM, which would receive a stream of bits from the keyboard and then is evaluated on its ability to send the appropriate letter to an output stream, which directs it to screen. The challenge here is to find a correct method of representing 'construction instructions' and in what language and what manner this should be translated into valid code, which is able to run on a picky CPU.

Monday, October 20, 2008

Fluid intelligence & redesigning software engineering

To the left is a little diagram which I created, which shows how I believe that we're making sense of things. The black line is a continuous line representing continuous, physical motion or action. The red lines are stops inbetween where we're perceiving the change. Relevance: A discussion on how we're limited to thinking in terms of actual motion or change as opposed state descriptions. We have not much trouble to deduce A->B->C relationships and then call that motion, but the motion underlying the path that is undertaken cannot be properly described (since we describe it as from A to B to C).

In programming languages, it's somewhat similar. We're thinking in terms of state, and the more explicit that state is determined, the better considering our ability to convey ideas to others. Also, controlling proper function relies on the verification of states from one point to another.

This may indicate that we're not very apt in designing systems that are in constant motion. Or you could rephrase that as saying that we're not very good at perceiving and describing very complex motions or actions without resorting to explicitly recognizing individual states within that motion and then inferring the forces that are pushing objects or issues from one state to another.

The human brain grows very quickly after embryonal development. Neurons in certain stages are created at a rate of 225.000 neurons per minute. The build-up of the body is regulated by the genes. The genotype determines the blueprint, the schedule (in time) and the way how your body could develop. The phenotype is the actual result, which is a causal relationship between genotype and the environment.

The way how people often reason about computers is in the state A->state B kind of way. It's always making references to certain states or inbetween verified behaviours to next behaviours. When I think of true artificial intelligence, it doesn't mean just changing the factors or data (compared to neuronal connections, neuron strengths, interconnections, inhibitions and their relationships), but the ability to grow a new network from a blueprint.

Turning to evolutionary computing, the question isn't so much to develop a new program, it's about designing a contextual algorithm void of data, which is then used as a model where factors are loaded in. Assuming that the functioning of the model is correct, the data is modified in such a way until it approximates the desired result. This could be a heuristic function, allowing "generations of data" to become consistently better.

Fluid intelligence is a term from psychology, which is a measure for the ability to derive order from chaos and solve new problems. Crystallized intelligence is the ability to use skills, knowledge and experience. Although this cannot be compared one-to-one with Genetic Algorithms, there's a hint of similarity of a GA with crystallized intelligence. Neural networks for example do not generally reconstruct themselves into a new order, they're keeping their structure the same, but modify their weights. This eventually leads to a certain limitation of the system if that initial structure is not appropriately chosen.

The human body works different. It doesn't start from an existing structure, it builds that structure using the genes. Those genes mutate or are different, causing differences in our phenotype (appearance). The brain creates more neurons than it needs and eventually sweeps some connections and neurons once those are not actually used (thus needed). Techniques in A.I. to do something similar exist, but I haven't come (yet) across techniques to fuse structures together using a range of "program specifiers".

The genes also seem to have some concept of timing. They know exactly when to activate and when to turn off. There's a great interaction, chemically, between different entities. It's a system of signals that cause other systems to react.

You could compare the signalling system to an operating system:
  • Messages are amino-acids, hormones and chemicals.
  • The organs would then be the specific substructures of the kernel, each having a specific task, making sense of the environment by interfacing with their 'hardware'.
  • By growing new cells according to the structure laid out in the genes (reaction to messages), the kernel could then start constructing new 'software modules' (higher-level capabilities like vision, hearing), device drivers (transducers & muscles), and so on, possibly layered on top of another.
Thus, function becomes separate from data (messages), and function itself is able to evolve and through the evolution of function, data interchange and data production will change as well. Possibly, the variability of data (types of messages) and the interpretation could change automatically, possibly regulating this data flow further. Would it become more chaotic?

It'd be good to find out if there are techniques to prescribe the development of a certain piece of software. Is a computer in any way, given its hardware architecture, capable to support such an evolutionary model? Kernels are pretty clever nowadays and together with hardware, they can detect memory violations and replace / remove processes or drivers when needed. They cannot however regulate the internal state of hardware once it's in an incorrect state, unless specific functions exist.

The other very important measure is that there's no better heuristic for evolutionary evaluation than nature. It's there, it (seems?) to promote balance, and it both nurtures the species and threatens them. If there's no diverse system (heuristic) for an evolutionary computer, then any hope to develop a smarter machine seems almost hopeless. If we assume that nature itself is 'external' (but also part of) an organism, then we could also perceive such a computer as a black box and assess its functioning in that way. This would allow us to withdraw from the internal complexities within (verification of state->state), but assess its functioning differently.

Emergent, efficient behaviours promoting the survivability in the situation should be stimulated. The problem with this approach is that, the system which needs to evaluate the performance of the agent could probably be more complex than the agent needing to evolve. Status quo?

Saturday, October 18, 2008

The invisible weight of being

Preparing for the exams, I'm taking some time off to get away from the first order predicate logic, psychology, exam training and so on. I am getting some interesting thoughts and combinations from reading through the book.

First order logic introduces the idea of quantification to propositional logic. The latter are atomic propositions which can be combined by logical connectives, forming a kind of statement of truth. It's the lowest level that you can go in order to make a statement about something. First order logic expands this with quantification and predication. The difference here is that propositional logic can only convey binary relationships between nouns. You could compare this with "if A, then B". So, the existence of one thing can be compared by the existence of another, but nothing more.

In FOPL, you can make predicates like "if everybody can dance, then Fred is a good dance instructor". The difference with the previous statement is that there are verbs included, which are predicates of a capability or property of an element, and the elements are quantified through "everybody" or "Fred" or "there exists at least one".

Now... trying to apply FOPL to our own methods of reasoning about the world, I recognize that we tend to make errors. That is, we're not generally developing a full, exact model of a certain knowledge domain (that is, having each relationship between objects in that world represented by a statement in formal logic), but have rather loose associations between those objects which are used to reason between them.

The deviations in the ability to reason exactly about things (in more complicated situations) may be due to the inability to measure exactly or with great certainty, but other more common reasons include cognitive bias.

If you remain in the FOPL world, this would mean that we could develop incorrect predicates on the functioning of the world around us. Consider the following story:

"The teacher says:'People not studying may fail their exams. People that do study, welcome to the class'".

Does the above mean by exact definition that people who do not study are not welcome? We could easily infer that from the above sentence. If the teacher meant that students not studying are not welcome, he should have said:"People not studying are not welcome here", which he did not. We thus tend to infer additional (incorrect?) knowledge based on a statement that only included part of the student group, but not all. Therefore, we assumed that students not studying are not welcome, because students that do study were explicitly mentioned and were explicitly welcomed.

So, we're not consistently reasoning with explicitly declared knowledge. We're inferring lots of different relationships from experiences around us which may be correct or not.

Learning is about making inferences. Inferring information by looking at a world and attempting to test assumptions. The question is then not so much how we can test assumptions to be true, but how to develop the assumptions in the first place. The cognitive bias shows that we're not necessarily always correct in developing our assumptions, but also that we're not necessarily correct in the execution of our logic, such that we may develop incorrect conclusions even though our essential knowledge does not change.

The interesting thing about FOPL next is that the symbols used for expressing the relationships are simple. Negation, implication, quantification, that is about it. When we use language, it feels as if the verb belongs to the object itself, but in FOPL the action or capability is another associative element through an implication. Since FOPL is a way to express boolese relationships, reasoning with uncertainty makes FOPL not so immediately useful, unless implications include a measure of certainty. But then we cannot reason using FOPL.

We could also ask the question as how far a computer has the ability to develop a hypothesis, and what techniques exist for hypothesis development. Looking at humans, we have different ways of testing our hypothesis. If hypothesis testing is taken as a goal, then we need to introduce some new logic which may or may not be true and test that against our existing knowledge. We'll need to be sensitive to evidence that refutes the hypothesis as well as evidence that supports it. If the hypothesis is definitely incorrect, there's no need to look further. If the hypothesis is inbetween, then we're probably lacking some information or missing an intermediate level which includes other dependencies. Thus... a hypothesis may be successful in that it indicates an indirect relationship between two elements, which can only be further investigated by researching the true relationships that lie between it. A true scientific approach would then establish the goal to prove the relationship exists and in by doing so, attempt to find other relationships with potentially unrelated elements, bring them into the equation and establish sub-goals to verify the truth of the sub-relationship.

It would be very difficult for a computer to find other elements that are contextually related. If first order predicate logic is a means to describe and lay down predicates about the functioning of the world around us, what tools could we use to reverse-engineer the underlying logic of those rules? Imagine that there's a person that has never received formal education. How different is the perception of their world from a person who has? Do they use the same formal knowledge and reasoning methods?

Wednesday, October 08, 2008

A.I.: Modeling reality through supposed (uncertain) associations

If you have an account at Amazon, you may have noticed that somewhere on the screen after your login, the system produces a list of recommendations on things that you may find interesting. This is a little project started by Greg Linden. You could consider this some kind of A.I. engine. The basis of the idea is the assumption/claim that an association exists when customer A buys book X and Y and customer B buys book X only, customer B may also be interested in book Y. This model can further be extended by logging what has been in the shopping cart at some point in time, such that it's probably of interest to a person, even though they end up buying it or not.

Does the relationship really exist? Probably in x % of the cases the relationship is real, but I have bought books for my wife for example and since then, the engine keeps recommending me books on corporate social responsibility. Although I do find the topic interesting, I'd rather hear summaries about it then dive into a 400-page bible describing it :).

But such is life then. A computer has very sparse information about online customers to reason with. And once you develop such technology, it's a good thing to shout about it, since it's good marketing. However, the point of this story is not to evaluate the effectivity of the algorithm or engine behind Amazon recommendations, it's to show that these A.I. systems are not necessarily that complicated.

The first thing to do is to understand modeling this space is all about finding sensible relationships / assocations. In Amazon's case, this is a customer that you may be able to profile further. Do you know their age? what is their profession? are they reading fiction/novels? are they reading professional books? where is their IP from? Can you find out if they're behind a firewall of a large company / university? when you send them some material to try them out, did they click your links? did they then also buy the book? and why?. Of course, you wouldn't start by finding out as much as possible, but you need to think about which properties of a customer are important and figure out a way to determine them.

At the other end of the spectrum are books, waiting for readers. A book has a category, it's got a total number of pages, it has a target age group, it has customer reviews with stars describing its popularity, some are paperbacks, others are always sold when put in the shopping cart, others are removed later, some are clicked on when you send small campaigns to a select customer group. Thus, very soon, the two domains are somehow married together in the middle, but in many different ways, of which some ways cannot be analyzed with great certainty.

A little bit of data-mining helps here to test the certainty of your hypothesis. The next step is to think of a model where you put these things together. You could consider using a neural network.... but why? That'll work well for data that is more or less similar, but can consumer behaviour really considered that way?

Other approaches consider production rules. It's not much different from IF-THEN rules, except that you're not processing them in the order in which they are declared in the program. The problem here lies in the fact that you have millions of books that you may be able to match to millions of customers, but testing every possible combination would certainly cost a lot of processing cycles for nothing. So you need some more intelligence to wisely pre-select sets.

The ideal thing would be to develop a system that is perfectly informed. That is, it knows exactly what your interests are at a certain time and it tries to match products against those interests. Two problems here. Consumer behaviour tells us nobody is going to stop at a website to enter their interests. Second, a customer may not know they're really looking for something until they see it. The second reason being much more interesting, since it's "impulse" buying to a high degree. Exactly what you'd need.

Well, and in case you were expecting a finale where I give you the secret to life, the universe and everything.... :)... This is where it ends. There is no other final conclusion but to understand that a server in 2008 cannot have perfect information about you, especially not when you choose to be anonymous and known at the same time.

So... reasoning and dealing with uncertainty it remains. The efficiency of recommendations is highly dependent (100% dependent actually) on the relationships and associations that you assume in the model. In the case of Amazon, they started with what customer A bought customer B might also find interesting, and developed their concepts further to "wish lists" and mixing it with other information. That still does not capture interests that arise suddenly, which is generally what happens when changes occur in your life. You may for example start buying a house, start a new course in cooking, start a business, have a colleague who talked about DNA and thought it really interesting.

Also, chances are that once you've bought books about a subject and let's say it's technical, you're saturated by that knowledge (or author), and thus your interest wanes. The recommendations you'll see are very likely bound to the same domain. So they are not nearly as effective (except for those who are totally consumed by the subject :).

As a change, you can also attack this from a totally different angle. The information you can build up about your products can be very deep. You could theoretically use consumer behaviour to find out more about your products, rather than applying it to understand your customers better. The idea is to generate intricate networks of associations between your products. Then link those associations back to anonymous users later on. The more you know about your products and those hidden associations they may have, you can react very quickly to anonymous demand. You could also use it to not search for books with a certain term in the title or text, but find books that are ontologically related to the term.

For example, a customer types "artificial intelligence". It's tempting to show books about A.I., but is it really what the customer is looking for? You could make this into a kind of game. Start with a very generic entry point, quickly zoom in on an "area of interest", which is interconnected with a host of products, books and other types. Then start showing 5 options that allows the user to browse your space differently. Always show 20 sample products after that. When a user clicks a specific product, it gets a score to bind that to the terms selected (path) to the product and it's showed again to other users with the same similar path. The higher a product scores (the more popular), it'll automatically pop up more often.

The above model could then be expanded. The idea is that you're not just seeing products that are easily related to the domain of interest, but also have less obvious relationships. That allows customers to see things they wouldn't have looked for themselves and it can peak interest. It's a bit like entering a store without exactly knowing what you want. It's also a bit like searching on the internet. Who knows what you generate if you don't allow the most obvious associations, but only the less obvious ones (which could be part of the heuristics/score).

Just be careful not to get this scenario. :)

Sunday, October 05, 2008

The relationship between rationality and intelligence

Every day, we make decisions on a continuous basis. We've come across many situations before, thus can reliably estimate a path of best resolution for experienced situations. In other cases, we haven't seen too much of a similar situation, but still develop an opinion, gut feeling and most likely undertake on a path for resolution.

We can call our thought and actions rational or not rational. The word rational refers to an ability to fully explain an action and most likely we'll agree that assumptions are not taken as an acceptable means of forming it, unless we have data/information to back up those claims. Thus, rationality involves an act or decision that is developed from a calculation and estimate of existing experiences. Irrational thoughts and actions are the products of assumptions, incomplete data or little experience. You could closely couple rationality with logic, although rationality may be a little larger than logic. Logic requires predicates and through logic and knowledge represented in the rules of logic, one can "reason" about the validity of claims, thoughts and actions. However, since logic follows those rules only, whenever knowledge is not embodied within the rules, the system cannot appropriately confirm or deny a specific claim, thought or action.

Intelligence could be seen therefore as the ability to act outside the realms of logic and rationality, based on the premise of uncertainty, and intelligent reasoning is the ability to infer new relationships through trial and error or 'logical reasoning' with analogous material and developing gut-feel probabilities that another situation will behave in similar ways or slightly different with expectations on how it will differ (although we could be really wrong there).

Induction is the ability to estimate the outcome of a situation based on a set of assumptions, initial states, goals and effects. Deduction is the ability to find out under which conditions a situation came to be. Both are intelligent actions.

A computer is a pure rational machine. It acts within the knowledge it was given and we haven't so far agreed that computers are really intelligent. Although some systems exist that can perform very specific tasks in a very efficient way, those systems are entirely rational and cannot deduce/induce new knowledge from their surroundings (enrich themselves with new programming).

Rational is also defined sometimes as "void of emotion and bias". This bias is caused by how easy it is for you to recall memory from similar situations. Stronger emotional situations generally are easier to remember (and this is generally for the good). Many times, we're over-compensating risks related to explosions, accidents or attacks, more than what is needed to appropriately reduce the risk. Some academic research is highly biased, because the author wanted to find the evidence that his claims are true, rather than remain open to find contradictory results. Rational reasoning thus requires us to eliminate the bias, not be guided by opinion, but rely on facts and computation to come to a conclusion.

The following text is related to power and rationality:

http://flyvbjerg.plan.aau.dk/whatispower.php

The interesting question that you can derive from the text is: "How can people in important governmental positions correctly apply the power that is given to them and make rational decisions in the interest of the people they serve?".

In order to make rational decisions, we may not be biased by irrational opinion. That is... the thoughts and arguments that we come up with must be fully explainable and not be tainted by personal expectations from the leader. We can choose to trust the leader on those claims, but without any explanation given, there is little reason to provide that trust.

Artificial Intelligence in this sense can be applied to some of these problems, although it should probably not be considered leading? There are some AI programs for example in research that can be used by the justice system to analyze historical cases. A current case can then be evaluated against the historical punishments, such that the judge has an extra tool to ensure the punishment given is fair and enough, considering the situation and previous cases. Certainly, each case by itself is one to be considered individually, but the programs give an indication of the similarity. It's thus a tool for the judge to verify his own bias, if any exists.

Saturday, October 04, 2008

1+1+1=5

The title refers to the fact that in emergence, the total result of interactions of smaller elements is more than their sum. Or rather, having many small elements or organisms perform actions according to simple rules, the overall result of the following of these rules yields a new kind of behaviour of the system itself, which may far surpass the expected sum of the results.

Where some people consider a neuron the most simple building block available in constructing a network (it either fires or doesn't and it can be influenced with chemicals), each cell in our body is actually an agent by itself which on a lower level has very intricate capabilities and behaviours. A cell in itself could in a way be considered an organism, even though a human has many of those and they are interrelated.

In order to follow my drift, you should look up the article on the definition of a cell in biology: Wikipedia link
Each cell is at least somewhat self-contained and self-maintaining: it can take in nutrients, convert these nutrients into energy, carry out specialized functions, and reproduce as necessary. Each cell stores its own set of instructions for carrying out each of these activities.
Cell surface membranes also contain receptor proteins that allow cells to detect external signalling molecules such as hormones.
Or... each cell has the ability to sustain itself, has its own behavior and purpose and follows a set of simpler rules than the entire organism. Cells can generally multiply, although this depends on the type of cell. The more complicated a cell is, the less its capability to multiply. Some cells are said not to be able to multiply at all (neurons), although other research has indicated that this is not entirely the case.
Cells are capable of synthesizing new proteins, which are essential for the modulation and maintenance of cellular activities. This process involves the formation of new protein molecules from amino acid building blocks based on information encoded in DNA/RNA. Protein synthesis generally consists of two major steps: transcription and translation.
Proteins have very complicated structures and may contain specific receptors, such that certain proteins may react to chemicals in the environment. This reaction may trigger a certain kind of behavior, thereby serving a particular purpose. For example, liver cells may give off chemicals to indicate to the body that there's a falling level of nutrients, thereby causing a desire to eat:

http://en.wikipedia.org/wiki/Hunger

The chemical is released by receptors (protein molecules):
In biochemistry, a receptor is a protein molecule, embedded in either the plasma membrane or cytoplasm of a cell, to which a mobile signaling (or "signal") molecule may attach.
So, each cell in this system has a very specific function. It monitors levels of hormones or sugar (type of molecules) and the entire functioning of the organism basically is the recognition of signatures of a certain complex molecule, generally proteins.

The proteins are specified by DNA. The DNA is a large blueprint, which on being split results in a template (RNA), which then reconstructs DNA from that point. Unfortunately, during this templating process, it is possible that certain "errors" occur, which are basically mutations of the original DNA. You've started with a DNA signature that is the result of the merger of two cells of your father and mother. Through that set and in your lifetime, the cells in different parts of your body re-uses that DNA signature to renew and recreate other cells. The older you get, the more likely it becomes that one cell multiplication leads to a certain kind of errors that give a cell a potential fatalistic behaviour: cancer. The cell basically becomes rogue in that it starts to multiply quickly, thereby breaking some rules in the aggregate system. It develops a lump of some sort. When the cell also develops the ability to move (which some cells do and others don't), things become dangerous, since the cells that bear DNA where the cell multiplies at a very high rate move to other parts of the body.

Thus... in short... considering a neuron as a cell that fires and as the lowest important building block of a neural network is a grave mistake. Each cell itself has very, very complicated workings, reactions and behaviours that are each in itself very important, as these define the simple rules of the behaviour of the cell. If the cell has behaviour which may change over time or be heavily influenced by changes in the environment, we cannot assume that ignoring that effect in a 100-billion neural network will not make any difference as opposed to the view where it's considered of the utmost importance to understand and model them.

In previous posts (important numbers and statements), I've done some calculations on the memory requirements for a human brain. The result is that, assuming 4 bytes per neuron and connection, you'd need 400 Terabyte (400,000 Gigabyte) of memory in order to store all the connections and neuronal information.

Now... each neuron is a complicated cell, which through changes in its immediate environments or differing levels of chemicals, could slightly modify its behaviour. It could start to fire more often or fire less. Thus, in the simplest form for a model, each neuron would need to have a threshold modifier, which is influenced by another system, to regulate its individual responsiveness.

If we take into account that besides the processing of signals, the brain also responds strongly to chemical changes brought about by external factors, such as fear or emotions, then one could say that "emotive neurons" are those neurons that give off proteins of a certain type on the recognition of danger, causing other neurons to become much more responsive in their processing of signals. The exact level of chemicals produced is dependent on the number of cells that would produce a certain chemical and how strongly they produce it. Since this also depends on learning, the question remains whether the producer learns to produce less or whether the signal processors inhibit the signal more as soon as it is observed.

Thus... there may be three very complicated effects at work in the human brain, relating to consciousness and our efficiency of acting in our environment. We have the neuron cells, which I see as pattern recognizers, which also learn and where sub-assemblies of neurons work together to create a learning experience (process signal, recognize situation, provide stimulus to react to situation, verify effectivity of reaction, reduce/increase stimulus).

And the glial cells, which outnumber neurons by a factor 10, but which have so far not been researched in great detail. Could it be that there's a secret in the interaction of neurons + chemicals and the glials that together, as three complicated systems, produce that which we call "consciousness"?

Monday, September 29, 2008

memory network

I've looked into some links about constructions of memory networks. I'm dreaming up a design for modeling an NN within a computer memory where it's better aligned or fitter for fast processing or recollection:

http://www.ploscompbiol.org/article/info:doi%2F10.1371%2Fjournal.pcbi.0030141

http://oai.dtic.mil/oai/oai?verb=getRecord&metadataPrefix=html&identifier=AD0671492

http://cat.inist.fr/?aModele=afficheN&cpsidt=16334395

Food for thought when going to work today :).

Friday, September 26, 2008

On the use of memory

The difference in memory in computers and organic memory is a strange thing. The memory in computers is linearly aligned and stores elements very differently than organic memory. Where the storage area is not large enough in computers, we tend to use databases to make the memory "searchable" and mallable.

Computers generally need to trawl through the entire space in order to find something, albeit with clever algorithms rather than an exhaustive search. We're modeling the world around us into a specific, definition to work with and then compare new input to those things we know.

Organic memory is very different. Through the input of elements and situations, the recall of other elements in memory seems activated automatically without the need for a search. As if very small biological elements recognize features in the stream of information and respond. This may cause other cells to become activated too, thereby leading to the recognition of one set of features with previous experiences. It's as if the recognition drifts up automatically, rather than that a general CPU searches for some meaning based on a description or broken down representation.

The problems of computers then is our limitation / lack of imagination to be able to represent memory in a non-linear form. The way how computers are modeled require the programmer to define concepts, elements and situations explicitly or less explicitly using rules and then search, within a certain space, for similar situations and reason further from there. It's totally lost if the situation cannot be mapped to anything.

I was thinking lately that it would be a ground-breaking discovery if memory could be modeled in similar ways to organic memory. That is, the access to memory being non-linear and a type of network, rather than developing a linear access algorithm. If you consider a certain memory space where a network may position its recognition elements (neurons?), then the connection algorithm of a current computer should map the linear memory differently into a more diffuse space of an inter-connected network. Basically, I'm not saying anything different than "create a neural network" at this time, but I'm considering other possibilities to use the mechanical properties of memory access in a clever way and as such to reduce the required memory for connection storage and to figure out a possibility to indicate "similarity" by the proximity of memory address locations.

Or, alternatively to that, use a neural network to determine a set of index vectors that will map to a large linear space. The index vectors can be compared to a semantic signature of any element. This signature should be developed in such a way that it is categorizing the element from various perspectives. Basically, considering the ability for semantic indexing of text, the technique is used to find texts that are semantically similar.

The larger the neural network, the finer its ability to recognize features. But our minds do not allocate 100 billion neurons to the (same) ability of pattern recognition. Thus, you could talk of specialized sub-networks of analysis that together define a certain result (binding problem).

But perhaps we're thinking too much again in the terms of input-processing-output as I've indicated before. We like things to be explicitly defined, since it provides a method of understanding. What if the networks don't work together in a hierarchy (input->network1->network2->output->reasoning), but work together in a network themselves?

Then this would mean that such a network could aggregate information from different sub-networks together to form a complicated mesh itself. The activation of certain elements in one part could induce the activation of neurons in another, leading to a new sequence of activation by reasoning over very complicated inputs of other neuronal networks. For example, what if thinking about a bear causes our vision analysis network to fire up / become induced and then produce a picture of such a bear?

Imagine a core of a couple of neural networks that have complicated information available to them from "processing" neural networks before them. If those core networks are interconnected in intricate ways and influence one another, then it's likely that a smell causes the memory of a vision or hearing in another network, albeit slightly weaker than normal.

Leaving this thought alone for now...

Thursday, September 18, 2008

Reusable A.I.


A miscellaneous crazy thought I had today was to look into the potential of reuse for artificial intelligence algorithms or networks. Think sharing the factors, definitions, neurons... Persisted artificial neural networks, which is only possible in computers because they are determined and extractable from memory.

The different functions of sight, smell, hearing and so forth are always seen as very specific properties of a system and it's specialized after that. What would happen once the networks are interchangeable, pluggable and can start talking to one another? Even then we'll need to make decisions on which networks to support and which not to, but next to this problem, the underlying platform... the glue between the networks, could serve as a basis for connecting grid / cloud computing of A.I. networks that each serve a particular purpose.

Some very complicated problems require some heavy processing power. I strongly believe that subdividing the problem space into smaller problems, each with its own expertise and domain, can help to significantly decrease the eventual complexity and required hardware.

Bayes theorem

In artificial intelligence courses, a start into an exploration of the Bayes theorem. In simpler words, Bayes is about discovering (strengths of) relationships between cause and effect. It's looking at events and developing a hypothesis or observing an occurrence which may have led to that event and an expression of the chance that the event occurs based on the occurrence/trueness of the hypothesis (potential cause).

This theorem is used in medical analysis for example (what is the chance for a person to have meningitis, considering the person has a headache?). Of course, the theorem can be expanded by the union of two hypothesis's. When both occur at the same time, *then* what is the chance for the event (meningitis) to occur?

For another direct application, consider credit-card fraud. If you have a large training set and you have a large database of credit-card frauders, you could determine from the data-set the probability that a female, a certain age group or people from a certain neighborhood/city/area commits credit card fraud. You could theoretically even come up with a number for a credit card applicant to state the probability the person would commit credit card fraud and so on. Of course, maintaining the view that you're dealing with probability, not certainty.

Bayes can also be used by learning systems to develop associations or relationships. It doesn't thus produce a boolean true|false relationship between elements, but a probability relationship. Theoretically, I think it should even be possible to organize different kinds of relationships (dependencies, associations, classifications) using Bayes just by looking at a large data-set. The problem here is that such an engine shouldn't be looking at all possible combinations of cause / effect, but logically reason within those, so make deductions about possibly sensible combinations.

Then one can question whether we as human beings absolutely exclude some silly statements. If we did employ true|false for each hypothesis with nothing inbetween, then we would have trouble understanding the world around us too, since it's full of exceptions. Does this suggest that some sort of Bayesian theorem is at the basis of our association determinations in a neural network?

http://www.inference.phy.cam.ac.uk/mackay/Bayes_FAQ.html

So, it's interesting to read about this...

Wednesday, September 17, 2008

Semantic priming

Psychology and neuro-science has done research on the effects of semantic priming. Priming is the effect where you're prepared to interpret a certain word, vision or thing in a certain way. When you're primed to react to a soon-to-be-fired stimulus and you're paying attention to it and the stimulus actually occurs, the reaction time is very quick. As soon as another stimulus is given however, the reaction to that is generally very slow.

I think priming is very interesting in the context of semantic processing of text or the world around us, in that a context of a story may also prime us for what comes next. That is, we always build up some kind of expectation of what will happen next and read on happily to see what's really going to occur.

Some time ago, I posted something about latent semantic indexing. It's more or less an indexation of latent semantics. Latent means:

1.present but not visible, apparent, or actualized; existing as potential: latent ability.
2.Pathology. (of an infectious agent or disease) remaining in an inactive or hidden phase; dormant.
3.Psychology. existing in unconscious or dormant form but potentially able to achieve expression: a latent emotion.
4.Botany. (of buds that are not externally manifest) dormant or undeveloped.

So, latent means hidden or dormant. It's about semantic meaning that you can't see inside the text, but use as a key for indexing that meaning or definition. In other posts, I doubted the viability of constructing formal knowledge systems (where knowledge is explicitly documented), due to the huge size and the huge efforts required in defining this knowledge (and the obvious ambiguities and disagreements that go along with it). Other than that, knowledge is also dynamic and changing, not static.

Considering priming thus and binding this with a technique for latent indexing, one could achieve a system where related symbols are primed before they are interpreted. Given different indices for vision, smell, audio and somatosensory information, each specific index could eventually (without saying how) be made to point to the same symbol, thus strengthening the interpretation of the world around a robot or something similar.

Thus, rather than explicitly defining relationships between concepts, consider the possibility of the definition (and growing) of indexed terms which partially trigger related terms (prime the interpreter), as the interpreter moves on to the next data in the stream. This could allow a system to follow a certain context and distinguish relationships of things in different contexts, because the different contexts have different activation profiles of each symbol.

Coupling this with probability algorithms, it would be interesting to see what we find. In fact, using probability is the same as the development of a hypothesis or "what-if" scenario. Whereas a certain relationship does not yet exist, we seek ways to prove the relationship exists by collecting evidence for it.

Some other activities that we learn are subconsciously learned. That is, the action/reaction consequences of throwing an object and having it drop on the floor. If the object is of metal, it probably won't break. If it's made of glass, it'd probably shatter. Those things are not obvious to children, but can quickly be learnt. Glass is transparent, feels a certain way, and there are a number of standard elements which are generally of glass. Plastic looks similar, but makes a different sound. We should aim to prevent dropping the glass on a hard floor. This bit of knowledge is actually a host of different relationships of actions, reactions, properties of objects either visible or audible and by combining these things together, we can reason about a certain outcome.

The psychology book also importantly notes the idea of attention. It specifically states that when attention is not given, performance of analysis, reasoning or control drops significantly. This means that we're able to do only one or two things at a time. One consciously, the other not so. But that it's the entire mind with control, audible and visible verification mechanisms to control the outcome.

The interesting part of this post is that it assumes that symbols as we know them are not named explicitly by natural language, but are somehow coded using an index, which has been organized in such a way that neighboring indexed items become somewhat activated (primed) as well to allow for the resolution of ambiguities. An ambiguity is basically the resolution of two paths of meaning, where the resolution should come by interpreting further input or requesting input from an external source in an attempt to solve it (unless assumptions are made to what it means).

Another thing that drew my attention is that recent strongly primed symbols may be primed strongly in the future independent of its context. This is mostly related to audio signals and related to for example the mentioning of your name. You could be in a pub hearing a buzz, but when your name is called somewhere, you can recognize it immediately within that buzz (thus, the neurons involved in auditory recognition are primed to react to it).

It's probably worthy to extend this theory by developing the model further and considering human actions, reasoning, learning and perception within that model (as opposed to building a network and trying out how it performs). Since it's already very difficult to re-create human abilities using the exact same replicas of biological cells, why not consider simpler acts and verifying parts of this reasoning with such a smaller network?

The first elements of such a network require a clever way of indexing signals and representations. In this way, the indexing mechanism itself is actually a clever heuristic, which may re-index already known symbols and place it in a different space. The indexing mechanism doesn't feel static.

Monday, September 15, 2008

Pre-processors of information

The psychology course I'm taking requires reading through a pretty large book (albeit in not too small type and with loads of pictures). The sensory system is explained, so it's sometimes more like a biology book. It's basically stating that rods are to analyze dim-lit places and cones are for richer-lit places. The cones can discern color and have lower sensitivity.

Researchers have determined that right after the light is transduced after the cones and rods, that nerve cells already start pre-processing the information. You should compare this pre-processing to the execution of an image filter of Photoshop. It runs some edge detection filters for example, improves contrast here and there and then sends it back to the primary visual cortex for further analysis.

I'm taking some personal experiments by looking at a scene for 1-2 seconds, then closing my eyes and attempting to reconstruct that scene. Looking at it for longer or more frequently makes the image more perfect, but the actual image with eyes open has a lot more information than that which I can reliable reconstruct. Or rather... I can reconstruct details of A tree or road, but possibly not THE tree or road out there. So my belief system of what I think a tree is or a road is starts to interfere. The interesting thing is that the actual image I can reconstruct is mostly based on the edges and swats of colors.

Example detail image

Example "memory" image

It's not very clear (due to time constraints) that the middle part of the picture still has the highest detail, whereas the sides of it has less due to peripheral vision.

The mind thus deconstructs the scene first by edge detection, finding lines, but at the same time highly depends on the ability to identify complete objects. Very small children for example are already surprised or pay attention when objects that were thought to be together suddenly seem to be actually apart.

It does take some time to identify something that we've never seen before, but pretty quick we're able to recognize similar things, although we may not know the expert name for it.

By deconstructing the scene, you could say it also becomes a sort of "3D world" that we can personally manipulate and visualize further (mind's eye). So I don't think we're continuously re-rasterizing heavy and complex objects, but have the ability to consider an object whole by its edges/traces, then rotate it, translate it or do with it as we please.

In these senses, the sciences that deal with signal processing and so on should depend on these techniques heavily. It is possible to recognize objects through its pixels, but perhaps by running filters on it before, the features are easier detected and the pattern recognition mechanism might just be significantly better. Thus... the way in which signals are presented probably always require pre-processors before they are sent to some neural network for further processing. In that sense, the entire body thinks, not just the brain.

Thursday, September 04, 2008

A.I. is about search

Well, so far I've enrolled on a couple of courses. One of them being "A.I. Kaleidoscope". It's a great course with very good course material and exceptional course material from the professor.

The book I'm reading, called "Artificial Intelligence", is very well written and highlights a couple of philosophical understandings as well as explains mathematical underpinnings of A.I. that have established so far. So it's quite a broad area it is discussing.

One of the statements I come across is that A.I. is about searching problem spaces. Whereas some problems have algorithms, other problems have a state space, where states are mapped onto and where transitions from one state to another are shown as arcs in a graph. Bringing the discussion to graph theory and trees. And breadth-first and depth-first searches, heuristics, and so on.

The idea is thus that A.I. is about mapping knowledge within a certain domain and understanding the phases or steps that an expert goes through in order to come to a reasonable conclusion (reasonable meaning not necessarily optimal, but certainly acceptable).

In previous posts, I sometimes discussed that we human beings aren't necessarily purely rational, but act emotionally as if we're programmed. We think we're exceptionally clever though. Well, another part of the book discusses the fact that we only consider things intelligent that act in ways that we ourselves would and could do. You could argue for example that intelligence shouldn't be subject to such a "narrow?" definition. But philosophically, there is no common agreement on the actual definition of intelligence, so this discussion isn't that useful at this time (within a blog that is).

I'd like for the moment to disconsider the general "folk" consensus that intelligence is solely determined by human observation and imitation (dolphins are considered intelligent because they seem able to have intricate conversations in their speech and behave in seemingly human ways to our stimulus and interactions). Taking thus a slightly wider interpretation of intelligence, and accepting the statement that "A.I. is about search", you can only conclude that Google built the most intelligent being on the planet. It's capable of searching through 70% of the internet at lightspeeds, you always find what you're looking for (unless it's too specific or not specific enough) and so on. Now, perhaps the implementation isn't necessarily intelligent, but the performance of the system surely demonstrates to me, using my browser, that it's a very intelligent system.

One important branch in A.I. is about "emergence", something I blogged about recently. It's when simple individuals within their own little context and environment execute actions, which within a greater context build up to a very intricate system that no individual could control, but together displays highly sophisticated attributes of intelligence. An example could be free market mechanisms. You could say that the information that a single individual has to control the logistics of vegetables in a single city would be limited, and most likely a single individual couldn't optimize this task. But all vegetable sellers in a certain city are very likely to be apt in optimizing their local inventory in such a way that it has least waste and optimal profit. Optimal profit means having just enough for people in their local environment to benefit, but not too much to have it thrown away.

These "agents" as they are called in A.I. act on their immediate environment. But taken together on a higher level, their individual actions contribute to a higher level of intelligence or optimization than possibly a single instance, computer, individual or thing could be if they were to understand the entire problem space, understand it and optimize in it. The core of the above is that many intricate and complex systems consist of simple agents that behave according to simple rules, but by consistently applying these rules, they can achieve "intelligence" that far exceeds their individual capacity.

So, A.I. does seem to be about search, but it's not about finding the optimal. Maths is about finding optimals and truths, it's an algorithm, thus (must be/needs to be) absolute and consistent. A.I. is about a problem space, possible solutions and trying to find optimal solutions (applying "intelligence") as best as you can, but always taking into account the cost to get there.

Humans don't always find optimal solutions to problems. They deal with problems at hand and are sometimes called "silly" or "stupid" by other humans (agents).

One of the things I liked about the book is that "culture" and "society" are instrumental to intelligence. It clearly suggests that there's a need for interaction for intelligence to occur. In fact, for intelligence to exist. It highly suggests that intelligence is thus cultural, but also infused and created by the culture itself.

If A.I. is about search, and more recent posts are about semantic models, where does this leave neural networks? I think the following:
  1. You can't build a human brain into a computer due to memory, bandwidth, cpu and space constraints. So forget about it.
  2. A.I. shows that you can model certain realities in different ways. There are known ways to do this through graphs, but those graphs have too harsh and clear relationships between them. They should be softer.
  3. Searching a space doesn't exclude the possibility of indexing knowledge.
  4. Relational databases may have tables that have multiple indices. Why not knowledge embedded in A.I. systems with multiple entry points, based on the input sensor?
Thus... what if we imagine an A.I. which behaves unlike the human brain but in other ways like it, uses multiple "semantic" indices for interpreting certain contents and contexts?

Latent Semantic Indexing is a technique to describe a certain text and then give it some sort of index (rating)?. You could then do the same to another piece of text and compare the two. The rate to which the two are alike is a certain score for the similarity. Thus, LSI could serve as a demonstration of the technique for semantic indexing (and possibly storage) of other receptors as well (sensors/senses).

Imagine that a computer has access to smells (artificial nose), images (camera), audible sounds (microphone) and so on and it has the ability to maintain a certain stream of this information in memory for a certain amount of time. The information together is a certain description of the current environment. Then, we code the current information using an algorithm yet to be constructed such that it can be indexed. And we create a symbol "A" in the table (the meaning) and create indices for smell, vision and hearing to point to A. Any future perception of either the smell, or the vision or the hearing might point to A, but not as strongly as when all indices point to it (confusion).

The problem space in this example is more limited to the combination of the senses and what it means and searching for possible explanations within each "sense" area.

The difference with more classic A.I. is that the classic version attempts to define context and define reality IN ORDER to classify it. The above version doesn't care much about the actual meaning (how we experience or classify it with our knowledge after x years of life). It cares about how one situation is similar to another one. In that sense, the definition of meaning is about how similar some situation is to another.

Now... if the indices are constructed correctly, similar situations should be close to one another. Thus, a computer should be able to quickly activate other memories and records of possibly similar situations.