PRACTICAL THINKING. — TECHNOLOGY USER. — PyLogTalk? ... [ Word count: 3600 ~ 15 PAGES | Revised: 2018.10.8 ]
BLOG
Digression regarding simple intelligent systems.
[Updated. Revised. 2 more sections. 5 and 6 are the new ones. ]
Toy system to build. Documented.
Word count: 3600 ~ 15 PAGES | Revised: 2018.10.8
PROBLEM: ROBUSTNESS
Basically we're looking at an actor model. Petri net features. Agents with tags.
This can also implement a multilayer neural net as needed.
One reason is user security. The other is ease of adding advanced features.
Actors will be agents.
Each of them will have methods available such as learning algorithms. But agents together form a net.
Everything is concurrent.
So we'll probably have to build most things from scratch. On the other hand, very easy to add features then, and we win overall.
Think Smalltalk. Erlang.
Each agent will be running multiple tasks. And many, many agents running. At "once". (There is no well-defined concept of "time".)
Most agents will be relatively concise code. Some will have logical methods, some will use ML, and they will try methods until a problem is solved, and then halt.
If they run out of methods, or take too long, they also halt. Just fail to produce any output.
FUNDAMENTALS
We'll use the word Actor in a somewhat nonstandard manner. Compared to the original papers of Hewitt. Beyond the actor axioms; but we'll reserve the word Agent for something that uses tags.
Tags in the sense of Holland will be introduced in the next document. Or else in revisions of this one.
Basically to start we need an ecosystem of several items:
Actor = ( Guards | Methods | Parameters )
when idling.Guards is a set of heuristics. Methods is as it sounds.
Inputs are messages. Each actor has some memory.
The result is Actor = ( Inputs | Guards | Parameters | Methods )
when not idling.
Messages themselves have some memory. They can "marked" by different actors.
One heuristic is selected from Guards and processes an input. Then it reduces the set of Methods.
One method is selected from Methods and if it yields any output (at all), then the output is stored. Meanwhile the Input is removed.
The result is Actor = ( Inputs/<inputs that yielded outputs> | Guards | Parameters | Methods | Outputs )
.
Parameters, which can be changed by a message received, determine to which other actor an output is sent.
An output may be a whole actor.
A message is a really minimal actor.
So actors can construct actors.
And methods are drawn from a common list. Shared. But separate instances.
Often times an actor will produce several actors, which will produce outputs, and crash and disappear once their outputs are out. The base actor however will remain available to users.
Actors can only send messages to actors whose name they have information regarding.
That means some actors do nothing but receive and forward message to several other actors, whose names (location) need not be known to the sending actors.
Order for inputs won't matter.
Rather they are labeled. No position notation for arguments. Rather by name.
F ( A,, B ) = F ( B,, A)
, which does not reduce to F ( x1, x2 )
.
F ( Label1_value1,, Label2_value2 ) = F ( Label2_value2,, Label1_value1 )
, which is evaluated as F ( 「Scrape」 ( Label1_value1 ), 「Scrape」 ( Label2_value2 ) ) = F ( value1, value2 ).
We would start with the simplest implementation that can be scaled and proceed worse-is-better.
For example, existing actor support would be insufficient. We need to build a generic process that spawns ( Inputs | Guards | Parameters | Methods | Outputs )
data structures.
Keeps track of the memory. Lets say N classes of actors, where the amount of memory available for each slot in ( ) is a multiple 1, 2, ..., N of Z KB.
Then equip each actor automatically with a "Mark" method and a "Make" method. Let the "Make" be able to produce only a message:
( MessageContent | | Marks | | )
.
All input and output operations are messages. Message passing.
Note regarding languages in which implemented. For example, current Python has some known issues with pure concurrency. The above framework is primarily intended to make sure none of these issues bother us. (It handles concurrency.)
Once the framework is there, we primarily work on creating Methods to put inside Actors.
The key is not to run out of memory, I think. And to avoid things locking.
After that we pipe together Actors to create AI. Each performs its role and passes the intermediate work to the next one.
Very modular. Allows us to demo very quickly.
And allows the user to cook their own AI.
Not super efficient as is, but there is the nice security property that recovering an actor/agent from merely observed memory is ... challenging if possible. And most often not possible.
How the actor model is implemented - by the way - so long as efficient, doesn't matter too much.
More methods will come next after we have actors.
We'll be adding more and more methods over time. So it's not a predefined class. Hierarchical encapsulation.
At the start, we only need Mark
and Make
.
Make
is also what sends messages. But it can create full actors. What it does it packages the outputs, if any, from the actors, with Methods
from a set accessible to all Actors.
A message is just a really stripped down actor that has no methods, not even Mark
or Make
.
I highly recommend reading up a little about Smalltalk. Or Erlang.
It may give you some good ideas how to implement.
I based the framework above on Concurrent Prolog. The usual ML, etc., will come in later as methods. For example, if an actor is given a text and tags it and the text gets > 100 reads, for example, the actor that automates tagging will treat this as a Success, and send factor analysis of coefficients and text to the Method in the common set. Which may or may not accept the update. (I think you can see where this is going.)
For example, however, if < 100 reads, then simply nothing is sent.
Smalltalk design so that nothing waits for anything. Discord back end is like that.
If you can figure out how make the simplest possible model with GPU acceleration, that would be excellent.
Anything that works will be something lightweight and that scales simply by adding methods. Each method its own program - which is (i) copied entirely into new actors when they are spawned (ii) basically just shares a socket with the framework. From the perspective of the actor, or the user, each method appears like a black box.
Anything can be changed without anything else changing. Totally independent.
Once a basic actor framework exists, we'll add the ability to time out after a count. The next step, which will be just a small modification of the generic Make and Mark methods - which all actors are equipped with.
So move in small steps. Worse-is-better.
EXAMPLE 1
Think like in Smalltalk. Flat concurrent prolog.
For example:
( MessageContent | | <ActorName>, ... , <ActorName>, Count | | ) = ( Add | | Bob, Count_1 | | )
If Alice
receives that AND having 1 mark is sufficient authority AND it has a method that understands Add
, it will output an Adding
process, send a message back to Bob
with the name of the added (e.g. the hash).
Presumably the actor we're calling Bob
will send another message ( 5, 10, <name-of-Bob> | | | | )
to <name-of-the-adder>, which will output ( 15 | | | | )
to Bob
.
Adder, in that case, has methods for 5 and 10 and a method to guess that B is where it should send the output.
What we need to do is allocate some memory for each of these things, and once not used, clean that memory.
Some methods will be more advanced. Like image processing via ML.
Even here we can allow actors to speak different languages.
(We just need a model of error, or malicious code. To test. And this may be it.)
For those in computer science we leave context free languages at this point.
But it's all simpler and more common than it may sound. If an actor receives message not meant
for it, say ( 5,, 10,, <name-of-Bob> | | | | )
arrives at Chocolate
. Which is not even an Adder. Nor a Multiplier. 5
and 10
just doesn't mean anything to it, has no meaning in that context; their processing and elimination is not means
Then each method Chocolate
simply crashes when applied in some random order to 5,, 10,, <name-of-Bob>.
What else can it do? Any options?
Sure.
- It can send the message back to the sender if all it's methods applied to it crash. And this is how
Alice
knows there's an error. Meanwhile this is how we can detect an error. Too slow or too fast or too frequent back directly to the sender of messages means we have bugs. Something wrong.
A background process that runs statistics on this and keeps count would be nice.
Up to you if you want to implement this as a test.
- It can send a message to a random actor, asking for a copy of a random method as a message. Then it tries that method on the current message. Like AI.
Not particularly safe; also inefficient. Even if there's a defined gradient for relation or fitness of method and message, as Wolpert recently demonstrated, it cannot be efficient. More on this later.
But relatively interesting, and powerful.
We may implement this in the toy model. (Don't worry about the gradient. Just hook randomness freely from the order in which processes begin or complete, as that is well indetermined. This means the randomness is biased to the phase or relation of states of the whole system. Was going to say state of the whole system. But there is really no such thing for a large distributed system of this type.)
There are ways to constrain this and make it safe. But that would require something like types. At the moment there are none.
- It doesn't reply.
Alice
sends the message elsewhere after no response to it prior count C reaching K. We could set K.
Tacit communication is still communication.
It can mark the message. Then, in a variation on the second approach, send the message to a random actor. Very unsafe and inefficient.
It can send part of the message. And its own name. Calls it a
Request
inParameters
, another kind of mark. To a random actor. If the syntax matches for another actor, that sends to it the method. Which it can load. If no match, the other actor marks the message, and sends it off randomly. An actor scans for its own mark in a message and randomly forwards the message if its own mark is there.
Possibly efficient. Whether that is the case depends on the exact system. What actors, what messages. Any natural gradient due to distribution of methods among actors.
None of these five approaches has to be implemented. But it reveals that we need (i) the capability for several kinds of marks for messages in
Parameters
, and (ii) data structures for actors that permit sending a method as a special case of a message and loading methods not already present, and (iii) unloading methods loaded after some use.
EXAMPLE 2
How to think?
Consider an actor equipped with several methods. Some occur in order; but others applied in random order to the input. The ordered methods may be but not necessarily are all methods present in the actor.
Suppose an image arrives of size (v,w)
. The actor spawns N x M copies of itself, minus this copying method, and sends each the image.
Each is equipped with adjoined integer parameters (n,m) < (v,w)
. N. And a parameter RGB
— also an integer. M.
So for every
The parameter of each is different. Let the largest possible (n,m)
be at least a quarter of (v,w)
. Dimensions of a subimage.
The smallest possible (n,m)
is random. Meanwhile the (i)-th spawned actor has (n,m)
greater than the (i+1)-th spawned actor. So there is a sequence of inequalities which is satisfied. But it's biased random.
RGB
is similar but varies in intervals of 25, or 50, for example. Differences that are relatively large.
Each actor has a method that cuts up the image from the upper left corner into subimages of the dimension it has stored.
And it has a method that takes these outputs and counts pixels with plus or minus 25, or 50, of its RGB
. For each subimage.
It compares the counts; if within, say, 100, of each other, it sends, to the actor which sent the image, one subimage, and its two parameters. But if no matches in this sense, it does nothing.
Each spawned actor, when idle, crashes. Memory cleared.
So for example, if congruent text was in some position in the same quantity, of whatever color, littered on an image in some tiled fashion, but random angle or position in each tile, the actor that sent the image to the processor would soon have a copy of the text and how to find it.
CORRECTNESS VERSUS OPTIMIZATION. CORRECTNESS AND OPTIMIZATION.
MESSAGE PASSING WITHOUT MESSAGE PASSING. OR PURE MESSAGE PASSING WITHOUT MESSAGES
I've made some progress on one of the last main ingredients we need. --- Which is a "difficulty" heuristic.
Basically we'll let processes that take too long time out. Meanwhile contributing to a statistic.
We want to minimize the number of timing out processes.
So we need a way to decide how "hard" some method is, such that we can do max min, and if a "hard" method is drawn near the time limit by which point the actor is going to time out, it can decide to skip it and try something else. (Because another, easier method may get done in time.)
Basically this reduces the frequency of "accidental" failures: --- actors which fail to produce an output before a cut off time, given an input, not because they can't produce a valid output or something wrong with the input but simply due to randomness in the order in which their valid-in-the-context methods happened to be applied.
Regarding code, right now, the idea to keep in mind is that any implementation is fine if:
(a) when tested it runs on the server,
(b) actors can read other actors very generically.
[Messages are just really simple actors. — So getting the actors reading other actors is all that matters.]
We should probably put a global size cap of K MB on each actor. Then the system becomes defined as a program which consists of many programs each < K in size.
So we have a system where all objects are actors. Actors can generate others.
Hence like in Smalltalk/Squeak, all other classes and metaclasses besides those used to define the smallest actors, are also themselves actors.
The most basic messages are just actors with no methods.
E.g., if ( 5, 10, <name-of-Bob> | | | | )
is sent to actor Add
, that should result in Add
sending ( 15 | | | | )
to actor Bob
.
Reading: Probably each method should be a separate file. Each one another file just on the server. The actor just loads whatever is inside each such document. So each method at the level of the actor definition is a pair of (MethodName_thename,, Link_alinktoafileontheserver)
.
Then reading works as follows.
(i) When A
receives a message M
from B
, the message goes to the location of A
, and "runs". This is significant. Not so much A
reads M
as rather M
runs and A
reads.
(ii) Unless M
runs, A
should not be able to read it.
(iii) If M
is run by A
, then M should first of all know A
is by it. That A is trying to read it should be available to M.
(iv) M
runs and A
reads, in which case data "A"
is passed to any methods in the message M. [There are none, most often; but there could be some, and this functionality will be needed.]
After that happens, only then is the "input" part of the message copied into the "input" part of A
. [The ability to share is a default part of the definition of all actors. Not a special method.]
Each actor is idle (a) if it's input part is empty, or (b) its methods part is empty. Once it's input part is filled, if it has any methods, it applies them in some order. [For now, let order be randomized.]
Suppose A
wants to read M
but M
is not coming to A
. For example, if M
is a mailbox; then A
sends a message O
to M
.
M
then would rather send back a message to A
. Specifically M
is a mailbox, and N
is a message from B
, meant for A
.
N
would have, in the inputs part, another actor, the message that M
would send to A
. So N
would have form ( ( 15 | | | | ) | | | | )
.
When N
runs by M
and M
reads N
, ( 15 | | | | )
is copied to the the inputs part of M
.
And that is what M knows to send to A
after it receives a message from A
.
So we basically have message passing without any message passing. Just one type of thing: --- actors which can run in proximity of each other and read each other.
RANDOM THOUGHTS
This assumes a high velocity, big database. Suppose the project is social media, and that we'd like to offer users, for their important posts, possibly all of them if they deem them such, to be immutable. Persistent data structures.
We can, actually, avoid scaling issues.
Blockchain for data makes for digital paper. (As hard to change as real print. Which has major uses. But, of course, if desirable, we can implement it later.)
One issue is basically running out of RAM. However the system is served. We must, therefore, allow SSD to be used, not having to keep the whole chain and more in RAM is quite possible with an actor message passing framework.
This could involve using very small pieces. And creating a gradient of "related", a distance function, which pragmatically determines the best order to load them into RAM.
Questions for later.
Many options however.
Or we can implement Kanerva's Sparse Distributed Memory for data on the chain, using our framework. (Already several Python implementations of that.)
Basically with data on the chain possible to eat part of publishing. Especially as Steem is dropping the ball in more than one way.
In any case, we can always add that in later. Pretty easy so long as our basic framework is what we're moving toward anyway.
[Another way to avoid scaling issues is to make sure all possible activity for each post converges to some small finite load, even as use increases. (Steem's attempted approach. One reason for their seven day upvote windows.)
More similar to lighting and such things, yet another way is to have several levels of chains, with larger blocks, but exponentially less frequent transactions between one chain and another chain of the same type, but with larger blocks. When transferred and verified to the larger block chain, the lower chains are cleared by an above threshold number of signatures. (Blocks on lower chains can be cleared in that case. Whereas blocks on the uppermost chain cannot be cleared in any case.) Easier in our situation, because forwarding actors can fall back onto a lower level to read if data link present in the front end is still missing in a higher chain, which is preferentially read off from, and read randomly.]
Suppose you are using blockchain only to monetize and running on an ordinary database (but MongoDB or something equally fast) for data storage for serving.
The other alternative for data being immutable is even simpler.
We have AI. So we can instantly add immutable data, where the user desires it. Just make a service fee based actor, that cross-chain posts.
It can copy the post in our system to STEEM, EOS, BCH (memo.cash), ..., if dragged onto the post. And the user just pays the fee for that, if any, like that.
Meanwhile we have some kind of small transaction fee, or just exchange for the actor blob.
The actor would make an account on those other chains and keep track of the keys, smart enough.
And the result would an AI framework that's integrated with, for example, STEEM.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License . . . . . . . . . . . . . . . Text and images: ©tibra.
To listen to the audio version of this article click on the play image.
Brought to you by @tts. If you find it useful please consider upvoting this reply.
Ty
Somewhere at the very top of the text above I put a tag: — Revised: Date.
Leave comments below, with suggestions.
Maybe points to discuss. — As time permits.
Guess what? Meanwhile the length may've doubled . . . ¯\ _ (ツ) _ /¯ . . .
2018.10.8 — POSTED — WORDS: 3.600