[Date Prev][Date Next][Thread Prev][Thread Next][Author Index][Date Index][Thread Index]

Re: Why have a multipart document address?



Right as far as you go, we spent a lot of time on designing this in
1979, and never got as far as implementing it.  The fact that you could
do it that way justified our design enough that we continued
implementation of green.    

Lets abstract a layer and remember that the documents don't have to be
in their home server, they may have been moved somewhere else.  So each
machine has a mop of what spans of the docuverse it has, and has caches
of, and what spans are cashed elsewhere as well as pointers as to where
more information can be found as to the whereabouts of the info.

So a query about versions of a remote document can be answered by
consulting the map of who knows where stuff about address A, and sending
a query about A in that direction.  So when versions or links are made,
the information about them has to be sent in the direction of the nodes
which have to know about it.  We envisioned something like an enfilade
that has a reflection of what is in various nodes but not all the
information.  The point here is to have a distributed tree rather than a
point to point crosspoint switch.  So that we get nlogn rather nsquared
traffic.  

How close are you to implementing this? and what base is it going on? 
It's good to have thought this out a few steps in advance of that you
are implementing, so that you know what direction you're headed. The
single layer kind of system you describe seems to reflect the
interconnectivity we currently have in the internet, and may be more
appropriate, than the quasi hierarchical model we had.  Then again since
nodes vary in bandwidth a lot more than storage size, maybe that should
inform how this plays out.


On Fri, 2005-02-11 at 16:11, Jeff Rush wrote:
> Roger, while we're on addressing considerations, can you shed any light
> on the backend <-> backend protocol?  I've come up with an approach I'm
> implementing but I've no idea if it is what the Xanadu team was
> considering doing.  Finding a way to (a) avoid storing a full copy of
> the docuverse at each node, and (b) alerting nodes to the availability
> of new documents in some manageable fashion is an interesting problem.
> Using the Green division of the tumbler space, the topic at hand, seems
> to help.
> 
> Given a document address, the node portion can be extracted and use to
> first, check the local cache and if not found, second, to issue a
> retrieval request at the home node for that document.  No problem.
> 
> My concern is how to obtain the document address for new, remotely
> created documents, i.e. how does node A become aware that there is new
> content at node B, with whom he has never held a conversation.
> 
> That breaks down really into how queries for links are handled.  Say I
> want to find out who links to my resume, stored on my server.  Those
> links may be on my node, or on any of a million other nodes.  If I issue
> a query for "links to doc 34 of type 45" to my node, I'll find those
> links whose home is on my node, but not those on a remote node.
> 
> My solution is that each time any node creates a link, it iterates over
> the endpoints and sends a notify to each node involved in some way in
> that link.  Those nodes can then pull a copy of the link from my home
> copy into their cache, and add it to their internal index.  Now when I
> query my own node, I'll find those links that reside far away.
> 
> This means that links are guaranteed to reside on any node with which
> they have a linking relationship, but not on unrelated nodes, until
> referenced by a reader.
> 
> This is subject to scalability problems, either accidental in the case
> of a commonly used catalog document like a list of all known users in
> Xanadu, or intentional in the case of a cracker.
> 
> My, admittedly poor, solution is that each time a node receives a notify
> from a remote node, it first counts the number of such links it
> currently holds for that remote node, and drops the notify message if
> some threshold is exceeded.  This loses information but preserves my
> local storage from overload.
> 
> -Jeff
> 
> 
> On Thu, 2005-02-10 at 10:22 -0800, roger gregory wrote:
> > Thanks, Andrew, my reply was a little incoherent, but I wanted to point
> > to some of the higher level considerations.  The important thing is that
> > it's not clear how e\we would design this differently, even though we
> > seem to be in a different universe now.  The considerations that led to
> > those design considerations still stand.  
> > 
> > I think it's instructive to notice that Gold didn't have a different
> > addressing scheme.  Not that it was intended to use the same scheme, but
> > that it was considered well enough solved and modular enough, that any
> > design changes could be postponed till closer to shipping.  Given the
> > kind of redesign that took place in Gold that's a resounding
> > endorsement!  Still it could use some reexamination for the next quantum
> > leap, though I'll continue to use it for my green stuff, and green<=>
> > html stuff.
-- 
Roger Gregory

roger@xxxxxxxxxxxxxxxxxxxxx
 
http://www.halfwaytoanywhere.com