Re: An idea: Apache Object Adaptor

From: Michael Poole <poole troilus org>
To: orbit-list gnome org
Subject: Re: An idea: Apache Object Adaptor
Date: 18 Dec 2000 15:48:54 -0500
dthompson characterlink net (Darrin Thompson) writes:

> On 12/18/00, 11:28:05 AM, Michael Poole <poole troilus org> wrote regarding 
> Re: An idea: Apache Object Adaptor:
> 
> I know it's sketchy but I don't think it's THAT bad. :-) And my 
> inspiration didn't come from buzzwords. (ok, sorry I mentioned XML. That 
> would be kind of stupid now that I think about it.) :-) My inspiration is 
> a data driven apps for which this approach seems well suited. 

Well, I was sort of trying to get your attention, and probably at
least a little influenced by my day work, which involves trying to
cram kinds of data processing into less than 200 ns.  Next to that,
"normal" programs sometimes seem creeping slow, regardless of how they
run compared to others of their kind.

> > First, you ignore the overhead of going through HTTP rather than a
> > more CORBA-oriented protocol (like IIOP).  On the client, it's not
> > that hard and it doesn't matter that much, but on the server, having
> > Apache dispatch the requests properly makes for a big CPU cycle hit,
> > and for real scalability you don't want to over-burden the server.
> 
> Hmmmmm. My experiences with Apache modules have shown that if a request 
> doesn't need to access a file on the filesystem, Apache can handle it 
> with extreme efficiency. But see below, I'll define what scalability 
> means to me more clearly.

That could be.  I don't know what kinds of tricks Apache plays with
parsing to make it fast, but it would have to parse at least the
GET/POST line from an HTTP request and possibly a Content-Type header,
which is somewhat slower than you could get with an efficient ORB
protocol.

> > Second, you don't really define scalability well.  I assume you mean
> > scalability to handle a reasonable number of potentially interacting
> > requests from many different clients (rather than scalability to
> > handle many linear requests from one client quickly).
> 
> You are right, I don't. My needs usually involve both of the above. A 
> small number of clients that hit the server hard and a large number of 
> clients that are gentler on the server. My performance "requirement" is 
> that handling of all requests begin in reasonably short period of time.

Thanks for the clarification.

> So, yes it's true that Apache will dispatch a request slower than 
> straight IIOP. But I believe the difference is measured in hundreds of 
> microseconds. For a data driven application (using Oracle etc.) this 
> difference is orders of maginitude lower than delays introduced by 
> Apache's dispatching.

I'm not sure I follow the last sentence; it seems to say that the
dispatch time for a data driven application based on Oracle is in
the range of microseconds?

> > Why does the definition of scalability matter?  Because the truly hard
> > part of scaling a server to many clients (and you can ask Oracle and
> > Microsoft and IBM's SQL server teams about this) isn't so much in
> > talking to all the clients -- how you do that efficiently is mostly a
> > function of the OS and its primitives for that.  The truly hard part
> > of scaling a server is making it concurrent, so that you can have
> > several requests being processed at the same time.  For this, no
> > performance-killing hack to add software layers will help; the server
> > must be written so that it does intellilgent locking.  And for this,
> > CGI scripts are very bad; there are no widely portable IPC locking
> > mechanisms available besides the problematic sysv ones, and those are
> > generally not very fast.
> 
> This is an excellent point. In a data driven app, however, a fully 
> featured RDMS provides a lot of help here.

Probably.  If you're that closely integrated with an RDMS, maybe it
would be better to write something that lives inside its address
space.  I don't know your application, so that could be impractical or
silly for some other reason.

> > (Depending on the operations, having a single-threaded server might
> > give more throughput or even lower latency than a badly done
> > multi-threaded server, because the multi-threaded server tends to get
> > in its own way.)
> 
> But I only write well done multi-threaded servers. :-)
> 
> I my experience, I can never beat the speed and fairness of a pool of 
> threads pulling requests from a queue. There are small latencies 
> introduced by the pool, but it's nearly impossible for one request to 
> starve the other requests. I always get real crisp response from this 
> approach. 

I'd guess that this depends on the kinds of requests (processing time,
server I/O bandwidth) you send to the server, but in general, I agree
with that.

> > Third, keeping state when the server runs as a CGI script costs much
> > more than just session-tracking.  If your server isn't long-lived, you
> > have to read data every time the server starts, and write the data back
> > before it stops.
> 
> Again you are right. However, CGI is only one way that a request could be 
> handled. Apache offers CGI (slowest), Mod_Perl (very very much faster), 
> native modules (speed limited by Apache only).

Right.  But from what I can tell, under Apache 1.3, mod_perl ends up
creating a separate perl environment for each process/thread of httpd.
This ties in with my point about preserving state between requests,
which you also point out may not be a big concern for all
applications.

> > * Concurrency headaches (worse than they need to be)
> 
> > * Longer state recovery and recording times
> 
> HTTP processing overhead is very small. If this approach is only applied 
> to RDMS and "data heavy" apps, the last to concerns are already expected 
> and aren't all that hard to manage at all.

I guess this is really where our opinions differ.  HTTP processing
seems slower to me than you want in an ORB (although maybe Apache does
have tricks to make it competitive), and most of the applications I
think about don't match RDBMSes nicely (for example, because the data
structures are hard to reflect in the table/row/column model).  For
the kind of applications you describe, I can see it as a possible fit,
but I do not think that HTTP is a good transport for CORBA.

-- Michael
References:
- An idea: Apache Object Adaptor
  - From: Darrin Thompson
- Re: An idea: Apache Object Adaptor
  - From: Michael Poole
- Re: An idea: Apache Object Adaptor
  - From: Darrin Thompson
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]