DTD: DbLinks and other complications



Vivien Malerba wrote:
  I wrote
> > What about the case where a table is NOT accessible directly,
> > but is indirectly accessible via the database you are connecting
> > to ?
> > (snip my obviously not-good-enough intro to dblinks)
> Can you describe that, I don't understand what it is (I'm not an Oracle user).
> Is that more or less like a view?

OK, explanations of ugliness the DTD needs to deal with if it is
to exploit the top-end databases properly (which it has to if it is
to be taken seriously by senior managers).  I'll discuss database
links, object types, and recursive queries.

I do not consider myself qualified to offer a nice solution tho.

Database Links
--------------
   MARY is running on her linux box.
   She can connect to DB0 on a Solaris box, sunny.
   Another database DB1 lives on an AIX box, aches.
   However, MARY cannot connect (ip routers/firewalls/whatever)
   directly to aches.
   The DBA (or other privileged user) creates a "dblink"
   (in this example called "CROSSHACK")
   that lives in the DB0 database on sunny, points to
   the aches database, and allows other DB1 users to get
   at the "hyperlinked" tables that way.
   Imagine further that BOTH databases contain schemas
   "SCOTT", each schema having a table "EMP".
   Once connected to DB0, MARY might be able to access both.
       SCOTT.EMP refers to the table on sunny
       SCOTT.EMP@CROSSHACK refers to the table on aches.
   Both tables can even be used in the one SQL statement.
   An individual user may or may not have privileges to
   use or manipulate a particular link.

   Think of a symlink in a directory you have execute on
   that points to a file that REALLY lives in a directory
   that you do NOT have execute on.  Its a bit like that,
   but with the chain-of-finding explicitly stated.

   They can exist between different versions of database
   on the same box as well.  Thus, if Oracle9 came out,
   and gda could not get at it, an Oracle8 database with
   a dblink might be used.

   The rights to create or modify the dblink definition
   typically are held only by DBAs.

Object Types
------------
   (Dr. Codd, close your eyes now!)
   Object types are a bit like C structures, sometimes with
   variable length arrays, that exist virtually as a "column"
   in a table.  They can even be nested.
   Thus the table "CUSTOMER" might have columns "MAIL_ADDRESS"
   and "DELIVERY_ADDRESS", each of which is of abstract type
   "ADDRESS" with the following structure:
      LEAD_LINES
          variable_length_array 5 max of varchar2(30)
      CITY_ID number 
         (referencing the CITY table, implying state/country)
      POST_ZIP_CODE varchar2(10)
      UNECE_LOCODE varchar(9)
      TAIL_LINES
          variable_length_array 2 max of varchar2(30)

    While there are some good points to this, (with stored
    procedures outer join problems can be avoided, for
    example, and abstract types can have their own methods
    or operators defined), this "object-relational" model
    can blow the simple relational model all the way back
    to codasyl.  Imagine if transactions were stored as
    a variable length array underneath account.  Or more
    realistically, email/phone_numbers underneath a person!

    Despite my distaste for these, I know they will be
    implemented in applications, and gda will have to
    deal with them.

    I know that Postgres and Oracle provide this sort of
    thing at least.

Recursive Queries
-----------------
    Oracle has long had a tree structured SQL statement.
    Imagine an EMPLOYEE table with the following structure.
        EMP_ID
        MGR_ID (null or a self-reference to EMP_ID)
        EMP_NAME
    select concat(lpad(' ',depth-pseudo-column), EMP_NAME)
            INDENTED_NAME,
        depth-pseudo-column, EMP_ID, MGR_ID
    from EMPLOYEE
    connect by prior EMP_ID = MGR_ID
    start with MGR_ID is null ;
    BIG_BOSS                 0    1234    (null)
     FRED                    1    1235    1234
      JOE                    2    1260    1235
      JILL                   2    1236    1235
       JOHN                  3    1220    1236
     MARY                    1    1111    1234
      HARRY                  2    1212    1111

     (There are limits, it blows up if you hit a depth
     of 255 from the start, and you cannot join without
     using a function on a column to retrieve the foreign
     data, but it nicely reflects trees which are the
     natural structure of many things).

     While the DTD can handle it, (its a straight row by
     column dataset being returned), it *WOULD* be
     nice to flag it for loading into a tree widget.

     I heard a rumor that DB2 can do something similar.
     
<asbestos suit on>
Why am I mentioning these difficult issues?  Because I
believe it is crucial to target the "top end of town".
Much as I hate to admit it, there are three real markets
for client software:
  1. Win32 only
     ODBC will continue to dominate for some years.
  2. Mixed Wintel/Lintel
     Borland will probably dominate in the near
     term.
  3. Architecture-neutral
     Currently dominated by perl DBI::DBD.  GDA
     certainly has a place here, especially if
     it can provide perl/python/et al bindings
     like Gtk--, perlgtk and pygtk.
     However in this area, inability to work fully
     with the likes of Oracle, Sybase, Informix,
     Postgres, etc, will lead to poor takeup, and
     Gnome database apps will suffer, and the big
     users who are candidates for using GDA will
     revert to the Win32 clients.

DBI::DBD is successful even though it admits it is
not as feature-rich as ODBC because it gets to the
big databases on any architecture.  However, it too
will be having problems, I believe, dealing with
object-types.

Even non-database programmers I deal with once they
learn that MySQL has no triggers and stuff-all
constraints drop it like a scone.  Postgres is the
natural home for serious open source databasing.
OpenSource bigot that I am, Oracle has a great
track record for never-say-die 24x7 work, distributed
queries, replication and hot backups that is hard
too find elsewhere, and thus hard to recommend
against.  (OK, I'm biased, I was the Oracle FAQ
in the early 90's - so shoot me!)

For GDA to work as I need it to, and be as successful
as we *all* want, the DTD needs to be capable of
properly handling the corporate/government databases,
running on platforms other than Win32 and Lintel,
and having more than one concurrent user.

Speaking with a managers hat on, there are issues
that can sneak through where GDA developers are running
on their own Linux box.  (Excuse ignorance here
please if I am basing comments on gda 0.95.)
Security of the gda corba factories and the SQL
statements running through them is critical on
multi-user machines.  I cannot allow deployment of
gda if the information of other connections (such
as SQL statements issued by other users) is
visible to just any old gda corba client.  Too
often, client programs will pass through sensitive
information through the middleware to the backend.
While the diagnostics available through a common
gda log are great for debugging, with 0.95, to get
the factory working, the file in /tmp had to be
writable by all users, could be read by all, so
any authorized user could check out what everybody
else was running.  Is this being addressed?  How
is it possible to declare to gda that certain users
have administrator privileges over connections
for monitoring/debugging/killing other sessions?
(Perhaps this is not a GDA issue at all, but
getting to be a general problem with linux-based
development).

I'm making these statements *not* because I am 
against the gda and related projects, but because
I am desperate for them to succeed.
<asbestos suit off>

Sorry for the ramble, the lack of time to code,
but all I can offer at the moment is (I hope
constructive) criticism.

Regards
David
-- 
David T. Bath bathd@edipost.auspost.com.au
+613 9204 8713 (W) 0418 316 634 (Mbl)




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]