]>
Commit | Line | Data |
---|---|---|
1 | @device(imprint10) | |
2 | @make(Report, Form 1) | |
3 | @style(hyphenation on) | |
4 | @style(BindingMargin 0) | |
5 | @style(justification off) | |
6 | @style(FontFamily ComputerModernRoman10) | |
7 | @style(footnotes "") | |
8 | @begin(center) | |
9 | @majorheading(GDB) | |
10 | @heading(Global Databases for Project Athena) | |
11 | @subheading(Noah Mendelsohn) | |
12 | @subheading(April 17, 1986) | |
13 | @end(center) | |
14 | @blankspace(1 cm) | |
15 | This note is intended as a brief introduction to the Global Database | |
16 | (GDB) system being | |
17 | developed at MIT Project Athena. | |
18 | GDB is an ongoing effort, still in its early stages, to provide the | |
19 | services of a high performance shared relational database to the | |
20 | heterogeneous systems comprising the Athena network. Specifications | |
21 | have been developed for a set of library routines to be used by | |
22 | @i[clients] to access the database. | |
23 | Current plans are to use the Ingres relational database product from | |
24 | RTI as a local data manager, but to support access via the client | |
25 | library from any Berkeley Unix@+[TM]@foot(@+[TM]Unix is a trademark of | |
26 | AT&T Bell Laboratories) system in the internet. Though early | |
27 | versions will manage only a single copy of any given relation, | |
28 | replication may be added at some point in the future. | |
29 | In the meantime the client library provides a uniform framework for | |
30 | writing database applications at Athena. | |
31 | ||
32 | While designing the client library it became apparent that many of its | |
33 | underlying services for structured data storage and transmission would | |
34 | be of value for a variety of applications. Most of these interfaces | |
35 | have been exposed, and the GDB project has undertaken as a secondary | |
36 | goal the development of these simple services for structured data | |
37 | maintenance and transmission. | |
38 | ||
39 | @section(Raison d'etre@+[1]@foot[@+[1]with apologies for lack of accents in the | |
40 | font!]) | |
41 | ||
42 | The GDB project was motivated by the observation that Athena | |
43 | applications tend to exploit the computational and display services of | |
44 | the system much more effectively than they use the network. | |
45 | Furthermore, | |
46 | those applications which do use the network tend to have strong | |
47 | machine type affinities, running comfortably on either a Vax or an | |
48 | RT/PC, | |
49 | but rarely both. Indeed, the @i[strategic] Athena database system is | |
50 | currently unavailable on the RT/PC's. | |
51 | ||
52 | Of the many unexplored uses of the network, globally accessible | |
53 | databases seem to have great value in a variety of disciplines, and | |
54 | they are also badly needed for certain aspects of Athena | |
55 | administration. By providing well architected services for global | |
56 | data sharing, we hope to achieve at least two goals: (1) set the | |
57 | precedent that user written applications and Athena supplied services, | |
58 | like @b[madm], @b[chhome], and @b[passwd], run compatibly | |
59 | from any machine in the network, and (2) encourage | |
60 | the development of new database applications by eliminating the need | |
61 | for individual projects and departments to develop their own | |
62 | transmission and encapsulation protocols. | |
63 | ||
64 | @section(Implementation Goals) | |
65 | ||
66 | The following goals have been established for the architecture and | |
67 | implementation of GDB: | |
68 | ||
69 | @begin(itemize) | |
70 | Access to databases stored on incompatible machines (e.g. RT/PC to | |
71 | Vax) should be supported transparently. | |
72 | ||
73 | Multiple databases, possibly at several sites, should be accessible | |
74 | simultaneously. The ability to do concurrent activity on the several | |
75 | databases is desirable. | |
76 | ||
77 | Appropriate facilities for managing structured data returned from the | |
78 | database should be provided for programmers (e.g. access fields by | |
79 | name.) | |
80 | ||
81 | Asynchronous operation should be supported, for several reasons: | |
82 | @begin(itemize) | |
83 | Required for control of simultaneous access to multiple databases. | |
84 | ||
85 | Needed for graceful interruption of long-running or erroneous | |
86 | requests. | |
87 | ||
88 | Facilitates pipelining of requests, thereby maximizing overlap of | |
89 | server and client processing. | |
90 | @end(itemize) | |
91 | ||
92 | When the internal interfaces used for session control and data | |
93 | transmission can be generalized without adding unnecessarily to their | |
94 | complexity, then those interfaces should be documented and exposed. | |
95 | @end(itemize) | |
96 | ||
97 | @section(Implementation Strategy) | |
98 | ||
99 | Several approaches to achieving these goals were considered, and an | |
100 | implementation strategy has been chosen. | |
101 | ||
102 | One approach to achieving the required function would be to rely on | |
103 | the appearance of RTI products containing the necessary facilities. | |
104 | At the very least, we would need a full function Ingres port to the | |
105 | 4.2 system on the RT/PC. RTI would further have to extend Ingres for | |
106 | access to databases through the internet, and they would have to | |
107 | support such access across multiple machine types. These extensions | |
108 | would give us a core of function suitable for limited application, | |
109 | though we would have to see whether flexibility and performance were | |
110 | truly appropriate for our needs. If RTI should come forward with a | |
111 | commitment to produce these products within the next few months, then | |
112 | need for the libraries described herein might not be so great. | |
113 | Lacking such products from RTI, it seems essential that we carry | |
114 | forward with a strategy for database access from @i[all] of the | |
115 | workstations in the Athena network. | |
116 | ||
117 | Having decided to do at least some of the necessary work ourselves, | |
118 | several implementation strategies are possible. One, which is | |
119 | currently being pursued by Roman Budzianowski, is to interpose the | |
120 | appropriate transmission services between the RTI Ingres front end and | |
121 | back end. This technique has a number of interesting advantages, and | |
122 | some disadvantages. The primary advantage is the ability to run | |
123 | existing Ingres applications, including some of the forms and query | |
124 | facilities, through the network. Also, Roman reports that he has | |
125 | succeeded in running an interesting subset of applications without too | |
126 | much effort. The disadvantages of Roman's approach are the lack of | |
127 | a strategy for supporting non-Vax machines until RTI comes out with | |
128 | the appropriate base products, and the dependence on undocumented | |
129 | interfaces. | |
130 | In some cases, the front-end and the back-end are sharing files, while | |
131 | in others signals seem to be sent. It remains to be seen how successfully | |
132 | we can divine these undocumented interfaces, how stable they remain | |
133 | over time, and whether--given an RTI Ingres on the RT/PC--we can | |
134 | figure out how to do the right byte swapping on the binary data sent | |
135 | through Ingres' pipes and files. Our conclusion is that Roman's | |
136 | effort should continue, because it can | |
137 | achieve valuable results without excessive effort. | |
138 | Nevertheless, this scheme falls short of our requirement for balanced | |
139 | support of all the machines in the Athena network, so we recommend an | |
140 | alternate implementation as the primary base for Athena application | |
141 | development. | |
142 | ||
143 | A machine independent access method for relational databases could be | |
144 | constructed in many different ways. One technique we have considered | |
145 | and rejected is to base an implementation on the RPC prototype | |
146 | developed by Steve Miller and Bob Souza. While RPC is | |
147 | convenient, and the prototype appears to be of very high quality, it | |
148 | fails to meet our needs in several crucial areas. We have concluded | |
149 | that asynchronous interaction between clients and servers is | |
150 | essential for performance, for parallel execution of multiple queries, | |
151 | and for interruptibility of ongoing operations. Synchronous RPC seems | |
152 | ill-suited to these requirements. A secondary concern is the lack of | |
153 | support for procedure calls across heterogeneous architectures in the | |
154 | current version of the RPC prototype. The right hooks are supposedly | |
155 | there, but the necessary alignment and type conversion routines have | |
156 | not been built. Indeed, the prototype has yet to be ported to the | |
157 | RT/PC. | |
158 | ||
159 | As a result of this analysis, we have designed a system which uses RTI | |
160 | Ingres for the things it does (or purports to do) well, and we have | |
161 | added a flexible, asynchronous transport mechanism for transmission of | |
162 | structured data between heterogeneous processors. The specification, | |
163 | outlined in a separate document, includes libraries for creation and | |
164 | management of tuples and relations in virtual memory, along with a | |
165 | simple mechanism for typing the fields comprising a tuple. Layered | |
166 | upon these are services for transmitting fields, tuples and relations | |
167 | through the internet, doing the necessary conversions and | |
168 | re-alignments when moving between incompatible machines. These | |
169 | services, in turn, are used by a library which provides almost | |
170 | all of the services of Ingres EQUEL to clients throughout the network. | |
171 | ||
172 | @section(Project Status) | |
173 | ||
174 | The specification for the interfaces to the client library routines is | |
175 | available in draft form and is now being refined. In parallel, design | |
176 | is proceeding on the protocol to be used between the sites, and on the | |
177 | related software structures used to encapsulate and parse the | |
178 | transmitted data. The design does include a simple but flexible | |
179 | proposal for managing asynchronous activities. Coding will start soon | |
180 | on those pieces of the library which seem to be stable; refinement of | |
181 | the other parts will take a few more weeks. While our rate of | |
182 | progress will depend greatly on the number of people doing the work | |
183 | and on their other responsibilities (neither of which are clear at | |
184 | this time), I'm optimistic that a basic implementation will start | |
185 | showing signs of life within a couple of months, with polishing taking | |
186 | a bit longer. |