gdb/gdb.prob

   1 /************************************************************************/
   2 /*
   3 /*                    PROBLEMS WITH THE GDB CODE
   4 /*
   5 /************************************************************************/
   6
   7 * Go through all of wesommers changes in /u1/sms/gdb
   8         - Eye catchers faster
   9         - fastprogress
  10         - signal handling
  11         - some fatal error stuff
  12
  13 * check error return handling on all system calls used by GDB
  14
  15 * terminate_db was never written
  16
  17 * comments on start_performing_db_operation lie
  18
  19 * clean up comments in dbserv.qc
  20
  21 * clean up comments in gdb_db.c
  22
  23 * Fix ptr <--> int casts on semantic routines
  24
  25 * Buffer input through gdb_move_data
  26
  27 * Ingres version 5 support
  28
  29 * Bouncing connections so that there is only one GDB port
  30
  31 * decode of descriptor currently assumes that NULL will be passed if receiving
  32   into unallocated descriptor.  Autos sometimes aren't null.
  33
  34 * Straighten out description of db_query in the library reference.
  35
  36 * Document retrieval limit setting
  37
  38 * Debug retrieval limit setting
  39
  40 * Document -l option on dbserv
  41
  42 * Document user and host fields and how to zap for anonymous use.
  43   Also, indicate risks of trusting the information in those fields.
  44
  45 * There is some inconsistency in the use of tuple vs. tuple descriptor
  46   as argument to the routines like FIELD_OFFSET_IN_TUPLE
  47
  48 * Offsets would be more convenient if from the start of the tuple rather than
  49   from the start of the 0th field.
  50
  51 * Some of the routines defined as macros may not work if semicolons are
  52   used after invocation.  This should be documented our changed.
  53
  54 * there is some inconsistency in the semantic routines over which
  55   decode routines presume null values of their targets, and which
  56   ones actively free any existing data.  Likewise for the null
  57   routines.  For the moment, we'll ignore it as an internal issue,
  58   but this should be fixed when it starts causing more trouble than
  59   it's worth.
  60
  61
  62 * should check all uses of register declaration to make sure they
  63   are appropriate
  64
  65 * make sure null values are handlined properly when tuples are
  66   initialized.  Looks like the string routines may be easily confused
  67   about non-initialized null values.
  68
  69 * must update documentation and sample programs to indicate requirement
  70   to call the gdb_init routine.  Perhaps there should be a closeup routine
  71   too?
  72
  73 ** create-tuple-descriptor may be mis-computing data-len in the case
  74    where alignment is leaving gas.  Looks like the gas is not getting
  75    added in. (I think this was fixed during debugging with Saber.)
  76
  77 * must make a plan to exhaustively test code paths.  Several whole
  78   routines have not been tried at all
  79
  80 * the design for out of band signalling and operation interruption must
  81   be considered more carefully.
  82
  83 * perhaps gdb.h should be split into a piece which is visible to all
  84   users (i.e. lives in /usr/include) and an internal piece used only
  85   for compiling the library.
  86
  87 * we need some way to mark and distinguish inbound and outbound half
  88   sessions.  For the moment, they can be implicit in their position in
  89   the connection data structure, but that seems a bit tacky.  Could
  90   use routine pointers to to do the reading and the writing.
  91
  92 * could implement a buffer/flush scheme at some level to avoid repeated
  93   and unnecessary system calls for write/read.  Not clear what level is
  94   best to do this--I'm inclined to do it in the oeprations themselves, not
  95   in the transport service.
  96
  97 * should come up with a way to encode those types which need a recursive
  98   call on cleanup.  We can probably get rid of null_tuple_strings that
  99   way, and pick up a bit of generality.
 100
 101 * don't yet have a proper architecture for the routines used at the
 102   server end.  Which one's should be exposed and documented to
 103   implementors of new services?  How general should they be?
 104
 105 * CONN_STATUS values should be cleaned up.
 106
 107 * make sure the cancellation function in OPERATION_DATA is handled properly.
 108
 109 * How do we really want to handle predecessor failing conditions.  Should
 110   we purge the queue, as currently claimed, or just attempt the next one?
 111   Probably just attempt the next one.
 112
 113 * select timer handling is being implemented very approximately for now
 114   We are setting the full timer each time the real select is called, not
 115   accounting for time already elapsed when select is called repeatedly.
 116
 117 * level of indirection on select is really not clear
 118
 119 * in dealing with file descriptor masks passed to functions, one should
 120   probably use the & notation consistently to avoid unnecessary structure
 121   copying.  See especially in the interfaces to gdb_fd_copy, etc. in
 122   gdb_trans.c
 123
 124 * current semantics of op_select imply that lots of manipulation of
 125   operation queues will be required as operations complete and the
 126   lists get logically shorter.  We may want to change the semantics
 127   of op_select and/or provide new operations for adding and deleting
 128   operations from an operation list, possibly with attendant re-allocation.
 129
 130 * how do we do exception handling on the connections?
 131
 132 * gdb_hcon_progress: how to handle cancelling, etc.
 133
 134 * the distinction between initialization and continuation routines
 135   is looking pretty artificial.  Leading to duplicate code in
 136   places like gdb_hcon_progress.
 137
 138 * out of band signalling not implemented.  The sockets are set to null
 139   during creation.
 140
 141 * when starting peer sessions a race condition is built into the current
 142   algorithm.  startup may fail when the two sides are initiated at about
 143   the same time.  Must be fixed for production use.
 144
 145 * start_peer_connection has no way to return a reason for failure.
 146
 147 * connections are currently made to a well known port.   Eventually this
 148   should be fixed.
 149
 150 * one could argue that the creation of the listening socket should be done
 151   once during initialization, not repeatedly.  On the other hand, this
 152   significantly raises the cost for someone who never listens.
 153
 154 * We could automate the activation of gdb_init by checking for a flag
 155   in the appropriate routines.
 156
 157 * define consistent set of op_results, apply them in gdb_ops.c
 158
 159 * synchronous calling of continuation routines may be a mistake?  I don't
 160   think so, since the length of the chain is statically limited.  Would
 161   be a mistake in the case of extensive re-queueing, but then we go back
 162   to the queue manager, NO?
 163
 164 * queue_operation takes a half connection as an argument, but they are
 165   not properly described in the spec.   There should be macros to
 166   extract the input and output half connections from a connection.
 167
 168 * our writeup should note ease of supporting shared state servers
 169   (e.g. a very live conferencing system?)
 170
 171 * for performance, changes should be made to minimize the number of read
 172   and write system calls done.  This could (perhaps) be done with look-behind
 173   on the outbound and buffered read on the inbound queues, used in conjunction
 174   with the iovec technique.  Should improve performance significantly.
 175   May also need a 'block' hint for use on a half connection which allows
 176   an application queueing lots of little requests to get them batched
 177   together.
 178
 179 * we're not handling the case where select returns -1 in cnsel, particularly
 180   for EBADF indicating a closed descriptor.
 181
 182 * start_peer_connection should use randomized waits to re-try connect and
 183   avoid the current startup window.
 184
 185 * need a better way to return errors on things like starting a connection.
 186   should scan all of the synchronous interfaces to find out where this
 187   should be used.
 188
 189 * it's possible that HCON_BUSY should be replaced by a mechanism to lock
 190   all of gdb_progress
 191
 192 * most operations should have cancellation routines.
 193
 194 * when tuples are sent for server/client stuff, make sure the descriptors
 195   are properly destroyed when not needed
 196
 197 * should integrate deletion as an operation on a type, along with
 198   optimization for nop'ing it.  Should then go through all of code
 199   which does things like null-tuple-strings and straighten it out.
 200
 201 * document start_sending_object start_receiving_object
 202
 203 * anything which starts an operation should make sure that the
 204   status of the supplied operation is NOT_RUNNING.  There is a
 205   nasty condition which results when the same operation is started
 206   twice.
 207
 208 * disable GDB_PORT because it encourages conflict among applications
 209
 210 * implement validate routines in create_forking_server
 211
 212 * check userinfod and inetd to make sure that server environment is
 213   appropriate.
 214
 215 * build async versions of db ops, at both client and server
 216
 217 * use of "string" vs "STRING" in documentation is confusing
 218
 219 * macros for getting at fields in tuples
 220
 221 * document non-portabilities
 222         - 4.2 dependencies
 223         - representation of data by enc/dec
 224
 225 * are text lengths being handled well on queries? (Looks ok, 9/17/86)
 226
 227 * lots of error checking is missing in dbserv
 228
 229 * Confusion in documentation as to whether receive_object allocates
 230   a new relation (for example)
 231
 232 * delete relation needs a way to delete non-contiguous data other
 233   than strings
 234
 235 * syntax checking of queries
 236
 237 * return codes for operations, especially DB operations, are a mess.
 238
 239 * we should have some equivalent of perror
 240
 241 * we can possibly give some information about where we were at
 242   the time of a failure if we keep a logical stack.
 243
 244 * gdb_init should be callable multiple times
 245
 246 * i.d. fields are currently implemented with strings, which is very
 247   slow.  Should re-implement with integers as magic numbers
 248
 249 * must figure out handling of errors on queued database requests.
 250   which ones cause a tear-down.  Are there problems dequeueing
 251   the appropriate operations in the right order?  Could things get
 252   out of sync if other things are queued ahead?  Seems to require
 253   a pretty good understanding of how things get cancelled implicitly
 254   and explicitly.  Maybe all explicit cancellations should be done
 255   using terminate_db.  There is also some question about how
 256   careful we have to be in handling the explicit cancellation of
 257   a single db operation, or whether that should be supported.  Not
 258   clear that we have enough of a bracket context to do that cleanly.
 259
 260 * document preempt_and_start_receiving_object, also, maybe should
 261   do one for sending.
 262
 263 * Notes should be provided on create/delete semantics.  Current is:
 264         tuple_descs are reference counted.  Your creates should
 265         match your deletes.   Same is true for relations and operations.
 266         Likewise tuples EXCEPT:  if you create a tuple, add it to
 267         a relation, and delete the relation, the tuple itself is
 268         deleted, and all dangling pointers to it are invalid.
 269
 270 * Must assign version numbers to all programs, documents and files
 271
 272 * gdb_ugd should reference K&R
 273
 274 * should make a CSTRING_T to represent a null terminated string by its pointer
 275
 276 * should be able to limit the number of tuples retrieved.
 277
 278 * Perhaps should be able to interrupt ongoing database operations, or
 279   in any case, the restriction on severing the connection should be
 280   documented.
 281
 282 * send_object and receive_object should return OP_SUCCESS not OP_COMPLETE
 283
 284 * create_type is documented, but it does not exist.  Also, it is mentioned
 285   in the language reference, but not really written up properly.
 286
 287 * add missing chapters to the user's guide
 288
 289 * Steve: forward reference in user guide to async
 290
 291 * Steve: diagrams of general system structure and flow
 292
 293 * More reference material in user guide
 294         - GDB data types
 295         - Other structures
 296
 297 * what are rules about memory allocation
 298         - what does it
 299
 300 * confusion over pointers and macros in documentation
 301
 302 * Missing chapters
 303
 304 * i.d. fields should be changed upon deletion to try and catch misuse
 305   of deleted data
 306
 307 * explicit host addresses are not accepted
 308
 309 *debugging forked servers is very difficult
 310
 311 *there should be more detailed failure codes on send_object receive_object
 312
 313 \f/************************************************************************/
 314 /*
 315 /*                      "Solved" Problems
 316 /*
 317 /************************************************************************/
 318
 319 * need a routine to wait for all the operations in a list
 320   should be easy to build out of op_sel (DONE)
 321
 322 * there should be a macro called TERMINATED for internal use which
 323   would be true when an operation was completed or cancelled.  This
 324   should be used by most of the routines which check for completion,
 325   in anticipation of the day when error recovery is done properly.
 326   (DONE...name of the macro is OP_DONE)
 327
 328 * see if select can be used with a listening socket (DONE...it works)
 329
 330 * we need a send_object and receive_object as well as start_sending_object
 331   and start_receiving_object (DONE, but documentation is missing)
 332
 333 * should put handling in select for the exceptfd case to catch closing
 334   connections, and likewise should check for errors on read/write.  Maybe
 335   only errors on read/write should be checked.  (DONE..exceptfd's are
 336   irrelavent...I think they're for getting signals, anyway, we do the
 337   right thing now when a socket fails by closing the corresponding
 338   connection.)
 339
 340 * make sure port numbers are handled OK in the server/client model
 341   (DONE)
 342
 343 * the database server MUST delete its relations, and we must decide
 344   how tuple descriptor reclamation is to be done (Done)
 345
 346 * make db_alloc and db_free replaceable (Done)
 347
 348 * James had problems with create_tuple_descriptor in that he got
 349   one of the parameters duplicated (not a gdb problem)
 350
 351 * async queries and db_ops (Done)
 352
 353 * we should put eye catchers into structured data and check it. (done)
 354
 355 * db_query is not nearly asynchronous enough (Done)
 356
 357 * Strings:  James forgot to initialize strings after creating tuple;
 358
 359 * we are missing a delete_relation routine, which would have to
 360   recursively delete all tuples after calling delete_tuple_strings
 361   or whatever on them.  This should be called by the decode relation
 362   and the null relation semantic routines.  (Done)
 363
 364 * naming inconsistency between "new-tuple" and "create-_____" all others (Done)
 365
 366
 367 /************************************************************************/
 368 /*
 369 /*              Problems in Providing SQL Support
 370 /*
 371 /************************************************************************/
 372
 373 * Character strings in SQL limited to 255 bytes, in literal at least
 374
 375