1 Date: Tue, 26 Jul 1994 14:37:39 -0400
2 To: Kimberly Carney <kim@MIT.EDU>
3 Cc: dkk@MIT.EDU, op@MIT.EDU
4 In-Reply-To: Kimberly Carney's message of Tue, 26 Jul 1994 14:00:29 EDT,
5 <9407261800.AA13079@chich.MIT.EDU>
6 Subject: Re: AFS prdb sync'd with Moira
7 From: "Richard Basch" <basch@MIT.EDU>
9 [ Note: make sure that you have the binaries from afsuser copied
10 locally onto your local machine; also make sure that you have all
11 scripts (see below) copied locally before you start, since AFS may
12 disappear on you. Altnertiavely, your machine should not be
13 depending on the Athena cell. For syspacks, root.afs (/afs), etc.
14 Make sure you have superuser tokens for all cells that you might need.]
16 The binaries are now in /moira/sync/ on the Moira server.
17 Most of the commands must be done on the Moira server.
19 touch /moira/afs/noafs
20 (This gives you some grace time, but watch for critical AFS errors after
21 this happens, as you will have to handle those by hand.)
23 [ 30 minutes after an AFS incrementals starts, they will time out....
24 So after that, they will log critical error and then they will
25 have to be done by hand! ]
27 /moira/sync/afssync /var/prdb.moira
28 (I recommend that you do the following few steps concurrently with this,
29 as the "noafs" lock file doesn't give you too much grace time.)
30 [ ^^ This takes roughly 20-40 minutes.]
32 [ Check all PTS servers to make sure they have a consistent versions.
33 Note the PTS database version number, make sure no write
34 transactions are in progress. "udebug -p 7002".]
36 rcp root@orf:/usr/afs/db/prdb.DB0 /var/prdb.old
37 (use "udebug <server> -p 7002" before and after to make sure the version
40 [ Check to make sure that the version number is the same with udebug.]
42 /moira/sync/pt_util -x -m -u -g -d /var/prdb.extra -p /var/prdb.old
43 perl /moira/sync/pt_util.pl < /var/prdb.extra > /var/prdb.extra.sort
44 (These two commands extract and prepare the personal groups and special
45 user entries in the old prdb for being reincorporated into the new prdb.)
47 *** Make sure the "afssync" command has completed ***
48 cp /var/prdb.moira /var/prdb.new
49 /moira/sync/pt_util -w -d /var/prdb.extra.sort -p /var/prdb.new
50 (This almost completes the preparation of the prdb.)
51 [ ^^ This takes 40 minutes, may take longer. Exponentially with
52 number of personal groups.]
55 (Save the numbers printed.)
57 copy /var/prdb.new to *ALL* the database servers (/usr/afs/db/prdb.new)
59 The following should be done as quickly as possible...
61 foreach i ( <db servers> )
62 bos shutdown $i ptserver
63 bos exec $i "rm /usr/afs/db/prdb.DB*; mv /usr/afs/db/prdb.new /usr/afs/db/prdb.DB0"
66 foreach i ( <db servers> )
67 bos restart $i ptserver
70 Watch the status of the servers using "udebug" to make sure things are
71 going well... make sure the beacons are working, and that once quorom
72 is established that the servers are resynchronizing their notions of the
73 databases and that the dbcurrent and up fields all become set and the
74 state goes to 1f. Also watch out for large rx packet queues on port
75 7002 using rxdebug, as the fileservers may get excessively backlogged,
76 and restart servers, if necessary, if the congestion remains excessive.
78 [ Use udebug on prill.... will take 75 seconds for the pts servers to
79 elect a master, and then additional time for the master to propagate
80 its database to the rest of the pts servers.]
83 (if the id's are lower than the saved ones, reset them appropriately to
84 the saved one's, using "pts setmax").
86 pts ex system:administrators
87 (good spot check, especially since it has special people)
88 (also spot check one of the personal groups and perhaps, something like
89 the membership of rcmd.ronald-ann)
92 (You need to remove the lock file you put on.)
96 ***************************************************************************
100 1. There is also a faster pt_util command for integrating the various personal
101 groups. However, it has not been fully verified. It can be found in the
102 development sources as pt_util-fast.c. Feel free to try using this one, but
103 I would also recommend generating the database the old way just in case...
105 2. The goal is to minimize the outage and minimize the potential for changes
106 so concurrency is highly recommended.
108 3. Make sure you copy the database to all the protection servers, as the
109 servers will be more than happy to give "no such user" answers and users
110 will not be able to reestablish authentic connections without doing
113 4. Don't do this when you're tired... There may be no cleanup procedure
114 available, with certain mistakes.
116 5. /moira/afs/noafs is only good for 30 minutes. Keep track of the
117 critical log, and you may have to do some operations by hand when the
118 operation is complete. Also, if requests depend on other requests, they
119 may be processed out of order, and fail, and may need to be done by hand.
121 ***************************************************************************
123 (The following is a very old message...)
126 To: op@MIT.EDU, mar@MIT.EDU
128 Subject: AFS/Moira sync
129 From: "Richard Basch" <basch@MIT.EDU>
132 I have rebuilt the AFS protection database from the information in the
133 Moira and old prdb (for the special entries and personal groups). It
134 has been installed without a problem. The old prdb is in
136 prill:/usr/afs/db/prdb.old
138 As usual, I installed it with no interruption of service (there may have
139 been a couple minutes when AFS was a bit slow as the protection database
140 servers were being restarted, but that's it).
142 The following is the basic procedure I used to create the new prdb...
147 moira2# /moira/bin/afssync /var/prdb
148 Doing users: Tue Sep 7 23:59:37 1993
149 Doing groups: Wed Sep 8 00:16:26 1993
150 Error adding group system:mit id -101: Entry for id already exists
151 Error adding group system:authuser id -102: Entry for id already exists
152 Error adding group system:administrators id -204: Entry for id already exists
153 Reading/preparing members: Wed Sep 8 00:30:11 1993
154 Doing members: Wed Sep 8 00:34:22 1993
155 Done (16591 users, 18144 groups, 23 kerberos, 39097 members): Wed Sep 8 00:41:10 1993
157 prill# /mit/opssrc/moira/afssync/pt_util -x -m -u -g -d /tmp/xxx
158 prill# perl /mit/moiradev/pmax/afssync/pt_util.pl < /tmp/xxx > /tmp/prdb.extra
161 moira2# cp -p prdb prdb.new
162 moira2# /mit/moiradev/pmax/afssync/pt_util -w -d /var/prdb.extra -p /var/prdb.new
164 Error while creating who:rune-staff: User or group doesn't exist
165 Error while creating system:gsipbbin: Entry for id already exists
166 Error while creating celine:admin: User or group doesn't exist
167 Error while creating celine:titan: User or group doesn't exist