]>
Commit | Line | Data |
---|---|---|
3d8d4b36 | 1 | Date: Tue, 26 Jul 1994 14:37:39 -0400 |
2 | To: Kimberly Carney <kim@MIT.EDU> | |
3 | Cc: dkk@MIT.EDU, op@MIT.EDU | |
4 | In-Reply-To: Kimberly Carney's message of Tue, 26 Jul 1994 14:00:29 EDT, | |
5 | <9407261800.AA13079@chich.MIT.EDU> | |
6 | Subject: Re: AFS prdb sync'd with Moira | |
7 | From: "Richard Basch" <basch@MIT.EDU> | |
8 | ||
9 | [ Note: make sure that you have the binaries from afsuser copied | |
10 | locally onto your local machine; also make sure that you have all | |
11 | scripts (see below) copied locally before you start, since AFS may | |
12 | disappear on you. Altnertiavely, your machine should not be | |
13 | depending on the Athena cell. For syspacks, root.afs (/afs), etc. | |
14 | Make sure you have superuser tokens for all cells that you might need.] | |
15 | ||
16 | The binaries are now in /moira/sync/ on the Moira server. | |
17 | Most of the commands must be done on the Moira server. | |
18 | ||
19 | touch /moira/afs/noafs | |
20 | (This gives you some grace time, but watch for critical AFS errors after | |
21 | this happens, as you will have to handle those by hand.) | |
22 | ||
23 | [ 30 minutes after an AFS incrementals starts, they will time out.... | |
24 | So after that, they will log critical error and then they will | |
25 | have to be done by hand! ] | |
26 | ||
27 | /moira/sync/afssync /var/prdb.moira | |
28 | (I recommend that you do the following few steps concurrently with this, | |
29 | as the "noafs" lock file doesn't give you too much grace time.) | |
30 | [ ^^ This takes roughly 20-40 minutes.] | |
31 | ||
32 | [ Check all PTS servers to make sure they have a consistent versions. | |
33 | Note the PTS database version number, make sure no write | |
34 | transactions are in progress. "udebug -p 7002".] | |
35 | ||
36 | rcp root@orf:/usr/afs/db/prdb.DB0 /var/prdb.old | |
37 | (use "udebug <server> -p 7002" before and after to make sure the version | |
38 | hasn't changed.) | |
39 | ||
40 | [ Check to make sure that the version number is the same with udebug.] | |
41 | ||
42 | /moira/sync/pt_util -x -m -u -g -d /var/prdb.extra -p /var/prdb.old | |
43 | perl /moira/sync/pt_util.pl < /var/prdb.extra > /var/prdb.extra.sort | |
44 | (These two commands extract and prepare the personal groups and special | |
45 | user entries in the old prdb for being reincorporated into the new prdb.) | |
46 | ||
47 | *** Make sure the "afssync" command has completed *** | |
48 | cp /var/prdb.moira /var/prdb.new | |
49 | /moira/sync/pt_util -w -d /var/prdb.extra.sort -p /var/prdb.new | |
50 | (This almost completes the preparation of the prdb.) | |
51 | [ ^^ This takes 40 minutes, may take longer. Exponentially with | |
52 | number of personal groups.] | |
53 | ||
54 | pts listmax | |
55 | (Save the numbers printed.) | |
56 | ||
57 | copy /var/prdb.new to *ALL* the database servers (/usr/afs/db/prdb.new) | |
58 | ||
59 | The following should be done as quickly as possible... | |
60 | ||
61 | foreach i ( <db servers> ) | |
62 | bos shutdown $i ptserver | |
63 | bos exec $i "rm /usr/afs/db/prdb.DB*; mv /usr/afs/db/prdb.new /usr/afs/db/prdb.DB0" | |
64 | end | |
65 | ||
66 | foreach i ( <db servers> ) | |
67 | bos restart $i ptserver | |
68 | end | |
69 | ||
70 | Watch the status of the servers using "udebug" to make sure things are | |
71 | going well... make sure the beacons are working, and that once quorom | |
72 | is established that the servers are resynchronizing their notions of the | |
73 | databases and that the dbcurrent and up fields all become set and the | |
74 | state goes to 1f. Also watch out for large rx packet queues on port | |
75 | 7002 using rxdebug, as the fileservers may get excessively backlogged, | |
76 | and restart servers, if necessary, if the congestion remains excessive. | |
77 | ||
78 | [ Use udebug on prill.... will take 75 seconds for the pts servers to | |
79 | elect a master, and then additional time for the master to propagate | |
80 | its database to the rest of the pts servers.] | |
81 | ||
82 | pts listmax | |
83 | (if the id's are lower than the saved ones, reset them appropriately to | |
84 | the saved one's, using "pts setmax"). | |
85 | ||
86 | pts ex system:administrators | |
87 | (good spot check, especially since it has special people) | |
88 | (also spot check one of the personal groups and perhaps, something like | |
89 | the membership of rcmd.ronald-ann) | |
90 | ||
91 | rm /moira/afs/noafs | |
92 | (You need to remove the lock file you put on.) | |
93 | ||
94 | -Richard | |
95 | ||
96 | *************************************************************************** | |
97 | ||
98 | NOTES: | |
99 | ||
100 | 1. There is also a faster pt_util command for integrating the various personal | |
101 | groups. However, it has not been fully verified. It can be found in the | |
102 | development sources as pt_util-fast.c. Feel free to try using this one, but | |
103 | I would also recommend generating the database the old way just in case... | |
104 | ||
105 | 2. The goal is to minimize the outage and minimize the potential for changes | |
106 | so concurrency is highly recommended. | |
107 | ||
108 | 3. Make sure you copy the database to all the protection servers, as the | |
109 | servers will be more than happy to give "no such user" answers and users | |
110 | will not be able to reestablish authentic connections without doing | |
111 | "aklog -force". | |
112 | ||
113 | 4. Don't do this when you're tired... There may be no cleanup procedure | |
114 | available, with certain mistakes. | |
115 | ||
116 | 5. /moira/afs/noafs is only good for 30 minutes. Keep track of the | |
117 | critical log, and you may have to do some operations by hand when the | |
118 | operation is complete. Also, if requests depend on other requests, they | |
119 | may be processed out of order, and fail, and may need to be done by hand. | |
120 | ||
121 | *************************************************************************** | |
122 | ||
123 | (The following is a very old message...) | |
124 | ||
125 | ||
126 | To: op@MIT.EDU, mar@MIT.EDU | |
127 | Cc: tjm@MIT.EDU | |
128 | Subject: AFS/Moira sync | |
129 | From: "Richard Basch" <basch@MIT.EDU> | |
130 | ||
131 | ||
132 | I have rebuilt the AFS protection database from the information in the | |
133 | Moira and old prdb (for the special entries and personal groups). It | |
134 | has been installed without a problem. The old prdb is in | |
135 | ||
136 | prill:/usr/afs/db/prdb.old | |
137 | ||
138 | As usual, I installed it with no interruption of service (there may have | |
139 | been a couple minutes when AFS was a bit slow as the protection database | |
140 | servers were being restarted, but that's it). | |
141 | ||
142 | The following is the basic procedure I used to create the new prdb... | |
143 | ||
144 | -Richard | |
145 | ||
146 | ||
147 | moira2# /moira/bin/afssync /var/prdb | |
148 | Doing users: Tue Sep 7 23:59:37 1993 | |
149 | Doing groups: Wed Sep 8 00:16:26 1993 | |
150 | Error adding group system:mit id -101: Entry for id already exists | |
151 | Error adding group system:authuser id -102: Entry for id already exists | |
152 | Error adding group system:administrators id -204: Entry for id already exists | |
153 | Reading/preparing members: Wed Sep 8 00:30:11 1993 | |
154 | Doing members: Wed Sep 8 00:34:22 1993 | |
155 | Done (16591 users, 18144 groups, 23 kerberos, 39097 members): Wed Sep 8 00:41:10 1993 | |
156 | ||
157 | prill# /mit/opssrc/moira/afssync/pt_util -x -m -u -g -d /tmp/xxx | |
158 | prill# perl /mit/moiradev/pmax/afssync/pt_util.pl < /tmp/xxx > /tmp/prdb.extra | |
159 | ||
160 | moira2# cd /var | |
161 | moira2# cp -p prdb prdb.new | |
162 | moira2# /mit/moiradev/pmax/afssync/pt_util -w -d /var/prdb.extra -p /var/prdb.new | |
163 | Ubik Version is: 0.0 | |
164 | Error while creating who:rune-staff: User or group doesn't exist | |
165 | Error while creating system:gsipbbin: Entry for id already exists | |
166 | Error while creating celine:admin: User or group doesn't exist | |
167 | Error while creating celine:titan: User or group doesn't exist | |
168 |