]> andersk Git - moira.git/blame - afssync/INSTRUCTIONS
Command line printer manipulation client, and build goo.
[moira.git] / afssync / INSTRUCTIONS
CommitLineData
26efe406 1The executables are in /moira/bin/ on the moira server, with sources
2in /mit/moiradev/src/afssync/. Most of the commands are run on the
3Moira server.
4
a46edefa 5#### Set up a workspace ####
6
7mkdir -p /moira/sync
8cd /moira/sync
9
26efe406 10#### This is preparation for the resync, to save non-Moira users. ####
11First, get a recent copy of the prdb, and extract non-Moira entries:
12
8381b984 13 /moira/bin/udebug prill -port 7002
14 rcp -px root@prill:/usr/afs/db/prdb.DB0 prdb.old
15 /moira/bin/udebug prill -port 7002
26efe406 16If the two udebugs show that the version changed, lather-rinse-repeat.
8381b984 17(udebug can be found in /usr/athena/bin; "prill" here and below is some
18DB server)
26efe406 19(Also check for "0 of them for write" at the end. It might matter.)
20
a46edefa 21 /moira/bin/pt_util -x -m -u -g -d prdb.extra -p prdb.old
22 perl /moira/bin/pt_util.pl < prdb.extra > prdb.extra.sort
26efe406 23to extract and prepare the personal groups and special user entries in
24the old prdb for being reincorporated into the new prdb.
25
d93d3bb6 26 awk -F\| '$9 == 3 {print $1}' /backup/backup_1/users > /tmp/deactivated
e3d53038 27
28and the following perl script:
29
30#!/usr/athena/bin/perl -w
31
32open(OUT, ">prdb.extra.trimmed");
33
34for ( `cat /tmp/deactivated` ) {
35 chop;
36 $ex{$_} = 1;
37}
38
39$punt = 0;
40
41foreach $L ( `cat prdb.extra.sort` ) {
42 @w = split(/ /,$L);
43 $_ = $w[0];
44 if ( /:/ ) {
45 @x = split(/:/,$w[0]);
46 if ($ex{$x[0]}) {
47 $punt=1;
48 } else {
49 $punt=0;
50 }
51 } else {
52 # If we got here, we're either a user, a prefixless
53 # group, or a group member.
54 $punt = 0 if $w[0];
55 }
56 print OUT $L unless $punt == 1;
57}
58
59close(OUT);
60exit 0;
61
7827a830 62to remove the personal groups for users who are deactivated
63
9dbd9d2b 64 awk '/^[^ ][^:]*@/ {printf "KERBEROS:%s\n",$1}' prdb.extra.trimmed \
65 > foreign
26efe406 66 blanche afs-foreign-users -f foreign
67Get a list of all the @andrew.cmu.edu type (non- athena.mit.edu cell)
68users, and sync the Moira list afs-foreign-users to this list.
69Moira then adds those entries to the group system:afs-foreign-users,
70thus keeping them from being lost in the prdb resync.
7827a830 71Sanity checking the diffs before running the blanche command is recommended.
26efe406 72
7827a830 73 awk '/^[^ 0-9][^:@]*$/ {printf "KERBEROS:%s@ATHENA.MIT.EDU\n",$1}' \
9dbd9d2b 74 prdb.extra.trimmed > oddities
75 awk '/^[^ ][0-9.]* .*$/ {printf "KERBEROS:%s\n",$1}' prdb.extra.trimmed\
76 >> oddities
26efe406 77 echo "LIST:afs-foreign-users" >> oddities
78 blanche afs-odd-entities -f oddities
79Do the equivalent of afs-foreign-users for domestic users. We make
80the afs-foreign-users list a member of the more general afs-odd-entities.
7827a830 81Sanity checking the diffs before running the blanche command is recommended.
82
26efe406 83WAIT for the incremental updates from the `blanche` changes to complete.
84
85#### Now the actual resync begins. Incremental updates must stop. ####
86
87 touch /moira/afs/noafs
88to disable AFS incremental updates during the synchronization. The
89afs.incr (?) will wait 30 minutes on an incremental update before
90timing out, so the resync should complete in that time, or list
91changes in Moira might need to be propagated by hand.
92
a46edefa 93 /moira/bin/afssync prdb.moira
26efe406 94to dump the prdb data that is in Moira (users, groups, and group
95memberships). This step takes about ten minutes, but can be done
96concurrently with the next few steps.
97
7827a830 98REPEAT the above commands, thus regenerating prdb.trimmed from a now
99completely-up-to-date prdb.
3d8d4b36 100
101*** Make sure the "afssync" command has completed ***
3d8d4b36 102
a46edefa 103 cp prdb.moira prdb.new
7827a830 104 /moira/bin/pt_util -w -d prdb.extra.trimmed -p prdb.new \
105 >& prdb.extra.err
26efe406 106This use of pt_util will presumably log errors about failed user
107creations and list additions. (To start over, do both the `cp` and
108`pt_util` again.) You can filter out the "User or group doesn't exist"
109type of lines that were caused by a user deactivation with something
110like:
d93d3bb6 111 awk -F\| '$9 == 3 {print $1}' /backup/backup_1/users > /tmp/deactivated
7827a830 112 perl -e 'for(`cat /tmp/deactivated`){ chop; $ex{$_}=1;} \
26efe406 113 foreach $L (`cat prdb.extra.err`){ $f=0; \
114 @w=split(/[ :]/,$L); for(@w){ $f=1 if $ex{$_}; } \
115 next if $f; print $L; }'
116Now, back to the resync.
117
7827a830 118The only remaining errors should be errors creating system:foo groups,
119be cause they already exist. These generally mean that that group has
120an odd user on it (root instance, IP acl, etc.) and can safely be
121ignored.
122
123Errors of the form:
124Error while creating dcctdw:foo: Badly formed name (group prefix doesn't match owner?)
125are probably an indication that a user with personal groups had a
126username change (in the past they have also meant that a user with
127personal groups was deactivated and the uid was re-used (this was
128becasue we didn't trim the prdb.extra.sort file in the past.))
129Assuming htese errors are due to a username change, the groups should
130be renamed, and you should regenerate prdb.extra.trimmed starting with
8381b984 131a fresh prdb from prill. (You may want to abort and
7827a830 132rm /moira/afs/noafs and try again later.)
133
9eba5bbc 134 pts listmax > prdb.listmax
26efe406 135 foreach i ( <db servers> )
7827a830 136 rsh $i -l root -x /bin/athena/detach -a # detach packs
137 rsh $i -l root -x rm -f /usr/afs/db/{prdb.new,pre-resync-prdb}
138 rcp -px prdb.new root@${i}:/usr/afs/db/prdb.new
139 end # staging
140 foreach i ( <db servers> )
141 bos shutdown $i ptserver -wait
142 bos exec $i "mv /usr/afs/db/prdb.DB0 /usr/afs/db/pre-resync-prdb; rm /usr/afs/db/prdb.DB*; mv /usr/afs/db/prdb.new /usr/afs/db/prdb.DB0"
26efe406 143 end
144 foreach i ( <db servers> )
145 bos restart $i ptserver
146 end
147
148 /moira/bin/udebug prill -port 7002
149to watch the status of the servers to make sure things are going well,
150where "prill" is preferred db server (the sync site).
151
d93d3bb6 152Make sure the beacons are working, and that once quorum is established
26efe406 153(~90 seconds) that the servers are resynchronizing their notions of
154the databases and that the "dbcurrent" and "up" fields all become set
155and the state goes to "1f". Also, if "sdi" isn't running, watch out
156for large rx packet queues on port 7002 using rxdebug, as the
157fileservers may get excessively backlogged, and restart servers, if
158necessary, if the congestion remains excessive.
159
160 pts listmax
9eba5bbc 161 cat prdb.listmax
26efe406 162and if the id maxima are lower than the saved ones, reset them
163appropriately to the saved ones using `pts setmax`.
164
165 pts ex system:administrators
166as a good spot check, especially since it has special people.
3d8d4b36 167(also spot check one of the personal groups and perhaps, something like
d93d3bb6 168the membership of rcmd.reynelda)
3d8d4b36 169
26efe406 170 rm /moira/afs/noafs
171to remove the lock file and let Moira's afs incrementals continue.
3d8d4b36 172
983926ee 173 The afssync program doesn't deal with null instance KERBEROS
174members of lists which are groups (example: if LIST zacheiss contains
175KERBEROS zacheiss@ATHENA.MIT.EDU). To get around this, run:
176
177/moira/bin/sync.pl
178
179Which will create /var/tmp/sync.out, which contains the pts commands
180needed to add all the null instance KERBEROS members back to the pts
181groups they belong in. If it looks sane, run:
182
183sh /var/tmp/sync.out
184
185Any failed additions are probably from lists that contain both USER
186username and KERBEROS username@ATHENA.MIT.EDU.
3d8d4b36 187
26efe406 188NOTES
3d8d4b36 189
26efe406 1901. Don't do this when you're tired... There may be no cleanup procedure
3d8d4b36 191available, with certain mistakes.
192
26efe406 1932. /moira/afs/noafs is only good for 30 minutes. Keep track of the
3d8d4b36 194critical log, and you may have to do some operations by hand when the
195operation is complete. Also, if requests depend on other requests, they
196may be processed out of order, and fail, and may need to be done by hand.
This page took 0.364207 seconds and 5 git commands to generate.