]> andersk Git - moira.git/blame - afssync/INSTRUCTIONS
if an OldFiles mountpoint exists, update it too
[moira.git] / afssync / INSTRUCTIONS
CommitLineData
26efe406 1[This is a still-under-construction rewrite of the afssync
2instructions, adapted to the Ingres/Maxine -> Oracle/SPARC port, and
3is also being updated and simplified.]
4
5
6The executables are in /moira/bin/ on the moira server, with sources
7in /mit/moiradev/src/afssync/. Most of the commands are run on the
8Moira server.
9
a46edefa 10#### Set up a workspace ####
11
12mkdir -p /moira/sync
13cd /moira/sync
14
26efe406 15#### This is preparation for the resync, to save non-Moira users. ####
16First, get a recent copy of the prdb, and extract non-Moira entries:
17
a46edefa 18 /moira/bin/udebug aggy -port 7002
9eba5bbc 19 rcp -px root@aggy:/usr/afs/db/prdb.DB0 prdb.old
a46edefa 20 /moira/bin/udebug aggy -port 7002
26efe406 21If the two udebugs show that the version changed, lather-rinse-repeat.
a46edefa 22(udebug can be found in afsuser; "aggy" here and below is some DB server)
26efe406 23(Also check for "0 of them for write" at the end. It might matter.)
24
a46edefa 25 /moira/bin/pt_util -x -m -u -g -d prdb.extra -p prdb.old
26 perl /moira/bin/pt_util.pl < prdb.extra > prdb.extra.sort
26efe406 27to extract and prepare the personal groups and special user entries in
28the old prdb for being reincorporated into the new prdb.
29
7827a830 30 awk -F\| '$8 == 3 {print $1}' /backup/backup_1/users > /tmp/deactivated
31 perl -e 'for(`cat /tmp/deactivated`) { chop; $ex{$_}=1;} \
32 $punt=0; foreach $L (`cat prdb.extra.sort`){ \
33 @w=split(/ /,$L); $_=$w[0]; if ( /:/ ) \
34 {@x=split(/:/,$w[0]); if($ex{$x[0]}) {$punt=1;}else{$punt=0;}} \
35 print $L unless $punt==1;}' > prdb.extra.trimmed
36to remove the personal groups for users who are deactivated
37
26efe406 38 awk '/^[^ ][^:]*@/ {printf "KERBEROS:%s\n",$1}' prdb.extra > foreign
39 blanche afs-foreign-users -f foreign
40Get a list of all the @andrew.cmu.edu type (non- athena.mit.edu cell)
41users, and sync the Moira list afs-foreign-users to this list.
42Moira then adds those entries to the group system:afs-foreign-users,
43thus keeping them from being lost in the prdb resync.
7827a830 44Sanity checking the diffs before running the blanche command is recommended.
26efe406 45
7827a830 46 awk '/^[^ 0-9][^:@]*$/ {printf "KERBEROS:%s@ATHENA.MIT.EDU\n",$1}' \
47 prdb.extra > oddities
48 awk '/^[^ ][0-9.]* .*$/ {printf "KERBEROS:%s\n",$1}' prdb.extra >> oddities
26efe406 49 echo "LIST:afs-foreign-users" >> oddities
50 blanche afs-odd-entities -f oddities
51Do the equivalent of afs-foreign-users for domestic users. We make
52the afs-foreign-users list a member of the more general afs-odd-entities.
7827a830 53Sanity checking the diffs before running the blanche command is recommended.
54
26efe406 55WAIT for the incremental updates from the `blanche` changes to complete.
56
57#### Now the actual resync begins. Incremental updates must stop. ####
58
59 touch /moira/afs/noafs
60to disable AFS incremental updates during the synchronization. The
61afs.incr (?) will wait 30 minutes on an incremental update before
62timing out, so the resync should complete in that time, or list
63changes in Moira might need to be propagated by hand.
64
a46edefa 65 /moira/bin/afssync prdb.moira
26efe406 66to dump the prdb data that is in Moira (users, groups, and group
67memberships). This step takes about ten minutes, but can be done
68concurrently with the next few steps.
69
7827a830 70REPEAT the above commands, thus regenerating prdb.trimmed from a now
71completely-up-to-date prdb.
3d8d4b36 72
73*** Make sure the "afssync" command has completed ***
3d8d4b36 74
a46edefa 75 cp prdb.moira prdb.new
7827a830 76 /moira/bin/pt_util -w -d prdb.extra.trimmed -p prdb.new \
77 >& prdb.extra.err
26efe406 78This use of pt_util will presumably log errors about failed user
79creations and list additions. (To start over, do both the `cp` and
80`pt_util` again.) You can filter out the "User or group doesn't exist"
81type of lines that were caused by a user deactivation with something
82like:
83 awk -F\| '$8 == 3 {print $1}' /backup/backup_1/users > /tmp/deactivated
7827a830 84 perl -e 'for(`cat /tmp/deactivated`){ chop; $ex{$_}=1;} \
26efe406 85 foreach $L (`cat prdb.extra.err`){ $f=0; \
86 @w=split(/[ :]/,$L); for(@w){ $f=1 if $ex{$_}; } \
87 next if $f; print $L; }'
88Now, back to the resync.
89
7827a830 90The only remaining errors should be errors creating system:foo groups,
91be cause they already exist. These generally mean that that group has
92an odd user on it (root instance, IP acl, etc.) and can safely be
93ignored.
94
95Errors of the form:
96Error while creating dcctdw:foo: Badly formed name (group prefix doesn't match owner?)
97are probably an indication that a user with personal groups had a
98username change (in the past they have also meant that a user with
99personal groups was deactivated and the uid was re-used (this was
100becasue we didn't trim the prdb.extra.sort file in the past.))
101Assuming htese errors are due to a username change, the groups should
102be renamed, and you should regenerate prdb.extra.trimmed starting with
103a fresh prdb from aggy. (You may want to abort and
104rm /moira/afs/noafs and try again later.)
105
9eba5bbc 106 pts listmax > prdb.listmax
26efe406 107 foreach i ( <db servers> )
7827a830 108 rsh $i -l root -x /bin/athena/detach -a # detach packs
109 rsh $i -l root -x rm -f /usr/afs/db/{prdb.new,pre-resync-prdb}
110 rcp -px prdb.new root@${i}:/usr/afs/db/prdb.new
111 end # staging
112 foreach i ( <db servers> )
113 bos shutdown $i ptserver -wait
114 bos exec $i "mv /usr/afs/db/prdb.DB0 /usr/afs/db/pre-resync-prdb; rm /usr/afs/db/prdb.DB*; mv /usr/afs/db/prdb.new /usr/afs/db/prdb.DB0"
26efe406 115 end
116 foreach i ( <db servers> )
117 bos restart $i ptserver
118 end
119
120 /moira/bin/udebug prill -port 7002
121to watch the status of the servers to make sure things are going well,
122where "prill" is preferred db server (the sync site).
123
124Make sure the beacons are working, and that once quorom is established
125(~90 seconds) that the servers are resynchronizing their notions of
126the databases and that the "dbcurrent" and "up" fields all become set
127and the state goes to "1f". Also, if "sdi" isn't running, watch out
128for large rx packet queues on port 7002 using rxdebug, as the
129fileservers may get excessively backlogged, and restart servers, if
130necessary, if the congestion remains excessive.
131
132 pts listmax
9eba5bbc 133 cat prdb.listmax
26efe406 134and if the id maxima are lower than the saved ones, reset them
135appropriately to the saved ones using `pts setmax`.
136
137 pts ex system:administrators
138as a good spot check, especially since it has special people.
3d8d4b36 139(also spot check one of the personal groups and perhaps, something like
140the membership of rcmd.ronald-ann)
141
26efe406 142 rm /moira/afs/noafs
143to remove the lock file and let Moira's afs incrementals continue.
3d8d4b36 144
3d8d4b36 145
26efe406 146NOTES
3d8d4b36 147
26efe406 1481. Don't do this when you're tired... There may be no cleanup procedure
3d8d4b36 149available, with certain mistakes.
150
26efe406 1512. /moira/afs/noafs is only good for 30 minutes. Keep track of the
3d8d4b36 152critical log, and you may have to do some operations by hand when the
153operation is complete. Also, if requests depend on other requests, they
154may be processed out of order, and fail, and may need to be done by hand.
This page took 0.078925 seconds and 5 git commands to generate.