]> andersk Git - moira.git/blame - afssync/INSTRUCTIONS
orf is no longer a db server
[moira.git] / afssync / INSTRUCTIONS
CommitLineData
26efe406 1[This is a still-under-construction rewrite of the afssync
2instructions, adapted to the Ingres/Maxine -> Oracle/SPARC port, and
3is also being updated and simplified.]
4
5
6The executables are in /moira/bin/ on the moira server, with sources
7in /mit/moiradev/src/afssync/. Most of the commands are run on the
8Moira server.
9
10FULL INSTRUCTIONS
11("SUMMARY" is below)
12
a46edefa 13#### Set up a workspace ####
14
15mkdir -p /moira/sync
16cd /moira/sync
17
26efe406 18#### This is preparation for the resync, to save non-Moira users. ####
19First, get a recent copy of the prdb, and extract non-Moira entries:
20
a46edefa 21 /moira/bin/udebug aggy -port 7002
22 rcp root@aggy:/usr/afs/db/prdb.DB0 prdb.old
23 /moira/bin/udebug aggy -port 7002
26efe406 24If the two udebugs show that the version changed, lather-rinse-repeat.
a46edefa 25(udebug can be found in afsuser; "aggy" here and below is some DB server)
26efe406 26(Also check for "0 of them for write" at the end. It might matter.)
27
a46edefa 28 /moira/bin/pt_util -x -m -u -g -d prdb.extra -p prdb.old
29 perl /moira/bin/pt_util.pl < prdb.extra > prdb.extra.sort
26efe406 30to extract and prepare the personal groups and special user entries in
31the old prdb for being reincorporated into the new prdb.
32
33 awk '/^[^ ][^:]*@/ {printf "KERBEROS:%s\n",$1}' prdb.extra > foreign
34 blanche afs-foreign-users -f foreign
35Get a list of all the @andrew.cmu.edu type (non- athena.mit.edu cell)
36users, and sync the Moira list afs-foreign-users to this list.
37Moira then adds those entries to the group system:afs-foreign-users,
38thus keeping them from being lost in the prdb resync.
39
40 awk '/^[^ ][^:@]*$/ {printf "KERBEROS:%s\n",$1}' prdb.extra > oddities
41 echo "LIST:afs-foreign-users" >> oddities
42 blanche afs-odd-entities -f oddities
43Do the equivalent of afs-foreign-users for domestic users. We make
44the afs-foreign-users list a member of the more general afs-odd-entities.
45WAIT for the incremental updates from the `blanche` changes to complete.
46
47#### Now the actual resync begins. Incremental updates must stop. ####
48
49 touch /moira/afs/noafs
50to disable AFS incremental updates during the synchronization. The
51afs.incr (?) will wait 30 minutes on an incremental update before
52timing out, so the resync should complete in that time, or list
53changes in Moira might need to be propagated by hand.
54
a46edefa 55 /moira/bin/afssync prdb.moira
26efe406 56to dump the prdb data that is in Moira (users, groups, and group
57memberships). This step takes about ten minutes, but can be done
58concurrently with the next few steps.
59
60REPEAT the first two sets of commands, above, thus regenerating
61prdb.extra from a now completely-up-to-date prdb.
3d8d4b36 62
63*** Make sure the "afssync" command has completed ***
3d8d4b36 64
a46edefa 65 cp prdb.moira prdb.new
66 /moira/bin/pt_util -w -d prdb.extra.sort -p prdb.new
26efe406 67This use of pt_util will presumably log errors about failed user
68creations and list additions. (To start over, do both the `cp` and
69`pt_util` again.) You can filter out the "User or group doesn't exist"
70type of lines that were caused by a user deactivation with something
71like:
72 awk -F\| '$8 == 3 {print $1}' /backup/backup_1/users > /tmp/deactivated
73 perl -e 'for(cat /tmp/deactivated`){ chop; $ex{$_}=1;} \
74 foreach $L (`cat prdb.extra.err`){ $f=0; \
75 @w=split(/[ :]/,$L); for(@w){ $f=1 if $ex{$_}; } \
76 next if $f; print $L; }'
77Now, back to the resync.
78
79 pts listmax > /var/prdb.listmax
80 foreach i ( <db servers> )
81 bos shutdown $i ptserver
82 bos exec $i "rm /usr/afs/db/prdb.DB*; mv /usr/afs/db/prdb.new /usr/afs/db/prdb.DB0"
83 end
84 foreach i ( <db servers> )
85 bos restart $i ptserver
86 end
87
88 /moira/bin/udebug prill -port 7002
89to watch the status of the servers to make sure things are going well,
90where "prill" is preferred db server (the sync site).
91
92Make sure the beacons are working, and that once quorom is established
93(~90 seconds) that the servers are resynchronizing their notions of
94the databases and that the "dbcurrent" and "up" fields all become set
95and the state goes to "1f". Also, if "sdi" isn't running, watch out
96for large rx packet queues on port 7002 using rxdebug, as the
97fileservers may get excessively backlogged, and restart servers, if
98necessary, if the congestion remains excessive.
99
100 pts listmax
101 cat /var/prdb.listmax
102and if the id maxima are lower than the saved ones, reset them
103appropriately to the saved ones using `pts setmax`.
104
105 pts ex system:administrators
106as a good spot check, especially since it has special people.
3d8d4b36 107(also spot check one of the personal groups and perhaps, something like
108the membership of rcmd.ronald-ann)
109
26efe406 110 rm /moira/afs/noafs
111to remove the lock file and let Moira's afs incrementals continue.
3d8d4b36 112
3d8d4b36 113
26efe406 114NOTES
3d8d4b36 115
26efe406 1161. Don't do this when you're tired... There may be no cleanup procedure
3d8d4b36 117available, with certain mistakes.
118
26efe406 1192. /moira/afs/noafs is only good for 30 minutes. Keep track of the
3d8d4b36 120critical log, and you may have to do some operations by hand when the
121operation is complete. Also, if requests depend on other requests, they
122may be processed out of order, and fail, and may need to be done by hand.
123
3d8d4b36 124
26efe406 125SUMMARY
126
127 # db servers with sync site first:
a46edefa 128set db=(prill agamemnon chimera)
26efe406 129set u="/moira/bin/udebug -port 7002 -server"
130set prefix="/moira/sync/prdb"
131cd `dirname $prefix`
132
133####### The following DOES NOT WORK currently. pt_util needs fixing
134#### BEFORE Moira and afs.incr are closed off:
135 # repeat as necessary:
136$u $db[2]; rcp root@$db[2]\:/usr/afs/db/prdb.DB0 $prefix.old; $u $db[2]
137/moira/bin/pt_util -x -m -u -g -d $prefix.extra -p $prefix.old
138awk '/^[^ ][^:]*@/ {printf "KERBEROS:%s\n",$1}' $prefix.extra > extra.foreign
139blanche afs-foreign-users -f extra.foreign
140awk '/^[^ ][^:@]*$/ {printf "KERBEROS:%s\n",$1}' $prefix.extra > extra.domestic
141echo "LIST:afs-foreign-users" >> extra.domestic
142blanche afs-odd-entities -f extra.domestic
143
144#### WAIT for the above afs.incr events to take place (see moira.log)
145touch /moira/afs/noafs
146/moira/bin/afssync $prefix.moira >& $prefix.afssync.err &
147 # repeat as necessary:
148$u $db[2]; rcp root@$db[2]\:/usr/afs/db/prdb.DB0 $prefix.old; $u $db[2]
149/moira/bin/pt_util -x -m -u -g -d $prefix.extra -p $prefix.old
150perl /moira/bin/pt_util.pl < $prefix.extra > $prefix.extra.sort
151wait
152more $prefix.afssync.err
153cp $prefix.moira $prefix.new
154/moira/bin/pt_util -w -d $prefix.extra.sort -p $prefix.new >& $prefix.extra.err
155 # and review $prefix.extra.err
156
157pts listmax > $prefix.listmax
158set dbdir=/usr/afs/db
159foreach i ( $db )
160 echo "$i..."
161 rcp -px $prefix.new ${i}:$dbdir
162end
163foreach i ( $db )
164 bos shutdown $i ptserver
165 bos exec $i "rm $dbdir/prdb.DB*; mv $dbdir/prdb.new $dbdir/prdb.DB0"
166end
167foreach i ( $db )
168 bos restart $i ptserver
169end
3d8d4b36 170
26efe406 171 # checks, etc:
172$u $db[1]
3d8d4b36 173
26efe406 174######## more on checks
3d8d4b36 175
26efe406 176rm /moira/afs/noafs
This page took 0.09137 seconds and 5 git commands to generate.