The host stabile.debian.org is once again creating snapshots as snapshot-master, all the data from the secondary setup on sibelius have been merged, so all is back to normal.
In the meantime we found a new controller and snapshot-master now is back online. One disk in one of the four RAID-6 arrays has also failed and we are looking into replacing that soon.
The secondary setup on sibelius has faithfully continued to do snapshots (except for backports.org for which we somehow failed to get pushed) and we are in the process of merging those new snapshots into the database on master.
The disk related problems raised some concerns about data integrity. The actual data of snapshot is kept in the farm, a content addressed file system. That is to say that all files are named by the SHA-1 hashsum of their content. The metadata which links hashes to packages and files in snapshotted archive trees is stored in a PostgreSQL database. To verify the health of the farm we have implemented a rudimentary fsck which verifies two things:
We are also currently importing historical snapshots of the debian-ports ftp tree as requested in Bug #571118.
If all goes well we should be back to a normal state in a couple of days. Then we have to deal with removing things that got removed from the source archives due to licensing problems. Once that last hurdle is taken we can finally announce it and make it an official service.
So we finally got a second machine up and serving a copy of snapshot. Of course this means that now a disk controller breaks in the snapshot-master machine and thus half of our disks are rendered inaccessible. Even raid6 doesn't like that very much.
Therefore snapshot will not get any new data currently and the service is provided by only one of the servers of what was previously a DNS round robin rotation.
We had planned to eventually set up a secondary snapshot master that would do the import runs into an alternate database for just such occasions. This occurrence prompted us to somewhat expedite that project.
So while currently snapshot will not get any updates we should be able to inject most mirrorruns into snapshot-master when it gets back. We'll have lost a few runs of volatile and the main debian archive, but we should not see a gap longer than roughly a day.
Finished setting up mirroring scripts. While snapshot-master (stabile) still is the entity importing new snapshots into the system all the data is now replicated to a secondary site (sibelius). The web front-end runs on both of them.