Opened 8 years ago

Last modified 8 years ago

#4119 new Task/To do item

Discussion of new data protection policies

Reported by: https://id.mayfirst.org/jamie Owned by: https://id.mayfirst.org/jamie
Priority: Medium Component: Tech
Keywords: Cc:
Sensitive: no

Description

After the data loss we experienced on julia, May First/People Link held a meeting to discuss next steps, which resulted in several proposals on new data protection procedures.

This ticket is being created to discuss those proposals.

Change History (3)

comment:1 Changed 8 years ago by https://id.mayfirst.org/alfredo

There is a potential concern in these excellent proposals. The audit would, presumably, check to make sure backups were done and that's great.

But will this audit simply check to make sure the process was started and completed? If there is a glitch in the process or code that ends in us not copying the full content of a directory or sub-directory (or the files inside those dirs), a "process started and completed" audit might not pick up on that.

In the simpler, more primitive days of People Link, I would use rsync and then do a visual check of both source and target to see if they compared in size -- selective, because it was me doing it, but I'd actually check the size of files, for instance. The most thorough way to do our automated audit would be a comparison check of some kind for file size, just like my one in the olden days.

Is such a thing possible and feasible. Sounds horridly resource and time intensive but computers *are* pretty darn fast.

If we had that, we would be near "confident" about the backups.

Other than that concern, I think the proposals are really great.

Alfredo

comment:2 Changed 8 years ago by https://id.mayfirst.org/jamie

Yeah - I think that's a concern. I opened #4138 to cover automated audits. That would be a system that, independent of the server that is supposed to launch the backup process itself, would check to backup logs to see if the commands completed successfully.

If a transfer was incomplete or failed, that should alert us to the problem.

However, there's a separate proposal (with no ticket attached) for a human backup audit system. I think we need to be sure not only that the backups ran properly, but that everything that is supposed to be backed up is being backed up. I'm not really sure how to design something like that.

I think it would need to be some kind of randomized process where a human would be assigned a server or a handful of files on a server and see if they could be restored.

jamie

comment:3 Changed 8 years ago by https://id.mayfirst.org/jamie

For the record - the ad hoc backup group has been discussing and working on these issues.

Notes are available from our first and second meetings.

jamie

Please login to add comments to this ticket.

Note: See TracTickets for help on using tickets.