Jump to content

How to handle the deletion of data?


 Share

Recommended Posts

I am working on a document management system where users that belong to a given organization can upload documents and tag them accordingly and then later retrieve them.  Note that these documents are not owned by the individual users who uploaded them but by the organization.

I am trying to decide whether I should push for one of the following business rules regarding the deletion of records:

  1. Do not delete the data but tag them as deleted (soft-delete).
  2. Actually delete the row from the table (hard-delete).
  3. Move the deleted record to another table (moved-delete).
  4. Maybe some other strategy?

Under what conditions, would you recommend having a business rule of one of the above?  Some concerns/thoughts for each option:

Soft-Delete - How should unique constraints such as user's email and username be handled?  I think a reasonable business rule is to make username unique for all time but not for email.

Hard-Delete - Foreign key constraints could be an issue.  For instance, I currently have a non-NULL uploadedByUserId column.  If an individual user uploads a document and that user is later deleted, I can't just delete the document because it wasn't the user's but the organization.  One option is to change it to nullable, but doing so isn't ideal, and there are other use-cases other than user-id which are not so simple.  Or maybe the user is required to reassign all documents first so there will not be a constraint violation?

Moved-Deletes - Seems like it will have the same challenges with foreign keys as with hard-delete.  If going this path, should one mirror the columns of each not-deleted and deleted table or serializing the data and sav as JSON in the deleted table?

Any other general insight would be appreciated.

Link to comment
Share on other sites

I know I am dating myself but I remember working on database tables before the internet and what we had the option of archiving the data (soft-delete the data) then purge the data (hard delete) if it was past so many years. I have seen online services do exactly that and I believe Meta (Facebook) Messenger gives you that option of archiving the data?  Email should always be used as a constraint ALONG with a username or street address or better yet date of birth. It depends on how secure you need it. I know my doctor requires me to not only give my date of birth, but have me choose what is my address from four options to verify who I am when I check in online or even in person. As a business you have to have accountability if it means you can get into legal problems.

Edited by Strider64
Link to comment
Share on other sites

Thanks Strider,  For some applications, I understand one can get in legal trouble for not hard-deleting the data (privacy, etc).  Not for my application and not over the top security as well.  While adding a constraint on email and username/street/birthday/etc solves some of the constraint issues, wouldn't doing so prevent allowing users to sign in using just their email?  

Link to comment
Share on other sites

I'm a huge fan of soft-delete, personally. IMO the only time data should be completely and actually deleted is in the case of a GDPR request or some other situation where you can face legal action for not deleting the information completely. I've even seen setups where the user's data is replaced by basically a hash - the record still exists, any foreign records still exist and link back to the original record, but the personal data has been scrambled or blinded in some way that makes it basically useless to anyone who can view it.

Link to comment
Share on other sites

Thanks for your sound advice maxxd,  I am still back and forth but more in your camp.  I thought this blog was good as it tried to give perspectives of both approaches, but it ultimately recommended hard deletes.  Definitely not saying it is right and would much appreciate your thoughts of their pros and cons.  More importantly, and high-level recommendations (and maybe later low-level) on how to implement with PHP would be great.

Link to comment
Share on other sites

For my personal non-PHP projects, I tend to use soft deletion (just marking it as deleted) over hard deletion. There's enough storage space in the server anyway, and it's easier to keep an active record of what's sticking around than to "destroy the evidence", per se.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.