It’s a lot to unpack for sure. I worked in computer security from the late 90s to the early 2000s and since then I have been working as a software engineer with privacy as a primary part of my job. That said, I will make the assertion that nothing can ever be completely secure. Security and usability are always at odds with one other. Do I think a reasonable amount of security can be applied to DNA sequence personal data? Yes, but this is quite a discussion.
One of the problems that copilot alluded to is data storage. We’re talking about around 300 gigabytes. This data isn’t easily downloaded; but if you do download, where do you want to store it? Do you trust the website that is storing your data? This is absolutely a headache for the non-techie person.
I think we’ve all gotten those emails on multiple occasions that say, “Our website was compromised.” If you need any convincing that “no website is completely secure”, have a look at the “row hammer attack” as an example, to see the level of sophistication computer security, or lack thereof, is at presently. Though, we’ve already opened Pandora’s box, and we access our medical records, and we bank online nowadays. The difference between medical records and banking is in banking we have accounting to track transactions… and our money is insured, to a degree. I think you can see where I’m going with this.
The internet itself is by design insecure. Let me give an example. The military came up with the concept of white, black, and grey systems. Black systems stand alone and are not allowed to be connected to the internet. White systems are connected to the internet. Grey systems can connect to white or black but never at the same time and are not left connected to the internet. Exposure is limited. If you have data that you really care about, you want it on a black system and you want to only allow grey systems to connect to it.
When speaking about privacy we use the terms “linkable” and “linked”. Copilot again:
Certainly! When discussing data privacy, the terms “linkable” and “linked” refer to how information can be associated with specific individuals or datasets. Here’s a breakdown:
Linkable Data
Definition:
Data that can potentially be connected to an individual, even if it isn’t directly identifying on its own.
Examples:
-
Partial location history
-
Browser fingerprint
-
IP address
Why it matters:
While linkable data may seem anonymous at first glance, it can become identifying when combined with other sources. For example, knowing someone’s workplace, combined with the time they access a specific website, might reveal their identity.
Linked Data
Definition:
Data that has already been associated or joined with other information to identify a specific individual.
Examples:
-
Your name attached to purchase history
-
An email address tied to medical records
-
Social media profile connected with browsing activity
Why it matters:
Linked data is typically not anonymous and presents higher privacy risks because it’s already been assembled in a way that reveals personal details.
Quick Comparison
Both play key roles in privacy discussions, especially around data anonymization and re-identification risks. Regulations like the GDPR and CCPA often distinguish between them when setting standards for user consent and protection.
Would you like to explore how these distinctions play out in AI systems or in public datasets? I suspect you might enjoy the deeper nuances.