Sequenced DNA comparison,

without DNA sequence disclosure.

robots

🢩 A Compliance Imperative 🢨

Patents: GRANTED

Offering: Licences with know-how, training & reference code.


Using Keys with DNA


        
        Cryptography has long used keys for encrypting and decrypting messages in ways that only the intended recipients can read the original messages and in such a way that the encrypted messages can be out in the open during any exchanges. The most basic forms of secrets involve hushed voices and secret encoders and decoders that you trade with certain people. Cryptography makes it possible to send and receive secret information out in the open, to yell for all the world to hear, and to communicate securely with people you never met before in person. Undisclosed DNA brings these innovations to the world of DNA matching.


I     Traditional Public and Private Keys

Pretend that Alice has an important message to deliver to Bob, but Alice does not have a secure channel for delivering that message. During delivery, other parties can see the message. If Alice and Bob want the contents of that message to stay hidden from other parties, they could use asymmetric cryptography
        Asymmetry is a fundamental component within cryptography. It can result in maths that is easy to perform in one direction but difficult to do in the other direction. We can use this property to create key pairs. Every private key has its public key and vice versa.
        Bob creates a key pair so that other people can send private messages to him that no one else can read. He keeps the private key on his own computer, but he gives out the public key to the world. Someone such as Alice can take this public key and use it to encrypt a message for Bob. The owner of the corresponding private key can decrypt that message and read it
        The message began clear and was simple to read. After encryption, it looks like gibberish to everyone. It is virtually impossible for anyone to read that message’s contents without the corresponding private key.
        Everybody can see the public key, whereas nobody except Bob will ever touch the private key. It is easy to confirm that a private key is paired with a public key when you have both keys. If you only have the public key, however, you would need billions of years to figure out the private key.


2     Innovations in Key Usage

Traditionally, cryptographers saw their goal as performing encryption in a single step that was as perfect as can be. It would be easy on your computer, unique for every user across space and time, seemingly random, impossible to reverse engineer – and all in one shot.
        On the other hand, Undisclosed DNA takes the tools and principles of cryptography, but applies them with different means to different ends. We use a person’s DNA sequences – let’s say Bob’s – to construct an encryption key
        At this point, you would think that we would use the same person’s DNA to make the matching private key. It doesn’t help any2 one, however, to make a message that only the same person can read. The usage of DNA would just be an arduous middle step, and the results would not as powerful as those in the cryptography we use in the internet today
        First, Bob does not create a public encryption key with the goal of giving a copy to everyone. Bob writes a message and encrypts a message with the public key. He could then decrypt his own message with a decryption key (a private key) made from his own DNA. But what’s the point?
        We pivot from the idea of linking private keys and public keys. Instead, many people can create private keys from their own DNA. Then, each person can try to use her own private key to try to decrypt the message encrypted with a key based on Bob’s. The message itself is not as important as the ability to decrypt it successfully. If a decryption key made from your DNA can read the secret message, then your DNA is similar to Bob’s. I.e., you and Bob are blood relatives.


3     Matching Keys to Proteins

In order to perform cryptographic functions, all DNA must become numbers. Unlike cryptography that would construct a new private key randomly, we take the totals of the four base proteins in your mitochondrial DNA and each of your two X chromosomes (in the case of biological females) or your mitochondrial DNA in your X and Y chromosomes (biological males).
        The more closely that two people are related (as in siblings versus third cousins), the more similar their DNA will be. Each person will possess similar amounts of each of the four base pairs of GC, CG, AT, and TA. This results in keys that decrypt similar content.
        If you can use your DNA to create a key that successfully decrypts a message encrypted by a key derived from Alice’s DNA but not one encrypted by a key made from Bob’s DNA, then you are related to Alice but not Bob. If you can decrypt a message from Bob’s DNA but not Alice’s, then you are related to Bob. If you can decrypt both messages, then – surprise – you, Alice, and Bob are all related.


4     Hiding the Connections

A simple illustration of the double helix of DNA, the production of mRNA, and key creation for Undisclosed DNA can do a disservice. For cellular functions, the matching of complementary pairs reveals what the original versions were. A bit of messenger RNA that is GA-C-C came from C-T-G-G, which is not exactly a secret.
        The private keys of Undisclosed DNA are not simple complements, however. We use a concept in mathematical logic that computer scientists of all stripes, and especially cryptographers, have employed: exclusive or (XOR). Cryptographers used this in an interesting way to check data, the HMAC. XOR also allows for obscuring data. In either case, we need to differentiate it from other uses of ‘or’.
        An ‘inclusive or’ question could be ‘Will we have juice or punch to drink at the picnic?’ with the answer of ‘Yes, we are taking a few flavors – orange, grape, and apple.’ As long as the picnic will serve fruit juice or fruit punch, we can reply in the affirmative. The story is the same with a daydreamer: ‘Someday, I want to visit Jamaica or Barbados, somewhere tropical.’ She would not be disappointed if she won a travel package to tour the Caribbean for a month and stayed multiple nights on both islands.
        In contrast, we see ‘exclusive or’ questions with directions. ‘When we reach the trailhead, do we turn left or right?’ A smartalec could say, ‘Yes, you have to turn. Going straight is not an option. It is a fork in the path.’ Normally, though, you would interpret this query to mean ‘Should I walk left, or should I walk right?’ We know that we can only go left or go right.
        A computer could assign a value to left and then to right. In binary, a 0 is yes, and a 1 is no. If the correct trail is to the left, the readout is 0,1. If we should go right, we have 1,0. The impossible choice to go left and right at the same time is 1,1. The equally impossible choice to go neither left nor right is 0,0. If the silly scenarios are 0,0 and 1,1, then we can say that each is false.
        Let’s translate the concepts of true and false into binary as well. We end up with 1,0=1, 0,1=1, 0,0=0, 1,1=0, and we now have XOR. If we look at the final result of 1, all we know is that our hiking guide told us to go left at the trailhead or she instructed us to head right. If we look at the final result of 0, then we only know 5 that our hiking guide is not making any sense because she just told us to go left and right at the same time or told us to do neither and instead to fly upward. Similarly, the private keys that we make from DNA cannot be reversed.


5     Genetic Drift

The ability to decrypt a message may seem to be black and white. After all, the cryptography we see in PGP email, TLS for websites, or the Signal Protocol in Signal or WhatsApp works that way. You either have the exact key you need to decrypt a message or you don’t. An American expression reminds us that ‘close’ only counts with the game of horseshoes and with grenades in war
        Undisclosed DNA breaks from traditional applications of cryptography to allow for multiple possibilities to decrypt a message. If you have the exact same DNA, and therefore the same key, as someone who encrypted a message under the methods of Undisclosed DNA, then you can decrypt the message. If you only share DNA that is similar but not identical – maybe you are siblings – then you can still successfully perform cryptography. Someone may want to reach out further and tweak the message encryption process so that second or third cousins can also be matched. Undisclosed DNA lets you do that.
        To connect relatives and indirectly measure the distance of that relation, we look at genetic drift. All humans are at least distant relatives. Importantly for genetic diversity, the connections are very slight. A person in Beijing may share a common ancestor with someone in Warsaw, but it may have been a thousand years ago. Also, every new plant, animal, bacterium, and human will have some mutations.
        These very tiny mutations accelerate the differentiation between people. As these tiny changes accumulate and kinship ties become more distant, we can say that genetic drift has also grown. Two people who exhibit very little genetic drift will be able to decrypt messages encrypted by the other.
        Reaching one’s own child seems straightforward, but we can apply the matching methods much more widely with the “volume” of a message. I.e., we can decide how loud that call is. Without a volume control, you would only have a simple binary of “close family versus random stranger”, and that has relatively few use cases.
        Let’s say you want to find cousins or a lost relative when you do not have access to the DNA of that person’s parents. Maybe you just want to see who is out there. With Undisclosed DNA you may call out loudly whilst maintaining the secrecy that you get from whispering.
        You can adjust the encryption of a message in such a way that only your daughter could read it. Alternatively, you can modify your message so that anyone who is a second cousin or closer can decrypt it successfully.
        Not only does this method preserve the privacy of everyone’s DNA and the messages themselves, it also allows for casting a wide net. Before now, the typical use cases from DNA matching were paternity tests or matching crime scene DNA to suspects. In those cases, you had the people in question.
        Another interesting application is to see what relatives even exist and in what number. If you have your messages only allow for matching with close relatives, a man may find that he in fact has one child in Barcelona. Maybe a woman he briefly dated there had become pregnant. A woman who was taken in from abroad as a war orphan may want to find relatives decades later. To her surprise, when she adjusts her encryption to allow decryption by second cousins, she finds that a dozen of her relatives are still alive back in the country of her birth.
        These wonderful possibilities also presented a dilemma, however. You would have to be willing to forgo your privacy and completely trust some companywith your complete DNA. From a practical standpoint, this workflow made finding connections very difficult. If the cousins of our woman in the example chose to not give up their DNA to a matching service, then they would never have been found. At this point, what is the solution?
        Many people will understandably oppose a global database of everyone’s complete DNA to which you upload your DNA and let the computer run a search on every other person’s DNA to look for commonalities. With Undisclosed DNA, we can exchange messages with a great many people but without compromising anyone’s DNA.


6     Tagging

Every message that gets sent in order to both hide the general DNA profile of someone and to render unique every message “sender”, we calculate a hash code on the overall sequence. No one can reverse this value into the original DNA sequence because an infinite number of DNA profiles could match a given hash value. The hash value, however, is long enough so that we should never see an accidental match between any two of the eight billion people on earth
        Twins, triplets, quadruplets – identical siblings share all their DNA. As a result, any mathematical functions you do for one person would yield the same results as for the other person. This presents some problems. We lose the uniqueness of a sender. Was a given message encrypted by the key made from the DNA of this woman living in Bristol or her twin who was adopted in Surrey?
        For the input of the hashing function, one uses the overall DNA sequence, but one can also add the current time. This ensures a different cryptographic hash. Because of how hashes work, no one can know what the original time was. The outputs of cryptographic hashes look random, as if the tiniest change to the input makes each number in the output have a fifty-fifty chance of changing. Therefore, it is impossible to determine what any of the input data was.


7     Reaching Out

Until now, this has covered the ability to decrypt something – anything – with the correct key. If you can successfully open this metaphorical treasure chest, then you have proven your genetic relation to someone. The actual contents of that treasure chest – what the message says – open the door to myriad options.