You’re surfing the internet on a random Tuesday, when suddenly, you receive an unexpected email.

“What *holiday package?*” you mutter to yourself, as your finger hovers over the delete button. “On second thought, I should check this out,” you decide seconds later, recalling your tendency to browse Amazon after a few too many drinks.

“Tracking…tracking…*wait, what*?” Suddenly, all your active windows close: your email browser, solitaire, and even that word document you were kind-of working on. They’re replaced by a single window, displaying the notice below.

As your unwelcome visitor lingers, annoyance gradually turns to dread. Thoughts like “A hundred dollars for *what*?” and “What’s a *CryptoLocker?” *and “I had a whole paragraph done on that document!” pass through your mind. Finally, you come to accept the obvious: you’re the latest victim of Ransomware.

Ransomware has been around for a while, but the variant described above, coming just lately out of Russia and Eastern Europe, is the most insidious yet. CryptoLocker’s encryption protocol is considered by computer scientists to be virtually impossible to beat. So, victims inevitably either pay the money, using Bitcoin or some other untraceable payment system, or lose their files. Long story short: if you ever want to see that word document again, it’ll cost ya.

Ransomware has taken the world by storm, infecting over 250,000 in just a few months [1]. Of those infected, about 3% pay the ransom, which averages at $300 [2]. Thus our friendly neighborhood Russian cyber-gangs have pulled in roughly 250,000 * 0.03 * $300 = $2,250,000.

Among those 3% who paid the ransom are the members of the Swansea Police Department in Boston, whose entire network was infected with ransomware. They reportedly paid $750 to regain control of their computers [3]. So much for not negotiating with terrorists.

In the cyber-criminals defense, though, $300 isn’t too bad a price (less than another computer), and they probably could have asked much more from a police department. I can just picture the conversation going like this:

SERGEI: We hooked the *police department? *Hell. YES!!! Talk about a jackpot. $100 a computer? Hell, we could charge $1000! They *need *those files!

IVAN: Come on, man. What are you, a sociopath? We charge $100 for *every *computer, even if it’s the president’s. I’m trying to run an *honest business *here!

In fact, though a few ransom-payers have reported not getting their files back, most do. And CryptoLocker’s masterminds have even been known to monitor tech support forums, helping victims complete their payments! [4] Can you say *win-win*?

Add to that the advanced technical knowledge required to design the software, and the fact that as of yet no Crypto Locker has been caught [3]. These guys are disgusting; they’re parasitic; and I don’t condone them in the least. But as far as criminals go—much like Jesse James and his crew, robbing trains in the 1800’s—you *kinda gotta respect them*.

That being said, make sure to stop by here [5] before you head out.

In the meantime, let’s learn some computer science.

Why exactly is CryptoLocker so hard to crack? The strength of the RSA encryption lies in the difficulty of factoring large numbers. It’s easy to multiply two massive prime numbers, producing an even larger product. But starting from that product and finding the original two primes is a much, much harder task. CryptoLocker’s creators take advantage of this fact, encrypting files with lengthy keys that are easy to produce but incredibly difficult to decipher.

More specifically: in order to solve the RSA problem, the computer must find some number *d *such that:

(*d ** *e*) Mod φ(*n*) = 1

Where *n *is some 2048-bit (massive) semiprime, or a number with only two prime factors *p *and *q*; φ(*n*) is given by (*p *– 1)(*q *– 1); and *e *is some number between 1 and φ(n) coprime with φ(*n*). [6]

For example, consider primes *p *and *q *given by *p *= 5 and *q *= 13.

Their product is the semiprime 5 * 13 = 65 = *n*.

φ(*n*) = (5 – 1)(13 – 1) = 4 * 12 = 48.

Choose some number *e *coprime with 48; say, *e *= 11.

For what *d *is *d * e* Mod φ(*n*) equal to 1?

Well, *d * e* Mod φ(*n*) = 1 –> 11*d *Mod 48 = 1.

One possible solution is *d *= 35. Note that 35 * 11 = 385, which, when divided by 48, produces a remainder of 1. So, the so-called public key is (*e = *11, *n = *65). This is available for the world to see, including the ransomware victim. The private key, which only the hackers know, is (*d = *35, *n = *65). This private key can be used to decrypt files that have been encrypted using the public key.

Now that we’ve produced our public and private keys, let’s try an encryption ourselves. Choose some secret message given by *m*: how about *m *= 23.* *We find the encrypted version *c* of this message with the expression *c *= *m ^{e}* Mod

*n*= 23

^{11}Mod 65 = 17. So,

*c*= 17.

Decryption of *c *requires the *private key*, and is carried out with the expression *m *= *c ^{d} *Mod

*n*. Let’s try it:

*m =*17

^{35}Mod 65 = 23. We’re back to our original message!

The strength of the RSA system lies in the fact that, given the public key, the private key is incredibly difficult to calculate. In our case, it was pretty easy. It’s clear that 65 is the product of primes 5 and 13; from here, we can calculate φ(*n*); from here, the expression *d * e* Mod φ(*n*) = 1 is easy to evaluate. But, with huge semiprimes like the ones used in CryptoLocker, *p *and *q *are pretty much impossible to discern. Without *p *and *q*, we can’t find φ(*n*), and without this we have no way to determine *d*. Thus the code can’t be cracked. We used a 2-digit *n*, but the semiprime used in CryptoLocker might be 400 or more digits. The largest RSA code that has been factored so far was 232 digits; RSA numbers larger than this probably won’t be factored any time in the near future [7].

…which brings me back to this!

References:

- The Inquirer: Cryptolocker ransomware has infected quarter of a million systems since September
- The Guardian: CryptoLocker attacks that hold your computer to ransom
- CBS Boston: Cryptolocker Ransomware Being Described As ‘The Perfect Crime’
- ZD Net: CryptoLocker’s crimewave: A trail of millions in laundered Bitcoin
- The Guardian: 10 ways to beat CryptoLocker
- RSA Algorithm Example
- RSA Numbers

There is one very interesting phenomenon happening here.

Euler’s totient function

φis familiar to algebraists as the function which, given ann, denotes the number of integerskwhich are both less thannand coprime ton. Note that the functionφis defined onarbitraryintegersn. (Euler’s function is significant because given an integern, each integerkboth less than and coprime toncorresponds to a generator of the cyclic group ofnelements.)Now,

φis defined above quite differently. Above,φis onlydefinedon semiprimesn= p * q, andφ(n) = (p – 1)(q – 1).I claim that these disparate functions

coincide on semiprimes(indeed, on their entire shared domain of definition). Given any semiprimen= p * q, the number of integerskwhich are both less than and coprime tonispreciselyequal to (p – 1)(q – 1).To prove this, it suffices to show that the number of integers which are both less than or equal to

nanddividenis equal to (pq) – (p – 1 )(q – 1) = (pq) – (pq – p – q + 1) = p + q – 1. Let’s countn‘s divisors. These are preciselyq, 2q, …, pq, andp, 2p, …, qp. Removing one to account for double-counting pq, we clearly arrive at p + q – 1, and the coincidence is proven.Question for the readers: can we generalize this to a closed form for Euler’s totient function for arbitrary (i.e., not necessarily semiprime) integers

n?We may use the above-established equivalence of these definitions to answer an important problem yet-unanswered in the above text. Why does the fact that

d is chosen as it is(i.e., that d satisfies (d * e) ≡ 1 (mod φ(n)) imply thatd serves as the decryptor(i.e., that m^e ≡ c (mod n) –> c^d ≡ m (mod n))? I have a possible answer.Though the statement of this above “decryptor theorem” refers to the semi-primes-based definition of

φ, we may, by the established equivalence, considerinsteadthe traditional algebraic definition as the totient function.We use a fact from elementary number theory, commonly called Euler’s totient theorem. This theorem states that for coprime

n,m, we have m^φ(n) ≡ 1 (mod n). Note also that the simple property of arithmetic that (a^b)^c = a^(b*c) extends generally to arbitrary rings as well (this follows from the definition of exponentiation and the associativity of the multiplication operation). Finally, recall that, by our hypothesis and by definition of modulo, we have k *φ(n) + 1 = (d*e) for some integer k. I assume basic knowledge of the rules of modular arithmetic.Thus, we get

c^d ≡ (m^e)^d ≡ m^(e*d) ≡ m^(k *

φ(n) + 1) ≡ (m^φ(n))^k * (m^1) ≡ 1^k * m^1 ≡ m (mod n)This is our desired result.

So that’s why RSA works! Euler’s theorem can be proven using some elementary algebra.

One question remains: Euler’s theorem works only for m

coprime ton. RSA’s specification seems to not include this condition… Is this an oversight on our part, or is something deeper happening?