Random blatherings by Jeff: On bitcoin data spam, and evil data

Tuesday, April 30, 2013

On bitcoin data spam, and evil data

What happens if somebody puts evil data in the blockchain? What responses are available?

It is a truly awful situation, and difficult to address.

What happened?

The easiest way to explain what happened here is through analogy. Imagine if someone picked a penny stock on the NYSE and made a sequence of apparently pointless trades. Then they announced that the prices of their stock trades actually encoded links to some "evil" websites. You know, maybe $0.01 means "a" and $0.02 means "b", etc. Stock market tickers are public, lots of places archive that data, so now lots of people have "links to evil data". Except really they don't. What they have is a list of stock trades. You'd need special software to turn that into some other kind of data.

This is what someone has done with Bitcoin. They sent a series of monetary transactions that did not actually represent real trades, and then announced that with a special program you could turn them back into some text. That text then contains links to, well, I don't actually know what because I haven't looked. But let's assume it's bad stuff.

What solutions are available? Software update?

The answer is very complex, with implications that travel to the heart of bitcoin's value.

Sending bitcoins requires two pieces of data: a bitcoin address, and an amount (number of bitcoins). There is no "comments field" or anything of that nature. A bitcoin address is just a random 20-byte piece of data. Normally those 20 bytes are derived from the RIPEMD160 and SHA256 algorithms, but a valid 20 bytes cannot be distinguished from an invalid 20 bytes. Therefore, if you are willing to waste money -- albeit very small fractions like 0.00000001 bitcoins -- by sending that money to invalid bitcoin addresses, you essentially have created a channel for random data transmission.

The bitcoin blockchain is in one sense a massively replicated ~7GB database that stores data for all eternity. There remains the open question of what happens if somebody dumps data into the blockchain, unrelated to currency. Maybe a government finds that data illegal. Smart people argue the legal theory mens rea and similar mitigating factors are applicable. But it remains an unknown. The vast majority of people are burdened with this awful data they don't care about, simply to use the bitcoin payment system they do care about.

There are many conflicting motives and incentives (very Brave New War-ish):

Anarchist activists want to publish this information, to force authorities to act (or not) when this illegal data is published.
Bitcoin activists want to publish this information, to force developers (us) to address The Filter Issue (see below).
Some people see more value in bitcoin as "eternity data storage", if expensive and inefficient, than bitcoin as a currency.
It is, quite literally, impossible to prevent use of bitcoin for data transmission. It is a purely digital currency. Who can say which digits are "evil" or "good", allowed or disallowed? You can detect certain patterns, and possibly filter those.
Many bitcoin users are using bitcoin for its intended purpose, as currency transfer, and dislike carrying the costs for these data transmission uses.
As this carrying-data issue rears its head, it increases the costs for anyone running a P2P node on the all-volunteer bitcoin P2P network. This shrinks the total number of bitcoin P2P nodes.
As such, due to both legal and resource-usage issues, "data spam" has long been theorized as an attack vector.

The "Filter Issue"

There are very large ramifications to filtering out transactions, even ones that are obviously data spam.

Fungability: currently, all bitcoins have the same value. My 1.0 BTC and your 1.0 BTC are equivalent in value. Once you start filtering transactions, you are injecting policy-based censorship into the mix. Some bitcoins are accepted by all, some bitcoins are only accepted by a few. A value of a bitcoin itself becomes a product of its ancestry. If this policy is implemented, perhaps by court order to a bitcoin mining pool, it could lead chain forks, where i.e. bitcoin users in the United States see a different set of spendable bitcoins than users outside the US. That would be a disaster for bitcoin.

It is widely speculated, based on common forum comments in the crypto-anarchist community, that this current round of data spam is intended to force bitcoin users, developers and governments of the world to take action to censor -- or not -- certain bitcoin transactions. Trying to force the issue, to establish a precedent one way or the other. Or, more pessimistically, a party could be simply trying to shut down bitcoin.

The bitcoin community is very staunchly anti-censorship, but if data spam were to threaten the life of bitcoin, I imagine ideology-neutral "it looks like data, not currency" filtering might appear. Bitcoin is ultimately a product of voting -- you vote by choosing which software version and software ruleset to download.

The users can always vote data spam off the island... but will they? Is data transmission a valid use of bitcoin? The users themselves choose the definition of "valid."

What solutions could be deployed right now?

Currently being discussed is avoiding the relay of economically worthless (under $0.0001 dollars, say) bitcoin transactions. Thus, higher transaction fees would be required to send out lots of data, directly raising the cost.

See Gregory Maxwell's post, "to prevent arbitrary data storage in txouts — The Ultimate Solution" for a proposed solution.

17 comments:

BurBurBurApril 30, 2013 at 1:35 PM
The blockchain IS a place for information storage. It is just a matter of interpretation of the data. You do not like it - you do not interpret it. And put your mind at ease knowing that they payed hefty transaction fees for those megabytes.
ReplyDelete
Replies
Anonymousg64April 30, 2013 at 2:46 PM
why not just implement address checking in the mining nodes
so you cant send to invalid addresses.

http://rosettacode.org/wiki/Bitcoin/address_validation
ReplyDelete
Replies
Amos BairnApril 30, 2013 at 9:36 PM
>"Sending bitcoins requires two pieces of data: a bitcoin address, and an amount (number of bitcoins). There is no "comments field" or anything of that nature."

That is not quite true. A bitcoin transaction sends coins to a script that the next user has to satisfy in order to send them on. Someone who knows what they are doing could insert plaintext information into a script and still send the bitcoins to a valid address.
ReplyDelete
Replies
Platinum EngineerMay 1, 2013 at 6:09 AM
There is actually exactly that, a comments section basically. A public note can be imbedded in the blockchain for all to see.
ReplyDelete
Replies
TiagoMay 2, 2013 at 8:21 AM
Any large enough number may be interpreted as *anything you want*.

The blockchain is a huge number. It may be interpreted as pretty much anything one may imagine. Any CP picture existent may be "extracted" from it, as well as any "secret documents from the CIA" and so on.
Obviously, to be able to make such extraction, you need the necessary software. Bitcoin clients will never be written to make such "nasty" extractions, so we are fine.

Prosecutors can't just get your computer, pass it through their nasty software, and then claim: see! CP got out!
That could be done out of any computer in the world, with or without a blockchain.

So, please, don't overreact and attempt to kill "arbitrary data in the blockchain". There might be interesting and useful use cases for it that just didn't come up yet. As long as spamming is discouraged (and miners have a strong interest to do it), we're fine.
ReplyDelete
Replies
AnonymousMay 3, 2013 at 3:31 PM
This comment has been removed by the author.
ReplyDelete
Replies
AnonymousMay 3, 2013 at 3:42 PM
The ability to encode data into the block chain could be a useful anonymous comms application for many people (because the block chain is received by everyone it is impossible to tell who the intended recipient of encoded communications is). Good and bad applications are possible e.g. an oppressed person in North Korea sending out news to the CIA vs an Osama bin Laden sending out instructions to attack. Many though may just like the idea of creating indelible graffiti on the Internet. Block chain message encoding can't easily be prevented, but if levels increase, makes sense to me to introduce a minimum transaction size to prevent everyone with an application doing it and degrading the network for free.
ReplyDelete
Replies
GianoMay 21, 2013 at 5:02 PM
1* - i can't imagine any bad data, can you give me an example?
2* - if the goverment want to block this data to be read the have to block the method that explain how that data have to be read, not all the blockchain, this is obvious!!

can you explain me better that? i'm not an expert in this field....
ReplyDelete
Replies
UnknownMay 29, 2013 at 8:15 AM
What if I am hired by Goldman Sachs, "here, take this 2M USD go destroy BTC"
lets say that 1 BTC==$100
I go and get 10K BTC for $1M , I keep the other Mil for myself ;).
Let say I send 0.0001BTC to fake addresses or what ever are you talking about ( arbitrary data in block chain ).
From 10K BTC I could make 100 Mil fake transactions , and if I send 0.00001 BTC it would be 1 Bil non valid transactions.
So my question is would this kind of action hurt in any way BTC network/miners/btc coin value.
ReplyDelete
Replies
GianoOctober 5, 2013 at 10:05 AM
I have another question (excuse maybe i say silly things before): http://cnnmoneytech.tumblr.com/post/49468888972/how-to-turn-bitcoin-code-into-a-ben-bernanke-portrait is still possibile to embed comments directly in the transaction hash, if yes how?
ReplyDelete
Replies

Add comment

New comments are not allowed.