Monday, July 7, 2014

What should the news write about, today?

I have always wanted to be journalist.  As a youth, it seemed terribly entertaining, jet-setting around the world chasing stories by day.  Living like Ernest Hemingway by night.  Taking [some of the first in the world] web monkey jobs at Georgia Tech's Technique and CNN gave me the opportunity to learn the news business from the inside.

Fundamentally, from an engineering perspective, the news business is out of sync with actual news events.

News events happen in real time.  There are bursts of information.  Clusters of events happen within a short span of time.  There is more news on some days, less news on others.

The news business demands content, visits, links, shares, likes, follows, trends.  Assembly line production demands regularized schedules; deadlines.  Deadlines imply a story must be written, even if there is no story to write.

Today's news business is incredibly cut throat.  Old dinosaurs are thrashing about.  Young upstarts are too.  My two young children only know of newspapers from children's storybooks, and icons on their Android tablets.  Classified ads, once a traditional revenue driver, have gone the way of the Internet.  Many "newspapers" are largely point-and-click template affairs, with a little local reporting thrown in.  Robots auto-post every press release.  Content stealing abounds.

All these inherent barriers exist for those brave few journalists left on the robot battlefield.  As usual with any industry that is being automated, the key to staying ahead is doing things that humans are good at, but robots not: creativity, inventiveness, curiosity, detective work.  Avoiding herds, cargo cults, bike shedding, conventional wisdom.

In my ideal world, news sites would post more news, look and feel a bit different, on days and weeks where there is a lot of news.  On slow news days, the site/app should feel like it's a slow news day.

The "it bleeds, it leads" pattern is worn out, and must be thrown in the rubbish.

Every day, every week, when a reporter is met with the challenge of meeting a deadline to feed the content beast, the primary question should be:  What recent trends/events impact the biggest percentage of your audience?

Pick any "mainstream" news site.  How many stories impact those beyond the immediate protagonists/antagonists/victims/authorities involved?

The news business, by its very nature, obscures and disincentivizes reporting on deep, impactful, and probably boring trends shaping our lives.  The biggest changes that happen to the human race are largely apparent in hindsight, looking back over the decades or hundreds or thousands of years.

The good stories are always the hardest to find.  Every "news maker" has the incentive to puff their accomplishments, and hide their failures.  Scientists have the same incentives (sadly):  Science needs negative feedback ("this theory/test failed!") yet there are few incentives to publish that.  Reporters must seek and tell the untold story, not the story everyone already knows.

Journalists of 20+ years ago were information gateways.  Selecting which bit of information to publish, or not, was a key editorial power.  Now, with the Internet, the practice is "publish all, sift later."  Today's journalists must reinvent themselves as modern detectives, versus the information gateways and "filters" of past decades.



Friday, June 13, 2014

Bitcoin and 51% mining power

Meta: This doesn't cover all incentives. More a high level reminder for new folks.

Tweet: #bitcoin mining market under-studied & interesting. Where else
can 50% market leaders disappear, and market adjusts in real time?
#Resilient

Explanation:

Bitcoin mining pools are entities that serve to aggregate the security services provided by bitcoin mining hardware owned by individuals all over the world.  These mining pools execute bitcoin monetary policy -- they are the key network entities that select transactions to be included in The Official Timeline of Bitcoin Transactions (the blockchain).

The companies and individuals that own bitcoin mining hardware form a second tier in the market.  These miners choose an aggregator (mining pool) to which they provide computing services, in exchange for bitcoin payments.

The unique and interesting bit is that these second tiers miners all employ software that auto-switches between mining pools based on a variety of economic factors:  pool monetary policy choices, profitability and fee structure of the pool, technical availability of the pool, collective strength of the pool (size of the aggregation) versus other pools, etc.

Thus, a large and popular mining pool, dominating the market with >50% marketshare, may disappear in an instant.  Or another pool may be more profitable.  Second tier miners all employ software that switches between first tier aggregators in real time.  Low economic friction vis a vis market entry implies that market leadership follows three trends:
  1. Network effects generate large marketshares rapidly.
  2. Low economic friction (low cost of entry) implies market leadership changes frequently.  Every 12 months or so.
  3. The market is resilient against failure of market leaders, even those with > 50% marketshare.
It is natural and expected that miners will see a pool grow large, and switch away to other pools.  ETA:  Standard recommendation, use P2Pool.

Finally, remember that mining pools and miners are paid with tokens within the system -- bitcoins.  It is always in a miner's interest that bitcoins maintain their value.  Any behavior that harms the network as a whole will directly impact a large miner's income stream.  The larger the miner, the larger the impact.


Thursday, June 5, 2014

Why I will not be joining the NSA protest

(copied from a twitter rant)

Some random points about the NSA and global surveillance:

Today, in 2014, paying low volume retail prices, it costs a DIY stalker $400/month to track every human going through a single street corner.  For a large government at mass scale, their costs will be 1/100th or 1/1000th of that, or lower.  The costs of tracking everyone on the planet is falling through the floor.

Don't blame the NSA for being the first to buy a hammer off a store shelf.

Most network engineers presumed the Internet has been tapped for all its life. The Internet was not built to be secure. The original Internet protocols all sent passwords and other sensitive data over the network in plaintext. To this day, the most popular email protocol sends email across the Internet in plaintext, ensuring at least 10 entities (routers) have a copy of your email. It was trivial for any university student to snoop your email.

The NSA's global surveillance is a commentary on the future of tech for everyone. What the NSA has today, other countries have tomorrow, everyone has next year.

Further, we are presented with the obvious paradox:  Law enforcement (LEA) needs to follow criminals, whereever they go. National defense needs to follow attackers around the world. If you build a space away from LEA, criminals go there, and LEA is tasked to follow.

Nevertheless...  Freedom of [physical] assocation, perhaps even freedom of thought is threatened by global surveillance.  Today's global surveillance is a natural consequence of technology, not the fault of the NSA.

We now live in a world where all authors, thinkers, activists, politicians, judges, attorneys are automatically recorded.

The movement and communications of all "wired" citizens on Earth are tracked. Relevant factor is how tech advances to permit NSA to "remember" ever higher percentage of daily data. Data firehose is staggeringly huge, even for NSA.

No matter the layers of process protections and personal honor defending such data, access to the movements and communications of everyone will be abused for political or petty reasons.

Consider NAACP v. Alabama in the context of a universally tracked digital world.

Globally, we must have a conversation about practical freedom and privacy limits to be placed on data collected without our knowledge.  This is much bigger than the NSA, and we should not get distracted from the bigger picture of global surveillance by breathing fire at the first organization that makes use of well known techniques and technologies.

My personal recommendation are laws in every jurisdiction regarding privacy, data retention, forced data expiration (deletion), decreasing use of secret evidence, and eventual notification of investigation targets.  We must avoid the "pre-crime" trap, where predictive models lock society into a straightjacket based on word or thought alone. Citizens must be able to spout off. Youth must be allowed to screw up and be forgiven by society, rather than curse a person with a minor youthful transgression for the rest of their lives.

We must encourage the government to be transparent, while protecting the privacy of our citizens in a global, internetworked society.


Wednesday, May 14, 2014

Bitcoin and the kernel-in-a-kernel security sandbox problem

When considering sandboxes and security jails, the problem space is interesting.  There is an increasingly common pattern. I call it "kernel-in-a-kernel." The problem is not really sandboxing untrusted code, but more fundamentally, sandboxing untrusted behavior.

Bitcoin sees this acutely: bitcoind manages the bitcoin P2P network. P2P network is flood-fill a la Usenet, and anyone may connect to any node. Built-in DoS protections are a must, but these are inevitably hueristics which duct-tape one problem area, while leaving another open to algorithmic attacks ("this P2P command runs an expensive query, that impacts other connected nodes").

One comprehensive solution is accounting. Account for the various resources being used by each connected party (CPU, RAM, disk b/w, ...) and verify that some connections do not starve other connections of resources. This solution is a sandbox that essentially becomes a kernel unto itself, as the solution is not merely preventing sandbox jailbreaks but at a higher level limiting algorithmic jailbreaks.

Think about the high level economics of any computing situation. You have limited resources, and various actors have valid and malicious needs of those resources.  What is the best practical model for balancing a set of limited resources, given potential malicious or buggy/haywire users of these resources?



Thursday, April 17, 2014

Cost of a DIY NSA?

A core thesis of mine these days is that citizens worried about NSA surveillance are only seeing the tip of the iceberg.  Computers get cheaper and faster every year.  This puts surveillance technology within reach of just about anyone.

Let's build our own surveillance network on a street corner, with some spy cameras, and consider how much it costs.

Data point #1: Bandwidth and storage costs for a 24 hour video feed.  We'll imagine that one camera produces 1GB/day.

Data point #2: Spy cameras such as this one.  To ensure good coverage of our street corner from multiple angles, we will place 6 spy cameras at $9.00/each.

Data point #3: Amazon cloud storage.  Inbound bandwidth is free.  We are uploading 6GB/day.  Because we are super-smart and know the NSA's secret time travelling techniques, we will store 60 days worth of video.  That caps our data storage at 360 GB.  Cost: $10.80/month.

Data point #4: Open source biometrics software.  This is free.

Data point #5: Amazon cloud processing.  Being conservative and over-estimating, we will have a 3-computer cloud processing our video data for biometrics, running 24/7.  Cost @ m3.large: $302.00/month.

Data point #6: Derived data.  The data so far is just raw video.  We want to build a database of persons, cars, etc. over time.  We'll assume a 3x expansion of data storage due to this.  That increases our data storage cost to $43.20/month.  This is a conservative over-estimation, as derived data will likely be exponentially smaller than raw video.

Data point #7: Site controller, to which all the spy cameras connect.  Just need a laptop and an Internet connection.  Cost: $300.00 one time for laptop, $60.00/month for Internet.

Total non-labor costs for our street corner DIY NSA project:
  • $354.00 initial
  • $405.20/month
For this cost, you may obtain a wealth of biometric data:  faces, license plates, associations between people, other biometric markers such as voice or gait, product usage.  Anything that may be gleaned from audio or video by simple, legal, public observation over time may be data mined for biometrics and personal data.

Folks engaged in the NSA debate often do not realize how inexpensive and accessible is this technology to local law enforcement, private corporations, and criminals.

Cheap cloud storage and data mining has implications for freedom of association and freedom of thought.  The NSA via Snowden has made this issue starkly clearly... but the media and public miss the larger point that technology itself, not NSA abuses, are leading us to global surveillance state where we will be watched by any number of public and private parties without our knowledge or consent.

Note: This post intentionally over-estimates costs by taking retail prices, not at scale, and assuming naive software implementations.  A local law enforcement agency or Google-level corporation could easily reduce these costs by large factors (100-1000x).

Monday, April 14, 2014

On Amazon, bitcoin and Stan Lee

After experimenting with bitcoin in July 2010 ("the great slashdotting") and finding it to be a sound design, my thoughts ran in a predictable direction:  what are the implications of a global digital currency?  What are the engineering practicalities required to bootstrap a brand new digital economy from zero?

the Amazon, though not the one we're talking about in the articleIf you imagine a new currency being rolled out worldwide, the idea of how Amazon.com might implement the currency inevitably comes up.  Amazon's business has several major components that deal with payments of various types.  It is relevant to Amazon and bitcoin that we consider all of these payment types, not just the well known storefront payment flow.
  • Amazon.com storefront.  Buy a book, pay with credit card, etc.
  • Amazon Payments.  A bit of a Paypal clone, though they don't market it that way.  Can have a positive balance, send P2P payments in-system.
  • Handling fulfillment and payout to merchants who sell goods through their systems.
  • Amazon AWS Flexible Payments Service
  • Amazon AWS DevPay
  • In the time it took for you to read this, Amazon has probably created another cloud payment service.
While performing research related bitcoin, I examined the money transmittal space to learn which corporations maintained money transmittal licenses nationwide.  According to my research circa 2010, Amazon is one of the few Fortune 500 companies with money transmitter licensing in all US states.  Adopting bitcoin, at a minimum, probably requires Amazon to re-evaluate compliance at a time when they are also trying to lobby US states on sales tax issues.

The network effect (Amazon's size) must also be considered.  Amazon.com today is basically "the Internet store." No need to qualify further.  With upcoming local delivery efforts, that effect is even more pronounced.

Like Google or Wal-Mart, every move by a company of this size has enormous consequences, intended and unintended.  Amazon adopting bitcoin today would have a disruptive effect on bitcoin, credit card systems, and banks worldwide.  At scale, one hopes Amazon acknowledges that great power and aims to wield it responsibly.  It is a fair and rational position for a company of that size to sit back and let the market sort out which crypto-currency to adopt.

On the technical side of the equation, a new payment system at Amazon.com is likely a major undertaking.  Amazon's software is entirely homegrown, and quite complex.  So complex that their store evolved into a web services business (AWS).  Having been recruited by Amazon myself, and having friends who work at Amazon as engineers, I know that Amazon stays at the bleeding edge of computing technology.  Integrating bitcoin -- or any digital cash -- probably requires extensive software updates throughout the system.  Digital cash, after all, does not behave like a credit card or debit card.  Those behavior differences can ripple through highly custom software, increasing engineering costs.

As a bitcoin supporter, I certainly feel the aforementioned disruptive effect is a positive one for the world.  But there are many reasons why Amazon in particular would be conservative about adopting an experimental new digital cash.  Given the above factors, my prediction -- dating back to 2010 -- was always that Amazon would sit back and let others decide the bitcoin-or-not question.


Sunday, March 23, 2014

Arthur C. Clarke on economic meltdowns and network congestion

Here is a fun passage from Arthur C. Clarke's Rama II fictional novel, published in 1989.  It presents an arguably realistic crash scenario for our [real world] financial systems.  Food for thought in the bitcoin economy or fiat economy both. We are incredibly dependent on computer financial databases.

In contrast, terrestrial affairs were dominated by the emerging world economic crisis. On May 1, 2134, three of the largest international banks announced that they were insolvent because of bad loans. Within two days a panic had spread around the world. The more than one billion home terminals with access to the global financial markets were used to dump individual portfolios of stocks and bonds. The communications load on the Global Network System (GNS) was immense. The data transfer machines were stretched far beyond their capabilities and design specifications. Data gridlock delayed transactions for minutes, then hours, contributing additional momentum to the panic.
 

By the end of a week two things were apparent—that over half of the world's stock value had been obliterated and that many individuals, large and small investors alike, who had used their credit options to the maximum, were now virtually penniless. The supporting data bases that kept track of personal bank accounts and automatically transferred money to cover margin calls were flashing disaster messages in almost 20 percent of the houses in the world.
 

In truth, however, the situation was much much worse. Only a small percentage of the transactions were actually clearing through all the supporting computers because the data rates in all directions were far beyond anything that had ever been anticipated. In computer language, the entire global financial system went into the "cycle slip" mode. Billions and billions of information transfers at lower priorities were postponed by the network of computers while the higher priority tasks were being serviced first.
 

The net result of these data delays was that in most cases individual electronic bank accounts were not properly debited, for hours or even days, to account for the mounting stock market losses, Once the individual investors realized what was occurring, they rushed to spend whatever was still showing in their balances before the computers completed all the transactions. By the time governments and financial institutions understood fully what was going on and acted to stop all this frenetic activity, it was too late. The confused system had crashed completely. To reconstruct what had happened required carefully dumping and interleaving the backup checkpoint files stored at a hundred or so remote centers around the world.
 

For over three weeks the electronic financial management system that governed all money transactions was inaccessible to everybody. Nobody knew how much money he had—or how much anyone else had. Since cash had long ago become obsolete, only eccentrics and collectors had enough bank notes to buy even a week's groceries. People began to barter for necessities. Pledges based on friendship and personal acquaintance enabled many people to survive temporarily. But the pain had only begun. Every time the international management organization that oversaw the global financial system would announce that they were going to try to come back on-line and would plead with people to stay off their terminals except for emergencies, their pleas would be ignored, processing requests would flood the system, and the computers would crash again.
 

It was only two more weeks before the scientists of the world agreed on an explanation for the additional brightness in the apparition of Halley's Comet. But it was over four months before people could count again on reliable data base information from the GNS. The cost to human society of the enduring chaos was incalculable. By the time normal electronic economic activity had been restored, the world was in a violent financial down-spin that would not bottom out until twelve years later. It would be well over fifty years before the Gross World Product would return to the heights reached before the Crash of 2134.