The NSA's plans to track everything you do

Crappy cell phones. 3 billion cell phones worldwide, half of which will have video and microphones. So 1 billion cell phones, you can record about 10 years worth?

Unless those cell phones are always encoding video all of the time (and then transmitting the resultant data!) I don’t see the numbers adding up.

Heck, just recording the phone calls sounds like (in this context) almost zero data! A phone conversation is encoded at what, 8kb/sec? You are on your phone, what, 2 hours a day (!!) on average? That’s 2.5GB/year/phone. Am I doing my math right? These numbers are tiny!

Filling this data store up is going to be an impressive challenge. Maybe they are going to be storing the hi-def video raw. I love it when people do that (yes, for people who are going to be editing the resultant data it makes sense in some cases, but really that is not the use here … unless they are going to frame all of us!!).

The referenced yottabyte study is here.

Jesus, Jason, you’ve got some pull.

So the article quote of this report’s yottabyte citation failed to acknowledge the fact that the report itself said the number was bullshit :)

Anyhow, most random sensor data will be useless to man and beast outside the particular application and in-situ context, so there is no point in the NSA tracking it except to build their budget out still further.

The general problem of sifting through enormous random collections of data is basically AI-hard. No really significant work has been done in this area (conspiracy theories aside) since the early 80s. The W3C Semantic Web project is currently grinding away at it, but the Semantic Web approach requires data source authors to extensively annotate their data and conform to standard ontologies, which of course will not be the case for the vast majority of data the NSA acquires. Good luck to the NSA in even indexing all the text much less analyzing actual data in random formats, schemas, and encodings.

Currently the best approach to analyzing vast amounts of text is Google’s, which requires billions of user queries to be counted and analyzed to find out what users are actually interested in. This search-term-emergence approach is amazingly good for what it does, but it’s useless for many conventional data applications, and since it has no “intelligence” of its own whatsoever, it’s difficult to adapt it for many desirable kinds of analyses.

Hah, fun. Someone should mail him, at the very least he cited the wrong article.

[Tinfoil Hat]
Clearly they must have advanced Quantum Computers!
[/Tinfoil Hat]

But seriously, “Yotta” bytes? Somebody misplaced just a few zeroes.

I did mail Bamford (c/o NRB) and he replied quite promptly and at length. Still in the middle of an email conversation.

By the way… I assume the DoD JASON project is not in fact named after you?

QT3 for the win!

By the way… I assume the DoD JASON project is not in fact named after you?

I can neither confirm or deny those accusations, Senator.

Why do I always hear the Terminator theme when I read reports like these?

There is an easy solution to reducing the amount of storage needed to maintain, analyze and index the communications of every American by 90%. We just need to kill 90% of all Americans. An epidemic would do nicely and allow priority individuals to be innoculated and removed from the impacted population. Efficient use of existing technology can get this project ahead of schedule and under budget.

fwiw cellphones can act like a microphone and record anything anytime and transmit that, they don’t even have to be turned on

Common sense and a dead battery would beg to differ with the idea that a powerless piece of technology can do anything.

You might want to do a bit more research before you leap to that conclusion, as foogla’s point isn’t speculation.

In particular, switching your phone “off” doesn’t guarantee it isn’t using any power. I suppose you could pull the battery out when you turn it off, but who do you know that actually goes that far?

Only murderers take their cell phone batteries out!

Okay, lets use some NSA keywords that will undermine America’s security:

Public Option!

World Peace!

Clean Air and Water!

High Gas Mileage!

Recycle!

I await the Black Helicopters.

A fun experiment that I’m not going to try would be to call someone you know who is not a government official on your cellphone while in DC near the White House and threaten the President, using words that are likely to be monitored for, and in a way that allows you to claim that you were just kidding. Then you time how long it takes for the Secret Service to show up and arrest you. This would settle the question of whether or not phone conversations are monitored on a regular basis, and if so how efficient the system is in responding to imminent threats.

xkcd: More Accurate - Roll-over text.

Great, something ELSE I’ll have to explain next time I go through customs!

No, officer, that’s not a bomb that’s my laptop.

Yes, I bought this shirt online.

No, that’s corn-meal flour.

No, that’s my deodorant.

Yes, I saw him post that but I didn’t report him to the authorities.

No, you can’t have any.

Well, it would be an experiment, for sure.

Untrained speech recognition on phone quality voice is far too crappy for pulling some word like “murder” out of a conversation. All the masons at construction sites talking about “mortar” would constantly be seeing black helicopters descending silently towards them on autorotation.

As for human monitoring, how many cell phone conversations are there at the same time during busy hour in DC? That’s the number of people you’d need sitting there in Fort Meade plugged in to headsets, eyes glazing over, contemplating suicide.

The computer speech recognition could be good enough to flag say, 40 calls of interest out of the hundreds ongoing at any moment. That’s enough reduction that human operators could listen to a recording of them with probably just a few minutes delay.

I’m not defending the idea as a good one, but it’s definitely technically feasible to have effective near realtime monitoring with today’s technology.