Archive for the ‘ESSAYS’ Category.

Lead the Charge Against More Advanced APIs

I received a conference solicitation with the provocative title of “Lead the Charge Against More Advanced APIs”. You could register and:

Add to your skills to defend against genuinely advanced cyber attackers employing a myriad of methods such as DDoS, DNS and API … Gain tools and insights that can help you protect enterprises from more advanced APIs

I suppose that I should be kind and refrain from making fun of the copywriter. On the other hand, I really hope that this imprecision is not catching.

Hint to future copywriters on security topics: DNS is a service, APIs are interfaces, they can be attacked, they are not attacks or attack methods.

Smart endpoints, complaint aggregators, carrier support, and real-time interfaces for law enforcement: A solution for the 2013 FTC Robocall Challenge

Submitted to the FTC Robocall Challenge on January 15, 2013 [link]

Overview

I propose a system comprised of smart endpoints and complaint aggregators, with interfaces to carriers and law enforcement, partially supported by bounties from successful prosecutions.

Benefits from this system accrue to all parties:

  • Smart endpoint hardware and software near consumers provides call screening features in a simple comprehensible manner (from the consumer point-of-view, an answering machine plus screening features). Building in flexibility allows the system to remain nimble as techniques become more sophisticated. Smart endpoints can capture complete audio data, compute audio fingerprints, and make classification decisions based on both content and metadata.
  • Complaint aggregation services benefit from a stream of prompt data in high volume. Beneficiaries of that aggregated data include law enforcement personnel and prosecutors, who can prioritize investigations by volume, and build stronger cases with high incident counts that are well-documented, supporting higher fines from successful prosecutions.
  • Interfaces between endpoints, carriers, and complaint aggregators enable the use of live call transfers as one of the call rejection mechanisms. Benefits include improved opportunities for call tracing, and selective automation-supported transfer of calls to law enforcement for identifying qualifiers and telemarketers.
  • Financial incentives from sharing bounties on successful prosecutions give at least a psychological/marketing boost to the entire system. There is some history for bounties in the U.S, in the form of qui tam litigation. Naming the endpoints “privateers” and noting the history of letters of marque is one evocative way to market the concept to consumers. Who doesn’t want to own a privateer protecting their privacy?

Details

The consumer point of view

The smart endpoint is easily comprehended as an answering machine PLUS:

  • Easy call block (one-press blacklist) and call enable (whitelist)
    • Implementation: Blacklist with simple sequence such as “*#” or long-press-* or long-press-#. Whitelist via memory of outbound calls. The typical set of answering machine features is also provided.
  • Automatic screening and classification into ring-through or take-a-message with automatic classification into an inbox or a suspicious box.
  • Like the current generation of call screening, some use of Caller ID is not ruled out, though clearly it is not definitive for robocall identification. Mainly Caller ID may be useful for classification of legal unwanted calls, since legal callers have no need to hide their source. Legal callees have every right to ignore high-volume unwanted calls despite their illegality. Even forged Called ID data may be useful as weak evidence if callers exhibit any predictable geographic or bogus forgery preferences.
  • Take-a-message behavior includes a CAPTCHA to add one more bit of evidence. I assert that “dial 23 to leave a message” is barely distinguishable from “leave a message after the tone” in annoyance level. (A minor disagreement with Mr Schulzrinne’s seminar presentation on “The Network”.)
  • Easy after-the-fact blocking (manual classification) while listening to recorded messages
  • Handles all unwanted calls: illegal robocalls or unwanted legal calls (Note: This is my definition of optimum behavior — the consumer gets to define “unwanted”.)
  • Low probability of false positives since CAPTCHA can take a message and mark it less-suspicious
  • Incentives: reporting incidents offers consumers:
    • Valuable prizes: opportunity for share of proceeds from prosecution
    • satisfaction of getting a caller blocked on your friends’ phones
    • know that the reports of others are contributing to the quality of your classifier

Behind the scenes, this endpoint can:

  • Send unwanted call data (recorded audio and/or acoustic fingerprints, caller ID) to complaint aggregator
  • Use crowdsourced collaborative filtering data from complaint aggregator to improve classification
    • pre-filing
    • post-filing
  • Transfer live calls (classified as unwanted) via carrier for live call tracing or human investigation, while passing incident and classification information out-of-band to the complaint aggregator so it can be shared immediately with cooperating law enforcement systems.

The system point of view

Complaint Aggregators can:

  • Collect high-quality unwanted call data, including:
    • recorded audio and/or derived data such as acoustic fingerprints, speech to text, or vocoder-based respresentations
    • evidence from CAPTCHA success/failure
    • evidence from human consumer’s manual classification
  • Offer to share valuable prizes when aggregated evidence contributes to prosecutions.

Carriers can:

  • Provide support for transferred calls from consumer endpoints, for live call tracing, or for transfer to live law enforcement investigators so qualifiers and telemarketers can be identified.
    • Implementation: Like current carrier switches that support call transfer via flash-dialcode-phonenumber, carriers could also support call transfers with an opaque incident number included in the dialing sequence. The opaque incident number could be passed to the destination as DNIS (dialed number) information, and this small datum could be a (aggregator#,incident#) key that would allow systems with access to aggregator data to immediately look up incident data (which was transmitted to aggregators out of band).
  • Offer the consumer endpoint features as a hosted IVR service instead of customer premise equipment

The law enforcement and prosecutor point of view

Complaint aggregators provide a high-quality stream of evidence:

  • verifiable audio recordings
  • automatic prompt high-volume clustering of identical robocall messages

Carrier forwarding of live calls includes:

  • opportunity for more information from network tracing of live calls,
  • opportunity for insertion of human investigators into robocall-initiated calls to collect information from qualifiers and telemarketers
    • automatic clustering based on initial robocall message provides opportunity to prioritize high-volume known offenders for live call transfer to human investigators

Therefore law enforcement is more likely to identify the actual source of illegal calls, and prosecutors have a strong record of high volume incidents supporting a case for high fines.

Interfaces among providers of these components and services are important

Smart endpoints and complaint aggregator services would be likely to be tightly integrated, as rapid nimble new feature development is important, so single-vendor suppliers of both would benefit from coordination between endpoint features and back-end database and computation features. But a competitive market including multiple endpoint/aggregator providers would be more healthy than a single source. Each supplier could implement a closed proprietary system and could innovate as rapidly as they want.

Law enforcement systems and personnel would want a common interface to multiple complaint aggregators and multiple carriers. Some simple general interfaces for pulling evidence from aggregators and carriers in real time would limit implementation on the law enforcement side without slowing down the innovation the data collection side.

Discussion of hostile counter-measures

Indeed, illegal callers are likely to adopt some counter-measures, some of which will be more effective than others. All will increase the expenses incurred by the callers.

  • To evade CAPTCHA challenges, callers may implement voice recognition or insert humans. Both are expensive, and can be made more so by increasing the variety of challenges.
  • To evade content matching, callers can introduce chaff to recording content (noise, distortion, voice generator parameter changes, music, timing changes). Audio fingerprinting techniques are already immune to many of these variations. Since the real domain is speech, text to speech algorithms will tend to be insensitive to these recorded content changes as well.
  • Attackers could try to overwhelm or subvert aggregator services or data structures. However, participation in the infrastructure would be limited to subscribing users, with enough resiliance to restrict access to legitimate devices and ignore denial of service attacks.

Evaluation

At a minimum, even as a standalone device, a smart endpoint offers as much as current user-configured call screening devices, with a simple comprehensible consumer feature set.

Aggregating evidence from many endpoints implementing manual classification and automatic CAPTCHA (in collaborative filtering and crowdsourcing fashion) makes the endpoints more powerful than any standalone device.

Access to realtime streams of call information through aggregators allows law enforcement to move from correlating randomly sampled incomplete delayed complaint reports to acting on deliberately selected fully-documented immediate events. Then for prosecution, automation support for building large related incident lists are useful for maximizing fines.

Endpoint implementation can be in a hardware device near the consumer, or can be a hosted service located at a carrier or IVR vendor, or can be embedded in a mobile phone application. All of those implementations benefit from sharing data with aggregators.

For the future, an architecture including both distributed smart endpoints and centralized database and compute support is a natural solution to implementing new technologies, such as authenticated caller ID in a VoIP-based consumer world. Successful endpoint implementors and complaint aggregators in the current generation will be well positioned to deliver implementations and services for the next generation.

Services sans support

Face it, the most successful services in the new era are the ones that provide something valuable while keeping their per-user costs near zero: some service, no customer support, and users happy nonetheless.

Phone service does not fit that model. There are just too many occasions for “no support” to be unacceptable.

Today’s example: Porting a phone number from Verizon to Google Voice: Just $20, it works great, except when it doesn’t. In my case, SMS never successfully ported. The only support mechanism is a help page that states that it takes up to five business days for text messaging to resume after a port, and after that time, you can visit a web page to fill out a form that causes no observable action.

This could have been mitigated by supplying information instead of support. Expose the internal states of the porting process, so a customer can see progress or can know who to blame. Track the tickets on the problem reports.

But Google Voice as it stands offers no information and no support (and all attempts to “get a human” fail). So it gets the blame for failure to deliver, even if it’s somebody else’s fault. (Who knows, perhaps carriers and SMS gateway providers drag their feet on number porting. But with no information offered, all I know for sure is that Google Voice couldn’t get it done after weeks of waiting.)

In summary: For some businesses the appropriate level of offered support ought to be greater than zero. More status information can mean less customer support.

Don’t try to make me spam my contacts

High-quality social network sites grow because contacts are real, and site-mediated communication is welcome. For example, LinkedIn from the beginning treated contact information very carefully, never generating any email except by explicit request of a user. Therefore it felt safe to import contacts into it, since I wasn’t exposing my colleagues to unexpected spam. (LinkedIn has loosened up a bit. Originally one could not even try to connect to someone unless you knew their email address already. They made it easier to connect to people found by search only, and you can pay extra to send messages to strangers; nonetheless, in my experience it’s always user-initiated.)

Low-quality social network sites grow by finding ways to extract contacts from people so the system can spam them, or trick users into acting as individual spam drones. (A worst-case example are those worm-like provocative wall postings that, once clicked, cause your friends to seem to post them also. Just up from that on the low rungs are the game sites that post frequent progress updates to all your friends.)

I’m a joiner and early adopter, but I rarely invite people to use a service they’re not already using. That’s my way of treating my contacts respectfully, and protecting my own reputation as a source of wanted communication, not piles of unsolicited invitations.

Google Plus has recently taken a step toward lower quality by changing their ‘Find People’ feature. Previously it identified/suggested Google Plus users separately (good). Now it identifies and suggests everyone on your contact list and beyond, without identifying whether they are already a Google Plus user. Really they are nudging me toward being an invite machine for them.

As a result, Google Plus will get less high-quality social-network building (among people who respect their contacts and take care with their communication), and more low-quality social-network building (piles of invites from people I barely know). If it goes too far downhill, Google will endanger the willingness of high-quality users to let Google know anything about their contacts or touch their email.

Desk Checking

Ole Eichhorn has written a great essay on “the lost art of desk checking,” sharing how slow and painful experiences with debugging led to habits of deliberate and careful pre-planning and checking.

My own parallel experiences: Okay, I’m doing to date myself here too. I’m also 49 years old, but didn’t start programming until Senior High. First experiences were with Basic on a Xerox Sigma 7 (thanks, Xerox), and a Wang 2200B. Not much learned there.

I learned more during summer vacations, when I paid real money to the University of Rochester to use their mainframe. I discovered that my first APL programs actually worked. I tried my hand at IBM 360 assembly language programming, but debugging was expensive – each assemble/link/run cost over $2. So I started editing the binary object decks on a keypunch instead, reducing the cost of a link/run to something under 80 cents.

While I followed the technology curve and have all the modern development environment power tools, there’s nothing like designing cleanly and understanding what’s going on. To quote Eichhorn:

To write code I just look at my screen and start typing, and to fix code, I just look at my screen some more and type some more. So now, finally, I‘m done with desk checking, right?

Wrong.

I desk check everything. Thoroughly.

And this, to me, is a major league black art which is lost to all those who didn’t have to hand-punch cards and wait a week for their deck to run. It is a lost art, but an essential art, because all the tools which make entering code and editing code and compiling code and running code faster don’t make your code better.

Prediction for 2008: Service providers avoid straightforward DTV answers

Like many others in 2008, I am cheap, don’t buy TVs very often, subscribe only to basic cable, and have questions about the impending February 17 2009 shutdown of analog over-the-air TV channels.

My prediction for 2008 is that confusion will reign because part of the answer is provided by cable, satellite, or telephone service companies, and their incentive is to maintain confusion because that’s an effective “up-sell” technique.

The simple story is that over-the-air (OTA) analog goes away, replaced by OTA digital. For OTA consumers, it’s just a matter of getting an ATSC tuner (built-in to a newer TV, or standalone with a government-subsidizied coupon).

The part that is different for every locality and service provider: what to do with analog TVs on analog cable systems. For every locality there is a simple cable story: the cable company could tell you their plans for analog channels, e.g. “We’ll continue to carry local channels for our analog customers through [let’s say] 2012.” But the cable companies will generally avoid that story. (I tried to extract it from TWC and they failed the first test, answered the wrong question entirely.)

Why would they tell you a simple “analog on cable is OK for N years” story when they would rather upgrade you to a new digital cable set-top box, and while they’re at it, try to replace your phone too?

So, even if it’s true that analog cable customers will live just fine on the analog cable plant for quite some time, you’ll only see it either in extremely fine print, or omitted as a choice at all in most promotional materials.

Now, it is also true that for bandwidth utilization reasons, the cable companies would like to convert their cable plant to all-digital. If they somehow manage to convert all their cheap $8/month basic cable customers to some fatter bundle, all the better for them. The good thing is that digital OTA tuners will provide competition, so the cable company had better have something that competes with free digital for cheap customers, or they’ll just lose the low end altogether. (The only reason I have basic cable is because my analog OTA reception is poor. Once digital OTA becomes cheap (it’s not yet, standalone tuners are too expensive), I’ll be a digital OTA customer unless cable really makes it worthwhile not to switch. It’s a race to the bottom for my dollar.)

Once they start losing a significant number of customers to digital OTA, then they will start publicizing cheap basic analog and constructing cheap basic digital. But they will wait as long as possible.

Vote but Verify

Local Rochester-area political blogger Thomas Belknap recently railed about HR 811, interpreting its requirement of a voter-verified durable paper ballot as a small-minded banning of an attractive future of modern networked reliable electronic voting machines. I could not resist posting my disagreement into the comments on his blog, and perhaps I am going to convince him, as he edited out my most provocative snide political shots and left in some of my more reasoned comments.

As a security person, I must point out that if machines do not produce a reliable auditable record, then all you have is a fait accompli fraud-blessing device. That’s the short version of the security argument.

I’m willing to go along with NIST that, as of today, all-electronic systems are an important research topic, not a settled present alternative:

The approach to software-independence used in op scan is based on voter-verified paper records, but some all-electronic paperless approaches have been proposed. It is a research topic currently as to whether software independence may be able to be accomplished via systems that would produce an all-electronic voter-verified, independent audit trail (known as software IV systems).

A durable paper ballot requirement is not a retrograde goof, nor a rejection of e-voting. It’s a reflection of current reality, that all-electronic e-voting implementations are asking for trouble. Codifying an allowance for all-electronic systems today would just open the door to arguments about what’s good enough cryptographically, arguments that will be settled by folks even less competent than our representatives. Codifying the well-understood voter-verified paper audit trail as a requirement puts an immediate crimp in the shopping spree for fancy-looking machines that are rotten inside – a shopping spree that will continue if this law isn’t passed, creating an ever-larger lump of sunk investment in pretty bad technology.

A paper audit trail today isn’t a rejection of e-voting, it is progress toward a more robust implementation that in the future will, no doubt, also include other alternative durable auditable records.

For credible background on the security geek consensus, see the above-quoted NIST draft, the US ACM policy recommendation, or Bruce Schneier (University of Rochester physics alumnus!). Or anything by Ed Felten or Avi Rubin on this subject. In this case, our representatives seem to be listening to informed advisers.

Regarding politics: All parties’ oxes have been gored at one time or another by voting fraud or rumors of fraud, so this does seem like an issue on which a consensus could form.

Systems programmers help people

Way back in the 1970s, I attended a banquet at RIT, for incoming or prospective students. My assigned seat placed me next to another intended Computer Science major.

I had cut my teeth in high school on some Basic programming (on a Xerox Sigma mainframe and a Wang 2200B), then self-taught myself APL and IBM/360 assembly language (paying for access at UR to an APL terminal, and editing object decks on the keypunch to save money while debugging assembly language programs).

My dinnermate at the banquet had had no such experience. So in choosing her major and concentration, she had to depend on the layman’s descriptions she heard during a college visit. You see, application programmers write programs that actually do things. Meanwhile, system programmers work on the operating system.

What’s an operating system? Well, it doesn’t do anything itself, it’s just there to help people write application programs.

Why did she choose Computer Science with a system programming concentration? “I like to help people.”

Goodbye IE6

My installation of Microsoft Internet Explorer 6 (version 6.0.2900.2180.xpsp_sp2_gdr.050301-1519) has developed the unfortunate problem of frequently (about once a day) trashing its ability to render correctly: painting its window contents at various places all over the display, rendering in the wrong font, leaving turds all over its window while scrolling. Once it starts I have to kill iexplore.exe to make it stop. I believe it is fully-patched.

In my mind the appearance of this problem is correlated with the appearance of two new aggressive JavaScript interfaces: The much-improved BlogLines feed selector, and the very-irritating Yahoo Finance streaming quotes feature (which slows down every refresh even when set to “off”). That may just be coincidence.

It does mean there’s some serious undiscovered memory corruption going inside IE6 somewhere.

It’s a good time to switch to FireFox and/or IE7.

Storage Innovation Ahead

The existence of cheap and presumed-reliable storage services such as
Amazon S3
will cause a burst of innovation in personal and corporate storage options.
A particularly good fit: content-addressible storage schemes such as
plan9 venti
and
git,
that offer frugal use of bandwidth (important when metered), and attractive features like version snapshots “for free.”
A little searching shows one talented software developer thinking along these lines already:
Brad Fitzpatrick: wsbackup — encrypted, over-the-net, multi-versioned backup.
There will be more.