Submitted to the FTC Robocall Challenge on January 15, 2013 [link]
Overview
I propose a system comprised of smart endpoints and complaint aggregators, with interfaces to carriers and law enforcement, partially supported by bounties from successful prosecutions.
Benefits from this system accrue to all parties:
- Smart endpoint hardware and software near consumers provides call screening features in a simple comprehensible manner (from the consumer point-of-view, an answering machine plus screening features). Building in flexibility allows the system to remain nimble as techniques become more sophisticated. Smart endpoints can capture complete audio data, compute audio fingerprints, and make classification decisions based on both content and metadata.
- Complaint aggregation services benefit from a stream of prompt data in high volume. Beneficiaries of that aggregated data include law enforcement personnel and prosecutors, who can prioritize investigations by volume, and build stronger cases with high incident counts that are well-documented, supporting higher fines from successful prosecutions.
- Interfaces between endpoints, carriers, and complaint aggregators enable the use of live call transfers as one of the call rejection mechanisms. Benefits include improved opportunities for call tracing, and selective automation-supported transfer of calls to law enforcement for identifying qualifiers and telemarketers.
- Financial incentives from sharing bounties on successful prosecutions give at least a psychological/marketing boost to the entire system. There is some history for bounties in the U.S, in the form of qui tam litigation. Naming the endpoints “privateers” and noting the history of letters of marque is one evocative way to market the concept to consumers. Who doesn’t want to own a privateer protecting their privacy?
Details
The consumer point of view
The smart endpoint is easily comprehended as an answering machine PLUS:
- Easy call block (one-press blacklist) and call enable (whitelist)
- Implementation: Blacklist with simple sequence such as “*#” or long-press-* or long-press-#. Whitelist via memory of outbound calls. The typical set of answering machine features is also provided.
- Automatic screening and classification into ring-through or take-a-message with automatic classification into an inbox or a suspicious box.
- Like the current generation of call screening, some use of Caller ID is not ruled out, though clearly it is not definitive for robocall identification. Mainly Caller ID may be useful for classification of legal unwanted calls, since legal callers have no need to hide their source. Legal callees have every right to ignore high-volume unwanted calls despite their illegality. Even forged Called ID data may be useful as weak evidence if callers exhibit any predictable geographic or bogus forgery preferences.
- Take-a-message behavior includes a CAPTCHA to add one more bit of evidence. I assert that “dial 23 to leave a message” is barely distinguishable from “leave a message after the tone” in annoyance level. (A minor disagreement with Mr Schulzrinne’s seminar presentation on “The Network”.)
- Easy after-the-fact blocking (manual classification) while listening to recorded messages
- Handles all unwanted calls: illegal robocalls or unwanted legal calls (Note: This is my definition of optimum behavior — the consumer gets to define “unwanted”.)
- Low probability of false positives since CAPTCHA can take a message and mark it less-suspicious
- Incentives: reporting incidents offers consumers:
- Valuable prizes: opportunity for share of proceeds from prosecution
- satisfaction of getting a caller blocked on your friends’ phones
- know that the reports of others are contributing to the quality of your classifier
Behind the scenes, this endpoint can:
- Send unwanted call data (recorded audio and/or acoustic fingerprints, caller ID) to complaint aggregator
- Use crowdsourced collaborative filtering data from complaint aggregator to improve classification
- Transfer live calls (classified as unwanted) via carrier for live call tracing or human investigation, while passing incident and classification information out-of-band to the complaint aggregator so it can be shared immediately with cooperating law enforcement systems.
The system point of view
Complaint Aggregators can:
- Collect high-quality unwanted call data, including:
- recorded audio and/or derived data such as acoustic fingerprints, speech to text, or vocoder-based respresentations
- evidence from CAPTCHA success/failure
- evidence from human consumer’s manual classification
- Offer to share valuable prizes when aggregated evidence contributes to prosecutions.
Carriers can:
- Provide support for transferred calls from consumer endpoints, for live call tracing, or for transfer to live law enforcement investigators so qualifiers and telemarketers can be identified.
- Implementation: Like current carrier switches that support call transfer via flash-dialcode-phonenumber, carriers could also support call transfers with an opaque incident number included in the dialing sequence. The opaque incident number could be passed to the destination as DNIS (dialed number) information, and this small datum could be a (aggregator#,incident#) key that would allow systems with access to aggregator data to immediately look up incident data (which was transmitted to aggregators out of band).
- Offer the consumer endpoint features as a hosted IVR service instead of customer premise equipment
The law enforcement and prosecutor point of view
Complaint aggregators provide a high-quality stream of evidence:
- verifiable audio recordings
- automatic prompt high-volume clustering of identical robocall messages
Carrier forwarding of live calls includes:
- opportunity for more information from network tracing of live calls,
- opportunity for insertion of human investigators into robocall-initiated calls to collect information from qualifiers and telemarketers
- automatic clustering based on initial robocall message provides opportunity to prioritize high-volume known offenders for live call transfer to human investigators
Therefore law enforcement is more likely to identify the actual source of illegal calls, and prosecutors have a strong record of high volume incidents supporting a case for high fines.
Interfaces among providers of these components and services are important
Smart endpoints and complaint aggregator services would be likely to be tightly integrated, as rapid nimble new feature development is important, so single-vendor suppliers of both would benefit from coordination between endpoint features and back-end database and computation features. But a competitive market including multiple endpoint/aggregator providers would be more healthy than a single source. Each supplier could implement a closed proprietary system and could innovate as rapidly as they want.
Law enforcement systems and personnel would want a common interface to multiple complaint aggregators and multiple carriers. Some simple general interfaces for pulling evidence from aggregators and carriers in real time would limit implementation on the law enforcement side without slowing down the innovation the data collection side.
Discussion of hostile counter-measures
Indeed, illegal callers are likely to adopt some counter-measures, some of which will be more effective than others. All will increase the expenses incurred by the callers.
- To evade CAPTCHA challenges, callers may implement voice recognition or insert humans. Both are expensive, and can be made more so by increasing the variety of challenges.
- To evade content matching, callers can introduce chaff to recording content (noise, distortion, voice generator parameter changes, music, timing changes). Audio fingerprinting techniques are already immune to many of these variations. Since the real domain is speech, text to speech algorithms will tend to be insensitive to these recorded content changes as well.
- Attackers could try to overwhelm or subvert aggregator services or data structures. However, participation in the infrastructure would be limited to subscribing users, with enough resiliance to restrict access to legitimate devices and ignore denial of service attacks.
Evaluation
At a minimum, even as a standalone device, a smart endpoint offers as much as current user-configured call screening devices, with a simple comprehensible consumer feature set.
Aggregating evidence from many endpoints implementing manual classification and automatic CAPTCHA (in collaborative filtering and crowdsourcing fashion) makes the endpoints more powerful than any standalone device.
Access to realtime streams of call information through aggregators allows law enforcement to move from correlating randomly sampled incomplete delayed complaint reports to acting on deliberately selected fully-documented immediate events. Then for prosecution, automation support for building large related incident lists are useful for maximizing fines.
Endpoint implementation can be in a hardware device near the consumer, or can be a hosted service located at a carrier or IVR vendor, or can be embedded in a mobile phone application. All of those implementations benefit from sharing data with aggregators.
For the future, an architecture including both distributed smart endpoints and centralized database and compute support is a natural solution to implementing new technologies, such as authenticated caller ID in a VoIP-based consumer world. Successful endpoint implementors and complaint aggregators in the current generation will be well positioned to deliver implementations and services for the next generation.