Opening the camera

by Peter Rojas

Cameras in phones were treated mainly as an afterthought or weird novelty when they first started showing up in handsets in the early 2000's. One response to a BBC article back in 2001 about cameras being added to phones in Japan was, "Infinite uses for the teenager, not entirely sure what the rest of us would do with one though." Sixteen years later it is virtually impossible to find a mobile phone without one. The camera is now a critical -- possibly even defining -- feature of the smartphone, and yet there is still a remarkable amount of friction to doing more with the default camera on your phone than capturing a photo or a video. 

That will need to change if the camera is going to be the starting point for so much of what we do with our phones and become a sort of visual browser that intermediates and augments our experience of the world. One way for that to happen would be for Apple and Google to allow developers to add lightweight extensions to the OS's default camera app, like lenses, filters, AR objects, or specialized image recognition capabilities that wouldn't necessarily justify building a full-blown app. Yes, iOS and Android developers have been able to take advantage of the phone's camera forever, but they've never been able to hook into or add onto the phone’s camera itself. 

Making the camera the place where these experiences live would help overcome context switching, which can be a remarkably high hurdle when it comes to getting users actually to engage with an app. This says nothing of getting them to install one in the first place -- something that is going to be especially important with the addition of ARKit to iOS and ARCore to Android this year. It should be as easy as possible for users to point their camera at the world and then augment, enhance, or recognize it. We should be able to have the camera open and then decide what we want to do with it, not the other way around.

There is already a precedent of sorts for this. Apple’s iMessage now has an App Store that goes beyond stickers to offer all sorts of contextual applications for enhancing chats, like Square for sending cash, or games like Crosswords with Friends. The latest figures I was able to find were from March of this year, when the iMessage App Store was estimated to have over 5,000 apps. That number has surely grown. And while Apple could do a lot to improve the experience here, being able to pull apps directly into the messaging experience makes a lot more sense than forcing users to jump out of what they're doing and into something else. 

We might see something along the lines of what I'm thinking come from Google first. Earlier this month they announced AR Stickers (basically 3D objects which can be inserted into scenes) for the camera of their line of flagship Pixel phones. They also unveiled Lens, which adds a number of computer vision-driven experiences to the Pixel's camera, like translating text written in other languages, identifying flowers, or pulling up ratings and reviews of a business simply by taking a photo of a storefront. Right now these are first-party, not third-party apps, but it’s not difficult to envision developers being able to augment the Lens experience in a variety of ways. Lens will be rolling out to Pixel phones soon, but it's expected to trickle down to other Android phones sooner or later. 

Apps which offer their own camera experiences won't go away, but opening up the camera as a sort of platform-within-a-platform would offer greater flexibility and intelligence to our devices than we have today. It would allow developers to add specialized image recognition capabilities, giving us phones which are better at knowing what they're looking it. It would also reduce the friction involved with using lenses, 3D objects, and other contextually-driven and location-based AR content and experiences, which wouldn't have to be siloed into their own apps (or only usable within Snapchat, Facebook, etc.). But perhaps most importantly, it would take us a step towards better understanding and defining the future we are headed towards, where screens themselves become secondary to the augmented view of the world we will have through smartglasses. Even if that future is still a few (or more) years away, we are approaching a time when a handheld screen with a field of icons will no longer make any sense when our expectations will be to have interfaces and experiences which surface intelligently within our field of view. Building great UX and UI for the post-smartphone world will take time, but opening up the camera would help us begin to figure this out. 

That time John Borthwicks and I were booed on stage

by Peter Rojas

The first time I ever met John Borthwick was at an event where we about to speak together on stage. About an hour later we were nearly booed off that stage. 

It was November of 2003 and back then he was working at Time Warner, which had recently stopped calling itself "AOL Time Warner." I was the editor of a new-ish gadget blog called Gizmodo. I'd been asked to speak at Time Warner's annual executive retreat about "convergence," which at the time was a very hot buzzword that had something to do with the shift from analog to digital media. Given my area of expertise they wanted someone to come in and discuss how all these crazy new devices like iPods, smartphones (the Treo 600 had just hit), and connected media players were going to change the media landscape. 

I wasn't their first choice for this. Maybe not even their second or third. Whoever had been slated to give this talk had canceled at the last minute, and they were looking for a replacement. Supposedly, the speaker had been Steve Jobs. This was probably right at the tail end of that window when anyone could get him to speak at an event like this (remember he was trying to get big media companies to support the iTunes Store, which had launched earlier that year). Regardless of who it was, Upendra Shardanand, who was at Time Warner then, and I were acquainted, and he thought I'd make a good backup speaker and so reached out and asked me if I'd do it. There wasn't much time to get something together, but after a bit of back and forth with the team there we decided that the format would be me on stage with a high-level exec interviewing me as I went through demos of various new technologies, including how to watch video on a smartphone, record TV on a Media Center PC, stream video from a PC to your TV over WiFi, etc. It's all stuff that we don't even think about now because it's so commonplace, but at the time was about as niche you could get. The exec who'd be on stage chatting with me? That was John Borthwick. 

I spent the next few days gathering up all the gear I needed for the demos and prepping so I could run through everything without a hitch. I remember being a little stressed out trying to find an HP Media Center PC in stock at my local BestBuy, but eventually I was able to buy one. I spent the evening before prepping all of my gear and getting the demos ready. The next morning I headed over to the venue and got ready for my big entrance, which the organizers decided would consist of me riding in on a Segway and then dropping it off and bounding onto the stage to tell people about the future!

Up on stage John guided me through the different parts of the presentation, asking me questions and moving us along from demo to demo. Everything was going smoothly until I got to the part where I wanted to show how easy it was to download video off the internet and stream it from the hard drive to your PC to your TV via a WiFi-enabled DVD player that Gateway used to sell that supported DivX files. I'd torrented an episode of Curb Your Enthusiasm for the demo, forgetting that the show aired on HBO, which was owned by Time Warner. We were about three seconds in before the HBO and Time Warner execs started booing and hissing. Keep in mind that this was just a couple of years after Napster had upended the music industry and presumably the collected assembly of high-powered media execs wanted to get some insight into what was coming next for video on the internet and how they might avoid that same fate, not a primer on how to pirate their IP. 

Flummoxed, I blurted out the first thing that came to mind, which was, "Look, I don't even have cable, I don't really need Time Warner at all to watch this show," which as you can imagine, did not do anything to win the audience over to my side. John managed to calm things down a bit and move us along to the last bit of the presentation before I sheepishly exited the stage. I hadn't had a ton of public speaking experience at that point, and none of it involved being booed at, so it was a little rough. I'm not sure I slept much that night.

A few days later I received a very nice note in the mail from Dick Parsons, who was CEO of Time Warner at the time, thanking me for my talk and saying that even though it wasn't pleasant, it was important for them to hear what I'd had to say.   

I didn't speak to John afterward -- I was way too embarrassed -- but somehow everything worked out. Two years later, in 2005,  AOL (which was by now a subsidiary of Time Warner) bought the company I'd started a few months after all this happened. By 2008 I'd reconnected with John courtesy of mutual friend Om Malik, and a year later his startup studio, betaworks, invested in another startup of mine (AOL also ended up buying that company, oddly enough). Two years ago, as I was preparing to leave AOL for the second time, John asked me to join betaworks and help raise a new seed fund. Now, fourteen years after being booed on stage together, John and I (along with Matt Hartman) are partners in the new betaworks ventures seed fund

Maybe We're Wrong About What Makes for Great VR

by Peter Rojas

Have we been thinking about what makes for great VR in the wrong way? There's a lot of anxiety in the industry lately about sluggish consumer adoption rates, which have been slower than hoped after last year's launch of the Oculus Rift, HTC Vive, and Sony's Playstation VR. This anxiety is only compounded when comparing time spent using VR headsets against that of smartphones (which all of us use off and on all day now), television (which still consumes 5 hours of the average American's day), and console gaming.
It's still early, so it's understandable that adoption is happening slowly, but what if total time spent is the wrong benchmark for determining whether VR is valuable to us? This question had been gnawing away at me for the past year or so since I picked up HTC's Vive headset, but it was only after playing the new co-operative quest games that were recently added to popular social VR game Rec Room, that I was finally able to put my finger on it. [I should disclose that betaworks ventures is an investor in Against Gravity, the company behind Rec Room, but to be clear the reason we invested in Rec Room is because it's awesome; hopefully I still have some cred in these matters.] Rec Room currently has two quests: The Quest for the Golden Trophy, which is a dungeon crawler where you use virtual swords, bows, and crossbows to battle a variety of enemies through a number of different levels; and The Rise of Jumbotron, an Eighties-themed laser tag-style campaign that involves defeating hordes of robots. 
Neither quest breaks new ground thematically, but that's not the point. Playing in room-scale VR, where you're physically moving around, brings an entirely new level of immersiveness to the experience and Against Gravity got a bunch of little details right about the collaborative game play. Both quests are challenging -- you almost certainly need three other players along for the ride to finish it -- but it's not so hard that you can't get through it with some persistence, and the first time I played The Quest for the Golden Trophy it took me and the crew of three other players I'd never met before about 45 minutes to get through it. 
Forty-five minutes of gameplay would be short for a AAA title like GTA V or Skyrim (I'm about 55 hours into Skyrim), but what struck me was was how satisfying it felt after I was done. For those forty-five minutes I was entirely immersed in the experience, working hard not to let down the three strangers I'd been randomly paired up with, and absolutely ecstatic when after several tries we were able to complete it. But what I hadn't expected was that at the end I was perfectly content to take my headset off and go back to the real world. I didn't want to keep playing or to spend a couple more hours inside of my headset. Playing The Quest was like eating a satisfying meal, one where I didn't leave feeling either hungry or overstuffed. 

It was one of the best experiences I've had to date in VR and one that's stuck with me, and it has me wondering if a lot of us have been focusing on the wrong things when it comes to creating great VR experiences. It's natural to want to contextualize VR by comparing it to smartphones, television, and game consoles, the other tech platforms it most closely resembles, but that would be a mistake. Mobile is something which we thread throughout every waking moment of our day, but in sort of a light way. We check our phones a hundred or more times a day, but usually for no more than a few minutes at a time. It's not hard to watch TV or play console (and PC) games for hours and hours upon end. 
VR is different. The best VR experiences are heavily concentrated--they draw you completely in and demand more of you sensorially, emotionally, and (if you have a room-scale headset like the Rift or Vive) physically. Judging VR on the same terms as either mobile (as something that we use throughout the day or even every day) or console and PC gaming (as something we can do for hours at a time) doesn't make sense. Maybe we'll only spend forty-five minutes a week using VR, but what we'll get during that time is something so immersive and engrossing that it might be the most engaging, fun, or productive forty-five minutes of your week. 
I don't quite know what this means in terms of where VR goes. In some sense spending time in VR starts to look more like a luxury good than anything else, something that you don't consume constantly, but when you do you savor it and come away immensely satisfied. If that's the case, the benchmark for a good VR experience should be something so good that you'll be happy to take off your headset when you're done. This has all kinds of implications for what kinds of VR products should be created and which businesses should be built -- I won't pretend to know the answers here -- but I'm convinced that VR startups which obsess over offering people the best forty-five minutes of their week will have the best chance of figuring it out. 

Voice-computing Needs Killer Apps

by Peter Rojas

Voice interfaces were all over CES this year. Gadget makers added support for voice-based computing to just about everything they could, ranging from stuff you'd expect like, TVs, speakers, and smartphones, to ones which are little more out there, such as refrigerators, wireless routers, and even air purifiers.

The point is that voice is shaping up to be one of the next big computing interfaces. That's good news for Amazon, which as Ben Thompson lays out in an excellent piece of analysis from earlier this month, has a golden opportunity with its Alexa voice computing platform to build a successful operating system. They failed to accomplish this when they forked Android for their line of Fire tablets and phones (well, just one phone). Ben does a better job than I could of explaining the business models around operating systems and how and why they can be so fantastically profitable, so please go read that piece. Suffice it to say, Alexa gives Amazon a shot at owning a primary computing interface, not unlike how Apple owns iOS or Microsoft owns Windows.

Alexa has proven to be remarkably popular with consumers. While they have not released sales figures, Amazon's line of Alexa-powered Echo speakers are estimated to be present in around seven million US households and appear to have been a hit this past holiday shopping season, with Amazon's companion app for the Echo being the 5th most popular free iOS app the day after Christmas. (It probably didn't hurt that Amazon offered the smaller Echo Dot speaker for just $39, making it a nice, but not-too-expensive gift option.) Meanwhile, partners ranging from Ford, LG, Lenovo, Whirlpool, Huawei, and DISH have been integrating it into their products. With over 7,000 "Alexa skills" and counting made by third-party developers, for now they have a lead over their competitors, which include Google (which is a partner in betaworks' Voicecamp accelerator), Apple, Microsoft, and to a lesser extent, Samsung, which recently acquired Viv. However, for Amazon to maintain its lead -- or for Google or anyone else wanting to build a voice-based computing platform for that matter -- it's going to need to do what practically every successful operating system has done: be a great platform for developers. That means enabling others to build massively successful apps on top of it which in turn develop into massively successful businesses.

The phrase "killer app" gets tossed around casually these days, but Amazon doesn't necessarily need something that'll drive sales of the Echo all on its own, like how VisiCalc, a spreadsheet app, did for the Apple II back in the day (though of course it wouldn't hurt). You can see how the iPhone didn't truly take off until Apple opened iOS up to developers, and even then it took a little while until someone built Instagram, which turned out to be one of the first massively successful apps. Instagram made iOS more valuable and useful to both Apple and its users, and other developers, seeing that success, were driven to create even more apps for the platform, making iOS even more valuable for Apple and its users.

Alexa has over 7,000 skills, and while there are some great ones out there, so far there a killer third-party app for it hasn't emerged. What's tricky is that it's very difficult for any platform maker to engender this intentionally, since you can't just magically make it happen with aggressive dev evangelism. Part of the challenge here is that with any new computing interface it takes time to figure out how to create apps and services that feel "native" to the platform. Platform makers also need to offer a good SDK and the right APIs; often devs are too limited in what they can build in the early days of a new OS. Then there are issues around monetization (whether it's ads, in-app purchases, commerce, etc) and discovery (i.e. how users are going to find and share their apps); according to a report by VoiceLabs, 69% of Alexa skills have one or zero consumers, and retention numbers are poor. People will build plenty of stuff for fun when a platform is new and growing fast, but at some point they want to build businesses around those products. If app makers can't acquire users or make money, they're not going to invest their time into building for a new platform. Not surprisingly, there are still lots of open questions around how to accomplish either of those on voice platforms.

If you get this stuff right, then with enough experimentation (and time) others will eventually be able to build compelling, innovative products which couldn't exist anywhere else. The success of these kinds of products is almost impossible to predict beforehand, yet their emergence feels entirely obvious in retrospect, which is a major reason why we're doing our Voicecamp accelerator. We want to take a bunch of teams building promising new products for voice and create the right circumstances for them to explore what works and what doesn't when it comes to voice. It's possible that at the end of Voicecamp we’ll have a group of noble, but failed, experiments in trying to figure this out. However, we also strongly believe that taking that risk is worth it if there's a shot that one of them will build the first great voice app.