Why Computer Vision Has Become a Major Investment Theme for Me

Mark Suster
Both Sides of the Table
8 min readJun 16, 2016

--

Computer Vison Startup Nanit

If you follow me on Snapchat (msuster) you might already know that I’ve been looking at and investing in a number of companies in the computer vision space. My thesis is that it will become a major I/O computing metaphor or as this field is sometimes referred to HCI (human-computer interaction).

Today I am so excited to announce our latest investment in the category — Nanit — which is a smart baby monitor. The objective behind Nanit is to help parent “sleep more and monitor less.” By using computer vision Nanit is able to better help parents understand how well a child is sleeping and if they’re having difficulties what the causes may be (sound, ambient light, temperature or even, gasp, too much parental interference).

It is simply a stunning use of computer vision (features here) to assist parents in better understanding their babies and young children in ways that were never possible before and we believe with the corpus of data available can also help parents understand child development relative to peer groups and even assist doctors in medical diagnosis in the future. More about Nanit later, but let me talk more broadly about our thesis.

Put simply — computers handle computation, storage and databases store & retrieve results and networks handle the movement of this information to different locations but in order for humans to “interact” with this digital world we need an entry point to “input” (i) information and a mechanism to interpret this information or “output” (o).

The major source of input throughout the past 50+ years has been a keyboard and then in the past 25+ years this added a friend called a mouse. The major source of output of course has a computer monitor that fortunately takes a lot less desk real estate than it did when I started at my first job in 1991.

In that same year I read my first technical article about how voice would profoundly change the I/O metaphor of computing arguing that — of course — typing into a computer was both slow and not intuitive and this even many years before the condition of carpel tunnel syndrome became widely known. In the late 90s I began experimenting with voice as an input with a program by Dragon Systems called “NaturallySpeaking” but I was never quite proficient enough to make this work for me.

Of course in recent years voice as I/O is becoming mainstream. I already use voice daily on Siri to send text messages while driving or search for directions and many friends are now raving about Amazon Echo (I think Father’s Day is coming up soon, Tania? Just sayin’ ;) )

Voice is interesting. But I think computer vision will be much more profound as our long-term human-computer interface and I think will dwarf “wearables” in the field people annoyingly call IoT (Internet of Things). Put simply — intelligent cameras tied to computers will be able interpret what is happening in the physical world with much more more insight than a human using his or her eyes and computers using a projector will be able to communicate with us as humans way more effectively than us staring into a computer monitor all day.

To some extent we’ve already started to accept this as a society because the conversation about “autonomous vehicles” has moved mainstream. Most intelligent people accept that computers attached to your car can predict movements that would interfere with your driving (say, a bicycle that is about to zoom in front of you from the right hand side of the vehicle) and respond more quickly and with fewer errors than a human.

Anybody who knows the field already knows that computer vision is already being used with facial recognition in counter-terrorism, crowd control, combating hooliganism and so forth. But computer vision will enhance the daily experiences of our lives and provide us with significantly more data about the world than we have ever had. (I’ll have to wait until July to tell you more about this when we announce our next computer vision investment).

One of the first investments I did in computer vision was Osmo, a product designed to get children off of the couch and to stop starting into an iPad and instead get them using physical objects (and other people!) in a way that would interact with the iPad and thus fulfill their digital curiosity and ambitions. Osmo produces games, drawing tools and educational toys (most recently a game called “Coding” that can teach children as young as four to program using physical tiles to teach logic.) When you watch children interact with physical objects that control the computing environment and watch how it stimulates imagination and engagement the field of computer vision becomes so obvious you begin searching for many more opportunities in the field.

One of the most obvious to me was Nanit, that was brought to our firm by my colleague Jordan Hudson who met them team, fell in love with their founders & technology, and encouraged us to invest. He and I attend board meetings together in this NYC-based (and Israeli) company. It also helped that Jordan was about to have a baby of his own — Slim Shaney- who is now more than a year old (as is our investment).

The founder & CEO — Assaf Glazer — is a Phd in computer vision from Technion in Israel and furthered his work with a postdoctorate from Cornell Tech in New York, where he hatched Nanit. His co-founder and COO, Andy Berman, made the smart move from being a young VC to becoming an entrepreneur. The CTO, Tor Ivry, heads engineering out of Israel.

And for anybody who has seen me publicly encouraging entrepreneurs to hire senior marketing staff early — amongst the gold standards is Lisa Kennedy who joined Nanit as the CRO having previously been a senior executive at Diapers.com.

When I first saw the product and understood the goals of helping parents make better decisions it was obvious to me and I understood that it wasn’t really a baby “monitor.” For example, my first son has a really skinny body type and as a baby struggled to put on weight. I remember how much emotional pressure this put on my wife wondering whether she was doing something wrong or if it was affecting his health. I distinctly remember her waiting with bated breath to visit the pediatrician and find out how his weight was developing relative to his peer group and whether everything was ok.

With Nanit, of course, parents would be able not only to know the baby’s weight and height (using computer vision to measure his or her proportions) but also help the parents understand where these factors fit into a cohort of similarly aged children.

We all know that baby monitors are used by parents to know when one’s child is crying. But what about if a camera could help you record how many times your baby wakes up in the night, how long it takes her to fall back to sleep and how this compares to all babies? What about if you knew that your baby could self sooth herself within 4 minutes and that was — gasp — normal!?!

How many times is your child waking up? Could the amount of time that you spend attending to her needs in the middle of the night actually contribute to how long it’s taking her to fall back to sleep? These things are knowable using computer vision. And of course Nanit is securely encrypted and HIPAA compliant.

Now. Here’s just a little bit more magic that took me much longer to be persuaded on but now I’m absolutely convinced it’s a game changer. When Assaf first showed us Nanit he told us that parents would want to capture private videos of their child in the crib and be able to share those with friends or grandparents.

NFW, I thought. No. Forking. Way.

Boy was I wrong. Nanit has been in field test with dozens of families in NYC in the past year and once you see real data it’s astounding. As a parent you spend waking hours trying to capture a little bit of magic of your baby’s life: a first step, a first spoonful of food, a babble, a word … a smile. But of course you can’t go around with a smart phone strapped to your head and ready to capture every moment.

It turns out that their are constant magic moments of your child’s development that happen at night. He wakes up and starts talking with his teddy bear. He learns that his right hand is connected to the same body and mind as his left hand and that if he uses them as teammates he can pick up objects. And that time that Connor started learning to get a leg over the side of the crib (and weeks later led to a broken leg … true story).

Using computer vision Nanit is able to monitor when the baby is asleep and when he’s awake and in the awake hours Nanit is able to create private video moments for you the phone that you can scroll through, smile, enjoy, save and share.

As a father this is also pure magic. I remember business trips when my boys were young and how much I missed them. For non parents I would describe this as the same gut wrenching feeling of teenage love and missing your boyfriend or girlfriend. It’s an unknowable human feeling unless you’ve been there. I remember being in China for 2 weeks and having my wife put an iPad at the kids breakfast table so I could “Facetime be there” in the morning and tell jokes and laugh. “Yes! Daddy really did see a guy eat a scorpion on a stick! Yuuuuck! I know! No you’re a poopy diaper head!”

Now imagine being able to see the highlight reel of the best moments from last night of your daughter’s sleep. Her first worlds and babbling and trying to let the ducky talk to ladybug. I wouldn’t have believed it if I hadn’t seen video footage and known just how active babies and toddlers can be in their moments between sleep. And watching how much parents appreciated this video capture knowing how hard it is in the bleary-eyed daytime hours to capture great videos between changing poopy diapers, breast feeding, calming down the screeching, etc.

And.

We believe that Nanit can eventually be used to help predict patterns in childhood development when things may be getting off course. The team has no ambition of playing doctor, but if data could assist a pediatrician in understanding a child’s development, sleeping patterns, weight changes, mood shifts or even inability to communicate emotionally … imagine what a game changer that could be.

Computer vision …

As input in can capture the world around us and store data that can help us better manage our well-being, our health and our precious moments. It can help guide us as parents to sleep more and monitor less (she’s going to be alright!) and can get our children off the couch and using physical toys again.

As output in can play back moments captured from life that be an early warning of danger (Connor!) or as a memory capture that enhances our own sense of connectedness in a world whose pressures often separate more than we may like.

I hope you’ll check out Nanit — it just went on pre-order and will be shipping very soon.

Their whole team has been such a pleasure to work with and their innovations and future plans are so inspiring (and focused on societal improvements in health and well being) — I can’t wait to see what they produce in the years ahead.

--

--

2x entrepreneur. Sold both companies (last to salesforce.com). Turned VC looking to invest in passionate entrepreneurs — I’m on Twitter at @msuster