Posts

I’ve loved cars since I was a little boy. From classic cars to custom hot rods, I loved them all, but I was especially fascinated by the futuristic vehicles featured on TV. Depending on which generation you identify with, you might remember Kitt from Knight Rider, the Batmobile, or the nameless Delorean from Back to the Future. Not only were these cars fast, they could think, talk and sometimes even see.

AI has given us the first generation of autonomous cars — and it’s pretty impressive. But there is a host of next generation of AI-enhanced features that go even further in providing convenience and ensuring passenger safety.

Auto-evolution: AI at the edge for cars

Xnor is focused on bringing computer vision to edge devices, so our technology is particularly valuable for automobiles and commercial vehicles. Every AI capability we offer – whether it involves people, object or face recognition – delivers a degree of speed and accuracy that until recently, was only possible using a high-end processor augmented by a neural accelerator. We take that same level of performance, improve upon it, and make it available on an edge device, such as a 1 GHz ARM processor or a simple onboard computer.

Check out this demo of our computer vision technology:

Object detection capabilities

Crime prevention

For car sharing companies or taxis, the system can enforce security regulations by recognizing when passengers hold weapons or other objects that present a safety hazard.

Loss prevention

Using object detection, the system can remind a passenger to retrieve the phone or purse they left on the seat. Transportation and logistics companies could receive an alert if a package was not delivered at the end of a route.

Face recognition capabilities

Here are a few of the capabilities that can be incorporated into a line of vehicles using Xnor’s face recognition or action detection models.

Secure access

Using face recognition, a driver can be authenticated even before they enter a vehicle. The door could automatically open for people recognized by the car, making hands-free entry possible. Our technology would even allow the car to differentiate between children and adults. Commercial vehicles could use that information to control access to certain areas by authorizing drivers.

Because all of this is done on-device, the data doesn’t need to be transmitted to the cloud, making it significantly more secure and practical of a feature.

Personalization

Once a driver or passenger is authenticated, the car could adjust settings to align with personal preferences, such as the position of the seat and steering column, interior temperature and infotainment system settings.

Driver awareness

ML-powered driver monitoring can tell when a driver is looking at a phone, instead of the road ahead. And if the driver becomes drowsy and their eyelids start to close, the system will know that too.

Emergency response

In the event of a crash or another emergency, the system can generate a passenger list, and notify someone if the driver does not respond to an audible alarm.

Passenger safety

Action detection models can be trained to detect specific gestures like fastening a seatbelt to ensure that everyone is buckled in.

Person and pet detection models can identify if a pet is left inside a car (a potentially dangerous situation on a hot day) or if an infant or small child is left behind, and then sound an alarm to notify the driver.

AI at the edge drives automotive innovation

Without recent advances in deep learning for computer vision, many of these features would be too difficult or expensive to implement.

Xnor’s AI technology is unique in that it delivers state-of-the-art performance on a commodity processor, using only the bare minimum for energy and memory.

Even with a simple onboard computer, Xnor models execute at up to 10x faster than conventional solutions – while using up to 15x less memory and 30x less energy.

Taken together, all these capabilities make it both practical and profitable for automobile manufacturers to incorporate high-performance computer vision into a variety of applications for the commercial and consumer vehicle markets.

At Xnor, we’re fascinated by the creative and powerful ways our customers are working to incorporate machine learning into their line of cars and commercial vehicles. It’s not as cool as owning one of the super-smart, fast-talking exotic cars that my TV heroes used to drive, but it comes pretty close.

Read more about how you can incorporate the latest in computer vision into your line of vehicles.

Search for the term “the future of retailing” and you’ll see plenty of stories about physical retailers being marginalized by their dot-com counterparts. Some would say that physical stores are fading from the retail landscape. Quaint, but doomed. To understand why consider the shopping experiences offered by each channel.

Online vs. Offline

For example, while checking the number of followers in their Instagram account, your future customer sees an image of their favorite celeb wearing shoes that they simply must have. Other distractions intervene, but after seeing several banner ads they finally click, swipe or tap their way to an online store. Thanks to cookies and ad tracking, the site already knows a great deal about the customer, from their purchase history down to their shoe size. The customer browses for products, reads reviews and compares items. With each click, the store knows a little bit more.

As the customer moves through the site, the convenience, selection and price advantage of shopping online becomes obvious. When they make a purchase, the customer can be rewarded for their loyalty with a coupon code, and the inventory system knows which item to reorder.

On the other hand, a retail store doesn’t know who you are the moment you walk in the door. They don’t know if you’ve bought from them – or from any of their competitors – before. They have no idea what color you like, or what shoe size you wear. Traditional retailers rely heavily on in-store displays or staff to guide customers through the store.

Now replay that scenario – but with one difference. This time it’s a physical store equipped with the latest generation in AI. Small cameras placed throughout the store use computer vision to provide an advanced level of retail analytics, possibly even better than what is available to online stores, while also creating a better experience for shoppers.

The Customer Journey in an AI-enabled Store

In this new scenario, a face recognition algorithm identifies customers and their demographics as they walk through the front door. Maybe this individual is a regular shopper and a member of your loyalty program. Based on their purchase history, you can send them a notification while they are in your store about new offerings that may be enticing to them.

As they move through the aisles, multiple cameras recognize that customer as the same person and track them throughout the store. Do the endcap displays attract their attention? Where do they stop and spend time? Does the location of a preferred product impact what else they buy nearby? Once your customers are at the check-out counter, payment can be as simple as a quick scan of their face.

On a larger scale, this data can be used to develop in-depth, real-time heatmaps without having to lift a finger. The information can also be bolstered with other AI capabilities such as emotion detection and action recognition in order to build highly detailed customer insights. Your customers and their paths through the store are now actionable data for your business, opening up a vast number of opportunities.

Security and Store Operations

The analytics you collect on the floor will impact your customers and their experiences, but there’s a slew of potential opportunities behind the scenes that can streamline operations for your business.

Surveillance and access control are important in-store functions for avoiding crime and unauthorized activity. Using Xnor’s AI capabilities, security can be enhanced with features like weapon or dangerous action detection. Secure areas can be better controlled with computer vision solutions like face recognition and person detection to make sure only the right people have access to restricted areas.

Another particularly valuable function is inventory management. Knowing when items are out of stock on the shelves helps to restock more efficiently. Creating efficient, real-time solutions for monitoring items also helps to keep vendors up-to-date on their products within your store as well as how they are performing. This can also be tied to traffic patterns so you can understand how often people are interacting with different products.

Gaining a competitive advantage

Many see the future of retail as being fully automated, but that shift won’t happen overnight. Retailers are beginning to introduce these capabilities piece by piece in order to stay ahead without having to completely overhaul operations. By incorporating AI solutions developed by Xnor, your store will avoid the headaches of conventional AI solutions. Xnor models can run on commodity devices, so you don’t need to upgrade your cameras or pay for expensive cloud-computing services (which are less secure). Running on-device also reduces latency and power consumption so your solutions will pick up that power-walker even on a battery-powered camera that you can place anywhere.

With Xnor’s computer vision models, physical stores can have the retail analytics they need to compete with their online counterparts – and help a loyal customer to find the perfect pair of shoes.

Visit Xnor to learn how the next generation in AI can help your retail store compete.

Mention Smart Appliance, and most people think of using a smartphone to turn on house lights as they pull in the driveway, arm security systems, control thermostats, or check if Amazon left a package on the front porch. Initially, that level of functionality was impressive. But so far, the value associated with Smart Appliances has been centered around heightened security and managing your home from a remote location.

It’s time Smart Appliances got an upgrade.

Smart Appliances V1

The first iterations of Smart Appliances were hampered by technical limitations. In some cases, the only smart thing about the earliest versions was touch screen interfaces, Bluetooth connectivity and the option to use a mobile device to control the appliance. Advanced features like food detection, if it was used at all, was constrained by the limitations inherent in AI technology at that time. One of those factors was the processing power needed to run an AI application. AI apps that could recognize and identify specific varieties of food required a robust processor with a neural or GPU accelerator, as well as an ample power source. Incorporating a power-hungry processor into the design of an energy-efficient appliance wasn’t practical. It also required a persistent, high bandwidth connection to the cloud. The resulting latency could delay system response to user input and create a poor customer experience. At any rate, aside from the onerous compute requirement, food detection models were still in their infancy. They were often inconsistent, and it was difficult to train them to identify new items.

The new generation of food identification technology promises to break through those barriers. With highly efficient algorithms, AI apps can be run on a small embedded device inside the appliance, without a persistent, high-bandwidth, internet connection.

Here are a few ways AI on the Edge can make a Smart refrigerator a little smarter:

  • Add items to a shopping list when they need to be replenished
  • Suggest a recipe based on the items you already have in your refrigerator
  • Make grocery shoppers faster and more informed
  • Make recommendations for how best to store certain produce
  • Provide cooking tips for certain foods
  • Detect when there’s a spill inside

With this kind of upgrade, homeowners can use the new generation of Smart Appliances to reduce their monthly grocery bill, reduce waste, and save time at the grocery store.

Compact, efficient algorithms are the brains behind smart appliances

With Xnor’s efficient, on-device computer vision models, smart appliances are now becoming a reality. Xnor’s food identification models offer appliance manufacturers some specific advantages over conventional AI solutions:

Improved performance

The new generation of food identification technology brings AI to edge devices, so there’s no need for internet connectivity. When Smart Appliances aren’t tethered to an internet connection, they are more responsive. Plus, there’s no risk of downtime due to a network or service outage. That translates into a better experience for consumers.

Improved accuracy

Even an item as ubiquitous as a Granny Smith apple comes in a variety of shades, sizes, and shapes. Our highly efficient training models deliver substantially higher accuracy, making it possible to visually identify food items in less than ideal lighting conditions, even if they are partially obscured.

Reduced energy use

Keeping energy consumption to a minimum is a top priority for appliance manufacturers. Xnor’s food detection models have been shown to be up to 30x more energy efficient than conventional AI technology.

Lower costs

Without the need for fast, power-hungry processors, the cost of introducing these features comes way down. Combined with low energy use and internet-free, on-device computing, its now possible to incorporate advanced food detection capabilities into a range of products at multiple price points.

Bon-Appetit

There’s a multitude of tasks involved in preparing a meal. By going beyond preserving and cooking food, refrigerators will begin to behave less like an appliance, and more like a virtual sous-chef. As a company that’s invested a significant amount of research in this area, we’d like to say, “Bon-Appetit!”

Visit us to learn how the next generation in food detection technology can boost the performance of your Smart Appliance.

Machine vision has long been the holy grail to unlocking a number of real-world use cases for AI – think of home automation and security, autonomous vehicles, crop picking robots, retail analytics, delivery drones or real time threat detection. However, until recently, AI models for computer vision have been constrained to expensive hardware with sophisticated hardware that often contain neural accelerators, or these models were required to be processed in the cloud with GPU or TPU enabled servers. Through Xnor’s groundbreaking research, in coordination with the Allen Institute for AI, on YOLO, Xnor-Net, YOLO 9000, Tiny YOLO and other AI models, we’ve been to able move machine learning from expensive hardware and the cloud, to simple, resource-constrained devices that can operate completely on-device and autonomously. This means you can run sophisticated deep learning models on embedded devices without the need for neural processors and without the need for a data connection to the cloud. For example, on a 1.4 GHz dual-core ARM chip with no GPU, we can run object detection with CPU utilization of only 55%, a memory footprint of only 20MB, and power consumption of less than 4.7W.

Object Detection

Let’s dig into one specific model that we’ve built – object detection. Object detection is a type of AI model that identifies categories of objects that are present in images or videos – think people, automobiles, animals, packages, signs, lights, etc. – and then localizes their presence by drawing a bounding box around them. Utilizing a CNN (convolutional neural network), the model is able to simultaneously draw multiple bounding boxes and then predict classification probabilities for those boxes based on a trained model.

Traditionally these models have been resource intensive because of the model architecture – the number of layers (convolution, pooling and fully connected) – and the fact that most CNN’s use 32-bit precision floating-point operations.

Xnor’s approach is different and we’ve summarized this approach below.

Xnorization (How It Works)

Our models are optimized to run more efficiently and up to 10x faster through a process we call Xnorization. This process contains five essential steps. First, we binarize the AI model. Second, we design a compact model. Third, we prune the model. Fourth, we optimize the loss function. Fifth, we optimize the model for the specific piece of hardware.

Let’s explore each of these in further detail

Model Binarization

To reduce the compute required to run our object detection models, the first step is to retrain these models into a binary neural network called Xnor-Net. In Binary-Weight-Networks, the filters are approximated with binary values. This produces results that are 58x faster for convolutional operations and a memory savings of up to 32x. Furthermore, these binary networks are simple, accurate, and efficient. In fact, the classification accuracy with a Binary-Weight-Network version of AlexNet is only 2.9% less than the full-precision AlexNet (in top-1 measure).

To do this, both the filters and the input to convolutional layers are binary. This is done by approximating the convolutions using primarily binary operations. Finally, the operations are parallelized in CPUs and optimized to reduce model size. This gives us the ability to reduce floating point operations to as small as a binary operation, making it hyper efficient. Once completed, we have state-of-the-art accuracy for models that:

  • Are 10x faster
  • Can be 20-200x more power efficient
  • Need 8-15x less memory than traditional deep learning models

Compact Model Design

The second critical piece is to design models that are compact. Without compact model design, the compute required for the model remains high. Our Xnorized models utilize a compact design to reduce the number of required operations and model size. We design as few layers and parameters into the model as possible. The model design is dependent on the hardware, but we take the same fundamental approach for each model.

Sparse Model Design

Third, a variety of techniques are used to prune the model’s operations and parameters. This reduces the model size and minimizes the operations necessary to provide accurate results.  Here, most of the parameters are assigned zero as their value. The remaining parameters, which are very few, will be non-zero. By doing this, we can ignore all the computations for the zero parameters and only save the indexes and the values for the non-zero parameters.

Optimized Loss Functions

Fourth, we’ve built groundbreaking new techniques for retraining models on their own predicted model. Techniques like Label Refinery greatly increase accuracy by optimizing loss functions for a distribution of all possible categories. With Label Refinery, we actually rely on another neural network model to produce labels. These labels contain the following properties: 1) Soft; 2) Informative; and 3) Dynamic.

Soft labels are able to categorize multiple objects in an image and can determine what percentage of the image is represented by what object category. Informative labels provide a range of categories with the relevant confidence, so, for example, if something is mislabeled as a cat, you can know that the second highest category is dog. Dynamic labels allow you to ensure that the random crop is labeling the correct object in the image by running the model dynamically as you sample over the image.

You can learn more about this technique here.

Hardware Optimization

Lastly, because we’re building models for all sorts of embedded devices, we need to optimize the model for different hardware platforms to provide the highest efficiency across a broad range of Intel and Arm CPUs, graphical processing units (GPU), and FPGA devices. For example, we’ve partnered with Toradex and Amabrella to build person detection models that can be viewed here and here.

Results

By Xnorizing our models, we’re able to achieve cutting edge results. We have miniaturized models that are < 1MB in size and can run on the smallest devices. The models have fewer operations, faster inferences and higher frames per second, and low latency because they are running on device. And, we have fewer joules per inference which translates to lower power consumption.

Much of the convenience and security that Smart Homes have claimed to promise has yet to become a reality. To understand why, consider that the technology behind a Smart Home historically required significant CPU power combined with a GPU or an accelerator chip to provide capabilities like object detection and face identification. To keep solutions affordable, today’s solutions are missing these advanced features.

Now the newest generation of AI tech will allow software engineers to get past those barriers. We refer to it as AI at the Edge. Not only does it drive costs down, it enables a whole new suite of enhanced object detection and face identification capabilities, making it possible to deliver a wide range of new products and services for Smart Homes.

Imagine a smarter home with computer vision AI.

A day in the life of a Smart Home

Consider the impact this could have in the day in the life of a future Smart Home dweller. We’ll call her Amy.

7:15 am

As Amy pulls out of her driveway, she’s confident that her security system will keep her home secure while she’s at work. When her husband leaves a little later, there’s just one other member of the family still at home: the family dog. Mr. Wiggles would do anything for his family, but as a ten-pound chihuahua, he isn’t much of a help in protecting their home.

The home’s security system recognizes Mr. Wiggles as a pet, so he doesn’t accidentally set off the motion detectors as he roams from room to room. Multiple cameras track him as he roams about the yard, but there’s no danger of Mr. Wiggles triggering a false alarm.

Later that afternoon, when someone approaches the front porch, the home uses facial recognition to determine if the individual is an authorized or unidentified person and monitors their movement. If they are lingering, the system can send Amy a notification or even engage an alarm system.

If they leave a package on the front porch, the system will recognize that there was an item left and notify Amy that there is a package waiting for her on her porch.

3:25 pm

When Amy’s son comes home from middle school, a smart doorbell recognizes his face. To open the door, all he has to do tell the smart doorbell to unlock the door. Amy receives a notification that her son has arrived at home and entered the house safely. Her son makes a beeline to the refrigerator and grabs a snack. An AI-enabled camera recognizes that the last hot pocket is gone and adds it to the virtual grocery list.

6:12 pm

As Amy pulls in the driveway, a camera recognizes the car and the license plate and opens the garage door. Amy’s arms are full of packages, but there’s no need for a key to get in. She simply tells the smart doorbell to open the door. The system confirms her identify via facial and voice recognition, deactivates the alarm, opens the door, turns on the lights, and adjusts the thermostat to her desired indoor temperature.

The AI that delivers on the promise of a Smart Home

Consider the Smart Home features highlighted in this story:

  • Being able to tell the difference between a family member, a stranger, and Mr. Wiggles
  • Locking or unlocking doors based on recognizing specific people
  • Sending an alert when an unidentified person is spending time around the house
  • Following objects across multiple cameras to track a subject moving from room to room
  • Identifying hundreds of inanimate objects including various types of food, vehicles and packages

All the capabilities featured in this story would have been difficult if not impossible to achieve without a new approach to AI.

Xnor’s combination of optimized pre-trained learning models and tuned algorithms give solution providers the power to deliver the functionality that makes Smart Homes smart. Visit us to learn more.