Design Compact Deep Learning Models: Small is the New Big

Davis Sawyer at GOTO Chicago 2020

The emergence of deep neural networks (DNNs) in recent years has enabled ground-breaking abilities and applications for modern intelligent systems. State-of-the-art DNNs have been found to achieve high accuracy on tasks in computer vision and natural language processing, even outperforming humans on object recognition tasks. Concurrently, the increasing complexity and sophistication of DNNs is predicated on significant power consumption, model size and computing resources. For example, since 2012, the training complexity of AI models has increased by 350,000x. These factors have been found to limit deep learning’s performance in real-time applications, in large-scale systems, and on low-power devices.
Furthermore, many low-end and cost-effective devices do not have the resources to execute DNN inference, causing users to sacrifice privacy and offload processing to the cloud. Application developers, software engineers and algorithm architects must now create intelligent solution that deal with strict latency constraints, such as in smart city, mobility and healthcare applications which often require that inference be performed in a matter of milliseconds ...

Davis Sawyer - Co-founder and chief product officer at AI software startup Deeplite