Skip to main navigation Skip to search Skip to main content

A DNN Inference Offloading Scheme for Storage Arrays

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Recent advancement in deep learning technology has brought tremendous amounts of deep neural network (DNN) inference jobs into a data center. While hardware accelerators for DNN computations have made rapid progress, network capability to transfer a large amount of data needed for DNN computations still is a common bottleneck threatening service level objectives (SLO). To alleviate such a bottleneck occurred by data transfer, we propose a novel system architecture that offloads DNN inference job to a storage node. Our system includes concise API which mitigates the programming burden needed to offload computations, and software architecture to conduct general DNN inference jobs in a conventional storage system. Experimental results show that our system exhibits a 35% of shorter average latency and more than 99% reduction in network usage in common image retrieval and classification jobs over existing systems.

Original languageEnglish
Title of host publicationProceedings of the 3rd IEEE Eurasia Conference on IOT, Communication and Engineering 2021, ECICE 2021
EditorsTeen-Hang Meen
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages173-175
Number of pages3
ISBN (Electronic)9781665445160
DOIs
StatePublished - 2021
Event3rd IEEE Eurasia Conference on IOT, Communication and Engineering, ECICE 2021 - Yunlin, Taiwan, Province of China
Duration: 29 Oct 202131 Oct 2021

Publication series

NameProceedings of the 3rd IEEE Eurasia Conference on IOT, Communication and Engineering 2021, ECICE 2021

Conference

Conference3rd IEEE Eurasia Conference on IOT, Communication and Engineering, ECICE 2021
Country/TerritoryTaiwan, Province of China
CityYunlin
Period29/10/2131/10/21

Keywords

  • computation offloading
  • DNN inference
  • storage array

Fingerprint

Dive into the research topics of 'A DNN Inference Offloading Scheme for Storage Arrays'. Together they form a unique fingerprint.

Cite this