Current Positions

  • Present 2019

    Sr. Hardware Safety Architect

    Automotive System Safety Engineering,
    NVIDIA

  • Present 2017

    Technical Consultant

    Compiler Micro-architecture Lab,
    Arizona State University

Previous Positions

  • 2019 2016

    SoC Design Engineer

    Internet of Things Group (IOTG),
    Intel

  • 2016 2015

    Research Assistant

    Compiler Micro-architecture Lab,
    Arizona State University

  • 2016 2015

    Graduate Research Aide

    Mathematical & Theoretical Biology Institute,
    Arizona State University

  • 2016 2015

    Teaching Assistant

    Ira A Fulton Schools of Engineering,
    Arizona State University

  • Education

    • Master of Science
      Computer Engineering (2016)

      Arizona State University, Tempe

      image
    • Bachelor of Technology
      Electronics & Communication Engineering (2013)

      GITAM University, Visakhapatnam

      image

Awards and Accomplishments

  • 2023
    Patent Award, NVIDIA
    image
    Recognition for patent filed in Functional Safety domain
  • 2019
    Innovation Award, City of Chandler
    image
    Recognized by the Chandler City Council for coming up with a curriculam to teach Python Programming to schools kids with special needs. Received the Innovation Award for 2019 from city mayor as a part of the Annual Chandler Volunteer Recognition Awards
  • 2019
    Intel Divisional Award
    image
    Recognized for demonstrating excellence in Speed & Execution for an IOTG critical product milestone.
  • 2018
    Intel Innovator Award
    image
    Recognition for Invention Disclosures & Patents filed in the domain of Semi Conductor Functional Safety
  • 2017
    Founder & Instructor, Coding Club for kids with special needs
    image
    Started a coding club to teach python for kids with special needs in Chandler Unified School District. Worked with Intel Involved Volunteering teams to make it an year long cadence.
  • 2016
    Invited Speaker, ASU Faculty Workshop for Research Computing
    image
    Demonstrated ways to deploy & accelerate the execution of Uncertainity & Sensitivity Analysis workloads on ASU Research Computing Clusters
  • 2014
    Graduate Fellowship, Arizona State University
    image
    Tution Waiver & Scholarship for teaching assistance & graduate research done at Ira A Fulton Schools of Engineering, ASU.
  • 2010
    Merit Scholarship, GITAM University
    image
    Merit scholarship for good academic performance in the undergraduate programme.

Filter by type:

Sort by year:

InCheck: An In-application Recovery Scheme for Soft Errors

Dheeraj Lokam, Moslem Didehban, Aviral Shrivastava
Conference Paper ACM/EDAC/IEEE 54th Design Automation Conference (DAC), Austin, Texas

Abstract

An ideal solution for soft error tolerance should hide the effect of soft errors from user and provide correct results at expected time. Software solutions are attractive because they can provide flexible reliability without imposing any hardware modifications. Our investigation of state-of-the-art error recovery techniques reveals that they suffer from poor coverage (ability to detect and correctly recover from soft errors). This paper presents InCheck (In-application Checkpointing and Recovery) as an effective, safe and timely software technique for complete error coverage. The key features of InCheck are: verified register preservation, single memory location checkpoints, and safe & timely recovery. To evaluate the effectiveness of InCheck, we performed more than 210,000 fault injection experiments on different hardware components of an ARM cortex53-like processor running MiBench applications. The original and SWIFT-R (state-of-the-art) protected programs suffered from 8000 and 1800 instances of wrong outputs respectively, but when protected by InCheck, there was no failure.

NEMESIS: A Software Approach for Computing in presence of Soft Errors

Dheeraj Lokam, Moslem Didehban, Aviral Shrivastava
Conference Paper IEEE/ACM 36th International Conference on Computer Aided Design, Irvine, California

Abstract

Soft errors are considered as the main reliability challenge for sub-nanoscale microprocessors. Software-level soft error resilience schemes are desirable because they require no hardware modifications and their protection can be tuned based on the application requirements. However, existing software-level error tolerant schemes do not provide high-level of protection. In this work, we present NEMESIS - a compiler-level fine-grain soft error detection, diagnosis and recovery technique that can provide high degree of error-resiliency. NEMESIS runs three versions of computations and detects soft errors by checking the results of all memory write and branch operations. In the case of mismatch, NEMESIS recovery routine reverts the effect of error from the architectural state of the program and program resumes its normal execution. Our extensive μ-architectural-level fault injection experiments results show that NEMESIS transformation is able to detect all soft errors and recover from 97% of detected errors.

An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes

Dheeraj Lokam
Manuscript Master's Thesis

Abstract

Soft errors are considered as a key reliability challenge for sub-nano scale transistors. An ideal solution for such a challenge should ultimately eliminate the effect of soft errors from the microprocessor. While forward recovery techniques achieve fast recovery from errors by simply voting out the wrong values, they incur the overhead of three copies execution. Backward recovery techniques only need two copies of execution, but suffer from check-pointing overhead. In this work I explored the efficiency of integrating check-pointing into the application and the effectiveness of recovery that can be performed upon it. After evaluating the available fine-grained approaches to perform recovery, I am introducing InCheck, an in-application recovery scheme that can be integrated into instruction-duplication based techniques, thus providing a fast error recovery. The proposed technique makes light-weight checkpoints at the basic-block granularity, and uses them for recovery purposes. To evaluate the effectiveness of the proposed technique, 10,000 fault injection experiments were performed on different hardware components of a modern ARM in-order simulated processor. InCheck was able to recover from all detected errors by replaying about 20 instructions, however, the state of the art recovery scheme failed more than 200 times.

LIGHTWEIGHT CHECKPOINT TECHNIQUE FOR RESILIENCE AGAINST SOFT ERRORS

Dheeraj Lokam, Moslem Didehban, Aviral Shrivastava
Patent Granted United States application 16/227,514, filed on 12/20/2018.

METHOD FOR DETECTING AND RECOVERY FROM SOFT ERRORS IN A COMPUTING DEVICE

Dheeraj Lokam, Moslem Didehban, Aviral Shrivastava
Patent Filed United States application 62/681,129, filed on 06/11/2018.

MULTI-LEVEL FAULT SIMULATIONS FOR INTEGRATED CIRCUITS (IC)

Dheeraj Lokam, Kevin Locker, Siva Prasad Kota, Massimo Ceppi, Teo Cupaiuolo
Patent Granted United States application US10747633B2, filed on 09/24/2018.

Filter by type:

Sort by year:

InCheck: An In-application Recovery Scheme for Soft Errors

Dheeraj Lokam, Moslem Didehban, Aviral Shrivastava
Conference Paper ACM/EDAC/IEEE 54th Design Automation Conference (DAC), Austin, Texas

Abstract

An ideal solution for soft error tolerance should hide the effect of soft errors from user and provide correct results at expected time. Software solutions are attractive because they can provide flexible reliability without imposing any hardware modifications. Our investigation of state-of-the-art error recovery techniques reveals that they suffer from poor coverage (ability to detect and correctly recover from soft errors). This paper presents InCheck (In-application Checkpointing and Recovery) as an effective, safe and timely software technique for complete error coverage. The key features of InCheck are: verified register preservation, single memory location checkpoints, and safe & timely recovery. To evaluate the effectiveness of InCheck, we performed more than 210,000 fault injection experiments on different hardware components of an ARM cortex53-like processor running MiBench applications. The original and SWIFT-R (state-of-the-art) protected programs suffered from 8000 and 1800 instances of wrong outputs respectively, but when protected by InCheck, there was no failure.

NEMESIS: A Software Approach for Computing in presence of Soft Errors

Dheeraj Lokam, Moslem Didehban, Aviral Shrivastava
Conference Paper IEEE/ACM 36th International Conference on Computer Aided Design, Irvine, California

Abstract

Soft errors are considered as the main reliability challenge for sub-nanoscale microprocessors. Software-level soft error resilience schemes are desirable because they require no hardware modifications and their protection can be tuned based on the application requirements. However, existing software-level error tolerant schemes do not provide high-level of protection. In this work, we present NEMESIS - a compiler-level fine-grain soft error detection, diagnosis and recovery technique that can provide high degree of error-resiliency. NEMESIS runs three versions of computations and detects soft errors by checking the results of all memory write and branch operations. In the case of mismatch, NEMESIS recovery routine reverts the effect of error from the architectural state of the program and program resumes its normal execution. Our extensive μ-architectural-level fault injection experiments results show that NEMESIS transformation is able to detect all soft errors and recover from 97% of detected errors.

An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes

Dheeraj Lokam
Manuscript Master's Thesis

Abstract

Soft errors are considered as a key reliability challenge for sub-nano scale transistors. An ideal solution for such a challenge should ultimately eliminate the effect of soft errors from the microprocessor. While forward recovery techniques achieve fast recovery from errors by simply voting out the wrong values, they incur the overhead of three copies execution. Backward recovery techniques only need two copies of execution, but suffer from check-pointing overhead. In this work I explored the efficiency of integrating check-pointing into the application and the effectiveness of recovery that can be performed upon it. After evaluating the available fine-grained approaches to perform recovery, I am introducing InCheck, an in-application recovery scheme that can be integrated into instruction-duplication based techniques, thus providing a fast error recovery. The proposed technique makes light-weight checkpoints at the basic-block granularity, and uses them for recovery purposes. To evaluate the effectiveness of the proposed technique, 10,000 fault injection experiments were performed on different hardware components of a modern ARM in-order simulated processor. InCheck was able to recover from all detected errors by replaying about 20 instructions, however, the state of the art recovery scheme failed more than 200 times.

LIGHTWEIGHT CHECKPOINT TECHNIQUE FOR RESILIENCE AGAINST SOFT ERRORS

Dheeraj Lokam, Moslem Didehban, Aviral Shrivastava
Patent Granted United States application 16/227,514, filed on 12/20/2018.

METHOD FOR DETECTING AND RECOVERY FROM SOFT ERRORS IN A COMPUTING DEVICE

Dheeraj Lokam, Moslem Didehban, Aviral Shrivastava
Patent Filed United States application 62/681,129, filed on 06/11/2018.

MULTI-LEVEL FAULT SIMULATIONS FOR INTEGRATED CIRCUITS (IC)

Dheeraj Lokam, Kevin Locker, Siva Prasad Kota, Massimo Ceppi, Teo Cupaiuolo
Patent Granted United States application US10747633B2, filed on 09/24/2018.