Behrooz Parhami's website banner

Menu:

Behrooz Parhami's ECE 257A Course Page for Fall 2020

Collapsed bridge

Fault-Tolerant Computing

Page last updated on 2020 December 23

Enrollment code: 12963
Prerequisite: ECE 154 (computer architecture), or equivalent
Class meetings: MW 10:00-11:30, Phelps 1440 (YouTube lectures)
Instructor: Professor Behrooz Parhami
Open office hours: MW 10:00-11:00 AM, via Zoom
Course announcements: Listed in reverse chronological order
Course calendar: Lecture, homework, and exam schedules
Homework assignments: Four assignments, worth a total of 40%
Exams: None for fall 2020
Research paper: Report 50%; Poster 10%
Research paper guidlines: Brief guide to format and contents
Poster presentation tips: Brief guide to format and structure
Policy on academic integrity: Please read very carefully
Grades: Statistics for homework and exam grades
References: Textbook and other sources (Textbook's web page)
Lecture slides: Via the textbook's Web page
Miscellaneous information: Motivation, catalog entry, history

Course Announcements

Megaphone

2020/12/23: The fall 2020 offering of ECE 257A is officially over and grades have been reported to the Registrar. I realize that on-line courses, entailing limited interaction with instructors, are quite challenging. I did my best to make the course flow smoothly and predictably, with quick feedback and ready access to me via e-mail and Zoom office hours. Suggestions on ways to improve the course material, lecture presentations, and assessments will be much appreciated. Happy holidays; hope to see some of you in my ECE 254B and 252B classes during winter and spring 2021 quarters!
2020/12/17: All students submitted their research papers on time and I have already begun reading and evaluating them. You will receive feedback on your paper and poster, along with a letter-grade for the course, soon, via a private e-mail to be sent no later than the grading deadline of December 23, 2020. I will hold no more regular office hours until January 2021, but will remain available via e-mail until 12/23.
2020/12/11: HW4 has been graded and its stats, along with those for the overall homework component, have been posted below. All of the submitted posters satisfied the first part of the research requirements, with feedback provided to some students via private e-mail messages. Good luck in the rest of your research. Remember that the PDF file of your research paper in final form will be due on W 12/16 by midnight. This deadline cannot be extended, as I will need time to read and compare the papers before the grading deadline of 12/23.
2020/12/03: I have already posted Lectures 15 & 16, the last two for the course. Once you have watched these lectures and turned in HW4 by Monday 12/7, 10:00 AM, you can focus on completing your research project, for which two items will be due by the following deadlines:
W 12/09 (any time): PDF file of your poster (see "Poster presentation tips")
W 12/16 (any time): PDF file of your paper (see "Research paper guidelines")
I will maintain my regular Zoom office hours through M 12/21 12/23.and will report your paper & course grades to you, along with feedback, by my grading deadline of W 12/23.
[P.S.: The December 16, 2020, 6:30 PM PST, IEEE Central Coast Section Zoom talk by Dr. Jessica Santana (UCSB TMP), entitled "Using Natural Language Processing to Measure Ethical Convergence in Scientific Discourse" is relevant to what we discussed about ethical considerations in dependable computing. You don't have to be an IEEE member to attend the talk free of charge. Please let me know if you are interested and I will provide you with registration details.]
2020/11/25: I have updated Part VII of the textbook (Failures: Computational Breaches) and its slides. A link to the recording of Lecture 15 will be posted by tomorrow and Lecture 16 (the last one for the course) will follow before Sunday. HW4, the last one for the course, will be due by 10:00 AM on Monday 12/07. During the last week of classes, you will be focusing on completing your research paper and the associated poster. I have decided against having Zoom poster presentations, but you still need to submit your prepared poster by W 12/09 as part of your research work.
2020/11/19: I have updated Part VI of the textbook (Degradation: Behavioral Lapses) and its slides. A link to the recording of Lecture 13 has already been posted and Lecture 14 will follow within 24 hours. I will post HW4, the last one for the course, over the coming weekend (11/21-22). Hope you are making good progress in your research project. If you need help with references or other aspects of your research, please reach out via e-mail or see me during one of my Zoom office hours. Final list of references and a provisional abstract will be due by M 11/30 (any time). A paper's abstract is usually the last part written, so "provisional" means that you won't be bound by the abstract that you submit at this time. The idea is to put into words what you have already accomplished in your project and what you hope to accomplish if everything goes right. You will have a chance to modify your abstract to correspond to the actual contents of your completed paper.
2020/11/12: I have updated Part V of the textbook (Malfunctions: Architectural Anomalies) and its slides. A link to the recording of Lecture 11 has already been posted and Lecture 12 will follow within 24 hours. Preliminary lists of research references submitted by you are all acceptable, so I won't provide separate feedbacks. I am looking forward to receiving your final references and provisional abstract by 11/30.
2020/11/08: I have posted HW3 a few days ahead of schedule, in case you have extra time to work on it during our "Research Focus" week. For your preliminary list of references, which is due by midnight tomorrow, please pick a couple of key papers that you have studied, plus a few more that explore various aspects of your topic. You will have a chance to include additional items, as you learn more about the topic, pursue leads from already-included references, and explore various subtopics. Around half-dozen references would be sufficient at this point. [P.S.: I plan to finish the grading of HW2 later today.]
2020/11/01: Part IV of the textbook and its presentation slides have been updated. Link to the recording of Lecture 9 has been posted and a link for Lecture 10 will follow within 24 hours. Homework 2 is due on W 11/04, 10:00 AM. The week of November 9-13 (which includes the Veterans' Day Holiday on 11/11) is research focus week. I won't post a new homework or lecture, so as to give you time to make progress on your research project, in anticipation of your preliminary reference lists due on M 11/09 and a more complete list of references, along with a provisional abstract, due on M 11/30.
2020/10/26: [For some reason, the latest update to this page was inadvertently replaced by an old version. Hence, I am posting the updates again.] HW2 and links to Lectures 7 & 8 have been posted. Part III of the textbook and its slides have also been updated on the textbook's Web page.
2020/10/21: Your grade for HW1 along with solutions will be e-mailed to you by the end of today. I have also finished the task of assigning the research topics. The same e-mail will specify your assigned topic. Of the ten students who specified their preferred topics by today's extended deadline, 6 were assigned their first choice and 4 got their second choice.
2020/10/16: Link to the 78-minute video of Lecture 5 has been posted in the "Course Calendar" section below. Lecture 6 will be recorded soon and should become available within 24 hours. The list of research topics has been updated for fall 2020. A correction to Slide 96 (Part I, Chapter 3 of the textbook) has been e-mailed to you via GauchoSpace.
2020/10/09: Link to the 88-minute video of Lecture 3 has been posted in the "Course Calendar" section below. Lecture 4 will be recorded soon and should become available within 24 hours. HW1 has been posted a few days ahead of schedule.
2020/10/04: Link to the 105-minute video of Lecture 1 has been posted in the "Course Calendar" section below. Lecture 2 will be recorded soon and should become available within 24 hours. [Correction: In Lecture 1, Slide 22, I erroneously refer to a Swedish nuclear reactor as a US unit.]
2020/09/29: Course update sent to students via GauchoSpace: Updated versions of the textbook Chapters 1-4 (Part I) and associated slides have been posted to the textbook's Web page. A forthcoming ACM webinar by Dr. Moshe Vardi, entitled "Lessons from COVID-19: Efficiency vs Resilience," is of great interest to this course (Registration). It will be offered on Wednesday, October 14, 9:00 AM PDT. Please watch it if you can, as a problem in our HW1 relates to this presentation. If you cannot watch the webinar, there are alternate sources that I will specify in the problem statements for HW1. Videos for Lectures 1 and 2 (week of September 05-09) will be posted by the end of this week.
2020/09/25: Introductory message sent to the 11 enrolled students via GauchoSpace: I hope your summer was fun and/or productive, despite the difficult circumstances arising from the coronavirus pandemic. As your instructor for this course, I will try to function more or less normally during fall and provide you with significant learning opportunities with a modified course format to accommodate on-line instruction.
During the fall 2020 quarter, you will need to consult two Web pages on a regular basis: This course page and the textbook page. I will use GauchoSpace only for announcements and not as the main communication medium for the course.
Course lectures will be asynchronous, using the flipped-classroom model. I will record the lectures, post them on YouTube, and provide you with the link for watching. The first hour of each class session (MW 10:00-11:00) will be devoted to a Zoom office hour. I will also be available for answering questions through e-mail. ECE 257A requirements and grading scheme are described near the top of this Web page.
I will send you links to the first couple of lectures soon. I will also inform you when I have updated the first four chapters of the textbook, so that you can download them from the textbook's Web page.
Looking forward to interacting with you during the fall quarter!
2020/07/26: Free conference attendance: The European Dependable Computing Conference will be held virtually from September 8 to 10, 2010. Intel and Fraunhofer IKS are covering all organizational expenses. Hence, participation is free of charge, but you need to pre-register. [Conference program] [Workshops]
2020/06/11: Welcome to the ECE 257A web page for fall 2020. The course will be research-based, with 60% of your grade determined by your research report and poster presentation and 40% based on homework. I plan to update the lecture slides and textbook chapters over the summer months and through the fall quarter, with each revised chapter becoming available before discussion in class.

Course Calendar

Calendar

Course lectures, homework assignments, and research paper deadlines have been scheduled as follows. This schedule will be strictly observed. In particular, no extension is possible for homework due dates. Please begin work on your assignments early. Each lecture corresponds to topics in 1-2 chapters of the instructor's forthcoming textbook on dependable computing. Chapter numbers are provided in parentheses, after day & date.

Day & Date (book chapters) Lecture topic [Homework posted/due] {Special notes}
M 10/05 (1) Background and motivation {Lecture 1}
W 10/07 (2) Dependability attributes {Lecture 2}

M 10/12 (3) Combinational modeling [HW1 posted, chs. 1-4] {Lecture 3}
W 10/14 (4) State-space modeling {Lecture 4}

M 10/19 (5, 7) Defect avoidance; Shielding and hardening {Research topics specified} {Lecture 5}
W 10/21 (6, 8) Defect circumvention; Yield enhancement [HW1 due] {Lecture 6}

M 10/26 (9, 11) Fault testing; Design for testability [HW2 posted, chs. 5-12] {Lecture 7}
W 10/28 (10, 12) Fault masking; Replication with voting {Research assignments finalized} {Lecture 8}

M 11/02 (13, 15) Error detection; Self-checking modules {Lecture 9}
W 11/04 (14, 16) Error correction; Redundant disk arrays [HW2 due] {Lecture 10}

M 11/09 Research-focus week: Getting started {Preliminary reference list due}
W 11/11 No lecture (Veterans' Day) [HW3 posted, chs. 13-20]

M 11/16 (17, 19) Malfunction diagnosis; Standby redundancy {Lecture 11}
W 11/18 (18, 20) Malfunction tolerance; Robust parallel processing {Lecture 12}

M 11/23 (21, 23) Degradation allowance; Resilient algorithms [HW3 due] {Lecture 13}
W 11/25 (22, 24) Degradation management; Software redundancy [HW4 posted, chs. 21-28] {Lecture 14}

M 11/30 (25, 27) Failure confinement; Agreement and adjudication {Ref's & abst. due} {Lecture 15}
W 12/02 (26, 28) Failure recovery; Fail-safe systems {Lecture 16}

M 12/07 Research-focus week: Finishing up [HW4 due]
W 12/09 Poster presentations {PDF of poster due} {Instructor and course evaluations}

W 12/16 {Full research paper PDF file due by midnight}
W 12/23 {Course grades due by midnight}

Homework Assignments

Homework image

- Turn in your solutions as a PDF file attached to an e-mail sent by the due date/time.
- Because solutions will be handed out on the due date, no extension can be granted.
- Include your name, course name, and assignment number at the top of the first page.
- If homework is handwritten and scanned, make sure that the PDF is clean and legible.
- Although some cooperation is permitted, direct copying will have severe consequences.

Homework 1: Dependability and its modeling (chs. 1-4, due W 2020/10/21, 10:00 AM)
Do the following problems from the textbook: 1.6, 1.19, 2.26, 3.15, 4.12

Homework 2: Defects and faults (chs. 5-12, due W 2020/11/04, 10:00 AM)
Do the following problems from the textbook: 5.2, 7.1, 8.2, 10.6, 11.6ab

Homework 3: Errors and malfunctions (chs. 13-20, due M 2020/11/23, 10:00 AM)
Do the following problems from the textbook: 13.1, 14.3, 15.7, 17.9, 19.3

Homework 4: Degradations and failures (chs. 21-28, due W 2020/12/07, 10:00 AM)
Do the following problems from the textbook: 21.4, 22.2, 24.6, 25.1, 27.6

Sample Exams and Study Guide

Answer sheet

(This section does not apply to fall 2020)
The following sample exam problems are meant to indicate the types and levels of problems, rather than the coverage (which is outlined in the course calendar).
Students are responsible for all sections and topics in the textbook and class handouts that are not explicitly excluded in the study guide that follows each sample exam, even if the material was not covered in class lectures.

Sample Midterm Exam (105 minutes)
Problems 3.12, 4.4, 9.4, and 12.1 from the textbook.

Midterm Exam Study Guide
Study Chapters 1-12 and review the problems in homework assignments 1-2. The following textbook sections are excluded: 6.6, 7.6, 8.6, 9.4, 9.6, 11.6

Sample Final Exam (120 minutes)
Problems 15.5, 17.1, 21.2, and 27.3 from the textbook.

Final Exam Study Guide
Study Chapters 13-28 and review the problems in homework assignments 3-4. The following textbook sections are excluded: 13.6, 14.6

Research Paper and Presentation

Colored marbles Each student will review a subfield of dependable computing or do original research on a selected and approved topic. A preliminary list of research topics is provided below (new topics, and new references for the current topics, may be added later). However, students should feel free to propose their own topics for approval. To propose a topic, send via e-mail a one-page narrative, including 2-3 key references, to the instructor.

A publishable report earns an "A" for the course, regardless of homework grades. See the course calendar for schedule and due dates and Research Paper Guidlines for formatting tips.

This year's suggested research topics for ECE 257A are built around the theme "Robustness of Interconnection networks." You can get started on each topic by taking a look at the following two common references, plus one topic-specific reference that is provided further down on this page. The two common references are:

[Parh10] Parhami, B., "Robustness Attributes of Interconnection Networks for Parallel Processing," Keynote Lecture at the First Int'l Supercomputing Conf., Guadalajara, Mexico, March 2010. {PPT and PDF slides are available from B. Parhami's Publications Web page; see publication [262].}

[Sall12] Salles, R. M. and D. A. Marion Jr., "Strategies and Metric for Resilience in Computer Networks," Computer J., Vol. 55, No. 6, pp. 728-739, June 2012.

1. Effects of Missing Nodes on Network Diameter and Average Distance (Assigned to: Trenton G. Rochelle)
[Kris87] Krishnamoorthy, M.S. and B. Krishnamurthy, "Fault Diameter of Interconnection Networks," Computers & Mathematics with Applications, Vol. 13, Nos. 5/6, pp. 577-582, 1987.

2. Effects of Missing Links on Network Diameter and Average Distance (Assigned to: TBD)
[Kris87] Krishnamoorthy, M.S. and B. Krishnamurthy, "Fault Diameter of Interconnection Networks," Computers & Mathematics with Applications, Vol. 13, Nos. 5/6, pp. 577-582, 1987.

3. Synthesis of Interconnection Networks with Maximal Fault Tolerance (Assigned to: TBD)
[Chen09] W. Chen, W. J. Xiao, and B. Parhami, "Swapped (OTIS) Networks Built of Connected Basis Networks are Maximally Fault Tolerant," IEEE Trans. Parallel and Distributed Systems, Vol. 20, pp. 361-366, March 2009.

4. Adaptive Schemes for Point-to-Point Communication in Networks (Assigned to: Veena Bellamkonda)
[Ngai91] Ngai, J. Y. and C. L. Seitz, "A Framework for Adaptive Routing in Multicomputer Networks," Computer Architecture News, Vol. 19, No. 1, pp. 6-14, March 1991.

5. Adaptive Schemes for Collective Communication in Networks (Assigned to: TBD)
[Pand95] Panda, D. K., "Issues in Designing Efficient and Practical Algorithms for Collective Communication on Wormhole-Routed Systems," Proc. Int'l Conf. Parallel Processing Workshop on Challenges for Parallel Processing, 1995, pp. 8-15.

6. Deadlocks in Adaptive Routing and How to Avoid or Detect Them (Assigned to: Ziming Qi)
[Dall93] Dally, W. J. and H. Aoki, "Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels," IEEE Trans. Parallel and Distributed Systems, Vol. 4, No. 4, pp. 466-475, April 1993.

7. Diagnosability of Regular Degree-d Interconnection Networks (Assigned to: Gushu Li)
[Chan05] Chang, G.-Y., G. J. Chang, and G.-H. Chen, "Diagnosabilities of Regular Networks," IEEE Trans. Parallel and Distributed Systems, Vol. 16, No. 4, pp. 314-323, April 2005

8. Diagnosability of Hierarchical or Multilevel Interconnection Networks (Assigned to: TBD)
[Xu09] Xu, M., K. Thulasiraman, and X.-D. Hu, "Conditional Diagnosability of Matching Composition Networks Under the PMC Model," IEEE Trans. Circuits and Systems II, Vol. 56, No. 11, pp. 875-879, November 2009.

9. Synthesis of Interconnection Networks with Maximal Diagnosability (Assigned to: Adam Jinhua Li)
[Chan05] Chang, G.-Y., G. J. Chang, and G.-H. Chen, "Diagnosabilities of Regular Networks," IEEE Trans. Parallel and Distributed Systems, Vol. 16, No. 4, pp. 314-323, April 2005.

10. Network Virtualization as a Tool for Fault Tolerance in Interconnection Networks (Assigned to: Varshika Mirtini)
[Chan05] Fischer, A., J. F. Botero, M. T. Beck, H. De Meer, and X. Hesselbach, "Virtual Network Embedding: A Survey," IEEE Communications Surveys & Tutorials, Vol. 15, No. 4, pp. 1888-1906, 2013.

Topics outside the main theme for the quarter

a. Reasoning Under Uncertainly, with Applications to Dependable Computing (Assigned to: TBD)
[IJAR16] Int'l J. Approximate Reasoning, Vol. 71, pp. 1-62, December 2016 (Five review articles on 40 years of Dempster-Shafer Theory)

b. Probabilistic Analysis of Program Correctness Under Soft Errors (Assigned to: Jeff A. Longo)
[Carb16] Carbin, M., S. Misailovic, and M. C. Rinard, "Verifying Quantitative Reliability for Programs that Execute on Unreliable Hardware," Communications of the ACM, Vol. 59, No. 8, pp. 83-91, August 2016.

c. Reliability of Reconfigurable 2D Processor Arrays with Distributed Switching (Assigned to: TBD)
[Parh20] Parhami, B., "Reliability and Modelability Advantages of Distributed Switching for Reconfigurable 2D Processor Arrays," Proc. 11th Annual IEEE Information Technology, Electronics and Mobile Communication Conf., November 2020, to appear.

d. Reliability Considerations in the Design of Neuromorphic Chips (Assigned to: Swetha Manoj Pillai)
[Gree20] Greengard, S., "Neuromorphic Chips Take Shape," Communications of the ACM, Vol. 63, No. 8, pp. 9-11, August 2020.

e. Dependable Computing Under Extreme Nanoscale Parameter Variations (Assigned to: Mingzhe Li)
[Ghos10] Ghoseh, S. and K. Roy, "Parameter Variation Tolerance and Error Resiliency: New Design Paradigm for the Nanoscale Era," Proceedings of the IEEE, Vol. 98, No. 10, pp. 1718-1751, August 2010.

f. Fault-Tolerant and Easily-Testable Fourier Transform Networks (Assigned to: TBD)
[Jou88] Jou, J. Y. and J. A. Abraham, "Fault-Tolerant FFT Networks," IEEE Trans. Computers, Vol. 37, No. 5, pp. 548-561, May 1988.
[Lu05] Lu, S. K., J. S. Shih, and S. C. Huang, "Design-for-Testability and Fault-Tolerant Techniques for FFT Processors," IEEE Trans. VLSI Systems, Vol. 13, No. 6, pp. 732-741, 2005.

g. Robustness and Fault Tolerance in Natural and Artificial Neural Networks (Assigned to: Min Jian Yang)
[Torr17] Torres-Huitzil, C. and B. Girau, "Fault and Error Tolerance in Neural Networks," IEEE Access, Vol. 5, pp. 17322-17341, August 2017.

h. Reliability Considerations in the Design of Analog Mixed-Signal Chips (Assigned to: Zahra Fahimi)
[Joks20] Joksas, D., P. Freitas, et al., "Committee Machines—A Universal Method to Deal with Non-Idealities in Memristor-Based Neural Networks," Nature Communications, Vol. 11, No. 1, pp. 1-10, 2020.
[Rekh19] Rekhi, A. S., et al., "Analog/Mixed-Signal Hardware Error Modeling for Deep Learning Inference," Proc. 56th Annual Design Automation Conf., 2019, pp. 1-6.

Poster Presentation Tips

Poster format

Here are some guidelines for preparing your research poster. The idea of the poster is to present your research results and conclusions thus far, get oral feedback during the session from the instructor and your peers, and to provide the instructor with something to comment on before your final report is due. Please send a PDF copy of the poster via e-mail by midnight on the poster presentation day.

Posters prepared for conferences must be colorful and eye-catching, as they are typically competing with dozens of other posters for the attendees' attention. Here is an example of a conference poster. Such posters are often mounted on a colored cardboard base, even if the pages themselves are standard PowerPoint slides. In our case, you should aim for a "plain" poster (loose sheets, to be taped to the wall in our classroom) that conveys your message in a simple and direct way. Eight to 10 pages, each resembling a PowerPoint slide, would be an appropriate goal. You can organize the pages into 2 x 4 (2 columns, 4 rows), 2 x 5, or 3 x 3 array on the wall. The top two of these might contain the project title, your name, course name and number, and a very short (50-word) abstract. The final two can perhaps contain your conclusions and directions for further work (including work that does not appear in the poster, but will be included in your research report). The rest will contain brief description of ideas, with emphasis on diagrams, graphs, tables, and the like, rather than text which is very difficult to absorb for a visitor in a very limited time span.

Grade Statistics

Chart

All grades listed are in percent, unless otherwise noted.
HW1 grades (letter): Range = [B, A], Mean = 3.71, Median = A–
HW2 grades (letter): Range = [B, A+], Mean = 3.60, Median = A–
HW3 grades (letter): Range = [B, A], Mean = 3.45, Median = B+
HW4 grades (letter): Range = [B, A], Mean = 3.74, Median = A–
Overall homework grades: Range = [81, 100], Mean = 91, Median = 92
Research paper grades: Range = [65, 90], Mean = 78, Median = 75
Course grades (letter): Range = [B, A], Mean = 3.43, Median = B+

References

Image of a reference book

Required text: B. Parhami, Dependable Computing: A Multilevel Approach, chapters will be posted as they are updated. Please visit the textbook's web page for general information. Lecture slides are also available there.
Some useful books (not required):
Koren/Krishna, Fault-Tolerant Systems, Morgan Kaufmann, 2007 (ISBN 0-12-088525-5)
Shooman, Reliability of Computer Systems and Networks, Wiley, 2002 (ISBN 0-471-29342-3)
Siewiorek/Swarz, Reliable Computer Systems, Digital Press, 1992 (ISBN 1-55558-075-0)
Johnson, Design and Analysis of Fault-Tolerant Digital Systems, Addison Wesley, 1989 (ISBN 0-201-07570-9)

Research resources:
Proc. IEEE/IFIP Int'l Conf. Dependable Systems and Networks (DSN), formerly known as Fault-Tolerant Computing Symp. (FTCS), annual, since 1971.
IEEE Trans. Dependable and Secure Computing, published since 2004
IEEE Trans. Reliability, published since 1955
IEEE Trans. Computers, published since 1952
UCSB library's electronic journals, collections, and other resources

Miscellaneous Information

Motivation: Dependability concerns are integral parts of engineering design. Ideally, we would like our computer systems to be perfect, always yielding timely and correct results. However, just as bridges collapse and airplanes crash occasionally, so too computer hardware and software cannot be made totally immune to unpredictable behavior. Despite great strides in component reliability and programming methodology, the exponentially increasing complexity of integrated circuits and software systems makes the design of prefect computer systems nearly impossible. In this course, we study the causes of computer system failures (impairments to dependability), techniques for ensuring correct and timely computations despite such impairments, and tools for evaluating the quality of proposed or implemented solutions.

Catalog entry: 257A. Fault-Tolerant Computing. (4) PARHAMI. Prerequisites: ECE 154. Lecture, 3 hours. Basic concepts of dependable computing. Reliability of nonredundant and redundant systems. Dealing with circuit-level defects. Logic-level fault testing and tolerance. Error detection and correction. Diagnosis and reconfiguration for system-level malfunctions. Degradation management. Failure modeling and risk assessment.

History: Professor Parhami took over the teaching of ECE 257A in the fall quarter of 1998. Previously, the course had been taught primarily by Dr. John Kelly, who instituted the two-course sequence ECE 257A/B, the first covering general topics and the second (now discontinued) devoted to his research focus on software fault tolerance. Borrowing from his experience in teaching dependable computing at other universities and based on an extensive survey of the field that he published in 1994, Professor Parhami oriented the course toward an original multilevel view of impairments to computer system dependability and techniques for avoiding or tolerating them. The levels of this models, in increasing order of abstraction, are: defects, faults, errors, malfunctions, degradations, and failures. A textbook based on this multilevel model of dependable computing is in preparation.
Offering of ECE 257A in fall 2019
Offering of ECE 257A in fall 2018
Offering of ECE 257A in fall 2016 (PDF file)
Offering of ECE 257A in fall 2015 (PDF file)
Offering of ECE 257A in winter 2015 (PDF file)
Offering of ECE 257A in fall 2013 (PDF file)
Offering of ECE 257A in fall 2012 (PDF file)
Offering of ECE 257A in fall 2009 (PDF file)
Offering of ECE 257A in fall 2007 (PDF file)
Offerings of ECE 257A in 1998 and 2006 (PDF file)