Boot log analysis for system crashiness

I am troubleshooting a problem that seems to show up in many different operating systems and many different machines. The mouse continues to display and yet the system does not respond to keyboard or mouse. It would seem that it must be caught in a higher priority interrupt that excludes the action of anything else until it is completed. Some kind of critical section, which becomes deadlocked or is a deadly embrace. Whatever it is, it does not allow the system to be analyzed on the failure and perhaps it needs a power off dump that indicates the core state when this happens or even a system management mode code diagnostic.

I love problems and seem to always find a way to solve them and this is the diary of a repair. I have a new system that exhibits a random failure of this type and I will find the bug. Here is some Q&D starting shell script to look in the logs for the time of the crash.


#!/usr/bin/env bash function showlogs { cd /var/log for x in `ls ` do if [ -f $x ] then if [ `echo $x | grep gz` ] then a=1 #echo $x else if [ `ls -s $x | cut -d " " -f 1` = "0" ] then a=2 #echo "empty "$x else echo "In "$x cat $x | grep "Mar 29 13:" #echo "file "$x #ls -l $x #ls -s $x | cut -d " " -f 1 fi fi fi done } showlogs

I start with this to get the time of the crash.

last | grep crash

There are many ways to go about this without making a custom kernel or using kernel debug messages. The machine state is available ( Everything Is A File in UNIX ) and I am not sure if it is blocking when a drive error occurs, but I would hope not. So I can look in /proc and find something that waits for a condition like no heartbeat from a program or some network signal, or just a long time out for no system activity like a click. I shall make it in python as that is a little more understandable and extensible. I should learn some interesting things about the kernel and UNIX in general in the process. Below is another little script to get the layout of the status. I can send UDP packets and use a wxPython script to have a heartbeat display of the last known state and I am not sure if it would stop also? When this happens the clock does not update so I am guessing that a display program would be blocked and perhaps I should use a different indicator. I would have to make a custom kernel to get access to SMM or interrupt structure. I am certain that SMM would over-ride any blocking at interrupt or CPU.


#!/usr/bin/env bash cd /proc for proc in `ls ` do if [ -d $proc ] then whom=$proc"/stat" if [ -e $whom ] then if [ -f $whom ] then #ls "./"$proc cat $whom fi fi fi done #ls -l
Pin Signal In/Out Description
1 DCD In Data Carrier Detect
2 RxD In Receive Data
3 TxD Out Transmit Data
4 DTR Out Data Terminal Ready
5 GND - Ground
6 DSR In Data Set Ready
7 RTS Out Request To Send
8 CTS In Clear To Send
9 RI In Ring Indicator

I decided on serial for the moment as a diagnostic interface as the USB stack seems to be working when the machine becomes unresponsive. gtkterm is a useful tool and you can man it and apt-get it from debian. You can also do stty with -a -F /dev/ttyUSB0 You need to know the difference between DCE and DTE as well as having a gender changer or crossover cable to run DCE against DCE. I have a USB to serial cable and a laptop which has a serial port. Now all I need to do it make a serial program that sets baud and bits and parity and stops , then waits for a command and generates some continuous status data. I am torn whether to use python or C and I think I will use both :), just to see how different it is or perhaps how fast. GUI or Display is the actually the toughest part of that so probably python wxPython and C with OpenGL text, since I already have that code.

And here is my medieval crossover cable nightmare. I didn't have a crossover handy and so this is what I did to make it work. I measured voltages and + was +6.0V and - was -6.0V and IIRC it used to be +-24 or +-12 commonly back in the IBM terminal days?

0 comments:

Contributors

Automated Intelligence

Automated Intelligence
Auftrag der unendlichen LOL katzen