Print
-
Tuesday 26th July 2016 17:11
-
1. Make sure you are not getting yourself blocked in the firewall. Check
/var/log/lfd.log for any blocks at the time of the issue. Also, esnure
that there are no cron jobs flushing iptables.
2. Have the physical memory in the server checked and/or swapped out. In
our experience, faulty memory is almost always the cause of server
hangs. RAM issues can be revealed after our service package work as more
memory is excercised with the increased page caching enabled for
services such as MySQL to aid performance, and the increased load from
some applications, e.g. MailScanner.
3. If you're running a VPS you will need to check whether the server has depeleted its allocated resources.
4. Have the physical disks check out with offline tests and check the
disks S.M.A.R.T. settings to see if there are any errors logging against
the device, in particular bad blocks. This is the second most common
cause of server hangs that we see.
5. You need to have a console connected to your server so that you can
observe any errors or messages that are logged to it when the server
hangs. this is probably the most important step to take as it is the
last ditch place for the kernel to log any problems.
6. At the time of the hang, you need to determing whether you can login
to the server console. If so, you can fault find from there.
7. You need to check all of your server logs thoroughly at the time of
the hang after it becomes available again. If the kernel is able to
write to disk, error messages may be present in the system logs, in
particular /var/log/messages
8. If you are seeing failing server daemons, you need to check their respective error logs for the reason
9. You need to check that none of your disk partitions are full and they they have not used up their inode allocation
10. Make sure you're receiving and reading all of your server emails. In
particular, any lfd emails regarding server load (PT_LOAD) or add
additional recording of server performance (e.g. logging the output from
ps or top) on a very frequent basis to try and get a snapshot of what
the server is doing as it approaches a hang.
If you are still unable to determine the cause from each of the above
steps (some of which you will probably need help with from your server
provider), you will most likely need to seek further help from your
server provider to perform intensive load tests to help determine what
is causing the problem.
Related Articles
Self-Hosted Help Desk Software by
SupportPal