Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Don't reboot, debug!

Joshua Thijssen
September 18, 2015
85

Don't reboot, debug!

Joshua Thijssen

September 18, 2015
Tweet

Transcript

  1. 1 Don't reboot, debug! A medic first aid course in

    debugging your server Joshua Thijssen @JayTaph
  2. 2

  3. 3

  4. ➡ Apache / PHP / nginx/php-fpm ➡ Monitoring / backup

    ➡ Hanging cron jobs & runaway tools 8 Other causes:
  5. ➡ Apache / PHP / nginx/php-fpm ➡ Monitoring / backup

    ➡ Hanging cron jobs & runaway tools ➡ Connectivity / DNS problems 8 Other causes:
  6. 11

  7. 11 ➡ Isolated user space. ➡ PID (process id) and

    state. ➡ Kernel “preempts”, or process yields.
  8. 11 ➡ Isolated user space. ➡ PID (process id) and

    state. ➡ Kernel “preempts”, or process yields. ➡ Multitasking.
  9. 12

  10. 12 ➡ R Running or runnable ➡ S Interruptible sleep

    ➡ D Uninterruptible sleep ➡ Z Defunct process (zombies)
  11. 13

  12. 14

  13. 14 ➡ Most processes are sleeping. ➡ External processes (and

    the kernel) can “wake up” a process at any time by sending “signals”.
  14. 14 ➡ Most processes are sleeping. ➡ External processes (and

    the kernel) can “wake up” a process at any time by sending “signals”. ➡ Fire signals with “kill”.
  15. 15

  16. 15 ➡ Uninterruptible means it won’t handle signals (directly), but

    waits on its task to finish (it must wake up by itself).
  17. 15 ➡ Uninterruptible means it won’t handle signals (directly), but

    waits on its task to finish (it must wake up by itself). ➡ Used for high-performance loops that needs to focus (like I/O).
  18. 15 ➡ Uninterruptible means it won’t handle signals (directly), but

    waits on its task to finish (it must wake up by itself). ➡ Used for high-performance loops that needs to focus (like I/O). ➡ Still can be preempted by the kernel.
  19. 16

  20. 16 ➡ Zombies aren’t bad. ➡ It’s just bad programming

    or administration that creates zombies.
  21. 16 ➡ Zombies aren’t bad. ➡ It’s just bad programming

    or administration that creates zombies. ➡ But there shouldn’t be many.
  22. 18

  23. 18 ➡ 1 minute, 5 minutes, 15 minutes averages ➡

    Calculated as the number of runnable processes (but has more sources nowadays).
  24. 18 ➡ 1 minute, 5 minutes, 15 minutes averages ➡

    Calculated as the number of runnable processes (but has more sources nowadays). ➡ Depends on number of CPU’s!
  25. 19 14:57:22 up 35 days, 18:57, 1 user, load average:

    1.52, 0.66, 0.27 ➡ 1.52 average runnable processes in the last minute.
  26. 19 14:57:22 up 35 days, 18:57, 1 user, load average:

    1.52, 0.66, 0.27 ➡ 1.52 average runnable processes in the last minute. ➡ 0.66 average in 5 minutes
  27. 19 14:57:22 up 35 days, 18:57, 1 user, load average:

    1.52, 0.66, 0.27 ➡ 1.52 average runnable processes in the last minute. ➡ 0.66 average in 5 minutes ➡ 0.27 average in 15 minutes.
  28. 19 14:57:22 up 35 days, 18:57, 1 user, load average:

    1.52, 0.66, 0.27 ➡ 1.52 average runnable processes in the last minute. ➡ 0.66 average in 5 minutes ➡ 0.27 average in 15 minutes. ➡ Single CPU: 52% more than it can handle.
  29. 19 14:57:22 up 35 days, 18:57, 1 user, load average:

    1.52, 0.66, 0.27 ➡ 1.52 average runnable processes in the last minute. ➡ 0.66 average in 5 minutes ➡ 0.27 average in 15 minutes. ➡ Single CPU: 52% more than it can handle. ➡ Quad core system: not doing very much
  30. 21 Q: How much memory does this process use? This

    is REALLY hard question to answer! It depends on many factors!
  31. 22

  32. 22

  33. 23 ➡ Virtual memory (VIRT) ➡ Shared memory (SHR SHRD)

    ➡ Resident memory (RES or RSS) ➡ Swapped memory (SWP, SWAP)
  34. 24 ➡ Each process has 4GB memory space usable. ➡

    Even if you have less memory installed. (on a 32bit system)
  35. 24 ➡ Each process has 4GB memory space usable. ➡

    Even if you have less memory installed. ➡ 1GB is reserved for kernel. (on a 32bit system)
  36. ➡ New phone book entries are created. ➡ VIRT will

    increase. ➡ Allocating memory != using memory. 27 Allocating memory
  37. 28

  38. <?php $pid = pcntl_fork(); if ($pid) { echo "Hello, this

    is the parent process\n"; } else { echo "Hello, this is the child process\n"; } 29
  39. 31 C1 B1 A1 C1` B1` A1` A1 B1 C1

    Physical Virtual Virtual fork() =>
  40. 32 C1 B1 A1 C1` B2 A1` A1 B1 C1

    Physical Virtual Virtual fork() => B2
  41. 33

  42. $ free -m total used free shared buffers cached Mem:

    3963 3500 462 0 722 1263 -/+ buffers/cache: 1515 2448 Swap: 400 20 379 35
  43. $ free -m total used free shared buffers cached Mem:

    3963 3500 462 0 722 1263 -/+ buffers/cache: 1515 2448 Swap: 400 20 379 35
  44. 38

  45. 39

  46. 40 ➡ With monitoring you have an excellent idea: ➡

    what is happening ➡ what happened ➡ what will likely be happening
  47. 43 ➡ syslog ➡ files ➡ mail ➡ slack /

    hipchat /irc ➡ logstash $ php composer.phar require monolog/monolog
  48. 45

  49. 50 $ strace -ff -p <pid> .... socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)

    = 20 fcntl(20, F_GETFL) = 0x2 (flags O_RDWR) fcntl(20, F_SETFL, O_RDWR|O_NONBLOCK) = 0 connect(20, {sa_family=AF_INET, sin_port=htons(11211), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) poll([{fd=20, events=POLLOUT}], 1, -1) = 1 ([{fd=20, revents=POLLOUT}]) write(20, "get ez_client1/acls/"..., 44) = 44 read(20, "END\r\n", 8196) = 5 write(20, "get ez_client1/acl/g"..., 40) = 40 read(20, "END\r\n", 8196) = 5 write(20, "quit\r\n", 6) = 6 shutdown(20, 2 /* send and receive */) = 0 close(20) = 0 mkdir("/tmp/smarty", 0777) = -1 EEXIST (File exists) chmod("/tmp/smarty", 0777) = 0 mkdir("/tmp/smarty", 0777) = -1 EEXIST (File exists) chmod("/tmp/smarty", 0777) = 0 access("/userdata/client1/user/templates/nl/block-right-last.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/userdata/client1/user/templates/block-right-last.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/userdata/client1/theme/templates/nl/block-right-last.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/userdata/client1/theme/templates/block-right-last.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/etc/noxlogic/root/themes/ezshopping/templates/nl/block-right-last.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/etc/noxlogic/root/themes/ezshopping/templates/block-right-last.tpl", F_OK) = -1 ENOENT (No such file or directory) mkdir("/tmp/smarty", 0777) = -1 EEXIST (File exists) chmod("/tmp/smarty", 0777) = 0 access("/userdata/client1/user/templates/nl/block-right.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/userdata/client1/user/templates/block-right.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/userdata/client1/theme/templates/nl/block-right.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/userdata/client1/theme/templates/block-right.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/etc/noxlogic/root/themes/ezshopping/templates/nl/block-right.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/etc/noxlogic/root/themes/ezshopping/templates/block-right.tpl", F_OK) = -1 ENOENT (No such file or directory) mkdir("/tmp/smarty", 0777) = -1 EEXIST (File exists)
  50. 51 $ strace ping www.google.com .... mprotect(0xb757f000, 4096, PROT_READ) =

    0 munmap(0xb76d8000, 44104) = 0 stat64("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=59, ...}) = 0 socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 3 connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.178.4")}, 16) = 0 gettimeofday({1347446161, 382120}, NULL) = 0 poll([{fd=3, events=POLLOUT}], 1, 0) = 1 ([{fd=3, revents=POLLOUT}]) send(3, "u\205\1\0\0\1\0\0\0\0\0\0\3www\6google\3com\0\0\1\0\1", 32, MSG_NOSIGNAL) = 32 poll([{fd=3, events=POLLIN}], 1, 5000
  51. 54 ➡ Unobtrusive probes inside the kernel ➡ Scripts written

    in D language. ➡ SUN / Solaris only (licensing)
  52. 55 ➡ SystemTAP ➡ “GPL” version of dtrace ➡ Awesome,

    but complex ➡ But you need / want debug info packages
  53. 57 ➡ There are some “providers” in the PHP core

    (zend_dtrace.{c,h,d}) ➡ file / line ➡ function entry / exit ➡ exception caught / thrown
  54. 59

  55. Find me on twitter: @jaytaph Find me for development and

    training: www.noxlogic.nl Find me on email: [email protected] Find me for blogs: www.adayinthelifeof.nl Thank You! https://joind.in/talk/view/15191