воскресенье, 6 апреля 2014 г.

Sysdig - new Linux tracing tool for sysadmins

Last week Draios company made bold move - they made their Linux tracing tool Sysdig open-source.
What is Sysdig? As it says on own website - "strace + tcpdump + lsof + awesome sauce".
And I think that tool is really quite awesome.
Installation for daredevils is quite simple -
curl -s https://s3.amazonaws.com/download.draios.com/stable/install-sysdig | sudo bash
But for more responsible sysadmins there's good manual - set up sysdig repo and install it (you'll need linux headers and DKMS for automatic kernel module build and installation).
You can start learning sysdig using this simple examples:
  • See the top processes in terms of network bandwidth usage:
  • sysdig -c topprocs_net
  • See the top local server ports (in terms of total bytes)
  • sysdig -c fdbytes_by fd.sport
  • See the top client IPs (in terms of established connections)
  • sysdig -c fdcount_by fd.cip "evt.type=accept"
  • Топ процессов по использованию диска
  • sysdig -c topprocs_file
  • Print the top files that apache has been reading from or writing to
  • sysdig -c topfiles_bytes proc.name=httpd
  • List the processes that are using a high number of files
  • sysdig -c fdcount_by proc.name "fd.type=file"
  • See the files where most time has been spent
  • sysdig -c topfiles_time
  • See queries made via apache to an external MySQL server happening in real time
  • sysdig -A -c echo_fds fd.sip=192.168.30.5 and proc.name=apache2 and evt.buffer contains SELECT
You can also record all events server-wide (or process-wide, or using other sysdig filter):
sysdig -w out.scap proc.name=httpd
and analyze that later, using even MAC or Windows workstation.
Also there is a framework for Lua - Chisel - you can write a simple script and execute them immediately at sysdig run.
However, there's an open question still - how much additional load sysdig brings to server.
Let's make a simple test. I have an small virtual machine on Ubuntu 12.04, 1 GB of RAM, with Percona Mysql 5.6 installed.
  1. Install sysbench:
  2. sudo apt-get install sysbench
  3. Create empty database 'sbtest' and fill it with test data:
  4. sysbench --test=oltp --mysql-table-engine=innodb --oltp-table-size=10000 --mysql-user=root --mysql-password=root --db-driver=mysql --test=oltp prepare
    
  5. Run sysbench
  6. root@ubuntu:~# sysbench --num-threads=8 --max-requests=5000 --oltp-table-size=10000 --mysql-user=root --mysql-password=root --db-driver=mysql --test=oltp run
    
  7. Results
  8. sysbench 0.4.12:  multi-threaded system evaluation benchmark
    
    Running the test with following options:
    Number of threads: 8
    
    Doing OLTP test.
    Running mixed OLTP test
    Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
    Using "BEGIN" for starting transactions
    Using auto_inc on the id column
    Maximum number of requests for OLTP test is limited to 5000
    Threads started!
    Done.
    
    OLTP test statistics:
        queries performed:
            read:                            70014
            write:                           25003
            other:                           10001
            total:                           105018
        transactions:                        5000   (123.58 per sec.)
        deadlocks:                           1      (0.02 per sec.)
        read/write requests:                 95017  (2348.49 per sec.)
        other operations:                    10001  (247.19 per sec.)
    
    Test execution summary:
        total time:                          40.4587s
        total number of events:              5000
        total time taken by event execution: 323.5886
        per-request statistics:
             min:                                  5.83ms
             avg:                                 64.72ms
             max:                               8020.75ms
             approx.  95 percentile:             168.71ms
    
    Threads fairness:
        events (avg/stddev):           625.0000/24.51
        execution time (avg/stddev):   40.4486/0.01
    
Delete sbtest database, reboot virtual machine, repeat p.1 and 2
  1. Run sysdig in separate terminal:
  2. root@ubuntu:~# sysdig -w /root/mysqld.scap proc.name=mysqld
  3. Re-run test
  4. root@ubuntu:~# sysbench --num-threads=8 --max-requests=5000 --oltp-table-size=10000 --mysql-user=root --mysql-password=root --db-driver=mysql --test=oltp run
    
  5. Results
  6. sysbench 0.4.12:  multi-threaded system evaluation benchmark
    
    Running the test with following options:
    Number of threads: 8
    
    Doing OLTP test.
    Running mixed OLTP test
    Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
    Using "BEGIN" for starting transactions
    Using auto_inc on the id column
    Maximum number of requests for OLTP test is limited to 5000
    Threads started!
    Done.
    
    OLTP test statistics:
        queries performed:
            read:                            70014
            write:                           25002
            other:                           10001
            total:                           105017
        transactions:                        5000   (71.62 per sec.)
        deadlocks:                           1      (0.01 per sec.)
        read/write requests:                 95016  (1360.97 per sec.)
        other operations:                    10001  (143.25 per sec.)
    
    Test execution summary:
        total time:                          69.8150s
        total number of events:              5000
        total time taken by event execution: 558.1830
        per-request statistics:
             min:                                  9.35ms
             avg:                                111.64ms
             max:                               1590.65ms
             approx.  95 percentile:             304.89ms
    
    Threads fairness:
        events (avg/stddev):           625.0000/39.17
        execution time (avg/stddev):   69.7729/0.02
    
    
So, average query time has almost doubled - 111 ms instead of 65 ms. Not very impressive. Truthfully speaking, test was quite artificial and not very methodologically correct though....

Комментариев нет: