вторник, 9 декабря 2014 г.

Semi-irregular Sysadmin Ninja's Github Digest (Vol. 17)

1. reptyr
Reparent a running program to a new terminal
https://github.com/nelhage/reptyr

Quite old tool made by @nelhage, it seems he is actively developing it again. It is really changes terminal for process.  "'reptyr PID' will grab the process with id PID and attach it to your current terminal. After attaching, the process will take input from and write output to the new terminal, including ^C and ^Z."
It is also quite interesting to know how it works - check this blog post if you are curious.

2. dockerana
Docker + Graphite + Graphana = Dockerana
https://github.com/dockerana/dockerana
It's exactly what it looks - Graphite + Graphana  packed in Docker container. Quite convenient.

3. seagull
Friendly Web UI to monitor docker daemon http://96.126.127.93:10086
https://github.com/tobegit3hub/seagull
Seagull is the best friend of docker which provides Web UI to monitor docker daemon. Demo site is down but screenshots looks nice. It seems that demo is working now.

4. Algorithms
Data Structures and Algorithms in Python
https://github.com/prakhar1989/Algorithms
Not very exciting stuff, but might be useful. Just as it says, it is collection of data structures and algorithms in Python.

5. pg_shard
PostgreSQL extension to scale out real-time reads and writes http://citusdata.com/docs/pg-shard
https://github.com/citusdata/pg_shard
Sharding helper extension for PostgreSQL. Nuff said, check docs.

6. peru
Maybe sometimes better than copy-paste.
https://github.com/buildinspace/peru
Ah, nice tool. Another approach of eternal problems of dependencies on your repos. Like "git submodules" but easier. Works with Mercurial and SVN too, not only with git. Demo gif below:

7. awesome-public-datasets
A awesome list of (large-scale) public datasets on the Internet. (On-going collection)
https://github.com/caesar0301/awesome-public-datasets
List of many public (but sometimes not free) datasets on Internet, for your fun and big data projects.

8. rocket
App Container runtime
https://github.com/coreos/rocket
CoreOS creates own container instead of Docker. Quite controversial decision, check their blog for explanation.

9. instavpn
the most user-friendly L2TP/IPsec VPN server
https://github.com/sockeye44/instavpn
Very user-friendly simple but secure VPN. Ubuntu, 512 MB RAM, curl -sS https://sockeye.cc/instavpn.sh | sudo bash, browse at http://IP-ADDRESS:8080 or use cli to setup.

10. shapeme
Evolve images using simulated annealinghttps://github.com/antirez/shapeme
Small toy from @antirez - it takes PNG and try to evolve bunch of triangles to copy it. Just for fun.

понедельник, 8 декабря 2014 г.

Semi-irregular Sysadmin Ninja's Github Digest (Vol. 16)


Ok, I'm still trying to finish with my old drafts and return to normal, weekly issues. Let's go!

1. devopsbookmarks.com
Website of devopsbookmarks.com http://www.devopsbookmarks.com
https://github.com/devopsbookmarks/devopsbookmarks.com

Cool new website which tries to collect all modern DevOps tools in one place (open-source and commercial too). And what is most exciting - everyone can participate through Github. :)

2. using-ngxlua-in-upyun
2014 Beijing OSC
https://github.com/timebug/using-ngxlua-in-upyun
It's also not standalone repo, but just code repo for this presentation from some Chinese conference. If you're interesting in Nginx + Lua / Openresty - check it out, quite good intro to subject. Don't afraid, it's in English - 


3. sshrc
bring your .bashrc, .vimrc, etc. with you when you ssh
https://github.com/Russell91/sshrc
If you're making some remote admin tasks on "not your" servers from time to time you're usually quite frustrated that working environment there is not like perfectly crafted precious configs. You can fix that problem with that script, but beware of big Vim plugins - they're need to be transferred to your home dir on remote host during every login.

4. tmux-resurrect
Persists tmux environment across system restarts.
https://github.com/tmux-plugins/tmux-resurrect
Doing exactly that was promised - "saves all the little details from your tmux environment so it can be completely restored after a system restart (or when you feel like it). No configuration is required. You should feel like you never quit tmux."

5. Openstackgeek
StackGeek OpenStack Deploy
https://github.com/StackGeek/openstackgeek
"StackGeek provides these scripts and this guide to enable you to get a working installation of OpenStack Icehouse going in about 10 minutes."
Nuff said.

6. weave
The Docker Network
https://github.com/zettio/weave
Very interesting project, missing part of Docker, really. Networking is still weakest part of Docker IMO, and this project will help you with creation of virtual networks for your containers:


7. ZeroTierOne
Create flat virtual Ethernet networks of almost unlimited size. https://www.zerotier.com/
https://github.com/zerotier/ZeroTierOne
This project is similar with previous one, but main target of it is "normal" VMs and clouds and not containers. Looks quite mature and feature-full.

8. msr-cloud-tools
MSR Cloud Tools
https://github.com/brendangregg/msr-cloud-tools
Again, another tools from Brendan Gregg. For this time you can check is your cloud "hardware" support TurboBoost or read CPU temperature directly from CPU's MSRs (Model Specific Registers).

9. pcstat
Page Cache stat: get page cache stats for files on Linux
https://github.com/tobert/pcstat
Yes, that tool can show for given file how many memory pages lies in Linux' file cache. Nice to know for tuning DBs, e.g. Cassandra (that's why it was written for). Not like very new tool, you can use fadvise tool from https://code.google.com/p/linux-ftools/ too - but Go code looks prettir IMO.
Also TIL mincore(2) syscall on which both tolls were based on.

10. lsleases
list assigned ip from any device in your network
https://github.com/j-keck/lsleases

Simple DHCP sniffer - will list all IP/MACs from devices in your network. Could be useful.

11. inspeqtor
Monitor your application infrastructure!
https://github.com/mperham/inspeqtor

"Famous" inspector tool - modern rewrite of Monit on Go language with extended syntax and commercially available extension (because of which it was DMCAed by Monit developers first, but they're dismissed their claim after)

12. puppet-catalog-diff
Tool to diff Puppet catalogs
https://github.com/acidprime/puppet-catalog-diff

"A tool to compare two Puppet catalogs. While upgrading versions of Puppet or refactoring Puppet code you want to ensure that no unexpected changes will be made prior to doing the upgrade."
Very useful tool for upgrade Puppet between versions, indeed.

13. logsend
Logsend is high-performance tool for processing logs
https://github.com/ezotrank/logsend

"This like Logstash but more tiny and written by Golang. Supported outputs: influxdb, statsd and
MySQL". If you need some tool for log processing, but logstash looks somewhat bloated - check it out.




воскресенье, 7 декабря 2014 г.

A presentation on building a replacement for Graphite with Riemann, InfluxDB and Grafana

Quite controversial presentation IMO:


Yep, Clojure is cool, JVM has threads, it's nice, InfluxDB rules - but why we need Rieman then?
We can use cyanite - then we got Clojure, JVM and Cassandra for storage or graphite-influxdb - then we got InfluxDB and who cares about threads and Python GIL then?
Maybe Logstash / Heka integration is cool idea but you can do it with Graphite too...

Graphite scaling and my evaluation of Zipper Graphite stack

Hello!
Many Dev/Ops teams out there are using Graphite – nice tool for collecting and graphing various metrics from your software and/or hardware. It is a really nice tool and a good example of good architecture – you can check out Graphite chapter from famous AOSA book.
But it's also not a big secret that despite Graphite is great tool its scaling is really not an easy task. Until you run it on single server – everything is fine, you can easily spread it over a couple of servers but above that…
·      Problem one. If you are using (default and single production ready) whisper storage, then the single option of clustering is to use normal Graphite cluster mechanism. But then, after adding or removing some nodes from cluster you need to rebalance it, using e.g. carbonate tool. It’s fine but for loaded cluster with hundreds of thousands metrics and tenths servers, it could take not hours - but days and weeks - and during rebalancing Graphite cluster will producing very funny results.
·      Problem two. Current Graphite clustering based on HTTP requests (remote nodes asks each other using same Graphite-web engine) and current code is quite non-optimal, especially for aggregating functions across many nodes. Fixing that is in progress, we already have @bmhatfield's patch in 0.9.x branch (also with not merged yet parallelization improvement), another approach from @jraby which was transformed to patched graphite-web from @datacratic.
·      The third problem is relay / aggregator performance problems. Graphite relays and aggregators are CPU-bound (contrary to carbon-caches, which are IO-bound) and because of python's GIL single process can't use more that one core for CPU-intensive calculations. Then you need to create some loadbalancer-based configurations with many relay processes, which doesn't make things less complex, believe me.

So, if you faced with Graphite scaling problems you have next options:

1. Migrate to OpenTSDB.
Unfortunately, it will require quite big efforts, if you have your Graphite installation up and running for a while, and have many tools and dashboards around it. Also, scaling Hbase is a the little bit harder than throwing more servers into the pool...

2. Migrate to new storage engines.
I'll explain this way little bit.  Which alternative options do we have for now?
a)    @pyr's cyanite
Quite mature, has some production instances running. Uses Cassandra as storage looks like the natural choice for Graphite data because of high write-load tolerance.
I made an evaluation of it and found out that for our metrics it'll require 4x more space for data storage. For us it was "no-go" then, also I was suspecting that its scalability was not so great then (it was year ago, now it's much better)
InfluxDB looks also like quite a natural choice for storage - it's native time series database, with built-in sharding and clustering. Dieter Plaetinck wrote this for Vimeo and run it there in production.
In my tests InfluxDB look much better as storage - it took almost exact same space as whisper, but - only for one month. It's because of InfluxDB still has no retentions with aggregations (current InfluxDB's retentions just purge old data without any aggregation) - so, if you need to make queries across big timespan (e.g. year) you need or store all data with high precision (and wasting space) or make some aggregation by own - but of course, it will hit performance quite bad. Theoretically, you can use continuous queries for aggregation on InfluxDB level, but its support not integrated to graphite-influxdb.
And according to @dieterbe he's running graphite-influx on a single node, for now, so, I also suspect that scaling it across many nodes could be quite an adventurous journey too.
c)    ceres. Looks like abandoned for now.  I know that @dkulikovskiy made some changes in his own repo, including new roll-up mechanism – and Yandex running that on quite a big scale - but anyway, it doesn't look like a right path to go.

So, we still stick up with whisper, so, no solution for problem one yet for us, but if you only start running your Graphite – maybe it’s a good idea to run some new storage in parallel – especially if you already have some Cassandra or InfluxDB in production.
What next? What is about other problems?

We also faced relay scaling problems quite fast, and after struggling with that little bit we adopted Scala-based @markchadwick's graphite-relay – I just make it work correctly with graphite hashing. We lived about a year on that but after it starts to consume too much CPU too.
In that time, my boss @vlazarenko point me on this video from Linux.conf.au 2014 – it’s only 20 minutes and worth watching. I found out from it that Booking.com uses Graphite, and uses it under quite a big load. In this video, Devdas Bhagat also mentioned that after struggling with relay scalability they developed (and what is much better – open-sourced) new and shiny C-based graphite relay, named carbon-c-relay. (Edit: first I mistakenly named it  graphite-c-relay, d-oh... real name is and always was carbon-c-relay). We started using it instantly and from that time and up to now it works very well.
Its main contributor, Fabian Groffen, also implemented aggregation and regexes in a couple of last months, so, for now, carbon-c-relay looks like complete and pretty sane alternative for python-based relay/aggregation daemons. I can find only one downside – its output is still line-based and not-pickle based but that’s completely OK for us. Edit: As @grobian mentioned (and I completely agreed with that) pickle is insecure and bloated - line protocol is better for that case.

So, third problem is solved for now.
But I was really wandered how Booking.com struggling with second scaling problem, and after following @grobian on GitHub I found out how. It seems that they rewrite most of the parts of Graphite stack in Go! And results of it's quite impressive – they’re running about 90 backend servers with more than 55TB of whisper files!
And it looks like this project implemented by only two persons - Damian “@dgryski” Gryski and @grobian. I contacted Damian, asked a couple of questions and checked how their solution works. He also said that he and @grobian will make a blog post about their stack, but they still didn't - so, I'll try to do so. J

So, how normal Graphite cluster stack looks like? I'll take part of Jamie picture to illustrate:

Did you see that 3 graphite-web servers below? They’re communicating between each over (or a single frontend to all backends), so just imagine what happens if you have 10-20-30... backend servers instead of two - and you can imagine that speed of rendering will be very low then.
Then check out Booking.com solution (they need some cool name for it, I will call it zipper-stack for brevity). Please also bear with my non-existing painting skills:


See? Graphite-web talking to the single daemon, named carbonzipper. It talks to all backends, but not over plain text based “pickle-over-HTTP” protocol, but over new “protobuf-over-HTTP” protocol - so, on backends we have a separate daemon for speaking that, named carbonserver. Also, they have special carbonapi daemon - it talks to zipper daemon also but it has some subset of Graphite functions re-implemented in Go, so, his speed is blazing fast - you can switch all your monitoring metrics (which rendering as text and not PNG) there. 
So, looks good - let's deploy it.
As we are doing only evaluation we made it quick-and-dirty – just compile binaries and run it on a server, but for production you’ll need some packaging and configuration, of course. Also, it’s not a manual for Graphite installation – I assume that you have working Graphite cluster already and just want to check how-zipper stack works.
Initial build is easy - go to your build VM, install and set up Go there and run
root@vagrant-ubuntu-precise-64:~# go build github.com/dgryski/carbonapi
root@vagrant-ubuntu-precise-64:~# go build github.com/dgryski/carbonzipper
root@vagrant-ubuntu-precise-64:~# go build github.com/grobian/carbonserver
root@vagrant-ubuntu-precise-64:~# ls -al ~/go/bin/carbon*
total 53552
drwxr-xr-x 2 root root    4096 Nov 28 16:32 .
drwxr-xr-x 5 root root    4096 Aug  8 13:20 ..
-rwxr-xr-x 1 root root 8796848 Oct 28 16:38 carbonapi
-rwxr-xr-x 1 root root 8526040 Oct 28 16:36 carbonserver
-rwxr-xr-x 1 root root 8515904 Oct 28 16:37 carbonzipper

Copy caronserver binary to graphite backends, and carbonzipper and carbonapi binaries to your frontend. If you do not have separate frontend - just make it - install separate server with graphite-web and install binaries there)

Go to backends first and run servers there - do not forget run it under same user as carbon-caches (e.g. using screen when testing):
~$ ./carbonserver -p=8080 -stdout=true -v=true -vv=true -w="/opt/graphite/storage/whisper"
2014/12/07 16:13:10 starting carbonserver (development build)
2014/12/07 16:13:10 reading whisper files from: /opt/graphite/storage/whisper
2014/12/07 16:13:10 set GOMAXPROCS=12
2014/12/07 16:13:10 listening on :8080

Next, run carbonzipper on the frontend. Create its config file first:
~$ cat zipper.json
 {
    "Backends": [
        "http://10.x.y.z1:8080",
        "http://10.x.y.z2:8080",
        
"http://10.x.y.z3:8080"
    ]
}
I think you got the pattern. J
Then run it in debug mode:
~$ ./carbonzipper -c="./zipper.json" -p=8080 -stdout -d=3
2014/12/07 15:58:44 starting carbonzipper (development version)
2014/12/07 15:58:44 setting GOMAXPROCS= 1
2014/12/07 15:58:44 querying servers= [http://10.x.y.z1:8080 http://10.x.y.z2:8080 http://10.x.y.z3:8080] uri= /metrics/find/?format=protobuf&query=%2A
2014/12/07 15:58:44 listening on :8080

Now you need to patch graphite-web little bit - after putting own IP to CLUSTER_SERVERS is not allowed - and that's exactly what we need to do:

--- a/webapp/graphite/storage.py
+++ b/webapp/graphite/storage.py
@@ -31,7 +31,7 @@
def __init__(self, directories=[], remote_hosts=[]):
self.directories = directories
self.remote_hosts = remote_hosts
- self.remote_stores = [ RemoteStore(host) for host in remote_hosts if not is_local_interface(host) ]
+ self.remote_stores = [ RemoteStore(host) for host in remote_hosts ]

if not (directories or remote_hosts):
raise valueError("directories and remote_hosts cannot both be empty")


Then run your frontend with CLUSTER_SERVERS = ['127.0.0.1:8080'] in local_settings.py.

Everything is prepared; you can test your frontend with normal graphs.
You can check how carbonapi works too, but you need to check which functions were re-implemented in Go in https://github.com/dgryski/carbonapi/blob/master/expr.go first:

~$ ./carbonapi -z="http://localhost:8080" -stdout=true -p=9090 -tz="Europe/Amsterdam,3600"
2014/12/07 16:01:27 starting carbonapi (development build)
2014/12/07 16:01:27 using zipper http://localhost:8080
2014/12/07 16:01:27 using fixed timezone Europe/Amsterdam, offset 3600
2014/12/07 16:01:27 listening on port 9090

That's mostly it.
But I want to mention just one important thing. When I test my graphs with zipper-stack with curl rendering speed was quite good. But when I test my last hour graph generated by zipper-stack instance by my eyes it was looking like this:


Normal one looks like that:

Huh? Do you know what's happened there? I know, unfortunately.
As I mentioned, Graphite is a really good piece of software and use quite good engineering solutions to make thing works with quite a big load on pure Python, not even using C. And most known part of this trick is named carbon-cache. E.g. when you put metrics in graphite usually it doesn't flush to disk instantly but goes to RAM instead using carbon-cache daemon, which keeps it in memory, flushes to disk periodically, and responses results to Graphite web, merging on-disk and in-memory results. 
As you can see on my diagram there're no more lines from carbon zipper to carbon-caches for zipper stack. That's right - carbonserver just reads whisper files from disk and didn't ask carbon-caches! It seems that for booking.com instances disk flushing time for every single metric is below 60 seconds, and for “1-minute retention” whisper files (which they and we are using) after each minute every file is updated and fresh. But our Graphite installation is different - we are using SAN disks and plenty of RAM instead of SSDs, so, for us it took up to 40 minutes to flush metric to disk, that’s why we have that graph depletion on “1-hour” graphs…
So, for us, it's not looks like a viable solution, alas! Or we need to implement carbon-cache interface to carbonserver in Go. But in a moment when I test zipper-stack, we were running an internal version of the hack, which shortly become PR #1010 and it works quite well, for now, so, maybe I return to it later. J
Edit: @grobian is making Go-based write daemon now - https://github.com/grobian/carbonwriter - so, maybe it could be combined with carbonserver to make similar to carbon-cache solution.
But YMMV, of course, just try zipper-stack - it looks very good and promising. 





понедельник, 1 декабря 2014 г.

MySQL Replication: What’s New in MySQL 5.7 and Beyond

Very interesting presentation about current and upcoming features of replication in MySQL 5.7.x:



воскресенье, 23 ноября 2014 г.

Semi-irregular Sysadmin Ninja's Github Digest (Vol. 15)

Hi All,

Let me introduce you Volume 15 of Sysadmin Ninja's Github Digest! (as usual, in no particular order).
But why volume 15, if previous one was 13 and where's 11 and 14 then?
Well, because 11 and 14 are still in my drafts, and although I'm doubt that I'm publish them sometimes... well, let's be at least consistent - I'll try to continue digest publishing (will try to do it) on weekly basis.

1. KeyBox.
A web-based SSH console that executes commands on multiple shells. KeyBox allows you to manage keys, share terminal commands, and upload files to multiple systems simultaneously. http://sshkeybox.com
https://github.com/skavanagh/KeyBox
"A web-based ssh console to execute commands and manage multiple systems simultaneously. KeyBox allows you to share terminal commands and upload files to all your systems. Once the sessions have been opened you can select a single system or any combination to run your commands. Additional system administrators can be added and their terminal sessions and history can be audited. Also, KeyBox can manage and distribute public keys that have been setup and defined."
Can be useful for small distributed team of sysadmins across the globe.

2. FNordmetric
FnordMetric allows you collect and visualize timeseries data with SQL. http://fnordmetric.io
https://github.com/paulasmuth/fnordmetric
Hot stuff. Client-server application which "aims to be a StatsD+graphite competitor, it implements a wire compatible StatsD API". The main idea behind it that you write your query in SQL-like language, called "ChartSQL" and get your graph in SVG format. For it has quite big amount of graphing modes, but almost lacks of any functions for now. Written in C++, so, quite fast, but good luck with porting Graphite function library. Maybe it's a good idea to port Graphite as backend for it. Looks promising anyway.

And talking about Graphite -
3. graphite-stresser
A stress testing tool for Graphite
https://github.com/feangulo/graphite-stresser
Nothing unusual, but nice tool if you want to stress test your Graphite instance. Check author blog's entry for details.

4. pstop
pstop - a top-like program for MySQL
https://github.com/sjmudd/pstop
"pstop is a program which collects information from MySQL 5.6+'s performance_schema database and uses this information to display server load in real-time." For example, you can get IOPS for innodb file, or locks / operations / latencies per table. At least, you can start using performance schema in your MySQL 5.6 instance for something useful. Another useful P_S tool is "sys schema", you can read recent entry in Percona blog about it.

5. Consul-template
Generic template rendering and notifications with Consul
https://github.com/hashicorp/consul-template
Quite recent addition to service discovery tool Consul.io - now you can use it for any service which didn't understand service discovery through DNS - you can format config file templates for that service and reload it when Consul will see configuration change. Agian, blog post is better than 1000 words.

6. VCLfiddle
VclFiddle is hosted at http://www.vclfiddle.net/
https://github.com/vclfiddle/vclfiddle

"VclFiddle is an online tool for experimenting with the Varnish Cache HTTP reverse-proxy in a sandboxed environment. The name comes from a combination of the Varnish Configuration Language (VCL) and another tool that inspired this project, JSFiddle."
I.e. you can edit your VCL config on-line, using web editor, and check how it caches your website.

7. Racher.io
Rancher is an open source project that provides infrastructure services designed specifically for Docker. http://www.rancher.io
https://github.com/rancherio/rancher
Quite ambitious project for creating AWS-like environment, but for Docker containers. 

8. Atlas
A high-performance and stable proxy for MySQL
https://github.com/Qihoo360/Atlas
Another MySQL proxy. Well... Personally I never saw any production running on some MySQL-proxy solution (even on MySQL Fabric), but some China company named Qihoo360 developed this solution and insists that it's running on their production infrastructure.

9. Bosun
An advanced, open-source monitoring and alerting system by Stack Exchange http://bosun.io
https://github.com/bosun-monitor/bosun
Another Graphite competitior - OpenTSDB-backed service with own system metric's collector scollector and graphing and alerting interface. Written in Go. Looks neat and scalable:


And speaking about Go -
10. Go-opstocat
Collection of Ops related patterns for Go apps at GitHub.
https://github.com/github/go-opstocat
and
11. Delve
Delve is a Go debugger, written in Go.
https://github.com/derekparker/delve

Going further.
12. bup
Very efficient backup system based on the git packfile format, providing fast incremental saves and global deduplication (among and within files, including virtual machine images).
https://bup.github.io/

https://github.com/bup/bup
Interesting new backup tool, quite green and fresh, but looks promising.

13. osquery
SQL powered operating system instrumentation, monitoring, and analytics.
http://osquery.io

https://github.com/facebook/osquery
Facebook quite recently open-sourced that tool. Idea is looks very promising - again present system state as SQL tables on which you can run queries, by interactive console or automatically, as daemon.


And fun section
15. C4
C in four functions
https://github.com/rswier/c4
C-compile in 500 lines of C. Reading of it's sources is quite fun. :)

16. Gravity
An orbital simulation game written in Elm
https://github.com/stephenbalaban/Gravity
You can play it here.

17. Convergence
Python/OpenCl Cellular Automata design & manipulation tool
https://github.com/InfiniteSearchSpace/PyCl-Convergence
Looks like fun, but can't run that on my Mac for some reason, so, no screenshots.

воскресенье, 16 ноября 2014 г.

Semi-irregular Sysadmin Ninja's Github Digest (Vol. 12)

Let's continue with review of most interesting Github projects.
N.B. this is vol. 12 - from quite old draft, I hope it's still relevant :)
Sorry for small comments, mostly projects are speak for themselves.

1. changelog
"What's changed in the last twenty minutes?"
https://github.com/prezi/changelog

The basic idea is that you'll send any event that has even a remote chance of causing problems to simple web service with REST interface. Later, when something goes wrong, you can quickly check what's changed in the last minutes / hours.
It's interface looks like this:



Very nice idea, indeed, especially with some graphing solution w/anomaly detection.
Clients for bash/python/java/etc. already exists.


2. db-readings
Readings in Databases
https://github.com/rxin/db-readings

A list of papers essential to understanding databases and building new data systems. Very nice reading for all sysadmins and developers. Only one problem exists - where I can find enough time for that????? :)

3. bitcoinbook
Mastering Bitcoin - Unlocking digital currencies - Early Release Draft
https://github.com/aantonop/bitcoinbook

Another "must-read" book for everyone who interested in digital currencies - it's about Bitcoin internals and technical realization. Early release of book is also available by O'Reilly.

4. github-cheat-sheet
A collection of cool hidden and not so hidden features of Git and GitHub
https://github.com/tiimgreen/github-cheat-sheet

Good stuff, even for experienced Git-hubbers.

5. Awesome Chef
A curated list of amazingly awesome Chef resources
https://github.com/obazoud/awesome-chef/

Where's our "Awesome Puppet"?

Ah, there it is:
6. Awesome Puppet
A curated list of amazingly awesome puppet resources
https://github.com/olindata/awesome-puppet


7. mcrouter
Mcrouter is a memcached protocol router for scaling memcached deployments.
https://github.com/facebook/mcrouter
Another nice piece of stuff from Facebook Engineering team. Must have if you're (still) using memcached on more than couple of servers.


8. masscan
TCP port scanner, spews SYN packets asynchronously, scanning entire Internet in under 5 minutes
https://github.com/robertdavidgraham/masscan


9. powa
PostgreSQL Workload Analyzer http://dalibo.github.io/powa
https://github.com/dalibo/powa


10. elasticsearch-HQ
Monitoring and Management Web Application for ElasticSearch instances and clusters.
https://github.com/royrusso/elasticsearch-HQ

Also nice tool for all ElasticSearch users.

суббота, 27 сентября 2014 г.

Semi-irregular SysadminNinja's Github Digest (vol. 13)

Hi All,

Third part of digest, most current one!
Two more to come.

1. gogs
Gogs(Go Git Service) is a painless self-hosted Git Service written in Go.
http://gogs.io
https://github.com/gogits/gogs

Another Github clone, but written in Go. Worth checking, if you need one.

2. pyston
An open-source Python implementation using JIT techniques.
https://tech.dropbox.com/2014/04/introducing-pyston-an-upcoming-jit-based-python-implementation/
https://github.com/dropbox/pyston

New Python implementation, blazingly fast, looks promising, recommended by Guido himself :).

3. profiling
An interactive Python profiler.
https://github.com/what-studio/profiling
Nice looking Python profiler, still under development but looks good:

4. blackbox
Safely store secrets in Git/Mercurial http://the-cloud-book.com
https://github.com/StackExchange/blackbox

Another solution for storing secrets in your repos, but now from StackExchange and @yesthattom himself. Fresh and professional looking.

5. sfr1-lite
Search Formula-1 - A distributed high performance massive data engine for enterprise/vertical search
https://github.com/izenecloud/sf1r-lite
Vertical search engines are quite rare in open-source landscape. It's a new one from some Chinese startup, check tech docs in English for details.

6. mannaggia
automatic saint calling for depressed Veteran unix Admins, in italian
https://github.com/LegolasTheElf/mannaggia
Hah, funny thing. "Automatic saint calling for depressed Veteran unix Admins, in italian
Developed in italian, can be easily adapted in other languages."
TIL what "mannaggia" means. :)

7. shellshoc_poc
Shellshock DHCP RCE Proof of Concept
https://github.com/mschwager/shellshock_poc
Proof of concept of shellshock attack on DHCP server. See https://www.trustedsec.com/september-2014/shellshock-dhcp-rce-proof-concept/ for details:

8. bash_shellshock 
Wrapper for /bin/bash that mitigates 'shellshock' 
If you do not trust current Shellshock fixes you can use own wrapper to secure your server(s).



So, welcome to new SysadminNinja's house!

Hi All!

My old domain was suddenly expired, so I decided to make some changes in my blog. So, now it's named "Sysadmin Ninja's Blog" and it's nocated on neat and fancy URL http://iamsysadmin.ninja ! (yes, I know, but I was little bit slowpok-ey and most of yummy domains in .ninja TLD were already taken).
And I made some blog redesign - I hope current theme is not very dark... Maybe I'll become "light side" ninja soon. :)
So, and it's time to continue our GitHub digest. Last one was more than month ago, but I was quite busy with my moving, sorry. But we have fancy name now - "Semi-irregular Sysadmin Ninja's Github Digest" !
I had a ton of interesting repos for (more than)  last month, so, I decided to split it for at three volumes, will publish them this weekend.

суббота, 16 августа 2014 г.

Semi-irregular Intresting Github Repos Ops Digest #10 (August 15)

Hi All,

New name for that digest come into my mind - Semi-irregular! It sounds good for me, let's stick with that.
I want to make some celebration with Issue #10, but first I found out that no really cool projects to show for a week - maybe because of summer/vacation time, I don't know. So, I was waiting for one more week, and now -

Semi-irregular Intresting Github repos (for) Ops Digest, issue 10:

First, fresh docker projects:

1. docker-cheat-sheet
Docker Cheat Sheet
https://github.com/wsargent/docker-cheat-sheet

Good cheat sheet for Docker, must read.

2. flocker
Easily manage Docker containers and their data https://clusterhq.com
https://github.com/ClusterHQ/flocker

Open-source Docker orcestration service. Also, not much to say, please try tutorial.

3. flynn
A next generation open source platform as a service (PaaS) https://flynn.io
https://github.com/flynn/flynn

Ah, that's more complicated thing, but looks very promising - latyered open-sourced PaaS platform, check it out.

4. github-trending
Tracking the most popular Github repos, updated daily
https://github.com/josephyzhou/github-trending

Self explanatory, tracking the most popular Github repos and updating list of top objective-c, javasript and go repos in markdown output daily. You can pick other languages and run own tracker if you want.

5. cloud-ssh
Cloud enhanced SSH client replacement with host auto-completion
http://leonsbox.com/cloud-ssh/
https://github.com/buger/cloud-ssh

Cloud enhanced SSH client replacement with host auto-completion, espicially nice for AWS hosts.

6. cpubars
Lightweight terminal-based multicore CPU usage monitor
https://github.com/aclements/cpubars

cpubars is a simple terminal-based tool for monitoring CPU load in real-time, especially tailored to monitoring large multicores over SSH.


7. Password-Repo
https://github.com/Tek-Security-Group/Password-Repo

Quite good collection of password lists, which help to your services be more secure, I mean if you will deny using that obvious passwords for users.

8. mooltipass
Github repository dedicated to the mooltipass project
https://github.com/limpkin/mooltipass

Very nice project, indeed, like it very much! Aruino-based hardware password storage, powered by smartcard - wow! Fully open-sourced. Prototype is looking like that:


Ceators selling it for 80$, but real paranoid needs to built this by own, of course.
More and more sysadmin tools starts to be written in Go language. This is collection of Go stuff for ops engineers, check it out
And useful port of psutil library for Go:
10. gopsutil
psutil for golanghttps://github.com/shirou/gopsutil

Also some vim plugins was detected by by Github radar this week:

11. vim-plug
Minimalist Vim Plugin Manager https://github.com/junegunn/vim-plug

New Vim plugin manager, like it:



12. vim-grammarous
A powerful grammar checker for Vim using LanguageTool.
https://github.com/rhysd/vim-grammarous

"vim-grammarous is a powerful grammar checker for Vim. Simply do :GrammarousCheck to see the powerful checking"

13. vim-go
Go development plugin for Vim https://github.com/fatih/vim-go

"Full featured Go (golang) support for Vim. It comes with pre-defined sensible settings (like auto gofmt on save), has autocomplete, snippet support, improved syntax highlighting, go toolchain commands, etc... It's highly customizable and has settings for disabling/enabling features easily."Nuff said.

Fun section:
14. carp
"interesting" VM in C. Let's see how this goes. 
https://github.com/tekknolagi/carp
Sample realization ov virtual machine in C. Check if you want to how to find how VM works.

15. sunfish
Sunfish: a Python Chess Engine in 111 lines of code
https://chessprogramming.wikispaces.com/Sunfish
https://github.com/thomasahle/sunfish


"Sunfish is a simple, but strong chess engine, written in Python, mostly for teaching purposes. Without tables and its simple interface, it takes up just 111 lines of code!
The clarity of the Sunfish code provides a great platform for experimenting, be it with evaluation functions, search extensions or anything. Fork it today and see what you can do!"

16. anagramatron
twitter anagram hunter http://anagramatron.tumblr.com/
https://github.com/cmyr/anagramatron

"Anagramatron hunts for pairs of tweets that use the exact same set of letters.
This script connects to the twitter stream. When it receives a new tweet it runs it through some filters, ignoring tweets that contain things like links or @mentions, or that contain less then a minimum number of characters."
You can check results of it on http://anagramatron.tumblr.com/

Small disclaimer: I'm just big fan of Github and since we have so many sources of information around I pick Github as source of information for my digest of interesting open-source projects. And because I'm Ops/Sysadmin person - I choose projects which somewhat ops-related, but not necessary. I'm not GitHub employee and this digest is not connected with GitHub in any way.

четверг, 7 августа 2014 г.

Google hiring task (first 10-digit prime in e) in bash one-liner

Hi All,
I was on a bit of short vacation, and according to my vacation rules I'm trying to not to think about my work-related stuff, or any computer-related stuff overall. I was not successfull completely with that target thanks to my smartphone :) but tried to. But during my train voyage I was quite bored, so, I was thinking about task which my friend and fellow colleague master Victor mentioned - it was famous foto of billboard which was used by Google for hiring purposes back in 2004 -

And master Victor said - "assuming you know about the e constant, solve in a Bash one-liner to get a job @Google :)" - and I was thinking about that during my trip. And I was succeded - with some remarks, of course.
Disclaimer: IMHO solving this thask in Bash is NOT GOOD, because it violates first principle of engeneering - "choosing the right tool for the job" - and bash definitely is not good for that task.
So, first I tried to find is that task could be solved in "pure bash" and I found that it is not. You can check that thorough investigation to find out that calculating e with required precision is not posible in bash because of integer overflow.
So, let's "cheat" - i.e. use some tools with bash. Which tools are suitable and not count as cheating? I think bc is absolutely required for math operations, also other unix tools, like sed and coreutils are also OK.
Let's divide task into pieces:

1. Generate e with enough precision (1000 digits)?
I found that you can easily do it with bc:
root@ubuntu:~# echo "scale=100;e(1)" | bc -l 2.718281828459045235360287471352662497757247093699959574966967627724\ 0766303535475945713821785251664274

2. We need to go through digits of e and take each 10 digits to check, then move to 1 digit forward and repeat, and so forth. We can do it using temporary file and "cut -cX..Y" syntax, but I found that using bash variable and bash string manipulation (${string:X:10}) is much convenient.

3. We need to check is current number prime or not. We can do it on "pure bash" too, but I found that we can use infamous factor tool from coreutils to do that task:

root@ubuntu:~# factor 100 100: 2 2 5 5
root@ubuntu:~# factor 177 177: 3 59
root@ubuntu:~# factor 179 179: 179

As you can see, factor tool make number factorization for any not really big integer number - it's quite enough for our purposes.
So, let's glue everything together:

And output is:
7427466391
7413596629
6059563073
3490763233
2988075319
1573834187
7021540891
5408914993
6480016847
9920695517
1838606261
6062613313
3845830007
1692836819
4425056953
2505695369
5490598793
1782154249
8215424999
9229576351
9519366803
5193668033
1825288693
8294887933
1730123819
4039701983
4804295311
8194558153
9455815301
1332069811
6181881593
1881593041
5930416903
1934580727
3858942287
4841984443
1978623209
3140934317
3640546253
8887070167
7683964243
4563549061
Answer is 7427466391. And we can also fold it to one-liner of course:



Misson accomplished. :)

воскресенье, 3 августа 2014 г.

Random Ops Github Digest #9 (August 3rd 2014)

Hello, my fellow readers!
I decided not ot wait until tomorrow and make new digest right now - still not sure if this digest will be weekly or more random and sporadic. :)

So, let's start, again, my favourite Github projects (sometimes Ops-like) in no particular order.

1. cheat
cheat allows you to create and view interactive cheatsheets on the command-line.
https://github.com/chrisallenlane/cheat

Like man command, but you'll get small cheatsheet just with commands instead big man page. Still not sure is it good thing or not....

2. awesome-django
A curated list of awesome Django apps and projects.
https://github.com/rosarior/awesome-django

Another Awesome list, but for Django developers. Can be useful for Django Ops too.

For you, fellow Puppeteers:

3. r10k
Smarter Puppet deployment, powered by killer robots
https://github.com/adrienthebo/r10k

Relatively new tool for Puppet installations, useful if you using librarian and dynamic environments.

4. hiera-consul
Hiera backend plugin for Consul
https://github.com/lynxman/hiera-consul

If you have working Consul cluster you can get/set vualues there directly from Hiera.

Going to DevOps stuff:

5. terraform
Terraform is a tool for building, changing, and combining infrastructure safely and efficiently. http://www.terraform.io
https://github.com/hashicorp/terraform

"Infrastructure-as-code" tool - describe your infrastructure, deploy and control it on different providers. It's not Puppet replacement though - http://www.terraform.io/intro/vs/chef-puppet.html - it's just more high level tool.

6. infratester
Infrastructure Behavior Testing Framework http://infrataster.net
https://github.com/ryotarai/infrataster

Another "Infrastructure-as-code" tool - it tests infrastructure's behaviour from outside of servers:



7. devstep
Development environment builder powered by Docker and buildpacks
https://github.com/fgrehm/devstep

A dead simple, no frills development environment builder that is based around a simple goal:
"I want to git clone and run a single command to hack on any software project."
For more information please check http://fgrehm.viewdocs.io/devstep

8. libcloud-vagrant
Apache Libcloud compute provider for local Vagrant boxes
https://github.com/carletes/libcloud-vagrant

"libcloud-vagrant is a compute provider for Apache Libcloud which uses Vagrant to create VirtualBox nodes. With libcloud-vagrant installed, you could prototype a small cluster on your laptop, for instance, and then deploy it later on to Amazon, Rackspace, or any of the other clouds supported by Libcloud."

Rest of things:

9. seL4
seL4 microkernel
https://github.com/seL4/seL4

Freshly opensourced L4-based microkernel, which has formal verification proof. Not really practical thing, but interesting - check its FAQ if curious.

10. snapzend
zfs send/receive backup system http://www.znapzend.org
https://github.com/oetiker/znapzend/

Cool new backup system specifically for ZFS!

11. pghero
Database insights made easy
https://github.com/ankane/pghero
Personally I didn't work with more-less loaded instance of PostgreSQL for quite long time. But if you do - please check it out - it's nice web dashboard for pg:


Today's fun section is quite big:

12. cool-old-term
A good looking terminal emulator which mimics the old cathode display
https://github.com/Swordifish90/cool-old-term 

Looks cool - but not sure is it OK for work:




13. ld-preload-sounds
Generates raw WAV output by hooking malloc() and read().
https://github.com/gordol/ld_preload-sounds

You can now hear how your compilation sounds like!

14. conway
A real-time, persistent, multiplayer version of Conway's Game of Life
https://github.com/drewblaisdell/conway

You can roll out your own server and play with friends or play with strangers right now on http://lifecompetes.com


Small disclaimer: I'm just big fan of Github and since we have so many sources of information around I pick Github as source of information for my digest of interesting open-source projects. And because I'm Ops/Sysadmin person - I choose projects which somewhat ops-related, but not necessary. I'm not GitHub employee and this digest is not connected with GitHub in any way.