Home
ApacheCon

Clustered Logging with mod_log_spread - Theo Sclossnagle

Spread background: a multicast technology that creates a ring of hosts (a spread) that can each push packets to the entire ring. This results in an huge save on network traffic, as the packets are not repeated for each host

Aggratgating logs is a hassle, but youve got to do it. You need all the data, ordered by time, and it's nice to group log entries into sessions so you can know user paths and behavior.

This is tough on clusters, as anyone who has grepped logs on cat01, 04, 06 knows ;)

Also, real time metrics are never available, and they sure are nice.

Fixes: Network-enabled syslogs are not bad, but they result in lots of tcp traffic. Passive logging (sniffing, basically) loses packets and ssl is prohibitive.

mod_log_spread is closer to network enabled syslog, but it uses the Spread concept to do multicast, so no extra packets are generated, even if the number of cluster members/logging boxes/monitors/etc increases.

Spread config tips: It actually uses two ports, the one specified and the next one, for example TCP/UDP 4913 and UDP 4914. Also, when listing the hosts in the Spread, you must list them in the same order on every member of the ring!

Replace mod_log_config with mod_log_spread in Apache on your content servers.

Then add log-consumer boxes to the ring: loggers (spreadlogd is perfect and stable for the task, and allows for customizable logging methods (file, sql, etc.)), and monitors if desired.

Now that you have real-time aggragated statistics you can join the Spread ring however you like, say with a super cute little Cocoa app that show realtime hits per second by reponse code, cluster member, user, etc at no extra cost to the logging system at all.

It is now advisable to write your own log handler, so you can include whatever neat information you want in your logs, allowing you to analyze ratio of response codes, unequal file sizes across cluster members, user paths in real time, etc.

WAN solution: run a Spread ring for each data center, and have one box that is a member of both, forwarding the packets through one way through one pipe.

This looks freaking amazing. I cannot get the image of realtime user paths updating in an ncurses display out of my mind: WebQ Home -> Summary: Bio 101 -> Edit Survey ...

Apache Performance - Rich Bowen

This was basically an overview of the tools availble and the hot spots to look for in optimizing Apache.

Tools:

Hotspots:

Chat with Rich (speaker for a couple of good Perl sessions) on IRC: DrBacchus on #apache on irc.freenode.net

Keynote - Building Software - Some guy named Doc

This opened, continued, and closed with a very very extended analogies between software and construction.

Boils down to the important distinction between software prisons that compete by being more amenable and modular, commodity-based systems.

I thought that notetaking for this event was over, but I'd just like to record this thought: "The net is our cathedral." Thank you, goodnight.

Authentication, Authorization and beyond... auth_ldap - graham leggett

Ugh, this doesn't look good - opens with a deeply technical but non-interesting overview of the platform difference trouble

I lost focus and looked into some other things - Patrick's summary was "They're never going to do what we want them to." Okay.