MongIops: Your Favorite Datastore, Only Faster

Last December, I had the pleasure of giving a presentation at MongoSV 2012 with Matthew Kennedy (Fusion-io Big Data Solutions Architect), and Dale Russell (CTO, Talksum).  The slides from the talk are available.

Posted in mongodb | Tagged | Leave a comment

Hello Again

Where did 7 months go?  I’ve neglected this place for far too long!  Since I last wrote here, I’ve embarked on a new adventure, as the platform architect for Talksum, Inc with my good friend and long time colleague Dale Russell.  I’ve been diving ever deeper into ZeroMQ since my first experiments (some documented here), as well as having a good time with a lot of other technologies.   I’ve been heads down for a good while, but I think it’s time to poke my head back up and share… so, hopefully some good posts coming soon.

Posted in ramblings | 1 Comment

Simulating High Latency Networks with Linux and Netem

Recently I wanted to simulate “slow” requests against a web server in order to investigate how high latency requests impacted the system overall. After some initial experimentation with writing my own code (in python), I started looking around for other solutions. I wanted to be able to use Multi-Mechanize, one of my favorite tools for http testing, without heavy modification. Eventually I stumbled onto netem.  From the FAQ:

Netem allows testing of protocols and applications on a local network with simulated wide area networks. This was originally done to test congestion algorithms in TCP/IP, but is equally applicable to web applications and file servers.

It’s dirt simple to use, and allows you to stick with whatever http traffic generating tools you like.  Here’s a few simple examples along with their impact.

Multi-Mechanize Configuration

[global]
run_time: 30
rampup: 0
results_ts_interval: 5
progress_bar: on
console_logging: off
xml_report: off
[user_group-1]
threads: 3
script: example_httplib.py

Multi-Mechanize Test Script

import httplib
import time

class Transaction(object):
    def __init__(self):
        self.custom_timers = {}

    def run(self):
        start_timer = time.time()
        conn = httplib.HTTPConnection('www.google.com')
        conn.request('GET', '/')
        resp = conn.getresponse()
        content = resp.read()
        latency = time.time() - start_timer</p>
        self.custom_timers['Google'] = latency
        assert (resp.status == 200), 'Bad HTTP Response'

if __name__ == '__main__':
    trans = Transaction()
    trans.run()
    print trans.custom_timers

First, the control run (no artificial latency introduced):



Next, let’s add 100ms latency:

[root@aenea multi-mechanize_1.011]# tc qdisc change dev wlan0 root netem delay 100ms

Now let’s do 100ms latency, with a 20ms variation with a normal distribution:

[root@aenea multi-mechanize_1.0.11]# tc qdisc change dev wlan0 root netem delay 100ms 20ms distribution normal

This is really just scratching the surface with netem.  You can use it for:

  • Delay
  • Variable Delay
  • Delay Distribution
  • Packet Loss
  • Packet Reordering
  • Rate Control

Have fun!

Posted in linux kernel, networks, Python, testing | Tagged , , , | 2 Comments

ZeroMQ Input / Output Plugins for Rsyslog

Just dropping a quick note that Aggregate Knowledge, where I work as Service Delivery Data Architect, has released our first open source release this week – zeromq input and output plugins for rsyslog. Give them a spin if it’s something you’re interested in! You can find them at the Aggregate Knowledge ZeroMQ Rsyslog Plugin repository on github.

Posted in 0mq, zeromq | Tagged | Leave a comment

How Much Do My Date Partitions Grow Each Day? – Stupid PostgreSQL Tricks

Today I had to do some fast analysis on a data warehouse to see how much the database was growing each day. The PostgreSQL database was date partitioned in a fairly standard way: each partitioned table had tables with the naming convention “tablename_YYYY_MM_DD”, and these tables were located in a schema named “partitions”. There were several hundred tables that were partitioned this way. Here’s a quick one liner from bash that gets the answer:

echo "SELECT sum(pg_total_relation_size(schemaname || '.' || tablename)) FROM pg_tables WHERE schemaname = 'partitions' AND tablename LIKE '%_2011_04_12'" | psql database_name

Posted in postgresql | Tagged | Leave a comment

PostgreSQL 9, Listen / Notify, and Jruby – Part 2

I worked up one more example, in this case, using PL/Ruby rather than PL/PgSQL… and then using yaml to serialize the NEW record returned by the trigger, send it to the client, and reconstitute it as a hash.

The PL/Ruby stored proc:

CREATE FUNCTION beam_me_up() RETURNS TRIGGER AS $$
require 'yaml'
payload = new.to_yaml
$Plans["engage"] = PL::Plan.new("NOTIFY watchers, '#{payload}'")
$Plans["engage"].exec()
$$ LANGUAGE 'plruby';

CREATE TRIGGER beam_me_up_trigger AFTER INSERT ON watched_table
FOR EACH ROW EXECUTE PROCEDURE beam_me_up();

And, the test client…

require 'rubygems'
require 'bundler/setup'
require 'java'
require 'yaml'

$LOAD_PATH << 'vendor/jars/'
require 'postgresql-9.0-801.jdbc3.jar'

# set up our database connection to the example database...
java_import java.sql.DriverManager
DriverManager.register_driver(org.postgresql.Driver.new)
url = "jdbc:postgresql://localhost/listen_notify_poller"

def insert_thread(url)
   8 lines:  insert_conn = DriverManager.get_connection(url) ------------------------------------------------------
end

def listen_thread(url)
  listen_conn = DriverManager.get_connection(url)

  stmt = listen_conn.create_statement
  stmt.execute("LISTEN watchers")
  stmt.close

  while true
    sleep 1
    puts 'polling...'

    notifications = listen_conn.get_notifications || []

    notifications.each do |notification|
      unless notification.nil?
        test = YAML::load(notification.parameter)
        puts test.inspect
      end
    end
  end
end

insert_thread = Thread.new{insert_thread(url)}
listen_thread = Thread.new{listen_thread(url)}

listen_thread.join
insert_thread.join
Posted in jruby, postgresql | Tagged | Leave a comment

PostgreSQL 9, Listen / Notify, and Jruby.

The LISTEN / NOTIFY functionality of PostgreSQL received a large boost to functionality in the 9.0 release. Since I’ve lately been experimenting with JRuby, I decided to write up some example code of utilizing LISTEN / NOTIFY via JDBC.

LISTEN / NOTIFY basically provides a publish / subscribe message bus within PostgreSQL. Clients can register as a subscriber on a channel, and will receive all notifications sent to that channel.

9.0 added a “payload” option to the notifier – a notification can contain an optional text string.

I worked up a quick test of using this functionality in JRuby.

The example sets up the following table in a local PostgreSQL 9.0 database:

CREATE TABLE watched_table (
  id              SERIAL PRIMARY KEY,
  value           INT,
  date_updated    TIMESTAMP NOT NULL DEFAULT now()
);

It then sets up a trigger, that upon insert, sends a notify event:

CREATE FUNCTION notify_trigger() RETURNS trigger AS $$
DECLARE
BEGIN
  PERFORM pg_notify('watchers', TG_TABLE_NAME || ',id,' || NEW.id );
  RETURN new;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER watched_table_trigger AFTER INSERT ON watched_table
FOR EACH ROW EXECUTE PROCEDURE notify_trigger();

In the example, the first thread inserts into this table every three seconds:

def insert_thread(url)
  insert_conn = DriverManager.get_connection(url)
  cnt = 0 
  while cnt < 10
    stmt = insert_conn.create_statement
    stmt.execute("INSERT INTO watched_table (value) VALUES (1)")
    cnt = cnt + 1 
    sleep 3
  end 
end

The listener then connects, and polls PostgreSQL every second to see if there are new notifications. If there are, the id of the inserted record is sent in the notification, and the listener thread retrieves the record:

def listen_thread(url)
  listen_conn = DriverManager.get_connection(url)
  
  stmt = listen_conn.create_statement
  stmt.execute("LISTEN watchers")
  stmt.close

  while true
    sleep 1
    puts 'polling...'

    notifications = listen_conn.get_notifications || []

    notifications.each do |notification|
      unless notification.nil?
        puts "NOTIFICATION ------------"
        puts "pid: #{notification.pid}" 
        puts "name: #{notification.name}"
        puts "param: #{notification.parameter}"
        puts "-------------------------"

        id = notification.parameter.split(',').last
        puts "RETRIEVING RECORD FOR NOTIFIED ID: #{id}"
        stmt = listen_conn.create_statement
        rs = stmt.execute_query("SELECT * FROM watched_table WHERE id = #{id}")

        while rs.next
          puts "id:\t#{rs.get_int(1)}"
          puts "value:\t#{rs.get_int(2)}"
          puts "record_time:\t#{rs.get_timestamp(3)}"
        end

        puts "-------------------------"

        stmt.close
        rs.close
      end
    end
  end
end

Here’s the code of the entire example:

require 'rubygems'
require 'bundler/setup'
require 'java'

$LOAD_PATH << 'vendor/jars/'
require 'postgresql-9.0-801.jdbc3.jar'

# set up our database connection to the example database...
java_import java.sql.DriverManager
DriverManager.register_driver(org.postgresql.Driver.new)
url = "jdbc:postgresql://localhost/listen_notify_poller"

def insert_thread(url)
  insert_conn = DriverManager.get_connection(url)
  cnt = 0
  while cnt < 10
    stmt = insert_conn.create_statement
    stmt.execute("INSERT INTO watched_table (value) VALUES (1)")
    cnt = cnt + 1
    sleep 3
  end
end

def listen_thread(url)
  listen_conn = DriverManager.get_connection(url)
  
  # register our client as a listner on the "watchers" channel....
  stmt = listen_conn.create_statement
  stmt.execute("LISTEN watchers")
  stmt.close

  # check for new notifications once a second...
  while true
    sleep 1
    puts 'polling...'

    # check for notifications
    notifications = listen_conn.get_notifications || []

    # if there are notifications, display a little info on them
    notifications.each do |notification|
      unless notification.nil?
        puts "NOTIFICATION ------------"
        puts "pid: #{notification.pid}" 
        puts "name: #{notification.name}" 
        puts "param: #{notification.parameter}" 
        puts "-------------------------"

        # retrieve the inserted record thisnotification was for.
        id = notification.parameter.split(',').last
        puts "RETRIEVING RECORD FOR NOTIFIED ID: #{id}"
        stmt = listen_conn.create_statement
        rs = stmt.execute_query("SELECT * FROM watched_table WHERE id = #{id}")

        while rs.next
          puts "id:\t#{rs.get_int(1)}"
          puts "value:\t#{rs.get_int(2)}"
          puts "record_time:\t#{rs.get_timestamp(3)}"
        end
      
        puts "-------------------------"
      
        stmt.close
        rs.close
      end
    end
  end
end

insert_thread = Thread.new{insert_thread(url)}
listen_thread = Thread.new{listen_thread(url)}

listen_thread.join
insert_thread.join
Posted in jruby, postgresql | Tagged | 4 Comments