Posts Tagged ‘git’

Creating Puppet types and providers is easy…

February 1st, 2010

Puppet types are used to manage individual configuration items.  Puppet has a package type, a service type, a user type, etc.  Each type has providers. Each provider handles the management of that configuration on a different platform or tool, for example the package type has aptitude, yum, RPM, and DMG providers (amongst 22 others – what wrong with people that they need to invent new packaging systems… but I digress).

There are a lot of types, in fact I think Puppet covers a pretty good spectrum of configuration items that need to be managed.  I don’t know of anything in particular that missing that I can’t live without.  But there are little gaps that are annoying, I’d like network and firewall types for example, but creating both these types in a generic enough way to support multiple platforms would be, IMHO, a non-trivial problem. 

Another gap VCS/DVCS management. A lot of people use source code in repositories to do things with (including install stuff from you bad people – package things … it’s healthier). Puppet currently relies on creating and removing these repositories with the exec type (which executes scripts or binaries), for example:

exec { "svn co http://core.svn.wordpress.org/trunk/ /var/www/wp":
    creates => "/var/www/wp",
}

This a bit ugly and it’d be a lot easier to write a Puppet type to manage repositories. But Puppet types and providers are written in Ruby and really, really complex and hard to develop. Right? Right?

No. No, they are not… and I’m going to create a simple type and provider to show you. :)

Here’s a very (very!) simple Puppet type, called , for managing repositories. I’ve created providers for SVN and Git as examples also. The first part of the type the type itself – these are usually stored in lib/puppet/type or distributed via modules (see the PluginsInModules page in the Puppet wiki). I’ll create a file called .rb.

$ touch repo.rb

And then populate the file:

Puppet::Type.newtype(:repo) do
    @doc = "Manage repos"
 
    ensurable
 
    newparam(:source) do
        desc "The repo source"
 
        validate do |value|
            if value =~ /^git/
                resource[:provider] = :git
            else
                resource[:provider] = :svn
            end
        end
 
        isnamevar
 
    end
 
    newparam(:path) do
        desc "Destination path"
 
        validate do |value|
            unless value =~ /^\/[a-z0-9]+/
                raise ArgumentError , "%s is not a valid file path" % value
            end
        end
    end
end

So – pretty simple. We create a block Puppet::Type.newtype(:) do that creates a new type, which we’ve called .

Inside the block we’ve got a @doc string. This the documentation for the type. Add whatever level of detail and examples in here that required.

We’ve also got the ensurable statement. Ensurable provides some “automagic” that creates a basic ensure property. Puppet types use the ensure property to determine the state of a configuration item.

service { "sshd":
    ensure => present,
}

The ensurable statement tells Puppet to expect three methods: create, destroy and exists? in our provider. These methods, allow, respectively:

  • A command to create the resource
  • A command to delete the resource, and
  • A command to check for the existence of the resource

All we then need to do specify these methods and their contents and Puppet creates the supporting infrastructure around them but more on this when we look at our providers.

Next, we’ve defined a new parameter – this one called source.

    newparam(:source) do
        desc "The repo source"
 
        validate do |value|
            if value =~ /^git/
                resource[:provider] = :git
            else
                resource[:provider] = :svn
            end
        end
 
        isnamevar
    end

The source parameter will tell the type where to go to retrieve/clone/checkout our source repository.

In this parameter we’re also using a hook called validate. Normally used to check the value for appropriateness here we’re using it to take a guess at what provider to use. Our code says, if the source parameter starts with git then use the Git provider, if not default to the Subversion provider. This obviously fairly crude as a default and we can override this by defining the provider attribute in our resources:

provider => git,

We’ve also used another piece of Puppet automagic, isnamevar, to make this parameter the “name” variable for this type. In Puppet-speak, the value of this parameter used as the name of the resource.

(Types have two kinds of values – properties and parameters. Properties “do things”. They tell us HOW the provider works. We’ve only defined one property, ensure, by using the ensurable statement. Parameters are more like variables, they contain information relevant to configuring the resource the type manages rather than “doing things”.)

Finally, we’ve defined another parameter, path.

    newparam(:path) do
        desc "Destination path"
 
        validate do |value|
            unless value =~ /^\/[a-z0-9]+/
                raise ArgumentError , "%s is not a valid file path" % value
            end
        end
    end

This a variable value that specifies where the type should put the cloned/checked-out repository. In this parameter we’ve again used the validate hook to create a block that checks the value for appropriateness. Here we’re just checking, very crudely, to make sure it looks like the destination path a valid fully-qualified file path. We could also use this validation for the source parameter to confirm a valid source URL/location was being provided.

(You can also use another hook called munge to adjust the value of the parameter rather than validating it before passing it to the provider.)

And that it for the type.

Next, we need to create a provider for our type. Let’s start with a Subversion provider like so:

require 'fileutils'
 
Puppet::Type.type(:repo).provide(:svn) do
    desc "SVN Support"
 
    commands :svncmd => "svn"
    commands :svnadmin => "svnadmin"
 
    def create
        svncmd "checkout", resource[:name], resource[:path]
    end
 
    def destroy
        FileUtils.rm_rf resource[:path]
    end
 
    def exists?
        File.directory? resource[:path]
    end
end

Up front we’ve required the fileutils library, which we’re going to use a method from. Next, we’ve defined the provider as a block:

Puppet::Type.type(:repo).provide(:svn) do

We tell Puppet that this a provider called svn for the type called .

Then we use a desc method that allows us to add some documentation to our provider.

Next, we define the commands that this provider will use, here the svn and svnadmin binaries, to manipulate our resource’s configuration.

    commands :svncmd => "svn"
    commands :svnadmin => "svnadmin"

Puppet uses these commands to determine if the provider appropriate to use on a client, if Puppet can’t find these commands in the local path then it will disable the provider.

Next, we’ve defined three methods – create, destroy and exists?. Sounds familiar? Yep, these are the methods that the ensurable statement expects to find in the provider:

The create method ensures our resource created. It uses the svn command to create a repository with a source of resource[:name] (remember the source parameter in our type also the name variable of the type – we could also specify resource[:source] here too) and a destination of resource[:path] (the value of the path attribute).

The delete method ensures the deletion of the resource. In this case, it deletes the directory and files specified in the resource[:path] parameter.

Lastly, the exists? method checks to see if the resource exists. Its operation pretty simple and closely linked with the value of the ensure attribute in the resource:

  • If exists? false and ensure present, then create method will be called.
  • If exists? true and ensure set to absent, then the destroy method will be called.

In this case the exists? method checks if there already a directory at the location specified in the resource[:path] parameter.

So, let’s put all this together and create a resource with our new type. I’ve assumed you’ve already distributed your type and providers to Puppet. We can then create a resource like:

repo { "wp":
    source => "http://core.svn.wordpress.org/trunk/",
    path => "/var/www/wp",
    ensure => present,
}

Simple eh? We specify a resource, the source we wish to check out or clone from, the destination path and the ensure attribute (present or absent) and that’s it.

You can see the complete code for this type and its providers at my Puppet repository on GitHub. It’s obviously very basic but should be easy to extend to provide additional capabilities (and currently has no tests – my bad). You can find further documentation (in a lot more detail!) on creating your own types and providers at the Puppet wiki.

Output GitHub commits as unified diffs

September 20th, 2009

I use a lot of GitHub – in my view it’s the premier Git hosting solution out there and keeps getting better and better. Every now and again I stumble across a useful little trick or technique that I don’t know about. This one a means to output commits on as unified diff files. I’ve blogged it because I can’t find it anywhere in the documentation.

The trick very easy – simply add .patch to the end of any commit URL to output the commit as a unified diff patch file. For example, this commit becomes a diff. Simple. Enjoy.

Pro Git

August 7th, 2009

Congratulations to Scott Chacon on his new book Pro Git. I am highly jealous – this was one of the things I’ve really wanted to write about and was on my list of things to pitch to my publisher. As the nature of these things I am not as smart as I think I am and someone else also had the idea. :)

It’s a great little title and Scott being one of the GitHub guys well positioned to right the definitive book on Git.  You can find the whole book under a CC license at the Pro Git site but if you love the book then buy a copy so Scott gets some royalties and can afford to eat and all… :)

Stashing with Git

August 2nd, 2009

I use Git fairly extensively and one of the features I use and love stash. The stash command allows you to take point-in-time snapshots of development and “stash” them away. The source tree then returned to the current commit.

So let’s take a quick example. Stashing something I particularly do when I want to pull changes into a dirty tree.  I am working on a problem and someone over there has committed something that changes the environment. I don’t want to commit my half-done work but I do want to keep it whilst I add the other commits into my tree.  So I type:

$ git stash

This will “stash” away the changes in my dirty tree and leave it clean at the last commit.

I can then, for example, cherry-pick in a commit and then use git stash pop to add back in my changes.

$ git cherry-pick 9cc0589
$ git stash pop

This will re-apply my stashed changes on top of the new commit and remove the stash. If this causes conflicts then you need to edit them by hand.

You can store more than one stash too – the last operations will just pluck the last stash from the list – but you can store a collection. You can see what stashes are available by using:

$ git stash list
stash@{0}: WIP on features/master/dailytask: 2d74623... Added daily build task to Rakefile
stash@{1}: WIP on features/master/dailytask: 2d74623... Added daily build task to Rakefile
stash@{2}: WIP on master: f13f08d... Minor fix to URL for LDAP nodes documentation

You could then apply a particular stash by specifying the stash’s ID.

$ git stash pop stash@{2}

You can see more features and options at the git-stash man page.

Git … oh my

June 20th, 2009

In recent projects I’ve been using a lot of git and I love it.  As a distributed source control tool it’s brilliant.  This particularly true when you need to gather and manage a wide variety of disparate patches and commits.  On the projects I work on we get patches via a lot of paths:

  • Diffs attached to tickets
  • Diffs sent via email
  • Git branches and cherry picks

With the former two paths we (mostly me) have been cutting and pasting patch diffs into files or using wget.  We then use the patch command to apply the diffs to various branches and then commit the results.

With this approach we often lose track of the patches ownership and author.  This problematic from two perspectives – we can’t allocate credit where it due and when something goes wrong with a commit it’s often hard to track down the original author.

We obviously don’t have this issue with Git repositories and merging branches or cherry picking specific commits.  With the latter it easy to track patch authors and ownership – even through multiple merges and rebasing.  So I like it when people fork the and provide a commit or feature/ticket branch when supplying code.

But for the others I’m trying to bring our process a little closer to Git by using the git am and git apply commands (also git-am and git-apply – though the git-command syntax being deprecated) to pull in diffs and patches.  The git am command processes mailboxes (mbox and Maildir), mail messages or RFC 2822 formatted messages provided via standard input.  The git apply command applies a unified diff file to the current working tree.

Let’s start with the easy to use git apply command.  Let’s take one of our use cases: downloading a patch from our tracker and applying it.

First, we wget our patch file:

$ cd /tmp
$ wget http://projects.redmine.com/project/tickets/2222/patch2222.diff

We then change directory into our Git and check what branch we’re in.

$ cd ~/src/puppet
$ git branch
* master

We can now feed our patch file into git apply using something like cat and a pipe.

$ cat /tmp/patch2222.diff | git apply

This will add the patch to the current branch and commit it.

Another useful trick the –amend option.  If your current commit not yet pushed you can amend it.  Just make your required edits, git add the updated files:

$ git add filename

And then run:

$ git commit --amend

Git will populate your editor with the last commit message and you can update the commit.

We could also create a separate branch for our new commit.

$ git checkout -b tickets/master/2222
$ git apply < /tmp/patch2222.diff

Here we’ve directed the file straight into git apply rather than piped the output of the cat command.

We can then merge this into the appropriate branch when ready.

$ git merge tickets/master/2222

Don’t forget you can also use the git rebase command to ensure your branch rebased against the branch you’re merging into, to squash multiple commits or to redo the commits – git rebase (especially the –interactive switch) the business. :)

Using our second command, git am, we can also pull patches straight from a mail client.  For example, on Thunderbird, I open up the message I want to import then (on OSX – on PC it’s basically the same commands prefixed with Ctrl not Command):

  1. Command-U to show the full message including headers
  2. Command-A to select all of the message
  3. Command-C to copy the selection

I can then go to the command line and do:

$ cat | git am

Followed by a Command-V to paste the content, and then Control-D to end the cat and submit my patch to be committed.

The git am command quite clever.  The author of the commit will be pulled from the From line of the message, the date and time of the commit from the date and time the message was sent and the Subject and Body are used as the title and body of the commit message.

We can also add the -s switch to the git am command.  This adds a “Signed-off-by” line to the commit message using your details (usually name and email address).

$ cat | git am -s

It’s not perfect model yet – I probably should be using git am on a Maildir or mbox file which contains our patches but in our small development team I have the luxury of just selecting the patches and emails I want.

Migrating a Rails database from Sqlite3 to MySQL

August 25th, 2008

So when I first looked at Redmine I ran it up and used Sqlite3 as the database back-end. Then when I migrated our Trac data I just left Sqlite3 as the back-end database and migrated our data to that. With that startling lack of forethought aside, I always had the view the database should be MySQL because well:

a) I know it
b) I like it
c) It’s probably more scalable (IMHO)

So today I actually sat down to do the migration piece. I dumped out the sqlite3 database and tried to do some manual/scripted edits to convert it to something MySQL would import. Epic Fail.

So I tried the YAMLdb that abstracts database exports using YAML. A quick installation, some edits to config/database.yml, a rake db:dump and rake db:load and the data was moved:

… Create our database …

$ sudo mysql -p
mysql> create database redmine character set utf8;

… Grant privs to your chosen user …

mysql> GRANT ...

… Configure a test database for our new MySQL database …

$ vim config/database.yml

.. for Rails version 2.1 and later install the plugin …

$ sudo script/plugin install
git://.com/adamwiggins/yaml_db.git

.. for Rails versions less than 2.1 use …

$ sudo script/plugin install http://.com/adamwiggins/yaml_db.git

… Dump out the current production database …

$ sudo rake db:dump RAILS_ENV=production

… Load the freshly created db/data.yml file into our test database …

$ sudo rake db:load RAILS_ENV=test

… Reconfigure the application to point to the new MySQL database as production …

$ vim config/database.yml

… Start Redmine …

$ sudo /etc/init.d/mongrel_cluster start

Had one bad field I had to do some manual editing too – still not quite sure what was wrong with the field but whatever I did fixed it – but otherwise very smooth.

Started up and now Redmine runs perfectly with MySQL as the back-end!

Puppet’s BuildBot

August 24th, 2008

So rather than doing the work I actually should be I’ve been playing with BuildBot. I had intended to get around to setting up BuildBot sometime in the next couple of months but I got hooked.

The reason I wanted to have a look at BuildBot was that Puppet has reached a stage where we simply can’t test every platform it runs on. We are also starting to get patches from a wider variety of sources. Buildbot will allow us to execute our tests on a wider variety of platforms. Hopefully with the cooperation of the we can gather a really big collection of build platforms to test on.

Here’s the blurb for BuildBot

The BuildBot a system to automate the compile/test cycle required by most software projects to validate code changes. By automatically rebuilding and testing the tree each time something has changed, build problems are pinpointed quickly, before other developers are inconvenienced by the failure. The guilty developer can be identified and harassed without human intervention. By running the builds on a variety of platforms, developers who do not have the facilities to test their changes everywhere before checkin will at least know shortly afterwards whether they have broken the build or not. Warning counts, lint checks, image size, compile time, and other build parameters can be tracked over time, are more visible, and are therefore easier to improve.

The overall goal to reduce tree breakage and provide a platform to run tests or code-quality checks that are too annoying or pedantic for any human to waste their time with. Developers get immediate (and potentially public) feedback about their changes, encouraging them to be more careful about testing before checkin.

It’s a very easy tool to deploy. The hardest part has been the slightly broken Git source handling and the assumption that any Git repository local. I need to have a local Git repository to allow BuildBot to submit the right commits references to the PBChangeSource function.

But I designed a basic process for handling new commits:

1. Commit pushed to .
2. Commit bot at picks up commit and sends it to BuildBot Master.
3. BuildBot uses the git_buildbot.py script to calculate the before/after commit and branch references and tell BuildBot about them.
4. BuildBot executes the build and tells each slave to retrieve the commit and runs the tests. Currently we’re running:

a. All the Unit tests
b. All the RSpec tests

5. We then get the results of the tests on the website and in an email to the new Puppet Builds mailing list.

In addition I’ve also enabled BuildBot’s IRC bot and added a new bot, called pinocchio, to the #puppet channel that reports on build status.

At this stage it’s all in test mode and when I’ve ironed out a few issues we should be in a position to do a production installation at ReductiveLabs and start canvassing for build slaves.

UPDATE

After mucking around with Buildbot I just couldn’t get a whole bunch of issues with Git resolved so we changed to as our CI – which works much better.  The message overall – CI and Git: still a young pair.  I’ve included our configuration below for edification:

# -*- python -*-
# ex: set syntax=python:

# This a sample buildmaster config file. It must be installed as
# ‘master.cfg’ in your buildmaster’s base directory (although the filename
# can be changed with the –basedir option to ‘mktap buildbot master’).

# It has one job: define a dictionary named BuildmasterConfig. This
# dictionary has a variety of to control different aspects of the
# buildmaster. They are documented in docs/config.xhtml .

# This the dictionary that the buildmaster pays attention to. We also use
# a shorter alias to save typing.
c = BuildmasterConfig = {}

####### BUILDSLAVES

# the ‘slaves’ list defines the set of allowable buildslaves. Each element
# a tuple of bot-name and bot-password. These correspond to values given to
# the buildslave’s mktap invocation.
from buildbot.buildslave import BuildSlave

c['slaves'] = [BuildSlave("debian", "debian"),
BuildSlave("freebsd", "freebsd"),
BuildSlave("redhat", "redhat")
]

# ‘slavePortnum’ defines the TCP port to listen on. This must match the value
# configured into the buildslaves (with their –master option)

c['slavePortnum'] = 9989

####### CHANGESOURCES

# the ‘change_source’ setting tells the buildmaster how it should find out
# about source code changes. Any class which implements IChangeSource can be
# put here: there are several in buildbot/changes/*.py to choose from.

from buildbot.changes.pb import PBChangeSource
c['change_source'] = PBChangeSource()

####### SCHEDULERS

## configure the Schedulers

from buildbot import scheduler

stable = scheduler.Scheduler(name=”stable”, builderNames=["debian_stable", "freebsd_stable", "redhat_stable"], treeStableTimer=60, branch=”0.24.x”)
dev = scheduler.Scheduler(name=”dev”, builderNames=["debian_dev", "freebsd_dev", "redhat_dev"], treeStableTimer=60, branch=”master”)

c['schedulers'] = [stable, dev]

####### BUILDERS

# the ‘builders’ list defines the Builders. Each one configured with a
# dictionary, using the following :
#  name (required): the name used to describe this bilder
#  slavename (required): which slave to use, must appear in c['bots']
#  builddir (required): which subdirectory to run the builder in
#  factory (required): a BuildFactory to define how the build run
#  periodicBuildTime (optional): if set, force a build every N seconds

# buildbot/process/factory.py provides several BuildFactory classes you can
# start with, which implement build processes for common targets (GNU
# autoconf projects, CPAN perl modules, etc). The factory.BuildFactory the
# base class, and configured with a series of BuildSteps. When the build
# run, the appropriate buildslave told to execute each Step in turn.

# the first BuildStep typically responsible for obtaining a copy of the
# sources. There are source-obtaining Steps in buildbot/steps/source.py for
# CVS, SVN, and others.

from buildbot.process import factory
from buildbot.steps import source, shell

pstable = factory.BuildFactory()
pstable.addStep(source.Git(repourl=’git://.com/jamtur01/puppet.git’, branch=’0.24.x’))
pstable.addStep(shell.ShellCommand(command=’rake spec’, name=’Spec Tests’))
pstable.addStep(shell.ShellCommand(command=’rake unit’, name=’Unit Tests’))

pdev = factory.BuildFactory()
pdev.addStep(source.Git(repourl=’git://reductivelabs.com/puppet’, branch=’master’))
pdev.addStep(shell.ShellCommand(command=’rake spec’, name=’Spec Tests’))
pdev.addStep(shell.ShellCommand(command=’rake unit’, name=’Unit Tests’))

debian_stable = {‘name’: “debian_stable”,
‘slavename’: “debian”,
‘builddir’: “debian_stable”,
‘factory’: pstable,
}

debian_dev = { ‘name’: “debian_dev”,
‘slavename’: “debian”,
‘builddir’: “debian_dev”,
‘factory’: pdev,
}

redhat_stable = {‘name’: “redhat_stable”,
‘slavename’: “redhat”,
‘builddir’: “redhat_stable”,
‘factory’: pstable,
}

redhat_dev = { ‘name’: “redhat_dev”,
‘slavename’: “redhat”,
‘builddir’: “redhat_dev”,
‘factory’: pdev,
}

freebsd_stable = {‘name’: “freebsd_stable”,
‘slavename’: “freebsd”,
‘builddir’: “freebsd_stable”,
‘factory’: pstable,
}

freebsd_dev = { ‘name’: “freebsd_dev”,
‘slavename’: “freebsd”,
‘builddir’: “freebsd_dev”,
‘factory’: pdev,
}

c['builders'] = [debian_stable, debian_dev, freebsd_stable, freebsd_dev, redhat_stable, redhat_dev]

####### STATUS TARGETS

# ‘status’ a list of Status Targets. The results of each build will be
# pushed to these targets. buildbot/status/*.py has a variety to choose from,
# including web pages, email senders, and IRC bots.

c['status'] = []

from buildbot.status import html
c['status'].append(html.WebStatus(http_port=8010))

from buildbot.status import mail
c['status'].append(mail.MailNotifier(fromaddr=”buildbot@reductivelabs.com”,
extraRecipients=["puppet-build@googlegroups.com"],
sendToInterestedUsers=False))

from buildbot.status import words
c['status'].append(words.IRC(host=”irc.freenode.net”, nick=”pinocchio”,
channels=["#puppet"],
password=”password”))

# from buildbot.status import client
# c['status'].append(client.PBListener(9988))

####### DEBUGGING OPTIONS

# if you set ‘debugPassword’, then you can connect to the buildmaster with
# the diagnostic tool in contrib/debugclient.py . From this tool, you can
# manually force builds and inject changes, which may be useful for testing
# your buildmaster without actually commiting changes to your repository (or
# before you have a functioning ‘sources’ set up). The debug tool uses the
# same port number as the slaves do: ‘slavePortnum’.

#c['debugPassword'] = “debugpassword”

# if you set ‘manhole’, you can ssh into the buildmaster and get an
# interactive python shell, which may be useful for debugging buildbot
# internals. It probably only useful for buildbot developers. You can also
# use an authorized_keys file, or plain telnet.
#from buildbot import manhole
#c['manhole'] = manhole.PasswordManhole(“tcp:9999:interface=127.0.0.1″,
#                                       “admin”, “password”)

####### PROJECT IDENTITY

# the ‘projectName’ string will be used to describe the project that this
# buildbot working on. For example, it used as the title of the
# waterfall HTML page. The ‘projectURL’ string will be used to provide a link
# from buildbot HTML pages to your project’s home page.

c['projectName'] = “Puppet”
c['projectURL'] = “http://reductivelabs.com/trac/puppet/”

# the ‘buildbotURL’ string should point to the location where the buildbot’s
# internal web server (usually the html.Waterfall page) visible. This
# typically uses the port number set in the Waterfall ‘status’ entry, but
# with an externally-visible host name which the buildbot cannot figure out
# without some .

#c['buildbotURL'] = “http://10.0.0.x:8010/”

You can measure security! ? Observations of a digitally enlightened mind

November 27th, 2006

I have seen Amrit Williams speak a couple of times at Gartner. So I read with interest his recent wading into arguments on metrics.

I especially liked this mockery of the position that there are no metrics:

Apparently measuring like anal probing. Aliens, with their advanced technology, have cracked the space/time continuum but apparently the mysteries of the human rectum still elude them – metrics are like the ass of IT, with all our advances it still eludes us.

I confess I am a big fan of metrics – providing they are the right metrics and that they:

a) Actually measure something
b) Actually demonstrate the ROI on that the business getting for their dollars (and oh yes they are the businesses dollars never forget)
c) Can’t be gamed or played (see pretty much every Operational availability figure ever published)

Overall, it’s a well reasoned article and I look forward to his whitepaper on the topics of specific metrics. I recommend that if you have an interest in metrics that you give it, and the articles linked in it, a good read.

I’ve been around long enough to know a proposition

August 30th, 2005

I think John Brogden a cad and an idiot and an offensive git. I also think his racist comments about Helena Carr were reprehensible. But somewhat overlooked was this. Brogden was not only a racist but he sexually harassed a woman – making an unwanted and unwelcome physical advance. He should have been as equally chastised and punished for this act as well as his racist comments.

UPDATE: John Brogden tried to take his own life. This terribly sad and indicates that he needs serious . But it doesn’t change what he did. I am sorry for him and his family but his self-destructive behaviour was brought on himself by his own actions. All of us have to take responsibility for their own actions. So should he.

Fun with IDS

February 27th, 2005

For any of you who have an interest in IDS I have come across two articles that are fascinating reading. The first article an analysis of models and techniques for testing IDS signatures to ensure that they adequately match and detect attacks. It includes links to a broad collection of tools and further reading.

The second article a topic that greatly interests me – the use of Bayesian statistical analysis to reduce the volume of false positives detected by your IDS. Bayesian analysis already used by a number of anti-spam tools, such as SpamAssassin, to reduce the volume of email incorrectly marked as spam. It also appears to have solid applications in IDS tuning. That tuning can be a truly black art at times and finding the real attacks in amongst the network noise and false positives can be problematic. Traditional IDS uses signature matching, which tries to match network traffic against the signature of an attack. In many cases legitimate network traffic can also match attack signatures and thus incorrectly identified by the IDS as an attack. These false positives all need to be identified, checked and reviewed. This can add enourmous management and reporting overhead to an IDS solution.

With Bayesian analysis network data accumulated and analysed. During this phase both false positives and real attacks are flagged and feed into a statistical model. After a suitable volume of data accumulated patterns start to emerge in the model. These patterns reveal which traffic statistically identifable as a false positive and which a malicious attack. These patterns are then applied to new network data. The patterns provide a more accurate indicator of false versus malicious traffic and significantly reduce the number of false positives identified by the IDS. Additionally this new network traffic also feed into the model and analysed, further refining the model.

The results of a study run by the authors of the article indicate that using Bayesian analysis halved the number of false positives recorded by the IDS. Anything that so significantly increases the probability of your IDS only alerting on a genuine attack greatly enhances the of your networks. Further developments in this area should prove extremely interesting.