Deleting old Confluence backup files

After I installed the Confluence wiki, I discovered I need to do at least one thing to maintain it. By default, Confluence makes a backup of the site every morning at 2 a.m. The files will keep building up unless you do something about it. On a small Confluence site like mine, it would take a couple of years to fill the disk. But in case your site is large and you haven't already put something in place to prune the older backup files, here's a simple Unix shell script you can add to your Cron configuration.
# Script to remove the older Confluence backup files.
# Currently we retain at least the last two weeks worth
# of backup files in order to restore if needed.
find $BACKUP_DIR -maxdepth 1 -type f -ctime +$DAYS_TO_RETAIN -delete
Confluence backs up the site to a backups subdirectory. You can tell where that is (and change it if you want) under your Administration -> Daily Backup Admin page.

I put this script on my Linux box in the /etc/cron.weekly directory. Since I run the script only weekly, more files will build up than defined by the DAYS_TO_RETAIN variable, but my Confluence site is small and this doesn't matter to me. If your site is larger, you might want to put the script under /etc/cron.daily.

Confluence uses Quartz to schedule backups, so if you want to change the time from 2 a.m. (or make backups less or more frequent), see the Confluence Changing time of Daily Backup page for instructions.

Concurrency the Java 1.5 way

I attended a talk tonight on Java concurrency presented by Stuart Halloway at the Northern Virginia JUG that provided a refresher on the java.util.concurrent package. Stuart is one of the founders of Relevance, author of Component Development for the Java Platform, a frequent speaker at technical symposiums, co-author of Rails for Java Developers, and a great technical speaker.

Stuart Halloway
Stuart Halloway
Stuart spent the first few minutes telling us why he now focuses more on Ruby and Rails than on Java. Paraphrasing the title of a Java book written by his Relevance partner Justin Gehtland and Bruce Tate, Stuart says, "We can go even 'betterer,' 'fasterer' and 'lighterer' with some other technologies" like Rails and the Streamlined framework. However, when multithreading and concurrency are needed, Java way outshines the current state of Ruby and Rails, he said.

When considering multi-threading in order to increase the speed of a process, it is important to consider whether the slowness is due to the application being suspended while waiting for an external resource (e.g. a database, user input, disk), or whether the process is suspended while waiting for free CPU cycles. If the process is waiting for an external resource, Stuart said, the language and the number of CPUs won't matter much. "Java, assembly language and PHP all wait at the same speed," he said.

Stuart's talk covered:
  • Threads
  • Tasks and scheduling
  • Locking
  • Concurrent collections
  • Alternatives to threads
The key point to remember about Java threads is they share code, data, resources, and heap storage. They contain their own instruction pointer and stack. Threading isn't often needed in server-side programming because components like EJBs and JEE container services abstract the multi-threading away from the developer. But threading is often needed when you need to:
  • Keep a user interface responsive (think Swing)
  • Take advantage of multiple processors in compute-heavy applications
  • Simplify code that would otherwise need to keep checking if other tasks need to be performed (implementing their own task-scheduling loop)
Before Java 1.5, the Java language used Thread objects as the main way to achieve concurrency. Developers would write a class that implements Runnable and pass an instance to a Thread. Two of the shortcomings of the Runnable interface is its single method, run, doesn't return anything and it isn't declared to throw an exception to indicate anything went wrong. "It's completely wrong," Stuart said.

Java 1.5 introduced higher-level classes to allow more abstraction away from Thread objects. It introduced the Callable interface, whose call method does return something and is declared to throw an Exception. Programmers write Callable classes and pass instances to one of the three ExecutorService classes obtained by the Executors Java factory, or perhaps from an external library. The ExecutorServices provided by the Executors factory provide single-theading execution, and execution by two types of thread pools, a cached, expandable thread pool or a fixed-size thread pool.

When you give a Callable to an ExecutorService, you get back a Future object containing the results of the Callable's execution. The result can be an object or an exception that will be thrown. Stuart demonstrated code that exercised the new threading objects and shows how to use them. The code and the slides from his presentation is available online.

The Need for Locking

You don't need locks if you're just telling separate tasks to run concurrently. You need locking code when multiple threads access the same data at the same time. Java provides lock support with the:
  • synchronized keyword and blocks
  • Java 1.5 Lock interface objects, which offer an improvement over a straight synchronized block because you can tell the code to give up its attempt to acquire a lock after a timeout period expires.
  • ReadWriteLock interface, which offers separate locks for whether the process needs to read data or alter the data.
If you want, you can tweak how the ReadWriteLock operates, such as defining whether readers or writers get lock priority.

Concurrent Collection Options

Strategies and the implications of using concurrent collections: strategy and implications:
  • Do nothing
    It's fast, simple, but not thread-safe
  • Fail-fast iterators (introduced in Java 1.2)
    Fast, not thread-safe. Misuse of concurrent access probably will cause a fast failure. Fail fast uses optimistic locking: It assume everyone can access a shared resource and uses clean-up code if something goes wrong with multi-threaded writes. Java collections implement the fail-fast strategy by using version numbers that iterators use to see whether the collection has changed.
  • Lock the entire collection
    Simple, slow, might be thread-safe (like Hashtable)
  • Lock partial collection
    Complex, maybe faster, maybe thread-safe.
  • Copy on write
    Fast read access, may read stale data. When you write to a collection, you get new copy, so your write can proceed. Iterators for reading threads point to older collection, so data can be stale.
  • Immutable
    Fast, simple, thread-safe, cannot change objects.
  • Application-controlled locking
    Difficult, allows any combination of the above strategies.
Java Collections Design Choices

Collections and strategies
  • Legacy (pre-Java 1.2): Lock entire collection
  • Collections (1.2) API: Lock none, fail-fast iterators
  • Synchronized wrappers (1.2): Lock entire collection
  • ConcurrentHashMap: Lock partial collection
    Uses "lock-striping" to allow uses of different buckets in a hash.
  • CopyOnWriteArrayList: Copy-on-write
    Very expensive if using big arrays that are written to regularly. Every write to the collection copies it again. Only advantageous if data is read-mostly.
  • String: immutable
Alternatives to Threads

Alternatives, pros and cons:
  • Container-managed threads: Simple. inflexible
    Like J2EE containers. You write applications as if you are the only user of the object. Scales well because most data in server side is in the database. The DB controls concurrency.
  • Non-blocking I/O: Do work when available. Con is it as complex as using threads
    For example, the java.nio (1.4) package. Pro: Do multiple operations and notify me when done. Con: As complicated as threads. Oriented around blocking waits. Tends to get ignored when you're coding on the server-side.
  • Use multiple processes: Pro: simple. Con: inflexible
    When you need to perform more work, start more processes.
  • Event-driven code: Con: as complex as threads
  • Do nothing: Pro: simple. Con: slow (but performance might not matter for the application)
    "Probably more time has been wasted by optimizing code that doesn't need to be optimized."
Stuart also discussed the double-checked locking Java anti-pattern and why it is a problem. Heck, the perils surrounding the use of double-checked locking in Java have been known since what, 1997, when I think Java Developer's Journal published an article on it. But I've seen wickedly smart developers insert this potentially evil anti-pattern into their code out of ignorance of the subtle problem. I'm glad Stuart mentioned it as a reminder.

For Java developers interested in learning more about programming using concurrency, Stuart recommended Java Concurrency in Practice by Brian Goetz. The book mixes academic rigor on threading with practical implications for Java developers, he said.

Stuart also will be in town Wednesday night to speak at the Northern Virginia Ruby User's Group. He'll be talking about the Streamlined framework for rapidly developing CRUD applications in Rails.

Stopping Firefox from auto-searching from the address bar

Of the many wonderful features I enjoy in the Mozilla Firefox browser, the feature to automatically perform a Google "I'm feeling lucky" search whenever I accidentally type a bad URL or keyword into the address bar isn't one of them.

The feature in question is Firefox's default behavior to replace the word or words it finds in the address bar with the URL[address bar text]
when what you type in the address bar doesn't look like a valid URL or one of the "Quick Search" bookmark keywords. I really like the keyword search feature. I'm constantly typing goo <search terms> into the Firefox address bar to perform a fast Google search. Firefox comes predefined with a Google search keyword as one of the "Quick Search" bookmarks. (I shortened the default "google" keyword for simplicity.) Firefox also comes with predefined keywords to search Wikipedia (wp), an online dictionary (dict), a stock-price lookup (quote), etc.

The problem with Firefox's automatic "I'm feeling lucky" search arises when I mistype one of my keyword searches. For instance, if my Yahoo keyword search is "yah" but I accidentally type
ya spring mvc
instead of performing a Yahoo search for sites talking about Spring's MVC web framework, I end up, for instance, at one of the Spring forum pages that happens to have the word "ya'll" in it. Google's "I'm feeling lucky" search thinks that's the page I want because it has the words ya, spring, and mvc. (We can leave aside for now whether ya'll is a word.)

Since I don't usually want Google to select a search-result page for me, I don't often use its "I'm feeling lucky" search. Except, that is, when I have a typo in my Firefox URL or keyword search on new installations of Firefox before I disable the automatic search feature.

Here's how to tell Firefox not to perform an "I'm feeling lucky" search whenever it doesn't understand the address you type in:
  1. Put your cursor into the Firefox address bar (Ctrl-L is a fast way)
  2. Type about:config
  3. In the Filter text box, type "keyword" and hit Enter or wait a second. You'll see a line that says:
    keyword.enabled                default  boolean  true
  4. Double click on this line. The line will become bold and the value will change from true to false to indicate the feature is now turned off.
  5. You're done
Now, whenever you type the wrong keyword or a bad URL into Firefox, you'll see an Alert box that says the URL is not valid and cannot be loaded. To me, that's what I'd expect Firefox to do rather than take me to some semi-random, unexpected page.

After you type the "about:config" and filter the results to the "keyword" configuration settings, you'll also see the "keyword.URL" preference setting. That's the URL Firefox uses to take you to the "I'm feeling lucky" Google search. If you like the Firefox auto-search feature but want to change the search URL, you can change the value by right-clicking on the line and selecting "Modify."

This level of configurability in Firefox is one of those features I really like. Instead of the typical software attitude of, "You don't like our default behavior? Tough!" Firefox lets you change many preferences to suit your own likes and dislikes.

JNDI error with Roller weblogger on Fedora Core 5

While planning my upgrade from Roller 2.1 to 2.3, I ran into an unexpected snag in Tomcat 5.5 running on Fedora Core 5. After I installed Roller 2.3 on my Fedora server and upgraded my database, Tomcat couldn't seem to create the <Resource> for the JDBC datasource. My catalina.log file had this unrevealing (to me) error:
Sep 22, 2006 9:23:29 AM org.apache.catalina.core.NamingContextListener addResource
WARNING: Failed to register in JMX: javax.naming.NamingException: Cannot create resource instance
Sep 22, 2006 9:23:47 AM org.apache.catalina.core.StandardContext start
SEVERE: Error filterStart
Sep 22, 2006 9:23:47 AM org.apache.catalina.core.StandardContext start
SEVERE: Context [/roller] startup failed due to previous errors
Exception in thread "Thread-9" java.lang.NullPointerException
The "Error filterStart" message seems to be Tomcat telling me that it couldn't start one of Roller's servlet filters after WorkerThread threw the NPE. Nowhere was Tomcat telling me, however, what resource it was trying to add at the time.

The messages in the roller.log file showed more helpful information. Roller obviously couldn't pull the DataSource from JNDI using the naming context java:comp/env/jdbc/rollerdb:
INFO  2006-09-22 09:23:30,239 RollerConfig:<clinit> - successfully loaded default properties.
INFO  2006-09-22 09:23:30,293 RollerConfig:<clinit> - successfully loaded custom properties file from classpath
INFO  2006-09-22 09:23:30,297 RollerConfig:<clinit> - no custom properties file specified via jvm option
WARN  2006-09-22 09:23:30,347 RollerContext:upgradeDatabaseIfNeeded - Unable to access DataSource
javax.naming.NamingException: Cannot create resource instance
[stack trace]
[Hibernate logging statements]
INFO  2006-09-22 09:23:41,904 NamingHelper:getInitialContext - JNDI InitialContext properties:{}
FATAL 2006-09-22 09:23:41,914 DatasourceConnectionProvider:configure - Could not find datasource:
javax.naming.NamingException: Cannot create resource instance
[stack trace]
ERROR 2006-09-22 09:23:41,928 RollerFactory:setRoller - Error instantiating
[stack trace]
Caused by: org.apache.roller.RollerException
... 31 more
FATAL 2006-09-22 09:23:41,938 RollerFactory:setRoller - Failed to instantiate fallback roller impl
java.lang.Exception: Doh! Couldn't instantiate a roller class
Still, the log messages weren't pointing me to why the datasource wasn't getting registered. I checked and double-checked the spelling from my Tomcat roller.xml context file and couldn't find any typos or bad configuration settings:
My roller configuration seemed correct. My initial Google search didn't turn up a solution, and neither did my search of the Roller FAQs or the Apache roller-user mailing list. But I did notice that if I google the exact string, "WARNING: Failed to register in JMX: javax.naming.NamingException: Cannot create resource instance," from the Tomcat log, the two search returns both mentioned Fedora Core. My environment:

O/S: Fedora Core 5 (Linux kernel 2.6.17)
Server: Tomcat 5.5.15 (from FC5 "core" repository)
JVM: Sun's Java 1.5 (build 1.5.0_07-b03)
Database: MySQL 5.0.22 (from FC5 repository)

I thus shortened some of my search terms and added "Fedora Core 5" into the search, and came across this message from PKR Internet's task list for its "Taskjitsu" product that pointed right at the problem. According to that message, Tomcat 5.5 is built so the default datasource naming factory is org.apache.tomcat.dbcp.dbcp.BasicDataSourceFactory from the naming-factory-dbcp.jar JAR file. But the Fedora Core install package for Tomcat 5.5, tomcat5-5.5.15-1jpp_6fc, does not ship that JAR or its factory class in any other JAR file.

There are two ways to solve the problem. The first is to add a factory attribute to the Roller webapp's roller.xml context with a <Resource> element that defines the Jakarta commons DBCP BasicDataSourceFactory class:
The second way is to grab a copy of naming-factory-dbcp.jar from a binary distribution of Tomcat 5.5 and install it in Fedora Core's Tomcat common/lib directory (default /usr/share/tomcat5/common/lib). I don't know if one solution is preferable to the other. Both solutions work, and the latter solution probably will resolve the same issue for other web applications. However, it seems adding two JARs that might share some of the same classes could lead to future ClassCastExceptions if for some reason the order of the JARs change in a classpath search. This event doesn't seem likely, though, as long as Tomcat controls the search order for classes in common/lib. Comments appreciated on whether one solution is better than the other.

After I found these solutions, I was going to post it as an FYI to the roller-user mailing list. Before I did, I poked again through the list archives to see if someone else already mentioned it. Sure enough. I found this email from Conor P. Cahill, posted July 6, by searching for fedora. Conor's subject line was "MySQL Database connector problems," which is why I think I missed it on my first search.

His email detailed the problem and proposed adding the factory attribute to the <Resource> element.

Since I didn't find Conor's email on my first search for a solution, I thought I'd post the problem and solution here, with the log errors, in the hope it helps other Fedora Core users.

Atlassian Branches Into CI and SSO

Last night, I attended what was billed as the first-ever Atlassian user-group meeting. Scott Farquhar, one of the founders of Atlassian Software Systems in Australia, was here in northern Virginia for the event.

One of the more interesting segments of the evening was Farquhar's roadmap of future Atlassian products and what's coming in new versions of JIRA and Confluence. In addition to those issue-tracking and wiki products, Atlassian will be releasing a continuous integration product called Bamboo (available for download in early beta form), and a single sign-on and identity management product called Crowd. Both products will be priced in the $1,000 to $5,000 range.

One of the key motivators for Bamboo was that existing CI products, like CruiseControl, are complex to install and configure, Farquhar said. A goal of Bamboo is to be up and running in five minutes.

Crowd will be Atlassian's release of Authentisoft's IDX single sign-on product, developed in J2EE. Atlassian acquired Authentisoft earlier this month. More about the IDX acquisition is available on this TSS discussion thread.

For existing products, Farquhar said coming in JIRA 3.7 will be project roles and Issue Navigator views. Version 3.8 will support hierarchical project categories. Internally, he said, JIRA 4.0 will be built using Maven 2, and more of the base functionality will be pushed into plugins for easier customization. Coming in Confluence 2.3 will be a clustered version to scale to several thousand users (with the help of Tangosol's Coherence clustered caching product), and a people directory to view and find other wiki users.

It was interesting to hear that Confluence has a bigger need to scale than does the more popular JIRA issue tracker. Most JIRA installations manage projects for a division, he said, but companies are installing Confluence to be their corporate-wide collaboration tool, so it needs to be clustered. Because more large companies are using Confluence, Farquhar said, version 3.0 will add improved WYSIWYG page editing, as well as LDAP support, better backup and restore, and a simple installer.

Also as part of the evening, Jonathan Nolen from Atlassian talked about the latest JIRA and Confluence plugins. Some of the plugins, like embedding an Excel document in a wiki page and displaying a calendar from an iCal file, look downright useful.

Subversion Best Practices Notes

These are notes from a webinar on Subversion best practices, conducted Aug. 30, 2006 by CollabNet. Now that I've cleared the cobwebs from this blog, I'm posting them here for future reference.

Presenter: Garrett Rooney, author of Practical Subversion, and a Subversion project committer.

Two organizational strategies: unstable trunk vs. stable trunk:

Unstable trunk
  • Development occurs in unstable trunk
  • All developers are exposed to your changes immediately, so find problems faster
If you use unstable trunk, here's what helps make it work:
  • Atomic commits (one change does one thing)
  • Developers who pay attention by watching what gets committed
Stable trunk
  • In stable trunk, you develop on branches and merge it back into the trunk when stable
  • Bugs don't disrupt people as much, but bugs don't get exposed as fast
  • On big teams, you could prefer a stable trunk because a bug committed to trunk could inconvenience hundreds of people. An alternative is to have each team work in their own unstable trunk.
Best Practices
  • PREFER an unstable trunk when possible
    Getting more eyes on change is a good thing. If you have good tests, people unlikely to commit serious problems into trunk.
  • For each project, create a trunk, branches and tags directory
    • Branches are copies of the trunk stored in the "branches" directory
    • Same with tags
    • This is just a convention, but good because people are used to it
  • Encourage frequent, small commits
    • If it doesn't get committed, it doesn't exist
    • Avoids lost code due to disk failure, accidental deletes
    • But you need to be confident your changes won't break things. Good tests help here
  • Each commit should target one goal
    • Prefer smaller changes -- each change done for one goal
    • Makes changes easier to understand, and simplifies merging between branches.
  • Goal is to optimize changes for understandability
  • Make sure people understand the change:
    • What did you change
    • Why did you change it?
    • What files and functions modified?
      • That way, can search change logs for specific functions
  • Write good commit log messages
    • Keep log message in your head if small changes
    • Write log messages as you go
    • Tools to simplify: tools/dev/svn-del.el (emacs macro)
  • Tests are critical for unstable trunks
    • Good tests that all developers can run will help
    • Create a culture where breaking tests is just as bad as breaking build
    • Consider test-first development
  • Commit notifications should go to all developers
    • Developers need to know what changes are being made
    • Use post-commit hook script to email changes
    • Use project-specific mailing lists
    • Email notification makes it harder for people to ignore changes
    • Creates forum for review, discus pros/cons of changes
  • Avoid branching when possible
    • But don't be afraid to branch when it's needed

Use branches when:
  • Long-term refactoring that will breaks things
  • New modules that won't work for some time
  • Release branches
  • Long-lived development branches
  • Experimental branches
Branch risks:
  • Creates extra work because you'll need to merge
  • Divides developer attention by separating code bases
  • Means you have to worry about merging changes
  • Avoid branching when possible
Reducing pain of branches:
  • svn doesn't track merges (yet).
  • Remember when you branched
  • Record what you're merging whenever you commit
  • Tools like can help. We use it on svn project to automate merges
  • Keep branches short-lived: changes are fewer, merges easier
  • Regularly merge back from branch to trunk
  • Don't make gratuitous whitespace/formatting changes It makes merges harder later on

Free Confluence Personal Wiki

When I was looking for a Java wiki application to install on my personal Tomcat server, I saw that Atlassian offers a free "personal" version of its Confluence wiki server. Confluence is a super-wiki application that uses a database backend. Instructions are provided to setup Confluence with various open-source and commercial databases. It talked to my PostgreSQL server with no problem.

Now that I've been using Confluence for a couple of weeks, I recommend the free personal version for people wanting to install a wiki for their own use (i.e. one not meant for full team collaboration). It was fairly easy to install, and the features are nice even for personal use. Atlassian's personal license allows you to create two registered users with full access, and allows an unlimited number of anonymous visitors. The license is perpetual, and allows one year of upgrades.

The Confluence interface is nice. The wiki provides all the expected simple formatting features (heading levels, bold, italic, underline, bulleting/numbering, text colors, tables) to avoid having to use HTML. Other nice features:
  • Blogging support
  • Add multiple RSS feeds to a single page for a quick blog browse
  • Creates RSS feeds from your Wiki pages (new/updated pages, new comments, new blog posts)
  • Formatters to view Java code, XML, etc. in a nicer format
  • Task/ToDo list
  • Search
  • Organize sections of the site using "spaces"
  • Nice formatting features, like adding boxed panels for "Warning," "Info," "Note" sections to highlight blocks of text
  • Extensibility through plugins
I had previously used JSPWiki, which I certainly liked. Confluence, however, shows the polish and extra features you'd expect in a commercial product.

Now, I suppose the question Atlassian would like answered is, would I buy Confluence for $1,200 to $8,000? (Atlassian prices Confluence based on number of full-access registered users.) If I needed the extra features, especially the way Confluence integrates with Jira, Atlassian's more well-known bug-tracking product, I'd probably say yes. But if I just needed a way for my small team to collaborate in a shared web space, with RSS feeds to keep the team updated, I might want to save costs and use one of the freeware wikis.

As a side note, I just noticed that Atlassian will be in town (Washington, D.C.) Tuesday for a user group meeting. The meeting will look at using Jira and Confluence on agile projects. I hope to have time to stop by and see what's up.

An Unwitting Intro to SELinux

I recently upgrade my old RedHat 9 server to Fedora Core 5 and unexpectedly had to explore some of the intricacies of Security Enhanced Linux. The need to upgrade my home server came after I wanted to expose some public services, like a web server, to remote users. I was leery of opening ports on a server I hadn't upgraded for more than two years. Upgrades to RedHat 9 were easy when I first installed it. The O/S came with the RedHat Network, a service that allowed me to upgrade packages with the click of a button. But then RedHat dropped the free RHN service, and I left my packages to languish in old, and probably insecure, states.

When I wanted to add a public web server to the box, I first considered trying to figure out how to upgrade its RH9 packages. I found that during my period of upgrade negligence, kind folks have replaced the RedHat Network by offering "legacy" upgrades to all sorts of packages, built for RedHat 9. I could even install the Yum package-management utility to make upgrading easier.

Rather than go the easy route, and since I wanted to upgrade my 2.2 kernel anyway, I bit the bullet and downloaded the FC5 ISOs. From what I read about upgrading versus a fresh install, I went with a full install. Developers who went the upgrade route from RH9 reported big problems. I began by backing up important data onto a second hard drive (formatted as ext3) installed in the box -- which was the start of my upgrade problems, since the second hard drive is 80 GB and the computer's BIOS doesn't recognize anything larger than 32 GB. But I optimistically believed if RedHat 9 could work with this drive, certainly Fedora Core 5 could.

The install went relatively well on my old Micron Pentium III, maxed out at 384MB of RAM (did I mention it was old?). I tried the graphic installation, but that failed and completely locked the computer, requiring a hard reset. The Fedora Core 5 installer couldn't handle the video built into the motherboard. It didn't fail right away, of course. It first walked me through several steps before locking the computer. Once I went to the text-only install (which was more familiar to me after years of installing versions of RedHat Linux since its 4.x days), the install proceeded mostly uneventfully.

With FC5 installed, I tried to mount the second hard drive so I could restore my precious data. I used the *exact* same fstab entry I had used under RedHat 9 -- which should work, right? Nope. I got an error messages about the drive not being formatted. I will mention that I was ultra-cautious during the install not to format this second device, hdb. I spent lots of time with online manuals, including the Large Disk HOWTO and the man pages for parted, fstab and mount.

After trying many different ways to mount the drive, a Maxtor 4K080H4, and using the parted command to examine the partition table, the mount command seemed to get confused by a tiny DOS partition on the disk. The DOS partition holds the bootable MaxBlast utility I had used on an older Linux box. The computer would boot to MaxBlast, which would perform some code to fool the BIOS into recognizing the 80 GB hard drive, then it would boot Linux.

The magic solution came with adding a boot parameter, hdb=remap. From what I've read online, it appears the newer Linux kernels don't automatically skip over the MaxBlast hard drive utility. The Large-Disk-HOWTO manual even mentions using "hda=remap" as a way to solve some problems, but on my initial readings, I didn't realize those problems were my problem because I wasn't booting off of this second drive, just trying to mount it.

On to the SELinux issues.

FC5 comes with SELinux (Security Enhanced) built into the kernel. I had heard of SELinux, but I always had thought of it as its own distribution and not something other distros were using. The security enhancements are defined as policies, which tell the kernel what an application is allowed to do, what types of files it can access, where an application should be run from, and probably many more settings. Looking back at all the SELinux problems I have had, I conclude that every one of my SELinux problems stem from not understanding its complexities. This seems to be the case for most SELinux users out there. In trying to solve my issues, I have read several mailing lists and blogs where users or vendors conclude that the way to solve SELinux problems was to disable it.

My first SELinux problem came with configuring Apache httpd. I configured httpd's DocumentRoot in httpd.conf to a directory in my second hard drive, the very one I had problems mounting initially. I'd start httpd and it would tell me it couldn't access the document root directory. All permissions (that I knew of) were set correctly. There was no reason at all httpd shouldn't be able to read its document root directory. No reason -- except for SELinux.

When SELinux on FC5 sees and stops a security violation, it writes an entry into /var/log/messages, but otherwise gives no other indication that it intervened and silently halted some activity. Fedora Core 6 is supposed to improve on this silent failure mode by introducing setroubleshoot. After I learned to check the system log for SELinux-related failures, here's what I found:

kernel: audit(1154839467.971:11): avc:  denied  { search } for  pid=29238 comm="httpd" name="/"
dev=hdb5 ino=2 scontext=user_u:system_r:httpd_t:s0 tcontext=system_u:object_r:file_t:s0 tclass=dir
It took me a while to learn to read these entries. It means the "httpd" application was denied permission to search (the "x" in the usual directory mode permission) the top-level directory on the hdb5 mounted file system. The name="/" threw me off at first because "/" is mounted on a completely separate hard drive. Apparently, SELinux knows only that the directory in question was the top level on the device it was trying to search, and doesn't bother to let you know the actual mount path to it.

The scontext is the source context SELinux was expecting to find for that resource. This context is defined in an SELinux policy somewhere, which defines what "httpd" should be allowed to do. The tcontext was the target context SELinux actually found for the resource it was trying to access. In the SELinux view, every file, directory and resource on the computer has a security context. You can configure SELinux to define what should be able to access what.

My conclusion: SELinux is pretty spiffy -- until you violate a security policy and can't figure out how to solve it. SELinux on FC5 seems like a complicated burglar alarm: you don't use it because you can't figure out what all the buttons are for. The lack of good SELinux documentation is defeating the SELinux goal, which is to protect the system from intentional attacks and unintended goofs. I'm hoping FC6's release, planned for next month, will make SELinux on Fedora simpler to use -- and more transparent when it intervenes. A burglar alarm won't ring when it's shut off.