[Review] FreeBSD 9.0

What’s New in FreeBSD 9.0

This article provides an overview of some of the new features available in FreeBSD 9.0.
FreeBSD 9.0-RELEASE introduces many new features which benefit FreeBSD users, application developers, and companies that use or base their products on FreeBSD. This article provides an overview of some of these features, including references to additional information. It does not list all of the new features as the FreeBSD 9.0 Detailed Release Notes, available from freebsd.org, contains a summary of all the changes introduced in 9.0.
This article discusses features in the following categories: security, compilers and testing frameworks, filesystems and storage, networking, and miscellaneous.

Security

Capsicum Framework

Capsicum is a lightweight framework which extends a POSIX UNIX kernel to support new security capabilities and adds a userland sandbox API. It was originally developed as a collaboration between the University of Cambridge Computer Laboratory and Google, sponsored by a grant from Google, with FreeBSD as the prototype platform and Chromium as the prototype application. FreeBSD 9.0 provides kernel support as an experimental feature for researchers and early adopters. Application support will follow in a later FreeBSD release and there are plans to provide some initial Capsicum-protected applications in FreeBSD 9.1.
Traditional access control frameworks are designed to protect users from each other through the use of permissions and mandatory access control policies. However, they cannot protect the user when an application, such as a web browser, processes many potentially malicious inputs, such as HTML, scripting languages, and untrusted images. Capsicum provides application developers fine-grained control over files and network sockets to provide privilege separation within an application, with minimal code changes. In other words, it provides application compartmentalization, allowing the application itself to provide many different sandboxes to contain its various elements. As an example, each tab in the Chromium browser has its own sandbox; it is also possible to contain each image in its own sandbox. Creating sandboxes under Capsicum does not require privilege, a key problem with current UNIX sandbox approaches.
As an example, the insecure tcpdump application can be sandboxed with Capsicum in about 10 lines of code and the Chromium web browser can be sandboxed in about 100 lines of code. capsicum(4) provides an overview of the available system calls. More information, including links to technical publications, projects, and a mailing list, can be found at the Capsicum website: http://www.cl.cam.ac.uk/research/security/capsicum/.

Resource Limits

rctl(8) has been added to the system, allowing the user to display the current resource limits and to define what action will occur when a process exceeds it limits. Resource rules can be applied to processes, users, login classes, or jails. The racct API tracks per-process, per-jail, per-loginclass, and per-user resource accounting information. More information about resource limits and rctl can be found at http://wiki.freebsd.org/Hierarchical_Resource_Limits.

Compilers and Testing Frameworks

LLVM Compiler Infrastructure

LLVM is a BSD-licensed compiler infrastructure with similar capabilities to the GPL3-licensed GCC compiler collection. Clang is the C, C++, Objective C, and Objective C++ front-end to LLVM and provides an alternative programming environment for developers and companies who prefer to use a BSD-licensed toolchain.
In addition to being BSD-licensed, Clang improves developer productivity with significantly improved error messages and a static code analyzer. The compiler is easily extendable to support research on new language features or code instrumentation.
Beginning with FreeBSD 9.0, the FreeBSD kernel and world can be compiled using Clang on most of the supported architectures. Work is ongoing to migrate the ports infrastructure so that any port can also be compiled with Clang. Details about architecture support, link time optimizations, automatic test generation, and links to additional resources can be found at http://wiki.freebsd.org/BuildingFreeBSDWithClang. More information about Clang can be found at http:// clang.llvm.org/ and more information about LLVM is available from http://www.llvm.org/.
A video of Brooks Davis describing how the FreeBSD Project has been actively working to incorporate tools from the LLVM project into the base system is available at YouTube.

You can follow the status of the ports infrastructure with regards to Clang at http://wiki.freebsd.org/PortsAndClang.

Userland Ttrace

DTrace is a general purpose, lightweight tracing framework that allows administrators, developers, and users to investigate causes of system failure or performance bottlenecks. FreeBSD introduced kernellevel DTrace support in FreeBSD 8.0. The addition of user-level DTrace suppport in 9.0 allows inspection of userland software and its correlation with the kernel, thus providing a much better picture of what exactly is going on behind the scenes.
http://wiki.freebsd.org/DTrace provides examples for using both kernel- and user-level DTrace on FreeBSD, as well as links to other DTrace resources.

Filesystems and Storage

Highly Avaliable Storage (HAST)

The Highly Available Storage framework allows for synchronous, block-level replication of any storage media across several physically separated machines connected by a TCP/IP network. HAST can be understood as a network-based mirror, similar to Linux DRBD. When combined with FreeBSD’s carp(4), HAST makes it possible to build a highly available storage cluster that is resistant to hardware failures.
HAST is file system and application independent and can be combined with any existing GEOM class. In case of a primary node failure, the cluster will automatically switch to the secondary node, check and mount the UFS file system or import the ZFS pool, and continue to work without missing a single bit of data.
The FreeBSD Handbook describes how to configure HAST: http://www.freebsd.org/doc/handbook/disks-hast.html.

SU+J

Journaled softupdates for UFS is now the default filesystem type. It adds a light version of journaling to soft updates as described in this technical paper: http://www.mckusick.com/BSDCan/bsdcan2010.pdf. This significantly reduces boot time after an improper shutdown as a background fsck only needs to be run if there is a corruption of the journal log.

ZFSv28

FreeBSD 9.0 ships with ZFSv28. This version of ZFS adds the following features:

  • deduplication: the process of eliminating duplicate copies of data. When enabled on datasets with duplicate data (for example, virtual images or jails), deduplication saves space and increases performance because less data is written and stored.
  • triple parity RAIDZ: RAIDZ3 offers three parity drives and can operate in degraded mode if up to three drives fail with no restrictions on which drives can fail.
  • zfs diff: command which describes which file system level changes have occurred between two snapshots.
  • zpool split: allows an administrator to extract one disk from each mirrored top-level vdev and use them to create a new pool with an exact copy of the data. The new pool can then be imported on any machine.
  • snapshot holds: permit users or applications to place holds on snapshots to prevent them from being deleted.
  • zpool import -F: allows the administrator to rewind a corrupted pool to an earlier transaction group.
  • the ability to import zpool as read-only.

Generic GEOM I/O Scheduler Framework

This framework supports scheduling disk I/O requests in a device independent manner in order to support multiple disk I/O schedulers to be used on different I/O providers. The framework provides a couple of sample scheduling algorithms that use the framework and implements two forms of anticipatory scheduling.
The ability to create different I/O schedulers allows users to select the I/O scheduler best suited to the task. This can increase responsiveness in certain kinds of I/O workloads, such as a mix of sequential and random I/O.
Examples of how to use the provided schedulers can be found at http://svnweb.freebsd.org/base/head/sys/geom/sched/README?view=markup&pathrev=206497.

Changes to CAM and AHCI SATA

The new ATA/SATA driver supports AHCI-compliant hardware, port multipliers, and NCQ (tagged queueing) for increased performance on modern SATA drives. Performance has been greatly increased, larger data transfers are supported, and hot-plugging support is much improved. ATA/SATA drives can now can be enumerated and manipulated via camcontrol(8), just like SCSI drives.
The cam(4) subsystem is now modularized and the addition of the ATA/SATA modules allows the CAM subsystem to grow into a framework for arbitrary transports and protocols. It also allows drivers to be written to support discrete hardware without jeopardizing the stability of non-related hardware.

Changes to Event Timer Infrastructure

The new event timers infrastructure provides unified APIs for writing event timer drivers and for choosing the best possible drivers by machine independent code. It provides support for both per-CPU and global timers in periodic and one-shot modes for the i386 and amd64 architectures. To improve performance in virtual machines and power usage in laptops, dynamic tick mode is enabled by default, replacing the periodic hardware timer interrupt ticking with one-shot variable-time ticks. This saves CPU time which would otherwise be spent handling timer interrupts which have no work assigned to them. Tickless mode can be turned off by setting the sysctl value of kern.eventtimer.periodic to 1. Technical details about dynamic tick mode can be found at http://permalink.gmane.org/gmane.os.freebsd.architechture/13276.

Networking

Five New TCP Congestion Control Algorithms

The Centre for Advanced Internet Architectures at Swinburne University of Technology, with the support of the Cisco University Research Program Fund at Community Foundation Silicon Valley and the FreeBSD Foundation, delivered enhancements to FreeBSD’s TCP stack in order to support newer congestion control algorithms. These enhancements included a modular framework for adding future algorithms as well as new modular implementations of the H-TCP, CUBIC, Vegas, HD, and CHD algorithms.
Each congestion control algorithm is implemented as a loadable kernel module. Algorithms can be selected to suit the application/network characteristics and requirements of the host’s installation. The modular framework makes it much easier for developers to implement new algorithms, allowing FreeBSD’s TCP stack to be at the forefront of advancements in this area, while still maintaining the stability of its network stack.
Links to technical papers regarding the framework and algorithms can be found at http://caia.swin.edu.au/freebsd/5cc/.

“IPv6-Only”

FreeBSD has been on the leading edge of IPv6 development ever since FreeBSD 4.0 was released in 2000 with the KAME reference implementation of IPv4/IPv6 networking support. In addition, the FreeBSD Project has been serving releases from IPv6-enabled servers for more than 8 years and FreeBSD’s website, mailing lists, and developer infrastructure have been IPv6-enabled since 2007.
Beginning with FreeBSD 9.0, no-IPv4 snapshots of FreeBSD are available. By completely decoupling IPv6 from IPv4, early adopters and developers can determine if “IPv6-ready” applications really are ready for IPv6 or if bugs were hidden due to the ability to fallback on IPv4. Providing an implementation of an IPv6-only kernel without IPv4 support provides the FreeBSD Project with the ability to test and fix such regressions while encouraging other software developers to improve their code for true IPv6 readiness. More information about no-IPv4 versions of FreeBSD is available from http://www.freebsd.org/ipv6/.
To support IPv6-only, rtadvd(8) and rtsold(8) were completely overhauled to support RFC 6106. rtsold can now update /etc/resolv.conf using the openresolv DNS management framework (http://roy.marples.name/projects/openresolv). An optional kernel module is available to provide Secure Neighbor Discovery protocol (SeND) support; SeND is described in RFC 3971.
Continuing earlier efforts, more global options can now be controlled on a per-interface base, such as the ability to accept router advertisements on one interface while still forwarding. This is needed to effectively run FreeBSD as an IPv6 CPE device. The single /etc/rc.conf option ipv6_ cpe_wanif will correctly set all sysctls and interface options to make creating a CPE as easy as possible.

High Performance SSH (HPN-SSH)

OpenSSH is network performance limited by statically defined internal flow control buffers. These buffers often end up acting as a bottleneck for network throughput of SCP, especially on long and high bandwith network links. HPN-SSH adds support for dynamically adjusted buffers to allow the full use of the bandwidth of long fat pipes such as 100Mbps or greater, trans-oceanic, or trans-continental links. Bandwidth-delay products up to 64MB are also supported. This implementation includes a multithreaded cipher implementation which makes such bandwidth sustainable on the CPU side.
HPN is enabled by default in FreeBSD 9.0‘s sshd and several HPN options have been added to /etc/ssh/sshd_ config. These options, as well as some performance tips, are described in http://svnweb.freebsd.org/base/head/crypto/openssh/README.hpn?revision=224638&view=markup.

Miscellaneous

Several other features are also worth mentioning:

  • large-scale SMP support for systems with more than 32 CPUs. Previously, the kernel structures were unable to account for such a large number of CPUs so this method implements extensible CPU accounting. Yahoo! provided systems for testing these changes.
  • improved USB 3.0 support.
  • the default NFS client and nfsd(8) now support NFSv4. Backwards compatibility for older NFS clients is provided with the oldnfs mount type.
  • a new kernel-mode NFS lock manager has been added, improving performance and behavior of NFS locking. A new clear_locks(8) command has been added to clear locks held on behalf of an NFS client.
  • sysinstall has been replaced with bsdinstall. Its features are described at http://wiki.freebsd.org/BSDInstall and its usage is detailed in the FreeBSD Handbook: http://www.freebsd.org/doc/en_US.ISO8859-1/books/hand book/bsdinstall.html.
  • the kernel now supports a new textdump(4) format of kernel dumps. A textdump provides higher-level information via mechanically generated/extracted debugging output, rather than a simple memory dump. This facility can be used to generate brief kernel bug reports that are rich in debugging information, but are not dependent on kernel symbol tables or precisely synchronized source code.
  • FreeBSD 9.0 can be installed on the Sony Playstation 3 using the instructions at http://people.freebsd.org/~nwhitehorn/ps3/README.
  • call and return rule actions were added to ipfw(8): http://svnweb.freebsd.org/base?view=revision&revisi on=223666.

    Conclusion

    With the release of FreeBSD 9.0, the FreeBSD Project continues to innovate in the areas of security, compilers, filesystems, and networking. You can find out more information about the FreeBSD Project and download FreeBSD 9.0 from freebsd.org.

    Source: MagazineBSD VOL 5 NO 0.1

13 Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.