Report of Jonathan, Ensworth
Linux is a POSIX-compatible, multithreaded, multitasking, configurable, SMP-capable, open source, monolithic and modular operating system kernel which runs on over forty different platforms and supports hundreds of thousands of different hardware configurations. Linux is the result of the work of thousands of different contributors, including independent contributors and organizations such as IBM, Google, Nokia, Intel, the NSA and others. Linux is used on the largest and most powerful supercomputer in the world, and on some of the world’s smallest cell phones and systems-on-a-chip. It is unique in its ability to adapt to virtually any situation, any application, real-time, embedded, distributed, or clustered. In its current iteration the Linux kernel, the core of the system, is over six and a half million lines of carefully crafted source code. The source code is available publicly with the rights of end users to read, modify, use, and distribute the source code in any way they wish, as long as these same rights remain to those who they distribute their modifications to.
Overall Design Edit
This section describes the overall design of Linux, including it’s APIs (Application Programming Interface), threads and process management, drivers, schedulers, and more.
Linux overall design is a monolithic kernel design in which most kernel services are offered within a single binary executable (within the kernel itself). This kernel design was a subject of some debate early in Linux's history (See Crictism and Contraversy section for details). Despite it's design, Linux however supports loadable kernel modules which can add limited functionality like additional hardware support or unique features. However, since Linux is an open source operating system, is designed with the ability to add and remove functionality at compile-time very easily (as opposed to runtime). It is not uncommon for a advanced user or specific application to compile his/or her own kernel with custom options. Thus Linux inherits many of the benefits of microkernel architectures (modularity), with none of the disadvantages (performance-loss).
Linux overall architecture and APIs are closely related to that of another operating system, Unix. Unix itself spawned a large amount of compatible operating systems. In time, it was shown that Unix needed a standard, a set a rules which all Unix-based OSes should follow to ensure compatibility with user-mode code. The standard was created and known as “POSIX”, or the Portable Operating System Interface for uniX. Unix-compatible operating systems are said to be “POSIX-compliant”. POSIX compliance tests a number of things, such as threading, process creation and destruction, standard shell functions, input and output streams, IPC, networking, and other things.
Linux is a mostly POSIX-compliant OS. Its threading architecture uses the pthread (or “POSIX threads”) mechanism, its sockets are based off of the standard BSD sockets interface. Process creation and destruction works in the typical Unix tree like way, when a process requests another process be created, it typically “forks” the process off itself, that process becoming a child of the given parent process. All processes on the system are children to the master process known as init.
One of the most important parts of a kernel is the scheduler. It is important because it affects the performance of almost every part of the kernel and operating system. Scheduling is the job of allocating CPU time to different tasks within an operating system. The scheduler makes it possible to execute multiple programs at the same time, thus sharing the CPU with users of varying needs.
The Linux scheduler strives to meet several objectives. These objectives include fairness and preventing starvation, efficiency, minimal overhead, minimizing turnaround, wait and response times. Fairness does not mean that every thread should have the same degree of access to CPU time with the same priority, but it means that no thread should ever starve. Since context switching is expensive, allowing tasks to run for longer periods of time increases efficiency. However, interactivity is also important for the Linux scheduler, given the growing effort to optimize Linux for desktop environments, which makes the efficiency suffer, because interactivity generally means having more frequent context switches. The scheduler tries to reduce turnaround time – sum of the service time and amount of time wasted waiting in the ready queue. The response from a program must be as fast as possible. Also another important factor which is often ignored is the variance between the response times.
There have been a number of changes between Linux 2.4 and 2.6. In Linux 2.4 all tasks are on a single global tasklist with each task assigned a "goodness" rating. Each task is assigned a certain number of clock ticks. If the task yields the CPU for I/O before the timeslice is complete it will receive a higher priority. The tasklist is not ordered in 2.4. For large numbers of processors the scheduler was taking too long.
Linux 2.6 has 140 parallel FIFO queues per scheduler (one for each priority). Most queues, except on a heavily loaded system, are empty. The bitmap indicates which queues are empty and processor queues are periodically rebalanced.
Before the 2.6 kernel, the Linux scheduler had a significant limitation when many tasks were active. This was due to the scheduler being implemented using an algorithm with O(n) complexity. In this type of scheduler, the more tasks are active, the longer it takes to schedule a task. At very high loads, the processor can be consumed with scheduling and devote little time to the tasks themselves. Thus, the algorithm lacked scalability. During the Linux 2.5 development period, a new scheduling algorithm was one of the most significant changes to the kernel. The Linux 2.4 scheduler while widely used, reliable, and in general pretty good, has several very undesirable characteristics. These characteristics were embedded in the design, and when Ingo Molnar decided to fix them he produced an entirely new scheduler instead of making modifications to the old one. The fact that the Linux 2.4 scheduling algorithm contained O(n) algorithms was perhaps its greatest flaw, and the improvement in using only O(1) algorithms was the most welcome improvement of Linux 2.6. Big-O notation is used to denote the growth rate of an algorithm’s execution time based on the amount of input. That is, O(1) algorithm is guaranteed to complete in a certain amount of time regardless of the size of the input. The Linux 2.6 scheduler does not contain any algorithms that run in worse than O(1) time. Regardless of how many tasks are on the system every part of the scheduler is guaranteed to execute within a certain constant amount of time. This allows the Linux kernel to efficiently handle massive numbers of tasks without increasing overhead costs as the number of tasks grows.
There are two key data structures in the Linux 2.6 scheduler that allow for it to perform its duties in O(1) time – runqueues and priority arrays. The runqueue data structure is defined as a struct in kernel/sched.c. It is not defined in kernel/sched.h because abstracting the scheduler’s inner workings from its public interface is an important architectural goal. Priority arrays data structure is the basis for most of the Linux 2.6 scheduler’s advantageous behavior, in particular its O(1) time performance. The Linux 2.6 scheduler always schedules the highest priority task on a system, and if multiple tasks exist at the same priority level, they are scheduled Round Robin with each other. Priority arrays make finding the highest priority task in a system a constant-time operation, and also make Round Robin behavior within priority levels possible in constant-time. The schedule() function is the main scheduler function. Its job is to pick a new task to run and switch to it. It is called whenever a task wishes to give up the CPU voluntarily (often through the sys_sched_yield() system call), and if scheduler_tick() sets the TIF_NEED_RESCHED flag on a task because it has run out of timeslice, then schedule() will get called when preempts are re-enabled.
In April 2007 Ingo Molnar released a new patchset titled the "Modular Scheduler Core and Completely Fair Scheduler". This project is a complete rewrite of the Linux task scheduler. His goal was “to address various feature requests and to fix deficiencies in the vanilla scheduler that were suggested/found in the past few years, both for desktop scheduling and for server scheduling workloads”. The Completely Fair Scheduler handles CPU resource allocation for executing processes, and aims to maximize overall CPU utilization while maximizing interactive performance. In contrast to the previous O(1) scheduler, the CFS scheduler implementation is not based on runqueues. Instead a red-black tree implements a 'timeline' of future task execution. Additionally, the scheduler uses nanosecond granularity accounting, making the previous notion of timeslices redundant, the atomic units by which an individual process' share of the CPU was allocated. Like the old O(1) scheduler, CFS utilizes a concept called "sleeper fairness", which considers sleeping or waiting tasks equivalent to those on the runqueue. This means that interactive tasks which spend most of their time waiting for user input or other events get a comparable share of CPU time when they need it. The fair queuing CFS scheduler has a scheduling complexity of O(log N), where N is the number of tasks in the runqueue. Choosing a task can be done in constant time, but reinserting a task after it ran requires O(log N) operations, because the runqueue is implemented as a red-black tree. CFS is the first implementation of a fair queuing process scheduler in a widely used general-purpose operating system.
Processes and IPC Edit
As stated before, Linux is a mostly POSIX-compliant OS. The POSIX standard enforces certain modes of IPC and process creation which all implementations most follow. Process creation and destruction works in the typical Unix tree like way, which all processes are children of another process. Specifically, when a process requests another process be created, it typically “forks” the process off itself, that process becoming a child of the given parent process. When a process is created it is assignmed a "Process Identification Number" or PID. The PID is used to reference the process for many different UNIX commands such as kill. All processes on the system are children to the master process known as init, which has the PID of 1.
Init is an actual user-mode process, but it often referred to as a pseudo-user mode process since it is given special privileges from the OS. For instance an init process can not be killed, and init is specially called by the kernel after it finishes initialization. A process which is directly forked off of init is typically referred to as a service. A “service” process runs with access for all users, and is usually launched early on in system boot up. Example services usually include services which provide file system services (such as FUSE), or the X Window System (which provides a GUI). When a given parent process is killed, all its children are often killed as well. For instance, when you kill the X Window System, all its children (which are the graphical applications running on top of it), are also killed.
Processes running on Linux which do not have interactivity or conventional input/out are known as daemons. Typically a server implementation of Linux will have many such daemons running, such as the Apache web server, the MySQL or Oracle database, FTPd and more. Typically the binary names of a daemon end with d, for instance, mysqld, httpd, ftpd. The Linux OS typically does not differentiate between daemons and normal processes, except in the case of the O(1) scheduler. (See section on O(1) scheduler for details)
Memory Manager Edit
The one most common services processes request from the kernel is memory allocation. Linux memory allocation has for many years been handled by the SLAB allocator. Christoph Lameter thought the SLAB allocator had many problems: "SLAB Object queues exist per node, per CPU. The alien cache queue even has a queue array that contain a queue for each processor on each node. For very large systems the number of queues and the number of objects that may be caught in those queues grows exponentially. On our systems with 1k nodes / processors we have several gigabytes just tied up for storing references to objects for those queues This does not include the objects that could be on those queues. One fears that the whole memory of the machine could one day be consumed by those queues." He created the new allocator is known as SLUB, which was recently merged with the kernel.SLUB is unique in that it requires no metadata per slab, thus saving memory for the actual contents of the slab.
One of the most unique aspects of Linux is the diversity and plethora of it's applications. The same Linux kernel (often with custom compilation flags) is used on small system-on-a-chip devices such as cell phones and other handheld devices, to some of the largest supercomputers, clusters, servers, and massively distributed machines. One of the major reasons Linux is so flexible has very little to do with it's engineered aspects, but rather the way it is distributed. Linux is an open source kernel, meaning that it's source code is freely available for anyone to view, modify, and distribute. Because of this, custom "kernels" with differing functionality can be created.
Because of Linux's liberal copyright terms, Linux is often bundled as part of a "distribution" (or "distro") with various applications and libraries. There are many hundreds of these distros, but the most popular ones are Red Hat Enterprise Linux (RHEL), Fedora, Ubuntu, Slackware, Gentoo, and SuSE.
Linux has many advantages that other operating systems do not. Perhaps the biggest advantage of all is that it is open source, which allows you to customize it for the specific task at hand. It is also royalty or license fee free. The architecture of Linux is very scalable, allowing it to run on hundreds of thousands of hardware configurations big and small without any code modification. It has a proven driver model and network stack, efficient scheduler and memory manager. It has an plethora of commercial support, documentation, and resources for both common and uncommon tasks.
Limitations and Controversy Edit
Linux is not without it's share of critics. Very early on it it's history, there was a famous heated exchange between Linus Trovalds (the original author of Linux), and Dr. Andy Tanenbaum (author of Minix, and considered an operating system expert) about Linux's fundamental architecture. Andy Tanembaum believed monolithic kernels to be obsolete design and microkernels to the future of all modern kernel designs. Linus pointed out all the problems with Minix's design, and replied to a specific comment of Tanembaum about him being a OS researcher and professor, to which Linus concluded: "your job is being a professor and researcher: That's one hell of a good excuse for some of the brain-damages of minix". Tanembaum responded by saying that if Linus was his student, he would fail him for doing such a poor job with Linux's design.
A lot of criticism of Linux comes not from it's engineering, but the open source movement which surrounds it. Some critics claim that open source has the potential to hurt the lucrativeness of the software industry. Others even go as far to say that the open source movement is an "anti-capitalist" or "communist" movement. If these claims are nonsense or not, they are still rather common.
Another common criticism is for Linux on the desktop, a market traditionally Linux has done poorly on. One of the common criticisms is poor hardware support, and the lack of support for running Windows applications. However, hardware support for PCs has improved greatly over the years, and the Wine project allows for native support for Windows applications on top of Linux.
Recent controversy erupted around the choice of scheduler in the Linux. Very recently, the kernel's primary process scheduler system was replaced with a completely fair scheduler (CFS) by Ingo Molnar. The CFS offers better performance for interactive tasks and has a simpler design then the earlier O(1) scheduler (also by Ingo Molnar). However, another scheduler which was in development and testing, the Staircase Deadline scheduler (SD) was aligned to be the next Linux scheduler. However, it was not chosen in the end, Linus opting to use the much newer CFS scheduler. The developer of the SD scheduler, Con Kovolivas, left the Linux development team in disgust in a highly public way. The SD vs CFS debate spread all over the Linux community and is still a matter of controversy, with people stating that the SD scheduler is superior.
In retrospect dispite the notable changenling Open-Source product face (according to page 115 of the seventh edition of "Object-Oriented & Classical Software Engineering" book written by Stephen R. Schach) open source product are generally staffed by unpaid volunteers. In order to attract volunteers to open source project and keep them interested it is essential that all times they view the project as "worthwhile" core group developers need to posess finely honed skills. Dispite all these changellings Linux continues to flourish as a powerful, robust, efficient, reliable, scalable operating system. developers (of various skill level )and organizations and instituional groups(such as IBM, Google, Nokia, Intel, the NSA, FAU Technical service group) either use or contribute to the further development of linux. As of october 2007 the worlds largest software development company Google(surpassing Wall Mart and Microsoft with a with a market value of $190 billion dollars)was rumored to be developing a powerful desktop operating system code name "goobu" a spin-off of linux Ubutu operating system.And as of June 2007, imperical data shows Linux is on 74% of the top 500 supercomputers in the world and with the release of linux 2.6 schulder was taylored with desktop operating system in mind. last but not least due to the fact that linux is an open source software users become co-developers submitting bugs, error, faults, defects in software and ways to fix these issues. General these issues are resloved in a timely manner with a new release of software any where from one week to three months in contrast to close source software(such as Microsoft that general take a any where from six moths to two year to reslove failure reporting issues).
- In 1991, Linux was 10,250 lines of code, now it's more than 8 million lines; current production is 86 lines of new code per hour, or 2,000 lines of code per day.
- As of June 2007, Linux is on 74% of the top 500 supercomputers in the world, including the largest.
1. http://www.russross.com/papers/cs261/, An Evaluation of the Linux Scheduler for Single User Workstations
2. Charles Babcock, “The Relentless Pace of Linux”, InformationWeek., October 22, 2007, pp.42-46
3. Steven Graham, Steve Shah, “Linux Administration: A Beginner’s Guide, Third Edition”, McGraw Hill/Osborne, 2003, pp. 3-6, 13-14, 207-220
4. Ellen Siever et al., “Linux in a Nutshell”, O’Reilly, 2003, p. 16
5. http://archive.is/20120719022634/kerneltrap.org/node/11737 - Completely Fair Scheduler (part of mainline Linux 2.6.23)
6. http://www.hpl.hp.com/research/linux/kernel/o1.php - O(1) Scheduler (main scheduler pre-Linux 2.6.23)
7. http://lwn.net/Articles/224865/ - Staircase Deadline Scheduler (part of linux-ck branch)
8. Christopher Negus, “Red Hat: Fedora Linux 3 Bible”, Wiley, 2005, pp. 954-959
9. Linus Torvalds, David Diamond, “Just for Fun”, Harper Business, 2001
10. http://www.winehq.org/ - Wine Project main website
11. http://www.oreilly.com/catalog/opensources/book/appa.html - From O'Reilly Books: The Tanenbaum-Torvalds Debate
12. Object-Oriented & Classical Software Engineering seventh edition by Stephen R. Schach